Re: [Freeipmi-devel] lib: raw command and threads

2013-11-22 Thread Thomas Cadeau

Thanks a lot for your answer.

The way you propose will not fit to what we want to do.
I re-ran on safe cpus without any troubles.

When I will have a real pool of cpus without any other troubles, I will 
let you you if there is again the problem.


Thomas

Le 21/11/2013 20:23, Albert Chu a écrit :

Hi Thomas,

I did a quick sanity test on my system and it worked (of course, it may
have not been exactly like you did things).

The trace indicates the segfault is here:


#0  0x7f4e278c89a9 in inb (ctx=0x7f4e28001770) at

/usr/include/sys/io.h:48

Which is during memory mapped i/o.  I suppose a segfault could happen if
the in/out call was going to a bad part of memory.  It might suggest
some corruption is happening.  Is it possible you're corrupting some
data structure somewhere?  The close/destroy/re-create works b/c it
fixes the corruption?

In all of FreeIPMI (especially the multi-ranged host access in the
tools), we create a context per thread for communication, e.g.

launch_thread
ctx = ipmi_ctx_create();
ipmi_ctx_find_inband(ctx, ...);
loop
   ipmi_cmd_raw

Have you considered doing it this way?

Al


On Thu, 2013-11-21 at 17:00 +0100, Thomas Cadeau wrote:

Hi all,


I'am curently tring to call a raw command several times.
Here are the functions I call:


ctx = ipmi_ctx_create()

ipmi_ctx_find_inband (ctx,
   NULL,//driver_type,
   0,   // disable_auto_probe,
   0,   // driver_address,
   0,   // register_spacing,
   0,   // driver_device,
   0,   // workaround_flags,
   IPMI_FLAGS_DEFAULT//0
   )

ipmi_cmd_raw(ctx,
  0x00, //lun (logical unit number)
  0x3A,//IPMI_NET_FN_SENSOR_EVENT_RQ,
  bytes_rq, //request data //const void *
  2, //length (in bytes)
  bytes_rs, //response buffer //void *
  IPMI_RAW_MAX_ARGS //max response length
  )

I check all return code.

If I create a simple example with a loop, I have no problem.

ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
for (...){
ipmi_cmd_raw(...)
//use result
}

Then I try inside an internal project, during initialization, I use the
3 functions, and then each time I want to update and call
ipmi_cmd_raw(...), a thread is created to do all operations.


ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
  ipmi_cmd_raw(...)
  //use result
...
//with fixed frequency:
launch thread
 ipmi_cmd_raw(...)
 //use result

In this case, on some cpus, I have no problem. But on some, I have a
segfault (core dump):

#0  0x7f4e278c89a9 in inb (ctx=0x7f4e28001770) at
/usr/include/sys/io.h:48
#1  _ipmi_kcs_get_status (ctx=0x7f4e28001770) at
driver/ipmi-kcs-driver.c:533
#2  0x7f4e278c8e50 in _ipmi_kcs_wait_for_ibf_clear
(ctx=0x7f4e28001770)
 at driver/ipmi-kcs-driver.c:656
#3  0x7f4e278c91d6 in ipmi_kcs_write (ctx=0x7f4e28001770,
buf=0x7f4e28003420, buf_len=3)
 at driver/ipmi-kcs-driver.c:845
#4  0x7f4e27898bc1 in _kcs_cmd_write (ctx=0x7f4e28005190,
obj_cmd_rq=value optimized out,
 obj_cmd_rs=0x7f4e28001ae0) at api/ipmi-kcs-driver-api.c:255
#5  api_kcs_cmd (ctx=0x7f4e28005190, obj_cmd_rq=value optimized out,
obj_cmd_rs=0x7f4e28001ae0)
 at api/ipmi-kcs-driver-api.c:398
#6  0x7f4e27899091 in api_kcs_cmd_raw (ctx=0x7f4e28005190,
buf_rq=0x7f4e2e390a60, buf_rq_len=2,
 buf_rs=0x7f4e2e38f8c0, buf_rs_len=4512) at
api/ipmi-kcs-driver-api.c:750
#7  0x7f4e2788f9a9 in ipmi_cmd_raw (ctx=0x7f4e28005190, lun=value
optimized out,
 net_fn=value optimized out, buf_rq=0x7f4e2e390a60, buf_rq_len=2,
buf_rs=0x7f4e2e38f8c0,
 buf_rs_len=4512) at api/ipmi-api.c:1983

If I force to connect again, I have no problem. But this workaround is
not a good way:

ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
  ipmi_cmd_raw(...)
  //use result
...
//with fixed frequency:
launch thread
 ipmi_ctx_close(ctx)
 ipmi_ctx_destroy(ctx);

ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )

ipmi_cmd_raw(...)
 //use result

Note that I check the version of BMC on each nodes, and I use
freeipmi-1.2.1.
I also hace security to ensure only one use of ctx can be done.

Do you have any idea of what happpens and if I'm doing something wrong?
Is there a function to check the connection is opened and if I need to
reopen?

Thank you for your help.

Thomas Cadeau

___
Freeipmi-devel mailing list
Freeipmi-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/freeipmi-devel



___
Freeipmi-devel mailing list
Freeipmi-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/freeipmi-devel


Re: [Freeipmi-devel] lib: raw command and threads

2013-11-21 Thread Albert Chu
Hi Thomas,

I did a quick sanity test on my system and it worked (of course, it may
have not been exactly like you did things).

The trace indicates the segfault is here:

 #0  0x7f4e278c89a9 in inb (ctx=0x7f4e28001770) at 
  /usr/include/sys/io.h:48

Which is during memory mapped i/o.  I suppose a segfault could happen if
the in/out call was going to a bad part of memory.  It might suggest
some corruption is happening.  Is it possible you're corrupting some
data structure somewhere?  The close/destroy/re-create works b/c it
fixes the corruption?

In all of FreeIPMI (especially the multi-ranged host access in the
tools), we create a context per thread for communication, e.g.

launch_thread
   ctx = ipmi_ctx_create();
   ipmi_ctx_find_inband(ctx, ...);
   loop
  ipmi_cmd_raw

Have you considered doing it this way?

Al


On Thu, 2013-11-21 at 17:00 +0100, Thomas Cadeau wrote:
 
 Hi all,
 
 
 I'am curently tring to call a raw command several times.
 Here are the functions I call:
 
  ctx = ipmi_ctx_create()
 
  ipmi_ctx_find_inband (ctx,
NULL,//driver_type,
0,   // disable_auto_probe,
0,   // driver_address,
0,   // register_spacing,
0,   // driver_device,
0,   // workaround_flags,
IPMI_FLAGS_DEFAULT//0
)
 
  ipmi_cmd_raw(ctx,
   0x00, //lun (logical unit number)
   0x3A,//IPMI_NET_FN_SENSOR_EVENT_RQ,
   bytes_rq, //request data //const void *
   2, //length (in bytes)
   bytes_rs, //response buffer //void *
   IPMI_RAW_MAX_ARGS //max response length
   )
 I check all return code.
 
 If I create a simple example with a loop, I have no problem.
  ctx = ipmi_ctx_create()
  ipmi_ctx_find_inband ( ...  )
  for (...){
  ipmi_cmd_raw(...)
  //use result
  }
 
 Then I try inside an internal project, during initialization, I use the 
 3 functions, and then each time I want to update and call 
 ipmi_cmd_raw(...), a thread is created to do all operations.
 
  ctx = ipmi_ctx_create()
  ipmi_ctx_find_inband ( ...  )
   ipmi_cmd_raw(...)
   //use result
  ...
  //with fixed frequency:
  launch thread
  ipmi_cmd_raw(...)
  //use result
 In this case, on some cpus, I have no problem. But on some, I have a 
 segfault (core dump):
  #0  0x7f4e278c89a9 in inb (ctx=0x7f4e28001770) at 
  /usr/include/sys/io.h:48
  #1  _ipmi_kcs_get_status (ctx=0x7f4e28001770) at 
  driver/ipmi-kcs-driver.c:533
  #2  0x7f4e278c8e50 in _ipmi_kcs_wait_for_ibf_clear 
  (ctx=0x7f4e28001770)
  at driver/ipmi-kcs-driver.c:656
  #3  0x7f4e278c91d6 in ipmi_kcs_write (ctx=0x7f4e28001770, 
  buf=0x7f4e28003420, buf_len=3)
  at driver/ipmi-kcs-driver.c:845
  #4  0x7f4e27898bc1 in _kcs_cmd_write (ctx=0x7f4e28005190, 
  obj_cmd_rq=value optimized out,
  obj_cmd_rs=0x7f4e28001ae0) at api/ipmi-kcs-driver-api.c:255
  #5  api_kcs_cmd (ctx=0x7f4e28005190, obj_cmd_rq=value optimized out, 
  obj_cmd_rs=0x7f4e28001ae0)
  at api/ipmi-kcs-driver-api.c:398
  #6  0x7f4e27899091 in api_kcs_cmd_raw (ctx=0x7f4e28005190, 
  buf_rq=0x7f4e2e390a60, buf_rq_len=2,
  buf_rs=0x7f4e2e38f8c0, buf_rs_len=4512) at 
  api/ipmi-kcs-driver-api.c:750
  #7  0x7f4e2788f9a9 in ipmi_cmd_raw (ctx=0x7f4e28005190, lun=value 
  optimized out,
  net_fn=value optimized out, buf_rq=0x7f4e2e390a60, buf_rq_len=2, 
  buf_rs=0x7f4e2e38f8c0,
  buf_rs_len=4512) at api/ipmi-api.c:1983
 If I force to connect again, I have no problem. But this workaround is 
 not a good way:
  ctx = ipmi_ctx_create()
  ipmi_ctx_find_inband ( ...  )
   ipmi_cmd_raw(...)
   //use result
  ...
  //with fixed frequency:
  launch thread
  ipmi_ctx_close(ctx)
  ipmi_ctx_destroy(ctx);
   ctx = ipmi_ctx_create()
   ipmi_ctx_find_inband ( ...  )
 ipmi_cmd_raw(...)
  //use result
 Note that I check the version of BMC on each nodes, and I use 
 freeipmi-1.2.1.
 I also hace security to ensure only one use of ctx can be done.
 
 Do you have any idea of what happpens and if I'm doing something wrong?
 Is there a function to check the connection is opened and if I need to 
 reopen?
 
 Thank you for your help.
 
 Thomas Cadeau
 
 ___
 Freeipmi-devel mailing list
 Freeipmi-devel@gnu.org
 https://lists.gnu.org/mailman/listinfo/freeipmi-devel
-- 
Albert Chu
ch...@llnl.gov
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory



___
Freeipmi-devel mailing list
Freeipmi-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/freeipmi-devel


[Freeipmi-devel] lib: raw command and threads

2013-11-21 Thread Thomas Cadeau



Hi all,


I'am curently tring to call a raw command several times.
Here are the functions I call:


ctx = ipmi_ctx_create()

ipmi_ctx_find_inband (ctx,
  NULL,//driver_type,
  0,   // disable_auto_probe,
  0,   // driver_address,
  0,   // register_spacing,
  0,   // driver_device,
  0,   // workaround_flags,
  IPMI_FLAGS_DEFAULT//0
  )

ipmi_cmd_raw(ctx,
 0x00, //lun (logical unit number)
 0x3A,//IPMI_NET_FN_SENSOR_EVENT_RQ,
 bytes_rq, //request data //const void *
 2, //length (in bytes)
 bytes_rs, //response buffer //void *
 IPMI_RAW_MAX_ARGS //max response length
 )

I check all return code.

If I create a simple example with a loop, I have no problem.

ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
for (...){
ipmi_cmd_raw(...)
//use result
}


Then I try inside an internal project, during initialization, I use the 
3 functions, and then each time I want to update and call 
ipmi_cmd_raw(...), a thread is created to do all operations.



ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
 ipmi_cmd_raw(...)
 //use result
...
//with fixed frequency:
launch thread
ipmi_cmd_raw(...)
//use result
In this case, on some cpus, I have no problem. But on some, I have a 
segfault (core dump):
#0  0x7f4e278c89a9 in inb (ctx=0x7f4e28001770) at 
/usr/include/sys/io.h:48
#1  _ipmi_kcs_get_status (ctx=0x7f4e28001770) at 
driver/ipmi-kcs-driver.c:533
#2  0x7f4e278c8e50 in _ipmi_kcs_wait_for_ibf_clear 
(ctx=0x7f4e28001770)

at driver/ipmi-kcs-driver.c:656
#3  0x7f4e278c91d6 in ipmi_kcs_write (ctx=0x7f4e28001770, 
buf=0x7f4e28003420, buf_len=3)

at driver/ipmi-kcs-driver.c:845
#4  0x7f4e27898bc1 in _kcs_cmd_write (ctx=0x7f4e28005190, 
obj_cmd_rq=value optimized out,

obj_cmd_rs=0x7f4e28001ae0) at api/ipmi-kcs-driver-api.c:255
#5  api_kcs_cmd (ctx=0x7f4e28005190, obj_cmd_rq=value optimized out, 
obj_cmd_rs=0x7f4e28001ae0)

at api/ipmi-kcs-driver-api.c:398
#6  0x7f4e27899091 in api_kcs_cmd_raw (ctx=0x7f4e28005190, 
buf_rq=0x7f4e2e390a60, buf_rq_len=2,
buf_rs=0x7f4e2e38f8c0, buf_rs_len=4512) at 
api/ipmi-kcs-driver-api.c:750
#7  0x7f4e2788f9a9 in ipmi_cmd_raw (ctx=0x7f4e28005190, lun=value 
optimized out,
net_fn=value optimized out, buf_rq=0x7f4e2e390a60, buf_rq_len=2, 
buf_rs=0x7f4e2e38f8c0,

buf_rs_len=4512) at api/ipmi-api.c:1983
If I force to connect again, I have no problem. But this workaround is 
not a good way:

ctx = ipmi_ctx_create()
ipmi_ctx_find_inband ( ...  )
 ipmi_cmd_raw(...)
 //use result
...
//with fixed frequency:
launch thread
ipmi_ctx_close(ctx)
ipmi_ctx_destroy(ctx);
 ctx = ipmi_ctx_create()
 ipmi_ctx_find_inband ( ...  )
   ipmi_cmd_raw(...)
//use result
Note that I check the version of BMC on each nodes, and I use 
freeipmi-1.2.1.

I also hace security to ensure only one use of ctx can be done.

Do you have any idea of what happpens and if I'm doing something wrong?
Is there a function to check the connection is opened and if I need to 
reopen?


Thank you for your help.

Thomas Cadeau

___
Freeipmi-devel mailing list
Freeipmi-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/freeipmi-devel