On 30-Dec-10 03:11, Greg Walton wrote: > Ok, I've convinced myself that my previous idea is (at least really > close to)correct, messages after the first one that get written into > dispatch_buffer are not aligned. > > Here's some debug cut and past showing testcpg bus error, gdb prints of > the pointers/structs in question, with dmesg output showing the bus > error address which matches the address accessed for dispatch_data->id. > After that there is some fprintf() debugging output that i inserted into > coroipcc.c and coroipcs.c to see what's happening to distpatch_buffer etc. > > No services are started in the corosync.conf file used. > > so now the question is... where do i add some code to pad the end of a > message in dispatch_buffer or shift the start of one to an aligned address? > > r...@serva:/usr/local/src/cluster/flatiron# uname -a > Linux serva 2.6.32-5-kirkwood #1 Fri Nov 26 07:01:06 UTC 2010 armv5tel > GNU/Linux > r...@serva:/usr/local/src/cluster/flatiron# corosync > r...@serva:/usr/local/src/cluster/flatiron# test/testcpg > Local node id is 3301c80a > membership list > node id 855754762 pid 4206 > Type EXIT to finish > > ConfchgCallback: group 'GROUP' > joined node/pid 855754762/4206 reason: 1 > nodes in group now 1 > node/pid 855754762/4206 > asdf > DeliverCallback: message (len=6)from node/pid 855754762/4206: 'asdf > ' > asdf > Bus error (core dumped) > r...@serva:/usr/local/src/cluster/flatiron# > r...@serva:/usr/local/src/cluster/flatiron# gdb test/testcpg core > GNU gdb (GDB) 7.0.1-debian > Copyright (C) 2009 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "arm-linux-gnueabi". > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>... > Reading symbols from /usr/local/src/cluster/flatiron/test/testcpg...done. > Reading symbols from /usr/lib/libcpg.so.4...done. > Loaded symbols for /usr/lib/libcpg.so.4 > Reading symbols from /usr/lib/libcoroipcc.so.4...done. > Loaded symbols for /usr/lib/libcoroipcc.so.4 > Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done. > Loaded symbols for /lib/librt.so.1 > Reading symbols from /lib/libpthread.so.0...(no debugging symbols > found)...done. > Loaded symbols for /lib/libpthread.so.0 > Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done. > Loaded symbols for /lib/libdl.so.2 > Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done. > Loaded symbols for /lib/libc.so.6 > Reading symbols from /lib/ld-linux.so.3...(no debugging symbols > found)...done. > Loaded symbols for /lib/ld-linux.so.3 > Core was generated by `test/testcpg'. > Program terminated with signal 7, Bus error. > #0 cpg_dispatch (handle=<value optimized out>, > dispatch_types=<value optimized out>) at cpg.c:339 > 339 switch (dispatch_data->id) { > (gdb) bt > #0 cpg_dispatch (handle=<value optimized out>, > dispatch_types=<value optimized out>) at cpg.c:339 > #1 0x00008f10 in main (argc=<value optimized out>, argv=<value > optimized out>) > at testcpg.c:237 > (gdb) print dispatch_data > $1 = (coroipc_response_header_t *) 0x403b41a6 > (gdb) print dispatch_data->id > $2 = 5 > (gdb) print &dispatch_data->id > $3 = (int *) 0x403b41ae > (gdb) quit > r...@serva:/usr/local/src/cluster/flatiron# dmesg > [1968597.391683] Alignment trap: testcpg (4206) PC=0x40031c40 > Instr=0xe5942008 Address=0x403b41ae FSR 0x001 > r...@serva:/usr/local/src/cluster/flatiron# cat /tmp/corosync.debug > coroipcc.c circular_memory_map() - start of memory - *buf:0x403b4000 > coroipcs.c shared_mem_dispatch_bytes_left n_read:0 n_write:0 > bytes_left:1048575 > coroipcc.c coroipc_dispatch_get() dispatch_buffer:0x403b4000 > control_buffer->read:0 > coroipcc.c coroipc_dispatch_get() > dispatch_buffer[control_buffer->read]:0x403b4000 > coroipcc.c coroipc_dispatch_put() addr:0x403b4000 read_idx:0 > header->size:232 dispatch_size:1048576 > coroipcc.c coroipc_dispatch_put() modulus calc:232 > coroipcs.c shared_mem_dispatch_bytes_left n_read:232 n_write:232 > bytes_left:1048575 > coroipcc.c coroipc_dispatch_get() dispatch_buffer:0x403b40e8 > control_buffer->read:232 > coroipcc.c coroipc_dispatch_get() > dispatch_buffer[control_buffer->read]:0x403b40e8 > coroipcc.c coroipc_dispatch_put() addr:0x403b4000 read_idx:232 > header->size:190 dispatch_size:1048576 > coroipcc.c coroipc_dispatch_put() modulus calc:422 > coroipcs.c shared_mem_dispatch_bytes_left n_read:422 n_write:422 > bytes_left:1048575 > coroipcc.c coroipc_dispatch_get() dispatch_buffer:0x403b41a6 > control_buffer->read:422 > coroipcc.c coroipc_dispatch_get() > dispatch_buffer[control_buffer->read]:0x403b41a6 > r...@serva:/usr/local/src/cluster/flatiron# > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais Ok, last reply to myself for today, but here is an additional test that supports the misalignment idea:
if i run testcpg and send only 2 chars, which ends up as 4 for the message as written to dispatch_buffer (i'm assuming) or at least some multiple of 4 then there is no alignment error here's the log with a success for len4 then fails on first message after a non 4 byte message: DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' as DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as ' asd DeliverCallback: message (len=5)from node/pid 855754762/5792: 'asd ' as Bus error (core dumped) _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
