On 30-Dec-10 03:11, Greg Walton wrote:
> Ok, I've convinced myself that my previous idea is (at least really
> close to)correct, messages after the first one that get written into
> dispatch_buffer are not aligned.
>
> Here's some debug cut and past showing testcpg bus error, gdb prints of
> the pointers/structs in question, with dmesg output showing the bus
> error address which matches the address accessed for dispatch_data->id.
> After that there is some fprintf() debugging output that i inserted into
> coroipcc.c and coroipcs.c to see what's happening to distpatch_buffer etc.
>
> No services are started in the corosync.conf file used.
>
> so now the question is... where do i add some code to pad the end of a
> message in dispatch_buffer or shift the start of one to an aligned address?
>
> r...@serva:/usr/local/src/cluster/flatiron# uname -a
> Linux serva 2.6.32-5-kirkwood #1 Fri Nov 26 07:01:06 UTC 2010 armv5tel
> GNU/Linux
> r...@serva:/usr/local/src/cluster/flatiron# corosync
> r...@serva:/usr/local/src/cluster/flatiron# test/testcpg
> Local node id is 3301c80a
> membership list
> node id 855754762 pid 4206
> Type EXIT to finish
>
> ConfchgCallback: group 'GROUP'
> joined node/pid 855754762/4206 reason: 1
> nodes in group now 1
> node/pid 855754762/4206
> asdf
> DeliverCallback: message (len=6)from node/pid 855754762/4206: 'asdf
> '
> asdf
> Bus error (core dumped)
> r...@serva:/usr/local/src/cluster/flatiron#
> r...@serva:/usr/local/src/cluster/flatiron# gdb test/testcpg core
> GNU gdb (GDB) 7.0.1-debian
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "arm-linux-gnueabi".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /usr/local/src/cluster/flatiron/test/testcpg...done.
> Reading symbols from /usr/lib/libcpg.so.4...done.
> Loaded symbols for /usr/lib/libcpg.so.4
> Reading symbols from /usr/lib/libcoroipcc.so.4...done.
> Loaded symbols for /usr/lib/libcoroipcc.so.4
> Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done.
> Loaded symbols for /lib/librt.so.1
> Reading symbols from /lib/libpthread.so.0...(no debugging symbols
> found)...done.
> Loaded symbols for /lib/libpthread.so.0
> Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib/libdl.so.2
> Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
> Loaded symbols for /lib/libc.so.6
> Reading symbols from /lib/ld-linux.so.3...(no debugging symbols
> found)...done.
> Loaded symbols for /lib/ld-linux.so.3
> Core was generated by `test/testcpg'.
> Program terminated with signal 7, Bus error.
> #0  cpg_dispatch (handle=<value optimized out>,
>     dispatch_types=<value optimized out>) at cpg.c:339
> 339                     switch (dispatch_data->id) {
> (gdb) bt
> #0  cpg_dispatch (handle=<value optimized out>,
>     dispatch_types=<value optimized out>) at cpg.c:339
> #1  0x00008f10 in main (argc=<value optimized out>, argv=<value
> optimized out>)
>     at testcpg.c:237
> (gdb) print dispatch_data
> $1 = (coroipc_response_header_t *) 0x403b41a6
> (gdb) print dispatch_data->id
> $2 = 5
> (gdb) print &dispatch_data->id
> $3 = (int *) 0x403b41ae
> (gdb) quit
> r...@serva:/usr/local/src/cluster/flatiron# dmesg
> [1968597.391683] Alignment trap: testcpg (4206) PC=0x40031c40
> Instr=0xe5942008 Address=0x403b41ae FSR 0x001
> r...@serva:/usr/local/src/cluster/flatiron# cat /tmp/corosync.debug
> coroipcc.c circular_memory_map() - start of memory - *buf:0x403b4000
> coroipcs.c shared_mem_dispatch_bytes_left n_read:0 n_write:0
> bytes_left:1048575
> coroipcc.c coroipc_dispatch_get() dispatch_buffer:0x403b4000
> control_buffer->read:0
> coroipcc.c coroipc_dispatch_get()
> dispatch_buffer[control_buffer->read]:0x403b4000
> coroipcc.c coroipc_dispatch_put() addr:0x403b4000 read_idx:0
> header->size:232 dispatch_size:1048576
> coroipcc.c coroipc_dispatch_put() modulus calc:232
> coroipcs.c shared_mem_dispatch_bytes_left n_read:232 n_write:232
> bytes_left:1048575
> coroipcc.c coroipc_dispatch_get() dispatch_buffer:0x403b40e8
> control_buffer->read:232
> coroipcc.c coroipc_dispatch_get()
> dispatch_buffer[control_buffer->read]:0x403b40e8
> coroipcc.c coroipc_dispatch_put() addr:0x403b4000 read_idx:232
> header->size:190 dispatch_size:1048576
> coroipcc.c coroipc_dispatch_put() modulus calc:422
> coroipcs.c shared_mem_dispatch_bytes_left n_read:422 n_write:422
> bytes_left:1048575
> coroipcc.c coroipc_dispatch_get() dispatch_buffer:0x403b41a6
> control_buffer->read:422
> coroipcc.c coroipc_dispatch_get()
> dispatch_buffer[control_buffer->read]:0x403b41a6
> r...@serva:/usr/local/src/cluster/flatiron#
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais
Ok, last reply to myself for today, but here is an additional test that
supports the misalignment idea:

if i run testcpg and send only 2 chars, which ends up as 4 for the
message as written to dispatch_buffer (i'm assuming) or at least some
multiple of 4 then there is no alignment error

here's the log with a success for len4 then fails on first message after
a non 4 byte message:

DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
as
DeliverCallback: message (len=4)from node/pid 855754762/5792: 'as
'
asd
DeliverCallback: message (len=5)from node/pid 855754762/5792: 'asd
'
as
Bus error (core dumped)


_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to