Hello, On Fri, 11 May 2012 18:52:04 +0400 Andrew Savchenko wrote: > I use maui-3.3.1. showstats with no arguments or with -s/-v argument > segfaults, in both cases gdb backtrace is the same: > > Program received signal SIGSEGV, Segmentation fault. > 0x00007ffff7535206 in __rawmemchr_sse2 () from /lib64/libc.so.6 > (gdb) bt > #0 0x00007ffff7535206 in __rawmemchr_sse2 () from /lib64/libc.so.6 > #1 0x00007ffff751f570 in _IO_str_init_static_internal () from > /lib64/libc.so.6 > #2 0x00007ffff750e0f5 in __isoc99_vsscanf () from /lib64/libc.so.6 > #3 0x00007ffff750e088 in __isoc99_sscanf () from /lib64/libc.so.6 > #4 0x0000000000406e35 in MCShowSchedulerStatistics ( > Buffer=0x22fd15f "1336747509 0 101 0 0 0 16 124 514144 16 124 514144 40 > 40 16946.078789 5.612000 917.944445 252951046507.442596 0.000000 0 0 > 248.240124 0.112964 0.000000 0 30 0.000000 6 0.016667 0.000000 0 1.075000 2 > 3"...) at omclient.c:3773 > #5 0x000000000040fea8 in main (argc=<optimized out>, argv=<optimized out>) > at mclient.c:510 > (gdb) > > Maui was compiled with CFLAGS="-march=core2 -O2 -ggdb".
1) This bug occurs only when compiled with any non-zero optimization level: with -O0 it works, with -O1 and higher it fails as above. This is a good hint of some memory misalignment or misuse in the code, because -O1 optimization level is stable and safe. My compiler is gcc-4.5.3 and my system is Gentoo entirely built with this compile with even more aggressive options without any trouble. 2) The problem was in bad server reply, which was mishandled by client's parser. In case of normal reply server returns two string in the ARG field for a CMD=showstat request: "CK=1007dd0424073223 TS=1336781650 AUTH=root DT=SC=1 ARG=1336781623\n1336781343 1 27970 0 0 0 16 124 514144 16 124 514144 40 40 18071.720111 5.612000 917.944445 269753236437.359100 0.000000 0 0 2"... In a corrupted reply first string was omitted, e.g.: "CK=1007dd0424073223 TS=1336781650 AUTH=root DT=SC=1 ARG=1336781343 1 27970 0 0 0 16 124 514144 16 124 514144 40 40 18071.720111 5.612000 917.944445 269753236437.359100 0.000000 0 0 2"... I found that problem lies in the moab/MSched.c in the function MSchedStatToString(): sprintf on line 4104 uses Buf string as both destination and an argument. This is wrong and must be avoided, because man sprintf says: However, the standards explicitly note that the results are undefined if source and destination buffers overlap when calling sprintf() Attached patch fixes this issue by joining two sprintf calls into a single one without buffer overlaps. Best regards, Andrew Savchenko
maui-3.3.1-showstats.patch
Description: Binary data
pgpzzSrTlTLYq.pgp
Description: PGP signature
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
