On 06/07/2012 04:37 PM, Lee Garrett wrote:
I'm tring to debug a 100% CPU lockup with aox. I'm experiencing the bug in 3.1.3 and in master branch of the git repo (up to commit e96c93d).There is a IMAP folder with 2000+ new messages, my mail client accesses it ( UID fetch 1:* (FLAGS) (CHANGEDSINCE 13466) ), the SQL query takes maybe a sec or two and after that aox "loops" in Allocator::allocate. 359 if ( taken< capacity ) { (gdb) bt #0 0x0000000000547a88 in Allocator::allocate (this=0x21f5270, size=24, pointers=1) at core/allocator.cpp:359 #1 0x0000000000547b0e in Allocator::allocate (this=0x232b310, size=24, pointers=1) at core/allocator.cpp:393 #2 0x0000000000547b0e in Allocator::allocate (this=0x2363d60, size=24, pointers=1) at core/allocator.cpp:393 #3 0x0000000000547b0e in Allocator::allocate (this=0x232abb0, size=24, pointers=1) at core/allocator.cpp:393 #4 0x0000000000547b0e in Allocator::allocate (this=0x232b7b0, size=24, pointers=1) at core/allocator.cpp:393 #5 0x0000000000547b0e in Allocator::allocate (this=0x231ab80, size=24, pointers=1) at core/allocator.cpp:393 #6 0x0000000000547b0e in Allocator::allocate (this=0x21f1850, size=24, pointers=1) at core/allocator.cpp:393 #7 0x0000000000547b0e in Allocator::allocate (this=0x23aaca0, size=24, pointers=1) at core/allocator.cpp:393 [...]
Sounds like memory corruption affecting the allocator. Do the 'this' values ever repeat?
If you change BlockShift in core/allocator.cpp to e.g. 20 or 24, does the stack change? If you build with jam -sOPTIM= install, does the bug go away or change?
17 means "allocate 2^17 bytes from the OS at a time", 128k. 20 is 1MB, 24 16MB.
taken and capacity are always the same size, so that block gets skipped and AFAICS a new Allocator object gets added to the chain. I don't known enough of aox's memory allocator to understand the gist of the problem. What strikes me as odd is that apparently Allocator::alloc tries to find 4.3 billion 44 byte ranges?
No, that function is alloc( number of bytes, maximum number of pointers ) and the default number of pointers is UINT_MAX, meaning "all of the bytes are pointers". So it's looking for a 44-byte object which may contain pointers anywhere.
Arnt
