Here's some background. Apologies if you know this already. The "secondary V bits table" holds V (definedness) bits for selected few parts of the process' address space. Just the parts of the address space where bytes are partially defined, that is, neither completely undefined nor completely defined. There are relatively few of these.
The table (secVBitTable) is actually an OSet, essentially an AVL tree which maps guest addresses to the V bits for that address. Because it would be rather wasteful of space to have one tree node for each partially defined byte in the address space, instead each node contains the definedness data for BYTES_PER_SEC_VBIT_NODE (16) bytes at a time. Accordingly the associated OSet key is rounded down to the nearest 16 byte boundary. Memcheck is bombing in "get_sec_vbits8(Addr a)" because, following consultation of other data structures, it has determined that the byte at "a" is partially defined, so it needs to look up in said table, its exact definedness info. Problem is there is no entry in the table. That means, either: 1. no entry was ever made for "a" (really, for VG_ROUNDDN(a, BYTES_PER_SEC_VBIT_NODE)), or 2. there was an entry, but it has since been deleted, or 3. some other snafu. Let's chase (1) first: in set_sec_vbits8 I'd add VG_(printf)("setting line %p\n", aAligned) let it run, presumably accumulating a large log file. When it borks, have a look in the log file, to see if the aAligned causing the assertion in get_sec_vbits8 was actually entered in the first place. Yell if that don't make sense. If that looks OK (iow, there is at least one corresponding log file entry), let's consider 2. Periodically gcSecVBitTable() walks over said AVL tree. If all 16 bytes in a given chunk are completely defined or completely undefined, then the chunk is redundant, and can be deleted from the tree. There are two complications, though: (a) we don't want to be chucking lines out of the tree too enthusiastically, for performance reasons. So a line has to have no part-defined bytes for MAX_STALE_AGE consecutive checks before it gets dumped. (b) we can't delete nodes from the tree at the same time we're iterating over it (using VG_(OSet_Next)). So instead, the survivor lines are copied into a new tree (OSet) and the old one is nuked afterwards. So anyway, you see "if ( keep )" at line 918. In the case (!keep), add a printf to show "n->a" of the line being dumped and see if any dumped line matches the missing one causing the assertion failure. Hmm, on rereading previous messages, all of (2) is irrelevant if you disabled gcSecVBitTable and the problem still exists. So it's either 1. or 3. Can you at least try 1. ? J On Thursday 06 December 2007 22:43, Nicholas Nethercote wrote: > On Thu, 6 Dec 2007, Tom Hughes wrote: > >>> Memcheck: mc_main.c:957 (get_sec_vbits8): Assertion 'n' failed. > >>> Memcheck: get_sec_vbits8: no node for address 0x6FA9EA0 (0x6FA9EAC) > >> > >> It's a problem with the secondary V bits table in Memcheck. That table > >> holds the full V bits for all memory bytes that are partially defined. > >> It's happened a couple of times, but always in situations that are > >> impossible for me to reproduce. If you are able to reduce it to a small > >> test, or are able to do any debugging yourself, that would be very > >> helpful. > > > > It is 100% repeatable for me but, interestingly, only on my machine at > > home. My machine at work doesn't have the same problem. Both are > > x86_64 machines with two cores and 4Gb of memory and both are running > > Fedora 8! > > [...] > > Any suggestions for the best way to debug it? > > The relevant code starts with this line, around line 838: > > /* --------------- Secondary V bit table ------------ */ > > It's a fairly basic data structure, the only notable thing is that we > periodically garbage collect (GC) it, ie. remove stale nodes. The easy > first thing to try is to turn off the GC, ie. make gcSecVBitTable() do > nothing. If that makes the problem go away, then we know that the GC is > removing nodes it shouldn't. > > It might also be useful if you can run with -v. The "memcheck GC" lines > indicate when GCs are happening. > > Nick > > ------------------------------------------------------------------------- > SF.Net email is sponsored by: > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Valgrind-developers mailing list > Valgrind-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/valgrind-developers ------------------------------------------------------------------------- SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ Valgrind-developers mailing list Valgrind-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-developers