Author: sir_richard
Date: Sun Jun 6 03:04:03 2010
New Revision: 47605
URL: http://svn.reactos.org/svn/reactos?rev=47605&view=rev
Log:
[NTOS]: Fix for the the bug that broke ARM3 paged pool (and has been corrupting
ReactOS paged pool behind the scenes for years):
When a KCB (key stuff) is allocated, the key name associated with it
receives an NCB (name stuff). In case this name is already used, a cache
exists, and an existing NCB is grabbed, and its reference count is increased.
When the KCB goes away, its NCB loses a reference. When all references are
gone, the NCB is destroyed. Simple enough.
It turns out that what was currently happening is that an NCB would get
dereferenced to 0, deleted, but still remained attached to a valid KCB
(shouldn't happen). When that KCB went away, the NCB's reference count was
dropped to... -1, and then -2, -3, -4, etc. Remember this is a FREED NCB. In
other words, freed pool, that might now belong to someone else, was getting
"-1" operations on it. So any value stored in that freed pool would get
decremented by one. In ARM3 paged pool, because the allocator keeps a linked
list, what would happen is that the FLINK pointer would be 0xE0F01234 instead
of 0xE1A01234. What happened is that "0xE1A0" was treated as the reference
count of the freed NCB, and it kept getting dereferenced down to 0xE0F0.
Proving this was easy, by adding an ASSERT(Ncb->RefCount >= 1) to the
routine that dereferences NCBs. Obviously, we should not try to dereference an
NCB that has a reference count of 0, because that NCB is now gone. Adding this
ASSERT immediately caught the error, regardless of which pool implementation
was being used, so this was a problem in ReactOS today, right now.
My first thought was that we were taking references to NCBs without
incrementing the reference count. The NCB gets referenced in two places: when
it gets created, and everytime a cached NCB is re-used for a new KCB (all this
in CmpGetNameControlBlock).
After adding some tracing code, I discovered that
CmpGetNameControlBlock would sometimes return an NCB that was cached, but
without referencing it. I did not understand why, since the code says "if
(Found) Ncb->RefCount++".
Further analysis showed that what would happen, on this particular
instance, is that NCB "Foo" was being Found, but NCB "Bar" was returned
instead. Therefore, causing some serious issues: First, NCB Foo was receiving
too many references. Secondly, NCB Bar was not being referenced.
Worse though, it turns out this would happen when "Foo" was the CORRECT
NCB, and "Bar" was an INCORRECT NCB. What do we mean by correct and incorrect?
Well, because NCBs are hashed, it's possible for two NCB hashes to be VERY
SIMILAR, but only ONE OF THOSE NCBs will be the right one -- for example,
HKLM\Software\Hello vs HKLM\Software\Hell.
In our case, when a KCB for "Hello" was searching for the "Hello" NCB,
the "Hello NCB would get a reference, but the "Hell" NCB would be returned. In
other words, whenever a HASH COLLISION happened, the incorrect NCB was
returned, probably messing up registry code in the process. Subsequently, when
the KCB was dereferneced, it was attached to this incorrect, under-referenced
NCB.
Since in ANY hash collision with "Hell", in our example, the "Hell" NCB
would come first, subsequent searches for "Hellmaster", "Hellboy", "Hello
World" would all still return "Hell". Eventually when all these KCBs would go
away, the "Hell" NCB would reach even -18 references.
The simple solution? When the CORRECT NCB is found, STOP SEARCHING! By
adding a simple "break" statement. Otherwise, even after the correct NCB is
found, further, incorrect, collided NCBs are found, and eventually the last one
("Hell", in our example) got returned, and under-referenced, while "Hellmaster"
and "Hellboy" were not returned, but LEAKED REFERENCES.
There you have it folks, MEMORY CORRUPTION (USE-AFTER-FREE), INCORRECT
REGISTRY NAME PARSHING, REFERENCE LEAKS and REFERENCE UNDERRUNS, all due to ONE
missing "break;".
-r
Modified:
trunk/reactos/ntoskrnl/config/cmkcbncb.c
Modified: trunk/reactos/ntoskrnl/config/cmkcbncb.c
URL:
http://svn.reactos.org/svn/reactos/trunk/reactos/ntoskrnl/config/cmkcbncb.c?rev=47605&r1=47604&r2=47605&view=diff
==============================================================================
--- trunk/reactos/ntoskrnl/config/cmkcbncb.c [iso-8859-1] (original)
+++ trunk/reactos/ntoskrnl/config/cmkcbncb.c [iso-8859-1] Sun Jun 6 03:04:03
2010
@@ -234,7 +234,9 @@
if (Found)
{
/* Reference it */
+ ASSERT(Ncb->RefCount != 0xFFFF);
Ncb->RefCount++;
+ break;
}
}
@@ -320,6 +322,7 @@
CmpAcquireNcbLockExclusiveByKey(ConvKey);
/* Decrease the reference count */
+ ASSERT(Ncb->RefCount >= 1);
if (!(--Ncb->RefCount))
{
/* Find the NCB in the table */
@@ -579,7 +582,7 @@
NewRefCount = OldRefCount - 1;
/* Check if we still have references */
- if( (NewRefCount & 0xFFFF) > 0)
+ if ((NewRefCount & 0xFFFF) > 0)
{
/* Do the dereference */
if (InterlockedCompareExchange((PLONG)&Kcb->RefCount,