On (19/08/13 21:05), Lukas Slebodnik wrote: >On (19/08/13 11:20), Simo Sorce wrote: >>On Mon, 2013-08-19 at 14:58 +0200, Lukas Slebodnik wrote: >>> On (19/08/13 07:58), Lukas Slebodnik wrote: >>> >ehlo, >>> > >>> >detail description in attached patches. >>> > >>> >LS >>> >>> >From 295dd1e0966df7bf1fc9a51f84a0daa81a279b5e Mon Sep 17 00:00:00 2001 >>> >From: Lukas Slebodnik <[email protected]> >>> >Date: Mon, 19 Aug 2013 05:39:28 +0200 >>> >Subject: [PATCH 1/2] mmap_cache: Skip records which doesn't have same hash >>> > >>> >Record in data table has two hashes hash1 and hash2. Hash table refers >>> >to the first record with the searched hash key. We do a record chaining >>> >in case of hash collision. >>> > >>> >When we removed record from cache hash_table was automaticaly updated to >>> >the next record from chain. But it is not very likely that two following >>> >records will have the same hashes (hash1 and hash2). Therefore it can >>> >happen, >>> >that some hash table keys refers to records, which both hashes has >>> >different >>> >values like hash key. Such record can be removed from memory cache, >>> >but there will be reference in hash table with different key, which could >>> >not be removed, but it points to removed data or in the worst case >>> >it points in the middle of newly added record. And this was a reason >>> >of crash in nss. >>> > >>> >>> I will try to write example of this behaviour. >>> >>> 1. Adding new record with R1(hash1:111, hash2:222, next:INVALID_VAL) >>> --record R1 will be added to data table to slot with index:0 >>> --hash keys fom record R1 will refer to slot_0 >>> hash_table[111] refers to slot_0 >>> hash_table[222] refers to slot_0 >>> >>> 2. Adding another record R2(hash1:111, hash2:333, next:INVALID_VAL) >>> --record R1 will be added to data table to the free slot with index:3 >>> --there is collision of hash key 111 >>> hash_table[111] will still refers to slot0, but we will add R2 to >>> chain >>> R1->next will refer to slot_3 (R2 is stored ins slot_3) >>> hash_table[333] refers to slot_3 (there was no colision) >>> >>> 3. Removing R1 >>> --remove records from chains using R1->hash{1,2} >>> (sss_mc_rm_rec_from_chain) >>> in this situation hash_table[R1->hash1] and hash_table[R1->hash2] >>> refers >>> directly to R1, It is just a simplification of this example. >>> --set hash_table[R1->hash1] to R1->next >>> hash_table[111] will refer to slot_3 >>> --set hash_table[R1->hash2] to R1->next >>> hash_table[222] will refer to slot_3 >>> >>> Current situation: >>> hash_table[111] refers to slot_3(R2) >>> hash_table[222] refers to slot_3(R2) >>> hash_table[333] refers to slot_3(R2) >> >>Ah yeah it is a mistake that [222] here still holds a reference to R2, >>as R2 has no such hash. >> >>> 4. Removing R2. >>> --remove rerords from chain using R2->hash{1,2} >>> R2->next has value INVALID_VAL (there is not another record after R2) >>> --set hash_table[R2->hash1] to R2->next (INVALID_VAL) >>> hash_table[111] will refer to empty slot >>> --set hash_table[R2->hash2] to R2->next (INVALID_VAL) >>> hash_table[333] will refer to empty slot >>> >>> Current situation: >>> hash_table[222] refers to slot_3 (but R2 was removed). >>> >>> 5. Adding new record R3(hash1: 999, hash2: 888, next:INVALID_VAL) >>> this record is very log (10 slots) and it will be added to >>> the first empty slot. (slot_0). This is just a simplification, because >>> algorithm is different, but it can happen. >>> >>> >>> Current situation: >>> hash_table[888] refers to slot_0(R3) >>> hash_table[999] refers to slot_0(R3) >>> hash_table[222] refers to slot_3 (in the middle of R3). >>> ^^^^^^^^^^^^^^^ >>> possible crash >> >>Indeed, and I am sure this is how one of the crashes where we had stuff >>pointing in the middle of a multi-slot field happened. >> >>Thanks a lot for catching this. >> >>However I am not completely sure the solution works correctly, I suspect >>it may cause orphans. >> >>Let me work out here an example and let's see where it leads. >> >>Assume we have a high collision rate, and the following records are >>added in order (the next lines are { record-id, slot1, slot2 } >> >>R1, 111, 222 >>R2, 222, 333 >>R3, 111, 333 >>R4, 333, 444 >>R5, 222, 333 >> >>The hash chains look like after each insertion step (hash1/slot1 is >>inserted first): >> >>1. >>[111] -> R1 >>[222] -> R1 >>[333] >>[444] >> >>2. >>[111] -> R1 -> R2 >>[222] -> R1 -> R2 >>[333] -> R2 >>[444] >> >>note how having only 1 ->next in the record causes duplication in the >>chains, this is accounted for in the code already, it just makes the >>flow a little bit messier and harder to follow unfortunately. >> >>3. >>[111] -> R1 -> R2 -> R3 >>[222] -> R1 -> R2 -> R3 >>[333] -> R2 -> R3 >>[444] >> >>4. >>[111] -> R1 -> R2 -> R3 -> R4 >>[222] -> R1 -> R2 -> R3 -> R4 >>[333] -> R2 -> R3 -> R4 >>[444] -> R4 >> >>5. >>[111] -> R1 -> R2 -> R3 -> R4 -> R5 >>[222] -> R1 -> R2 -> R3 -> R4 -> R5 >>[333] -> R2 -> R3 -> R4 -> R5 >>[444] -> R4 -> R5 >> >> >>Original table for easier reference in next step: >>R1, 111, 222 >>R2, 222, 333 >>R3, 111, 333 >>R4, 333, 444 >>R5, 222, 333 >> >>Now let's try to remove R2. >>The first step with your patch checks if the hashes of next record (R3) >>matches the current hash chain hash. >>The first hash chain we check when removing R2 is [222], and R3 has >>[111], [333], so it doesn't match, and is skipped. >>The second step checks R4 that has 333, 444, again no match. >>The third step matches R5 as it is also in the 222 chain. >>So what we do is that we set the next record of R1 (which is R2's >>previous record in chain [222] to R5. >> >>After removing R2 from first chain: >>[111] -> R1 -> R5 >>[222] -> R1 -> R5 >>[333] -> R2 -> R3 -> R4 -> R5 >>[444] -> R4 -> R5 >> >>Then we look at the second hash chain of R2 which is [333] >>We check R3, which is again the next record to see if it is a record >>that belongs to [333] by matching R3's hashes (111, 333). It does, so we >>just set [333] to point directly at R3 as R2 was its first element. >> >>This is how finally the chains look: >>[111] -> R1 -> R5 >>[222] -> R1 -> R5 >>[333] -> R3 -> R4 -> R5 >>[444] -> R4 -> R5 >> >>As you can see they are not correct. >>These are the remaining records: >>R1, 111, 222 >>R3, 111, 333 >>R4, 333, 444 >>R5, 222, 333 >> >>R1 is properly linked by both chains. >>R3 is not linked in 111 as it should. >>R4 and R5 are referenced by both their chains. >> >>So R3 now is 'lost' on one chain. >> >>The problem here is that we are doing separate removals, but are not >>considering the shared nature of the 'next' pointer when it comes to the >>affected chains. >> >> >>I think we should instead use the original approach to remove chains, >>but then re-validate them: >> >>Original table for easier reference in next step: >>R1, 111, 222 >>R2, 222, 333 >>R3, 111, 333 >>R4, 333, 444 >>R5, 222, 333 >> >>Original status with all Record chained: >>[111] -> R1 -> R2 -> R3 -> R4 -> R5 >>[222] -> R1 -> R2 -> R3 -> R4 -> R5 >>[333] -> R2 -> R3 -> R4 -> R5 >>[444] -> R4 -> R5 >> >>Let's remove R4 this time (chose R4 because it leaves a reference to R5 >>in the [444] as an orphan with the original method). >> >>Removing R4 (from chains 333 and 444) with the original method leaves >>this table: >> >>[111] -> R1 -> R2 -> R3 -> R5 >>[222] -> R1 -> R2 -> R3 -> R5 >>[333] -> R2 -> R3 -> R5 >>[444] -> R5 >> >>After removal from both the chains R4 referred to we revalidate them (ie >>we only revalidate 333 and 444: >> >>First we revalidate [333], which has -> R2 -> R3 -> R5 as a chain. >>R2 does have 333 as hash so it is valid >>The next step needs to insure that the link R2 -> R5 makes sense. It >>makes sense if any of the previous record hashes are matched. >>So for the second step we need to match [333] plus R2's own hashes {222, >>333}. >>The list to match is therefore {222, 333} and R3 matches it with 333. >>For the next record the list to match is {111, 333} + {222, 333}, or >>{111, 222, 333}, and R5 matches and is the last element. >>The chain is revalidated >> >>The second chain to check is [444], which has only -> R5 >>R5 (222, 333) does not match, so we set [444] to R5->next, and continue. >>Turns out R5 was the last element in the chain so [444] is empty. >> >>Final situation after revalidation: >>[111] -> R1 -> R2 -> R3 -> R5 >>[222] -> R1 -> R2 -> R3 -> R5 >>[333] -> R2 -> R3 -> R5 >>[444] >> >>This algorithm seem to work with the R4 removal case (and your example >>too). >> >>It's a bit annoying to doi the validation for long chains as it requires >>a dynamic array of hash values to test ->next elements against, >>optimizations in there would be a nice to have if we can come up with >>any. >> >>Simo. >> > >MAIN_PROBLEM: We know that main problem is when hash table entry referes >to record, which hashes are different. > (hash_key != rec->hash1 && hash_key != rec->hash2) > >This can only happend if data in hash table are changed. Data in hash table can >be changed in two situation (adding record, removing record) > >A) Adding record to hash table: > 1) hash_table[hash_key] == MC_INVALID VAL > there is no record for hash_key and we can safely refer to new record. > hash_table[hash_key] refers to valid first record. > MAIN_PROBLEM cannot occur in this situation. > 2) hash_table[hash_key] refers to valid object. > adding record to chain. > hash_table[hash_key] refers to valid record. > MAIN_PROBLEM cannot occur in this situation. > >AS we can see, there is no problem with adding record to hash table. > > >B) Removing record from hash table > We need to remove record from chain (function sss_mc_rm_rec_from_chain) > 1) Record is the first in chain and *HAS NOT* successor > (rec->next == MC_INVALID_VAL) > hash_table[hash_key] will be set to MC_INVALID_VAL > this mean that there is no record for hash_key. > MAIN_PROBLEM cannot occur in this situation. > 2) Record is the first in chain and *HAS* successor > hash_table[hash_key] will be set to rec->next > *** MAIN_PROBLEM can occur *** > This will not be problem with attached patch > 3) Record is in the middle of chain. > record will be removed from chain. (nothing else) > hash_table[hash_key] refers to valid record, because > 1st record is untouched. > MAIN_PROBLEM cannot occur in this situation. > 4) Record is at the end of chain. > The situation is the same as in previous point. > MAIN_PROBLEM cannot occur in this situation. > > >We can see that only one operation is problematic (B3). > >Attached patch fixes this problematic situation. Chains will not be changed. >on hash_table[hash_key] will refer to first valid record after removed record. hash_table[hash_key] will refer to first valid record after removed record. or it will refer to MC_INVALID_VAL, there was no record with hash_key in chain. > >Simo, thank you very much for review. I hope this version is correct. > >LS
>From 74b7282102f7d2be8690368f92ced68bd53135f8 Mon Sep 17 00:00:00 2001 >From: Lukas Slebodnik <[email protected]> >Date: Mon, 19 Aug 2013 05:39:28 +0200 >Subject: [PATCH] mmap_cache: Skip records which doesn't have same hash > >Record in data table has two hashes hash1 and hash2. Hash table refers >to the first record with the searched hash key. We do a record chaining >in case of hash collision. > >When we removed record from memory cache, entry in the hash_table was >automaticaly updated to the next record from chain. But it is possible, >that following record will have different hashes (hash1 and hash2). >Therefore it can happen, that some hash table keys refers to records, >which both hashes has different values like hash key. Such record >can be removed from memory cache, but there will be reference in hash table >with different key, which could not be removed, but it points to removed data >or in the worst case it points in the middle of newly added record. >And this was a reason of crash in nss. > >This patch fixes removing fix entry from chain. > >Resolves: >https://fedorahosted.org/sssd/ticket/2018 >--- > src/responder/nss/nsssrv_mmap_cache.c | 23 ++++++++++++++++++++++- > 1 file changed, 22 insertions(+), 1 deletion(-) > >diff --git a/src/responder/nss/nsssrv_mmap_cache.c >b/src/responder/nss/nsssrv_mmap_cache.c >index >cd5a6436e005b4c7f5622eaff2f259de3bbe5d29..aa9d535e404456c18eaf9e0b57557a77500cdbea > 100644 >--- a/src/responder/nss/nsssrv_mmap_cache.c >+++ b/src/responder/nss/nsssrv_mmap_cache.c >@@ -134,6 +134,27 @@ static void sss_mc_add_rec_to_chain(struct sss_mc_ctx >*mcc, > cur->next = MC_PTR_TO_SLOT(mcc->data_table, rec); > } > >+static inline uint32_t >+sss_mc_get_next_slot_with_hash(struct sss_mc_ctx *mcc, >+ struct sss_mc_rec *start_rec, >+ uint32_t hash) >+{ >+ struct sss_mc_rec *rec; >+ uint32_t slot; >+ >+ slot = start_rec->next; >+ while (slot != MC_INVALID_VAL) { >+ rec = MC_SLOT_TO_PTR(mcc->data_table, slot, struct sss_mc_rec); >+ if (rec->hash1 == hash || rec->hash2 == hash) { >+ break; >+ } >+ >+ slot = rec->next; >+ } >+ >+ return slot; >+} >+ > static void sss_mc_rm_rec_from_chain(struct sss_mc_ctx *mcc, > struct sss_mc_rec *rec, > uint32_t hash) >@@ -151,7 +172,7 @@ static void sss_mc_rm_rec_from_chain(struct sss_mc_ctx >*mcc, > slot = mcc->hash_table[hash]; > cur = MC_SLOT_TO_PTR(mcc->data_table, slot, struct sss_mc_rec); > if (cur == rec) { >- mcc->hash_table[hash] = rec->next; >+ mcc->hash_table[hash] = sss_mc_get_next_slot_with_hash(mcc, rec, >hash); > } else { > slot = cur->next; > while (slot != MC_INVALID_VAL) { >-- >1.8.3.1 > >_______________________________________________ >sssd-devel mailing list >[email protected] >https://lists.fedorahosted.org/mailman/listinfo/sssd-devel _______________________________________________ sssd-devel mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
