Re: [SSSD] [PATCHES] mmap_cache: Skip records which doesn't have same hash

Lukas Slebodnik Mon, 19 Aug 2013 12:12:51 -0700

On (19/08/13 21:05), Lukas Slebodnik wrote:
>On (19/08/13 11:20), Simo Sorce wrote:
>>On Mon, 2013-08-19 at 14:58 +0200, Lukas Slebodnik wrote:
>>> On (19/08/13 07:58), Lukas Slebodnik wrote:
>>> >ehlo,
>>> >
>>> >detail description in attached patches.
>>> >
>>> >LS
>>> 
>>> >From 295dd1e0966df7bf1fc9a51f84a0daa81a279b5e Mon Sep 17 00:00:00 2001
>>> >From: Lukas Slebodnik <[email protected]>
>>> >Date: Mon, 19 Aug 2013 05:39:28 +0200
>>> >Subject: [PATCH 1/2] mmap_cache: Skip records which doesn't have same hash
>>> >
>>> >Record in data table has two hashes hash1 and hash2. Hash table refers
>>> >to the first record with the searched hash key. We do a record chaining
>>> >in case of hash collision.
>>> >
>>> >When we removed record from cache hash_table was automaticaly updated to
>>> >the next record from chain. But it is not very likely that two following
>>> >records will have the same hashes (hash1 and hash2). Therefore it can 
>>> >happen,
>>> >that some hash table keys refers to records, which both hashes has 
>>> >different
>>> >values like hash key. Such record can be removed from memory cache,
>>> >but there will be reference in hash table with different key, which could
>>> >not be removed, but it points to removed data or in the worst case
>>> >it points in the middle of newly added record. And this was a reason
>>> >of crash in nss.
>>> >
>>> 
>>> I will try to write example of this behaviour.
>>> 
>>> 1. Adding new record with R1(hash1:111, hash2:222, next:INVALID_VAL)
>>>    --record R1 will be added to data table to slot with index:0
>>>    --hash keys fom record R1 will refer to slot_0
>>>          hash_table[111] refers to slot_0
>>>          hash_table[222] refers to slot_0
>>> 
>>> 2. Adding another record R2(hash1:111, hash2:333, next:INVALID_VAL)
>>>    --record R1 will be added to data table to the free slot with index:3
>>>    --there is collision of hash key 111
>>>          hash_table[111] will still refers to slot0, but we will add R2 to 
>>> chain
>>>              R1->next will refer to slot_3 (R2 is stored ins slot_3)
>>>          hash_table[333] refers to slot_3 (there was no colision)
>>> 
>>> 3. Removing R1
>>>    --remove records from chains using R1->hash{1,2} 
>>> (sss_mc_rm_rec_from_chain)
>>>      in this situation hash_table[R1->hash1] and hash_table[R1->hash2] 
>>> refers
>>>      directly to R1, It is just a simplification of this example.
>>>         --set hash_table[R1->hash1] to R1->next
>>>               hash_table[111] will refer to slot_3
>>>         --set hash_table[R1->hash2] to R1->next
>>>               hash_table[222] will refer to slot_3
>>> 
>>> Current situation:
>>>     hash_table[111] refers to slot_3(R2)
>>>     hash_table[222] refers to slot_3(R2)
>>>     hash_table[333] refers to slot_3(R2)
>>
>>Ah yeah it is a mistake that [222] here still holds a reference to R2,
>>as R2 has no such hash.
>>
>>> 4. Removing R2.
>>>    --remove rerords from chain using R2->hash{1,2}
>>>      R2->next has value INVALID_VAL (there is not another record after R2)
>>>         --set hash_table[R2->hash1] to R2->next (INVALID_VAL)
>>>               hash_table[111] will refer to empty slot
>>>         --set hash_table[R2->hash2] to R2->next (INVALID_VAL)
>>>               hash_table[333] will refer to empty slot
>>> 
>>> Current situation:
>>>     hash_table[222] refers to slot_3 (but R2 was removed).
>>> 
>>> 5. Adding new record R3(hash1: 999, hash2: 888, next:INVALID_VAL)
>>>       this record is very log (10 slots) and it will be added to
>>>       the first empty slot. (slot_0). This is just a simplification, because
>>>       algorithm is different, but it can happen.
>>> 
>>> 
>>> Current situation:
>>>     hash_table[888] refers to slot_0(R3)
>>>     hash_table[999] refers to slot_0(R3)
>>>     hash_table[222] refers to slot_3 (in the middle of R3).
>>>                                        ^^^^^^^^^^^^^^^
>>>                                        possible crash
>>
>>Indeed, and I am sure this is how one of the crashes where we had stuff
>>pointing in the middle of a multi-slot field happened.
>>
>>Thanks a lot for catching this.
>>
>>However I am not completely sure the solution works correctly, I suspect
>>it may cause orphans.
>>
>>Let me work out here an example and let's see where it leads.
>>
>>Assume we have a high collision rate, and the following records are
>>added in order (the next lines are { record-id, slot1, slot2 }
>>
>>R1, 111, 222
>>R2, 222, 333
>>R3, 111, 333
>>R4, 333, 444
>>R5, 222, 333
>>
>>The hash chains look like after each insertion step (hash1/slot1 is
>>inserted first):
>>
>>1.
>>[111] -> R1
>>[222] -> R1
>>[333]
>>[444]
>>
>>2.
>>[111] -> R1 -> R2
>>[222] -> R1 -> R2
>>[333] -> R2
>>[444]
>>
>>note how having only 1 ->next in the record causes duplication in the
>>chains, this is accounted for in the code already, it just makes the
>>flow a little bit messier and harder to follow unfortunately.
>>
>>3.
>>[111] -> R1 -> R2 -> R3
>>[222] -> R1 -> R2 -> R3
>>[333] -> R2 -> R3
>>[444]
>>
>>4.
>>[111] -> R1 -> R2 -> R3 -> R4
>>[222] -> R1 -> R2 -> R3 -> R4
>>[333] -> R2 -> R3 -> R4
>>[444] -> R4
>>
>>5.
>>[111] -> R1 -> R2 -> R3 -> R4 -> R5
>>[222] -> R1 -> R2 -> R3 -> R4 -> R5
>>[333] -> R2 -> R3 -> R4 -> R5
>>[444] -> R4 -> R5
>>
>>
>>Original table for easier reference in next step:
>>R1, 111, 222
>>R2, 222, 333
>>R3, 111, 333
>>R4, 333, 444
>>R5, 222, 333
>>
>>Now let's try to remove R2.
>>The first step with your patch checks if the hashes of next record (R3)
>>matches the current hash chain hash.
>>The first hash chain we check when removing R2 is [222], and R3 has
>>[111], [333], so it doesn't match, and is skipped.
>>The second step checks R4 that has 333, 444, again no match.
>>The third step matches R5 as it is also in the 222 chain.
>>So what we do is that we set the next record of R1 (which  is R2's
>>previous record in chain [222] to R5.
>>
>>After removing R2 from first chain:
>>[111] -> R1 -> R5
>>[222] -> R1 -> R5
>>[333] -> R2 -> R3 -> R4 -> R5
>>[444] -> R4 -> R5
>>
>>Then we look at the second hash chain of R2 which is [333]
>>We check R3, which is again the next record to see if it is a record
>>that belongs to [333] by matching R3's hashes (111, 333). It does, so we
>>just set [333] to point directly at R3 as R2 was its first element.
>>
>>This is how finally the chains look:
>>[111] -> R1 -> R5
>>[222] -> R1 -> R5
>>[333] -> R3 -> R4 -> R5
>>[444] -> R4 -> R5
>>
>>As you can see they are not correct.
>>These are the remaining records:
>>R1, 111, 222
>>R3, 111, 333
>>R4, 333, 444
>>R5, 222, 333
>>
>>R1 is properly linked by both chains.
>>R3 is not linked in 111 as it should.
>>R4 and R5 are referenced by both their chains.
>>
>>So R3 now is 'lost' on one chain.
>>
>>The problem here is that we are doing separate removals, but are not
>>considering the shared nature of the 'next' pointer when it comes to the
>>affected chains.
>>
>>
>>I think we should instead use the original approach to remove chains,
>>but then re-validate them:
>>
>>Original table for easier reference in next step:
>>R1, 111, 222
>>R2, 222, 333
>>R3, 111, 333
>>R4, 333, 444
>>R5, 222, 333
>>
>>Original status with all Record chained:
>>[111] -> R1 -> R2 -> R3 -> R4 -> R5
>>[222] -> R1 -> R2 -> R3 -> R4 -> R5
>>[333] -> R2 -> R3 -> R4 -> R5
>>[444] -> R4 -> R5
>>
>>Let's remove R4 this time (chose R4 because it leaves a reference to R5
>>in the [444] as an orphan with the original method).
>>
>>Removing R4 (from chains 333 and 444) with the original method leaves
>>this table:
>>
>>[111] -> R1 -> R2 -> R3 -> R5
>>[222] -> R1 -> R2 -> R3 -> R5
>>[333] -> R2 -> R3 -> R5
>>[444] -> R5
>>
>>After removal from both the chains R4 referred to we revalidate them (ie
>>we only revalidate 333 and 444:
>>
>>First we revalidate [333], which has -> R2 -> R3 -> R5 as a chain.
>>R2 does have 333 as hash so it is valid
>>The next step needs to insure that the link R2 -> R5 makes sense. It
>>makes sense if any of the previous record hashes are matched.
>>So for the second step we need to match [333] plus R2's own hashes {222,
>>333}.
>>The list to match is therefore {222, 333} and R3 matches it with 333.
>>For the next record the list to match is {111, 333} + {222, 333}, or
>>{111, 222, 333}, and R5 matches and is the last element.
>>The chain is revalidated
>>
>>The second chain to check is [444], which has only -> R5
>>R5 (222, 333) does not match, so we set [444] to R5->next, and continue.
>>Turns out R5 was the last element in the chain so [444] is empty.
>>
>>Final situation after revalidation:
>>[111] -> R1 -> R2 -> R3 -> R5
>>[222] -> R1 -> R2 -> R3 -> R5
>>[333] -> R2 -> R3 -> R5
>>[444]
>>
>>This algorithm seem to work with the R4 removal case (and your example
>>too).
>>
>>It's a bit annoying to doi the validation for long chains as it requires
>>a dynamic array of hash values to test ->next elements against,
>>optimizations in there would be a nice to have if we can come up with
>>any.
>>
>>Simo.
>>
>
>MAIN_PROBLEM: We know that main problem is when hash table entry referes
>to record, which hashes are different.
>   (hash_key != rec->hash1 && hash_key != rec->hash2)
>
>This can only happend if data in hash table are changed. Data in hash table can
>be changed in two situation (adding record, removing record)
>
>A) Adding record to hash table:
>    1) hash_table[hash_key] == MC_INVALID VAL
>       there is no record for hash_key and we can safely refer to new record.
>       hash_table[hash_key] refers to valid first record.
>       MAIN_PROBLEM cannot occur in this situation.
>    2) hash_table[hash_key] refers to valid object.
>       adding record to chain.
>       hash_table[hash_key] refers to valid record.
>       MAIN_PROBLEM cannot occur in this situation.
>
>AS we can see, there is no problem with adding record to hash table.
>
>
>B) Removing record from hash table
>    We need to remove record from chain (function sss_mc_rm_rec_from_chain)
>    1) Record is the first in chain and *HAS NOT* successor
>              (rec->next == MC_INVALID_VAL)
>       hash_table[hash_key] will be set to MC_INVALID_VAL
>       this mean that there is no record for hash_key.
>       MAIN_PROBLEM cannot occur in this situation.
>    2) Record is the first in chain and *HAS* successor
>       hash_table[hash_key] will be set to rec->next
>       *** MAIN_PROBLEM can occur ***
>       This will not be problem with attached patch
>    3) Record is in the middle of chain.
>       record will be removed from chain. (nothing else)
>       hash_table[hash_key] refers to valid record, because
>           1st record is untouched.
>       MAIN_PROBLEM cannot occur in this situation.
>    4) Record is at the end of chain.
>       The situation is the same as in previous point.
>       MAIN_PROBLEM cannot occur in this situation.
>
>
>We can see that only one operation is problematic (B3).
>
>Attached patch fixes this problematic situation. Chains will not be changed.
>on hash_table[hash_key] will refer to first valid record after removed record.
    hash_table[hash_key] will refer to first valid record after removed record.
    or it will refer to MC_INVALID_VAL, there was no record with hash_key in
    chain.
>
>Simo, thank you very much for review. I hope this version is correct.
>
>LS


>From 74b7282102f7d2be8690368f92ced68bd53135f8 Mon Sep 17 00:00:00 2001
>From: Lukas Slebodnik <[email protected]>
>Date: Mon, 19 Aug 2013 05:39:28 +0200
>Subject: [PATCH] mmap_cache: Skip records which doesn't have same hash
>
>Record in data table has two hashes hash1 and hash2. Hash table refers
>to the first record with the searched hash key. We do a record chaining
>in case of hash collision.
>
>When we removed record from memory cache, entry in the hash_table was
>automaticaly updated to the next record from chain. But it is possible,
>that following record will have different hashes (hash1 and hash2).
>Therefore it can happen, that some hash table keys refers to records,
>which both hashes has different values like hash key. Such record
>can be removed from memory cache, but there will be reference in hash table
>with different key, which could not be removed, but it points to removed data
>or in the worst case it points in the middle of newly added record.
>And this was a reason of crash in nss.
>
>This patch fixes removing fix entry from chain.
>
>Resolves:
>https://fedorahosted.org/sssd/ticket/2018
>---
> src/responder/nss/nsssrv_mmap_cache.c | 23 ++++++++++++++++++++++-
> 1 file changed, 22 insertions(+), 1 deletion(-)
>
>diff --git a/src/responder/nss/nsssrv_mmap_cache.c 
>b/src/responder/nss/nsssrv_mmap_cache.c
>index 
>cd5a6436e005b4c7f5622eaff2f259de3bbe5d29..aa9d535e404456c18eaf9e0b57557a77500cdbea
> 100644
>--- a/src/responder/nss/nsssrv_mmap_cache.c
>+++ b/src/responder/nss/nsssrv_mmap_cache.c
>@@ -134,6 +134,27 @@ static void sss_mc_add_rec_to_chain(struct sss_mc_ctx 
>*mcc,
>     cur->next = MC_PTR_TO_SLOT(mcc->data_table, rec);
> }
> 
>+static inline uint32_t
>+sss_mc_get_next_slot_with_hash(struct sss_mc_ctx *mcc,
>+                               struct sss_mc_rec *start_rec,
>+                               uint32_t hash)
>+{
>+    struct sss_mc_rec *rec;
>+    uint32_t slot;
>+
>+    slot = start_rec->next;
>+    while (slot != MC_INVALID_VAL) {
>+        rec = MC_SLOT_TO_PTR(mcc->data_table, slot, struct sss_mc_rec);
>+        if (rec->hash1 == hash || rec->hash2 == hash) {
>+            break;
>+        }
>+
>+        slot = rec->next;
>+    }
>+
>+    return slot;
>+}
>+
> static void sss_mc_rm_rec_from_chain(struct sss_mc_ctx *mcc,
>                                      struct sss_mc_rec *rec,
>                                      uint32_t hash)
>@@ -151,7 +172,7 @@ static void sss_mc_rm_rec_from_chain(struct sss_mc_ctx 
>*mcc,
>     slot = mcc->hash_table[hash];
>     cur = MC_SLOT_TO_PTR(mcc->data_table, slot, struct sss_mc_rec);
>     if (cur == rec) {
>-        mcc->hash_table[hash] = rec->next;
>+        mcc->hash_table[hash] = sss_mc_get_next_slot_with_hash(mcc, rec, 
>hash);
>     } else {
>         slot = cur->next;
>         while (slot != MC_INVALID_VAL) {
>-- 
>1.8.3.1
>

>_______________________________________________
>sssd-devel mailing list
>[email protected]
>https://lists.fedorahosted.org/mailman/listinfo/sssd-devel

_______________________________________________
sssd-devel mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/sssd-devel

Re: [SSSD] [PATCHES] mmap_cache: Skip records which doesn't have same hash

Reply via email to