On (20/08/13 21:50), Simo Sorce wrote: >On Tue, 2013-08-20 at 10:32 +0200, Lukas Slebodnik wrote: >> On (19/08/13 16:49), Simo Sorce wrote: >> >On Mon, 2013-08-19 at 21:05 +0200, Lukas Slebodnik wrote: >> >> MAIN_PROBLEM: We know that main problem is when hash table entry referes >> >> to record, which hashes are different. >> >> (hash_key != rec->hash1 && hash_key != rec->hash2) >> >> >> >> This can only happend if data in hash table are changed. Data in hash >> >> table can >> >> be changed in two situation (adding record, removing record) >> >> >> >> A) Adding record to hash table: >> >> 1) hash_table[hash_key] == MC_INVALID VAL >> >> there is no record for hash_key and we can safely refer to new >> >> record. >> >> hash_table[hash_key] refers to valid first record. >> >> MAIN_PROBLEM cannot occur in this situation. >> >> 2) hash_table[hash_key] refers to valid object. >> >> adding record to chain. >> >> hash_table[hash_key] refers to valid record. >> >> MAIN_PROBLEM cannot occur in this situation. >> >> >> >> AS we can see, there is no problem with adding record to hash table. >> >> >> >> >> >> B) Removing record from hash table >> >> We need to remove record from chain (function >> >> sss_mc_rm_rec_from_chain) >> >> 1) Record is the first in chain and *HAS NOT* successor >> >> (rec->next == MC_INVALID_VAL) >> >> hash_table[hash_key] will be set to MC_INVALID_VAL >> >> this mean that there is no record for hash_key. >> >> MAIN_PROBLEM cannot occur in this situation. >> >> 2) Record is the first in chain and *HAS* successor >> >> hash_table[hash_key] will be set to rec->next >> >> *** MAIN_PROBLEM can occur *** >> >> This will not be problem with attached patch >> >> 3) Record is in the middle of chain. >> >> record will be removed from chain. (nothing else) >> >> hash_table[hash_key] refers to valid record, because >> >> 1st record is untouched. >> >> MAIN_PROBLEM cannot occur in this situation. >> >> 4) Record is at the end of chain. >> >> The situation is the same as in previous point. >> >> MAIN_PROBLEM cannot occur in this situation. >> >> >> >> >> >> We can see that only one operation is problematic (B3). >> >> >> >> Attached patch fixes this problematic situation. Chains will not be >> >> changed. >> >> on hash_table[hash_key] will refer to first valid record after removed >> >> record. >> >> >> >> Simo, thank you very much for review. I hope this version is correct. >> > >> >After quite some thinking I couldn't find a way to break this, indeed I >> >guess the only case where a record can remain orphaned and accessible on >> >a chain is if it is the second record on the chain but brought there >> >only because it is a next record on another chain, and the first record >> >on that chain is removed. >> > >> >So this new patch seem to address the problem correctly, however I would >> >like to see comments before bot the critical assignments in the code >> >that points out that the difference in treatment is intentional (so that >> >if someone needs to modify the code at a later time he will not 'fix' >> >the discrepancy bymistake). >> done. >> >> > >> >Also I think we need to rewrite the comment of the commit, I still do >> >not think it is clear enough to convey what the actual issue is. >> > >> >What about this: >> > >> >"The code uses 2 hashes for each record, but only one hash table to >> >index them both, furthermore each record has only one single 'next' >> >pointer. >> > >> >This means that in certain conditions a record main end up being on a >> >hash chain even though its hashes do not match the hash chain. This can >> >happen when another record 'drags' it in from another hash chain where >> >they both belong. >> > >> >If the record without matching hashes happens to be the second of the >> >chain and the first record is removed, then the non matching record is >> >left on the wrong chain. On removal of the non-matching record the hash >> >chain will not be updated and the hash chain will end up pointing to an >> >invalid slot. >> >This slot may be later reused for another record and may not be the >> >first slot of this new record. In this case the hash chain will point to >> >arbitrary data and may cause issues if the slot is interpreted as the >> >head of a record. >> > >> >By skipping any block that has no matching hashes upon removing the >> >first record in a chain we insure that dangling references cannot be >> >left in the hash table" >> >> I am highly influenced by debugging this, so my point of view is not >> objective. >> I reused your proposal without any change. >> >> I also changed resolves section to https://fedorahosted.org/sssd/ticket/2049, >> because I used reproducers from BZ997406 and reporter used sssd from master >> (including Michal's patch). >> >> Functionality of 1st patch is not changed. >> >> Thank you very much for review. > >Looks good to me! > >Simo. >
Reproducer for this bug: https://bugzilla.redhat.com/show_bug.cgi?id=997406#c9 While I was investigating the root of the problem I wrote two helper functions, because it was impossible to find out solution from coredump. Those functions check each entry in hash table, whether hash_key refers to valid slot. Functions are called after adding new record and after removing record. I think patches can be used as a base for unit tests. Patches are really ugly. Macro EXCEPTION causes SEGFAULT :-) But it was an aim, because I debug optimized code at the beginning. I had an idea about optimized code, but luckily it was not a true. LS
>From af11020b18f82458527b92c0280619d75881410b Mon Sep 17 00:00:00 2001 From: Lukas Slebodnik <[email protected]> Date: Sat, 17 Aug 2013 09:13:34 +0200 Subject: [PATCH 1/2] check hash table --- src/responder/nss/nsssrv_mmap_cache.c | 72 +++++++++++++++++++++++++++++++++++ 1 file changed, 72 insertions(+) diff --git a/src/responder/nss/nsssrv_mmap_cache.c b/src/responder/nss/nsssrv_mmap_cache.c index c34997b8034d05796687ff7d380b367d5c7bba06..4afee22486cd45fd9a10d86b0eedc85d677ab5ee 100644 --- a/src/responder/nss/nsssrv_mmap_cache.c +++ b/src/responder/nss/nsssrv_mmap_cache.c @@ -161,6 +161,68 @@ static uint32_t sss_mc_hash(struct sss_mc_ctx *mcc, return murmurhash3(key, len, mcc->seed) % MC_HT_ELEMS(mcc->ht_size); } +static char sync_var=0; + +#ifdef DEBUG_MCC +static char seg_var=0; +#define EXCEPTION do {\ + seg_var+=1; \ + return seg_var; \ +}while(0) + +#else /* !DEBUG_MCC */ + #define EXCEPTION do {\ + char *ptr =NULL;\ + char b =0;\ + return b + ptr[0];\ + }while(0) +#endif /* DEBUG_MCC */ + +char sss_check_hash_table(struct sss_mc_ctx *mcc) +{ + volatile uint32_t i; + + const uint32_t ht_elems = MC_HT_ELEMS(mcc->ht_size); + uint32_t *hash_table = mcc->hash_table; + + volatile uint32_t slot; + + const uint32_t dt_elems = mcc->dt_size / MC_SLOT_SIZE; + + struct sss_mc_rec * volatile rec; + for (i=0; i<ht_elems; i++) { + slot = hash_table[i]; + if (slot == MC_INVALID_VAL32) { + continue; + } + + if (slot >= dt_elems) { + EXCEPTION; + } + + rec = MC_SLOT_TO_PTR(mcc->data_table, slot, struct sss_mc_rec); + if ((rec->b1 == MC_INVALID_VAL) || + (rec->b1 != rec->b2)) { + EXCEPTION; + } + + if (!MC_CHECK_RECORD_LENGTH(mcc, rec)) { + EXCEPTION; + } + + if (rec->hash1 >= ht_elems) { + EXCEPTION; + } + + if (rec->hash2 >= ht_elems) { + EXCEPTION; + } + } + + /* all tests passed */ + return true; +} + static void sss_mc_add_rec_to_chain(struct sss_mc_ctx *mcc, struct sss_mc_rec *rec, uint32_t hash) @@ -278,9 +340,15 @@ static void sss_mc_free_slots(struct sss_mc_ctx *mcc, struct sss_mc_rec *rec) } } +static uint8_t backup_memory[6806312+1]; + static void sss_mc_invalidate_rec(struct sss_mc_ctx *mcc, struct sss_mc_rec *rec) { + static struct sss_mc_rec backup; + memcpy(&backup, rec, sizeof(backup)); + memcpy(backup_memory, mcc->mmap_base, mcc->mmap_size); + if (rec->b1 == MC_INVALID_VAL) { /* record already invalid */ return; @@ -305,6 +373,9 @@ static void sss_mc_invalidate_rec(struct sss_mc_ctx *mcc, rec->hash1 = MC_INVALID_VAL32; rec->hash2 = MC_INVALID_VAL32; MC_LOWER_BARRIER(rec); + + sync_var += sss_check_hash_table(mcc); + (void) backup; /* unused, debug purpose */ } static bool sss_mc_is_valid_rec(struct sss_mc_ctx *mcc, struct sss_mc_rec *rec) @@ -733,6 +804,7 @@ errno_t sss_mmap_cache_pw_store(struct sss_mc_ctx **_mcc, /* finally chain the rec in the hash table */ sss_mmap_chain_in_rec(mcc, rec); + sync_var += sss_check_hash_table(mcc); return EOK; } -- 1.8.3.1
>From 6c3eaa2da526f5566b0896c408549546d7833129 Mon Sep 17 00:00:00 2001 From: Lukas Slebodnik <[email protected]> Date: Sat, 17 Aug 2013 13:32:39 +0200 Subject: [PATCH 2/2] Check if hash key refers to right slot --- src/responder/nss/nsssrv_mmap_cache.c | 43 +++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/src/responder/nss/nsssrv_mmap_cache.c b/src/responder/nss/nsssrv_mmap_cache.c index 4afee22486cd45fd9a10d86b0eedc85d677ab5ee..c88d0d8ef6c7aa49632a28479875f630dd03b556 100644 --- a/src/responder/nss/nsssrv_mmap_cache.c +++ b/src/responder/nss/nsssrv_mmap_cache.c @@ -223,6 +223,47 @@ char sss_check_hash_table(struct sss_mc_ctx *mcc) return true; } +char sss_check_hash_table2(struct sss_mc_ctx *mcc) +{ + volatile uint32_t i; + + const uint32_t ht_elems = MC_HT_ELEMS(mcc->ht_size); + uint32_t *hash_table = mcc->hash_table; + + volatile uint32_t slot; + + const uint32_t dt_elems = mcc->dt_size / MC_SLOT_SIZE; + + struct sss_mc_rec * volatile rec; + for (i=0; i<ht_elems; i++) { + slot = hash_table[i]; + if (slot == MC_INVALID_VAL32) { + continue; + } + + if (slot >= dt_elems) { + EXCEPTION; + } + + rec = MC_SLOT_TO_PTR(mcc->data_table, slot, struct sss_mc_rec); + if ((rec->b1 == MC_INVALID_VAL) || + (rec->b1 != rec->b2)) { + EXCEPTION; + } + + if (!MC_CHECK_RECORD_LENGTH(mcc, rec)) { + EXCEPTION; + } + + if (rec->hash1 != i && rec->hash2 != i ) { + EXCEPTION; + } + } + + /* all tests passed */ + return true; +} + static void sss_mc_add_rec_to_chain(struct sss_mc_ctx *mcc, struct sss_mc_rec *rec, uint32_t hash) @@ -375,6 +416,7 @@ static void sss_mc_invalidate_rec(struct sss_mc_ctx *mcc, MC_LOWER_BARRIER(rec); sync_var += sss_check_hash_table(mcc); + sync_var += sss_check_hash_table2(mcc); (void) backup; /* unused, debug purpose */ } @@ -805,6 +847,7 @@ errno_t sss_mmap_cache_pw_store(struct sss_mc_ctx **_mcc, /* finally chain the rec in the hash table */ sss_mmap_chain_in_rec(mcc, rec); sync_var += sss_check_hash_table(mcc); + sync_var += sss_check_hash_table2(mcc); return EOK; } -- 1.8.3.1
_______________________________________________ sssd-devel mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
