On Wed, Aug 21, 2013 at 05:56:48PM +0200, Lukas Slebodnik wrote: > On (20/08/13 21:50), Simo Sorce wrote: > >On Tue, 2013-08-20 at 10:32 +0200, Lukas Slebodnik wrote: > >> On (19/08/13 16:49), Simo Sorce wrote: > >> >On Mon, 2013-08-19 at 21:05 +0200, Lukas Slebodnik wrote: > >> >> MAIN_PROBLEM: We know that main problem is when hash table entry referes > >> >> to record, which hashes are different. > >> >> (hash_key != rec->hash1 && hash_key != rec->hash2) > >> >> > >> >> This can only happend if data in hash table are changed. Data in hash > >> >> table can > >> >> be changed in two situation (adding record, removing record) > >> >> > >> >> A) Adding record to hash table: > >> >> 1) hash_table[hash_key] == MC_INVALID VAL > >> >> there is no record for hash_key and we can safely refer to new > >> >> record. > >> >> hash_table[hash_key] refers to valid first record. > >> >> MAIN_PROBLEM cannot occur in this situation. > >> >> 2) hash_table[hash_key] refers to valid object. > >> >> adding record to chain. > >> >> hash_table[hash_key] refers to valid record. > >> >> MAIN_PROBLEM cannot occur in this situation. > >> >> > >> >> AS we can see, there is no problem with adding record to hash table. > >> >> > >> >> > >> >> B) Removing record from hash table > >> >> We need to remove record from chain (function > >> >> sss_mc_rm_rec_from_chain) > >> >> 1) Record is the first in chain and *HAS NOT* successor > >> >> (rec->next == MC_INVALID_VAL) > >> >> hash_table[hash_key] will be set to MC_INVALID_VAL > >> >> this mean that there is no record for hash_key. > >> >> MAIN_PROBLEM cannot occur in this situation. > >> >> 2) Record is the first in chain and *HAS* successor > >> >> hash_table[hash_key] will be set to rec->next > >> >> *** MAIN_PROBLEM can occur *** > >> >> This will not be problem with attached patch > >> >> 3) Record is in the middle of chain. > >> >> record will be removed from chain. (nothing else) > >> >> hash_table[hash_key] refers to valid record, because > >> >> 1st record is untouched. > >> >> MAIN_PROBLEM cannot occur in this situation. > >> >> 4) Record is at the end of chain. > >> >> The situation is the same as in previous point. > >> >> MAIN_PROBLEM cannot occur in this situation. > >> >> > >> >> > >> >> We can see that only one operation is problematic (B3). > >> >> > >> >> Attached patch fixes this problematic situation. Chains will not be > >> >> changed. > >> >> on hash_table[hash_key] will refer to first valid record after removed > >> >> record. > >> >> > >> >> Simo, thank you very much for review. I hope this version is correct. > >> > > >> >After quite some thinking I couldn't find a way to break this, indeed I > >> >guess the only case where a record can remain orphaned and accessible on > >> >a chain is if it is the second record on the chain but brought there > >> >only because it is a next record on another chain, and the first record > >> >on that chain is removed. > >> > > >> >So this new patch seem to address the problem correctly, however I would > >> >like to see comments before bot the critical assignments in the code > >> >that points out that the difference in treatment is intentional (so that > >> >if someone needs to modify the code at a later time he will not 'fix' > >> >the discrepancy bymistake). > >> done. > >> > >> > > >> >Also I think we need to rewrite the comment of the commit, I still do > >> >not think it is clear enough to convey what the actual issue is. > >> > > >> >What about this: > >> > > >> >"The code uses 2 hashes for each record, but only one hash table to > >> >index them both, furthermore each record has only one single 'next' > >> >pointer. > >> > > >> >This means that in certain conditions a record main end up being on a > >> >hash chain even though its hashes do not match the hash chain. This can > >> >happen when another record 'drags' it in from another hash chain where > >> >they both belong. > >> > > >> >If the record without matching hashes happens to be the second of the > >> >chain and the first record is removed, then the non matching record is > >> >left on the wrong chain. On removal of the non-matching record the hash > >> >chain will not be updated and the hash chain will end up pointing to an > >> >invalid slot. > >> >This slot may be later reused for another record and may not be the > >> >first slot of this new record. In this case the hash chain will point to > >> >arbitrary data and may cause issues if the slot is interpreted as the > >> >head of a record. > >> > > >> >By skipping any block that has no matching hashes upon removing the > >> >first record in a chain we insure that dangling references cannot be > >> >left in the hash table" > >> > >> I am highly influenced by debugging this, so my point of view is not > >> objective. > >> I reused your proposal without any change. > >> > >> I also changed resolves section to > >> https://fedorahosted.org/sssd/ticket/2049, > >> because I used reproducers from BZ997406 and reporter used sssd from master > >> (including Michal's patch). > >> > >> Functionality of 1st patch is not changed. > >> > >> Thank you very much for review. > > > >Looks good to me! > > > >Simo. > > > > Reproducer for this bug: https://bugzilla.redhat.com/show_bug.cgi?id=997406#c9 > > While I was investigating the root of the problem I wrote two helper > functions, > because it was impossible to find out solution from coredump. > Those functions check each entry in hash table, whether hash_key refers to > valid slot. Functions are called after adding new record and after removing > record. I think patches can be used as a base for unit tests. > > Patches are really ugly. Macro EXCEPTION causes SEGFAULT :-) > But it was an aim, because I debug optimized code at the beginning. > I had an idea about optimized code, but luckily it was not a true. > > LS
Nice work, we should really turn these into unit tests: https://fedorahosted.org/sssd/ticket/2055 _______________________________________________ sssd-devel mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
