On 10/27/2011 03:59 PM, Mathieu Desnoyers wrote: > * Lai Jiangshan ([email protected]) wrote: >> >> Is it a bug? >> >> It depends. >> >> Current code(a combination of add_replace and add_unique) >> does never generate duplicated keys, but it only >> guarantees snapshot semantic: If we take a snapshot of the >> hash table, we can observe that there are no duplicated keys >> in the snapshot. >> >> But this definition don't work in practice, we don't take snapshot >> before observe, we actually observe them one by one. >> Example: >> >> cds_lfht_lookup(&iter); >> while (should_end(&iter)) { >> do_something_with(&iter); >> cds_lfht_next(&iter); >> } >> >> In the old code, we may do_something_with/observe 2 duplicate nodes >> in this practice! >> >> Here is an identical-hash-value node chain with no duplicated keys >> -->[p1]-->[p2]-->[p3]-->[p4]--> >> >> Now thread 1 is the observing thread, it is travelling and do >> something with the nodes. thread 2 deletes [p1], thread 3 >> add_unique [p1'] with the same key as [p1] >> >> thread 1 thread 2 thread 3 >> --------------------------------------------------- >> do something with [p1] >> do something with [p2] >> delete [p1] >> uniquely add [p1'] >> (-->[p2]-->[p3]-->[p4]-->[p1']-->) >> do something with [p3] >> do something with [p4] >> do something with [p1'] >> --------------------------------------------------- >> >> Now, thread 1 unexpectly handles the same key twice. >> It is BUG! >> >> If we just provide snapshot semantic, it is not useful. >> >> So we need to provide more strict add_replace/add_unique semantic. >> 1) no duplicated keys should ever be observable in the table(snapshot >> semantic) >> 2) no duplicated keys should ever be observed by forward iteration in the >> table. >> >> The fix: >> To provide forward-iteration-observation semantic, I ensure the new inserted >> node(add_unique/add_replace) is inserted as the first node of >> the identical-hash-value node chain. >> >> A little another behavior is changed in this patch, we will no do gc in >> __cds_lfht_next_duplicate_node(), because it does need. >> >> > > I agree that your semantic fix is needed for ensuring that cds_lfht_next > is semantically correct with add_unique semantic. > > Just a note: the most important lookup/get next API semantic in my view > is the lookup + cds_lfht_next_duplicate, which is the typical use case > for looking up and node and getting all the duplicate keys. > > Ideally, we want both get_next and get_next_duplicate to provide the > correct semantic. > > I think the reason your fix here seems to work and not break the > behavior of cds_lfht_next_duplicate is because you only put nodes added > with add_unique at the beginning of the same-hash-value-chain. It's a > good thing that you don't try doing this for other add mode (allowing > duplicates), because cds_lfht_next_duplicate requires duplicate nodes to > be next one to another. And you patch keeps this behavior. > > Few comments about the code below, > >> >> >> Signed-off-by: Lai Jiangshan <[email protected]> >> --- >> rculfhash.c | 31 ++++++++++++++++++++++++++----- >> 1 files changed, 26 insertions(+), 5 deletions(-) >> >> diff --git a/rculfhash.c b/rculfhash.c >> index 6ba3971..72a444d 100644 >> --- a/rculfhash.c >> +++ b/rculfhash.c >> @@ -833,6 +833,10 @@ int _cds_lfht_replace(struct cds_lfht *ht, unsigned >> long size, >> return 0; >> } >> >> +static void >> +__cds_lfht_next_duplicate_node(struct cds_lfht *ht, struct cds_lfht_iter >> *iter, >> + struct cds_lfht_node *node); >> + >> static >> struct cds_lfht_node *_cds_lfht_add(struct cds_lfht *ht, >> unsigned long size, >> @@ -862,20 +866,37 @@ struct cds_lfht_node *_cds_lfht_add(struct cds_lfht >> *ht, >> goto insert; >> if (likely(clear_flag(iter)->p.reverse_hash > >> node->p.reverse_hash)) >> goto insert; >> + >> /* dummy node is the first node of the >> identical-hash-value chain */ >> if (dummy && clear_flag(iter)->p.reverse_hash == >> node->p.reverse_hash) >> goto insert; >> + >> next = rcu_dereference(clear_flag(iter)->p.next); >> if (unlikely(is_removed(next))) >> goto gc_node; >> + >> + /* uniquely add */ >> if ((mode == ADD_UNIQUE) >> && !is_dummy(next) >> - && clear_flag(iter)->p.reverse_hash == >> node->p.reverse_hash >> - && !ht->compare_fct(node->key, node->key_len, >> - clear_flag(iter)->key, >> - clear_flag(iter)->key_len)) { >> - return clear_flag(iter); >> + && clear_flag(iter)->p.reverse_hash == >> node->p.reverse_hash) { >> + struct cds_lfht_iter d_iter = {.next = iter,}; >> + >> + /* >> + * uniquely adding inserts the node as the first >> + * node of the identical-hash-value node chain. >> + * >> + * This semantic ensures no duplicated keys >> + * should ever be observable in the table >> + * (including observe one node by one node >> + * by forward iterations) >> + */ >> + __cds_lfht_next_duplicate_node(ht, &d_iter, >> node); > > Why did we need to split cds_lfht_next_duplicate at all ? > > Could we simply pass a: > > struct cds_lfht_iter d_iter = { .node = node, .iter = iter }; > > cds_lfht_next_duplicate(ht, &d_iter); > > without splitting it ? > > I am probably missing something here...
The result is correct/the same, but it may introduce some confusions, because the reason I don't use cds_lfht_next_duplicate is "@node is not already in the list". (I didn't find that the result are the same then.) Since the result are the same, I will respin it. Thanks, Lai > > Thanks, > > Mathieu > >> + if (!d_iter.node) >> + goto insert; >> + >> + return d_iter.node; >> } >> + >> /* Only account for identical reverse hash once */ >> if (iter_prev->p.reverse_hash != >> clear_flag(iter)->p.reverse_hash >> && !is_dummy(next)) >> -- >> 1.7.4.4 >> > _______________________________________________ ltt-dev mailing list [email protected] http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
