Thanks for comments, Ilya.
> On Apr 14, 2020, at 21:11, Ilya Maximets <[email protected]> wrote:
>
> On 4/14/20 6:00 AM, Eiichi Tsukata wrote:
>>
>>
>> @@ -1575,11 +1573,16 @@ check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES],
>> unsigned int n_tries,
>> * fields using the prefix tries. The trie checks are done only as
>> * needed to avoid folding in additional bits to the wildcards mask. */
>> for (j = 0; j < n_tries; j++) {
>> - /* Is the trie field relevant for this subtable, and
>> - is the trie field within the current range of fields? */
>> - if (field_plen[j] &&
>> - flowmap_is_set(&range_map, trie_ctx[j].be32ofs / 2)) {
>> - struct trie_ctx *ctx = &trie_ctx[j];
>> + struct trie_ctx *ctx = &trie_ctx[j];
>> + /* Check if trie field is relevant for this subtable and
>> + * synchronized with field_plen. */
>> + if (!field_plen[j] || !ctx->trie->field) {
>> + continue;
>> + }
>> +
>> + uint8_t be32ofs = ctx->trie->field->flow_be32ofs;
>
> What will prevent 'field' being cleared between above two lines and produce
> a NULL pointer dereference here or inside trie_lookup()?
Main thread calls ovsrcu_synchronize() before it sets cls->tries[i]->field =
NULL.
Here ovsrcu_synchronize() waits for other readers(revalidator/handler thread)
of rcu protected pointers going through RCU quiescent state.
So here, main thread who called ovsrcu_synchronize() waits for
revalidator/handler thread
going through quiescent state. It means 'be32ofs =
ctx->trie->field->flow_be32ofs;'
is executed before main thread executes 'cls->tries[i]->field = NULL;’.
>
> trie->field is not protected by anything and not even an atomic variable.
> Since it always points to some static memory objects it doesn't need to
> be RCU protected, but it should be an atomic variable and it must be checked
> after any read. In fact it might be made rcu protected just to make things
> easier and more accurate.
>
I also had thought that protecting trie->field by RCU is a good idea, but found
the
following race condition:
[main thread] [revalidator/handler thread]
========================================================================
ovsrcu_set(&trie->field, NULL);
ovsrcu_synchronize();
Quiecense State;
for (int i = 0; i < cls->n_tries; i++)
/* field == NULL */
field = ovsrcu_get(struct mf_field *,
&trie->field);
be32ofs = field->flow_be32ofs;
/* n_tries == 0 */
cls->n_tries = n_tries;
The problem stems from n_tries and tries are not atomically updated.
Protecting them by RCU can be a pretty big patch, so I selected the current
way to fix though I’m still trying to find a better way.
What do you think is the best way to fix it?
Thanks
Eiichi
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev