On Thu, 2012-08-16 at 11:55 -0700, Linus Torvalds wrote:
> Shai - *please* don't send stuff to me personally unless I'm
> absolutely the only person who can help. Which is not normally the
> case at all.
>
> Trond, this BUG_ON() seems to have been introduced in commit
> a427b9ec4eda8 ("NFS: Fix a number of bugs in the idmapper") from David
> Howells.
>
> David? Why is there a BUG_ON() there? Killing the machine helps us exactly
> how?
>
> There seems to be some missing locking wrt idmap->idmap_key_cons
> accesses. Instead, it uses the insane
>
> + cons = ACCESS_ONCE(idmap->idmap_key_cons);
> + idmap->idmap_key_cons = NULL;
>
> sequence that makes no sense. What's so special about it that it needs
> ACCESS_ONCE()? If that access is unlocked, then setting the field to
> NULL directly afterwards seems singularly unsafe. What's the logic
> behind such an access pattern?
Hi Linus,
I believe that this is the same issue that William Dauchy hit last week,
and for which we have already have a diagnosis: it is due to the
upstream code failing to clear idmap->idmap_key_cons if/when the legacy
upcall fails.
I do have a couple of patches available to fix this issue in linux-next,
but we found a problem with one of them during testing earlier today, so
I'm holding back sending them upstream pending further testing.
Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer
NetApp
[email protected]
www.netapp.com