On Thu, 2012-08-16 at 15:22 -0400, Trond Myklebust wrote:
> On Thu, 2012-08-16 at 11:55 -0700, Linus Torvalds wrote:
> > Shai - *please* don't send stuff to me personally unless I'm
> > absolutely the only person who can help. Which is not normally the
> > case at all.
> >
> > Trond, this BUG_ON() seems to have been introduced in commit
> > a427b9ec4eda8 ("NFS: Fix a number of bugs in the idmapper") from David
> > Howells.
> >
> > David? Why is there a BUG_ON() there? Killing the machine helps us exactly
> > how?
> >
> > There seems to be some missing locking wrt idmap->idmap_key_cons
> > accesses. Instead, it uses the insane
> >
> > + cons = ACCESS_ONCE(idmap->idmap_key_cons);
> > + idmap->idmap_key_cons = NULL;
> >
> > sequence that makes no sense. What's so special about it that it needs
> > ACCESS_ONCE()? If that access is unlocked, then setting the field to
> > NULL directly afterwards seems singularly unsafe. What's the logic
> > behind such an access pattern?
>
> Hi Linus,
>
> I believe that this is the same issue that William Dauchy hit last week,
> and for which we have already have a diagnosis: it is due to the
> upstream code failing to clear idmap->idmap_key_cons if/when the legacy
> upcall fails.
>
> I do have a couple of patches available to fix this issue in linux-next,
> but we found a problem with one of them during testing earlier today, so
> I'm holding back sending them upstream pending further testing.
OK. I've just sent out a new version of the those 2 patches for testing,
and I made sure to Cc: Shai.
Shai, can you please test if they fix the problem you are seeing?
Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer
NetApp
[email protected]
www.netapp.com
N�����r��y����b�X��ǧv�^�){.n�+������z)����w*jg��������ݢj/���z�ޖ��2�ޙ����&�)ߡ�a�����G���h��j:+v���w��٥