Re: [Linux-cluster] RHEL/CentOS-6 HA NFS Configuration Question

Fabio M. Di Nitto Thu, 30 Aug 2012 04:04:51 -0700

Hi Colin,

the fix is out for rhel5.8.z in rgmanager-2.0.52-28.el5_8.2 and/or higher.


rhel6.4 fix has been built but not verified by our QA team yet.

Fabio

On 8/30/2012 12:39 PM, Colin Simpson wrote:
> Did this fix make it as yet?
> 
> Thanks
> 
> Colin
> 
> On Thu, 2012-05-17 at 11:57 +0200, Fabio M. Di Nitto wrote:
>> Hi Colin,
>>
>> On 5/17/2012 11:47 AM, Colin Simpson wrote:
>>> Thanks for all the useful information on this.
>>>
>>> I realise the bz is not for this issue, I just included it as it has the
>>> suggestion that nfsd should actually live in user space (which seems
>>> sensible).
>>
>> Understood. I can´t really say if userland or kernel would make any
>> difference in this specific unmount issue, but for "safety reasons" I
>> need to assume their design is the same and behave the same way. when/if
>> there will be a switch, we will need to look more deeply into it. With
>> current kernel implementation we (cluster guys) need to use this approach.
>>
>>>
>>> Out of interest is there a bz # for this issue?
>>
>> Yes one for rhel5 and one for rhel6, but they are both private at the
>> moment because they have customer data in it.
>>
>> I expect that the workaround/fix (whatever you want to label it) will be
>> available via RHN in 2/3 weeks.
>>
>> Fabio
>>
>>>
>>> Colin
>>>
>>>
>>> On Thu, 2012-05-17 at 10:26 +0200, Fabio M. Di Nitto wrote:
>>>> On 05/16/2012 08:19 PM, Colin Simpson wrote:
>>>>> This is interesting.
>>>>>
>>>>> We very often see the filesystems fail to umount on busy clustered NFS
>>>>> servers.
>>>>
>>>> Yes, I am aware the issue since I have been investigating it in details
>>>> for the past couple of weeks.
>>>>
>>>>>
>>>>> What is the nature of the "real fix"?
>>>>
>>>> First, the bz you mention below is unrelated to the unmount problem we
>>>> are discussing. clustered nfsd locks are a slightly different story.
>>>>
>>>> There are two issues here:
>>>>
>>>> 1) cluster users expectations
>>>> 2) nfsd internal design
>>>>
>>>> (and note I am not blaming either cluster or nfsd here)
>>>>
>>>> Generally cluster users expect to be able to do things like (fake meta
>>>> config):
>>>>
>>>> <service1..
>>>>  <fs1..
>>>>   <nfsexport1..
>>>>    <nfsclient1..
>>>>     <ip1..
>>>> ....
>>>> <service2
>>>>  <fs2..
>>>>   <nfsexport2..
>>>>    <nfsclient2..
>>>>     <ip2..
>>>>
>>>> and be able to move services around cluster nodes without problem. Note
>>>> that it is irrelevant of the fs used. It can be clustered or not.
>>>>
>>>> This setup does unfortunately clash with nfsd design.
>>>>
>>>> When shutdown of a service happens (due to stop or relocation is
>>>> indifferent):
>>>>
>>>> ip is removed
>>>> exportfs -u .....
>>>> (and that's where we hit the nfsd design limitation)
>>>> umount fs..
>>>>
>>>> By design (tho I can't say exactly why it is done this way without
>>>> speculating), nfsd will continue to serve open sessions via rpc.
>>>> exportfs -u will only stop new incoming requests.
>>>>
>>>> If nfsd is serving a client, it will continue to hold a lock on the
>>>> filesystem (in kernel) that would prevent the fs to be unmounted.
>>>>
>>>> The only way to effectively close the sessions are:
>>>>
>>>> - drop the VIP and wait for connections timeout (nfsd would effectively
>>>>   also drop the lock on the fs) but it is slow and not always consistent
>>>>   on how long it would take
>>>>
>>>> - restart nfsd.
>>>>
>>>>
>>>> The "real fix" here would be to wait for nfsd containers that do support
>>>> exactly this scenario. Allowing unexport of single fs and lock drops
>>>> etc. etc. This work is still in very early stages upstream, that doesn't
>>>> make it suitable yet for production.
>>>>
>>>> The patch I am working on, is basically a way to handle the clash in the
>>>> best way as possible.
>>>>
>>>> A new nfsrestart="" option will be added to both fs and clusterfs, that,
>>>> if the filesystem cannot be unmounted, if force_unmount is set, it will
>>>> perform an extremely fast restart of nfslock and nfsd.
>>>>
>>>> We can argue that it is not the final solution, i think we can agree
>>>> that it is more of a workaround, but:
>>>>
>>>> 1) it will allow service migration instead of service failure
>>>> 2) it will match cluster users expectations (allowing different exports
>>>> and live peacefully together).
>>>>
>>>> The only negative impact that we have been able to evaluate so far (the
>>>> patch is still under heavy testing phase), beside having to add a config
>>>> option to enable it, is that there will be a small window in which all
>>>> clients connect to a certain node for all nfs services, will not be
>>>> served because nfsd is restarting.
>>>>
>>>> So if you are migrating export1 and there are clients using export2,
>>>> export2 will also be affected for those few ms required to restart nfsd.
>>>> (assuming export1 and 2 are running on the same node of course).
>>>>
>>>> Placing things in perspective for a cluster, I think that it is a lot
>>>> better to be able to unmount a fs and relocate services as necessary vs
>>>> a service failing completely and maybe node being fenced.
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> I like the idea of NFSD fully being in user space, so killing it would
>>>>> definitely free the fs.
>>>>>
>>>>> Alan Brown (who's on this list) recently posted to a RH BZ that he was
>>>>> one of the people who moved it into kernel space for performance reasons
>>>>> in the past (that are no longer relevant):
>>>>>
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=580863#c9
>>>>>
>>>>> , but I doubt this is the fix you have in mind.
>>>>
>>>> No that's a totally different issue.
>>>>
>>>>>
>>>>> Colin
>>>>>
>>>>> On Tue, 2012-05-15 at 20:21 +0200, Fabio M. Di Nitto wrote:
>>>>>> This solves different issues at startup, relocation and recovery
>>>>>>
>>>>>> Also note that there is known limitation in nfsd (both rhel5/6) that
>>>>>> could cause some problems in some conditions in your current
>>>>>> configuration. A permanent fix is being worked on atm.
>>>>>>
>>>>>> Without extreme details, you might have 2 of those services running on
>>>>>> the same node and attempting to relocate one of them can fail because
>>>>>> the fs cannot be unmounted. This is due to nfsd holding a lock (at
>>>>>> kernel level) to the FS. Changing config to the suggested one, mask the
>>>>>> problem pretty well, but more testing for a real fix is in progress.
>>>>>>
>>>>>> Fabio
>>>>>>
>>>>>> --
>>>>>> Linux-cluster mailing list
>>>>>> Linux-cluster@redhat.com
>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>
>>>>>
>>>>> ________________________________
>>>>>
>>>>>
>>>>> This email and any files transmitted with it are confidential and are 
>>>>> intended solely for the use of the individual or entity to whom they are 
>>>>> addressed. If you are not the original recipient or the person 
>>>>> responsible for delivering the email to the intended recipient, be 
>>>>> advised that you have received this email in error, and that any use, 
>>>>> dissemination, forwarding, printing, or copying of this email is strictly 
>>>>> prohibited. If you received this email in error, please immediately 
>>>>> notify the sender and delete the original.
>>>>>
>>>>>
>>>>> --
>>>>> Linux-cluster mailing list
>>>>> Linux-cluster@redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>
>>>
>>> ________________________________
>>>
>>>
>>> This email and any files transmitted with it are confidential and are 
>>> intended solely for the use of the individual or entity to whom they are 
>>> addressed. If you are not the original recipient or the person responsible 
>>> for delivering the email to the intended recipient, be advised that you 
>>> have received this email in error, and that any use, dissemination, 
>>> forwarding, printing, or copying of this email is strictly prohibited. If 
>>> you received this email in error, please immediately notify the sender and 
>>> delete the original.
>>>
>>
> 
> 
> ________________________________
> 
> 
> This email and any files transmitted with it are confidential and are 
> intended solely for the use of the individual or entity to whom they are 
> addressed. If you are not the original recipient or the person responsible 
> for delivering the email to the intended recipient, be advised that you have 
> received this email in error, and that any use, dissemination, forwarding, 
> printing, or copying of this email is strictly prohibited. If you received 
> this email in error, please immediately notify the sender and delete the 
> original.
> 

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] RHEL/CentOS-6 HA NFS Configuration Question

Reply via email to