Re: [Nfs-ganesha-devel] IP based recovery

Jeremy Bongio Thu, 17 Sep 2015 11:56:00 -0700

I mistakenly thought that only during EVENT_TAKE_NODEID we removed 
clientids from the recovery directory after grace. Actually, it's only 
during non-takeover events, which excludes both EVENT_TAKE_NODEID and 
EVENT_TAKE_IP. The main difference between the two events is the naming 
scheme used for directories.


In the IP address case we have one recovery directory per server IP. For 
the nodeid case we have one recovery directory per ganesha node.

It looks like clientids in general _are_ garbage collected and removed 
from the recovery directory by the reaper thread in 
reap_hash_table()->nfs_client_id_expire().  My concern is in 
nfs4_read_recov_clids() we don't add clientids that we're recovering to 
the general clientid hashtable, but to a separate clientid list in 
nfs4_recovery.c:

...
struct glist_head clid_list = GLIST_HEAD_INIT(clid_list);  /*< Clients */
...

I don't see where we are deleting the clientids that _aren't_ reclaimed. 
But if a clientid _is_ reclaimed, then it will be put in the general 
list and eventually be garbage collected by the reaper thread.


On 09/17/2015 12:47 PM, Frank Filz wrote:
>> We've previously used node-based recovery, which is basically what's
>> implemented right now in Ganesha. However, for a number of reasons we
>> need an IP-based recovery solution. Malahal told me that Redhat wants an
>> IP-based solution as well. Soumya (or anyone else), have you been working
>> on this? Do you have anything to show yet?
> I thought the recovery was IP based already...
>
> There are basically three scenarios of interest:
>
> 1. Node goes down and IPs are moved to other nodes
> 2. Interface on Node goes down and IP is moved to another node
> 3. Actually there's just two scenarios, because failback when a node or
> interface comes back online is just scenario 2...
>
> This means there are three main actions to manage the transfer of state:
>
> 1. RELEASE_IP, if an interface goes down, the node that is losing that IP
> needs to release state associated with that IP
>
> 2. TAKE_IP, whichever node is acquiring an IP address (whether from a failed
> node, failed interface, or failback) must notify v3 clients and then accept
> reclaims from the appropriate v3 and v4 cleints.
>
> 3. Somewhere in there all nodes need to cooperate to enforce grace period.
>
> Frank
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>

-- 
Jeremy Bongio

[email protected]
IBM Linux Technology Center - Linux Filesystems Team
Linux Development Engineer


------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
Nfs-ganesha-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] IP based recovery

Reply via email to