Hi,

inline

-- 
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-707-0660
fax.  734-769-8938
cel.  734-216-5309

----- Original Message -----
> From: "Jeremy Bongio" <jbon...@linux.vnet.ibm.com>
> To: nfs-ganesha-devel@lists.sourceforge.net
> Sent: Tuesday, November 17, 2015 11:46:06 AM
> Subject: [Nfs-ganesha-devel] memory leak related to same client using v3 and 
> v4 mounts and fixes
> 
> There is a memory leak that is caused by V3 and V4 duplicate request
> caches being shared.
> 
> We don't keep track of whether a DRC was used for NFSv4 or NFSv3 in the
> hashkey, only the address ...  but each cache _does_ have a protocol
> type. This type is used later to decide which request/replies should be
> cached and which shouldn't. This results in large operations like READs
> being cached when there is a mismatch between the request protocol-type
> and the DRC protocol-type. This can quickly (in a few minutes) eat up
> all memory (and trigger the OOM killer) in targeted testing.
> 
> 1. Either we can include the DRC type when creating the hashkey for the DRC.

you could include it in the sort, but not the hash key, because is pre-computed 
by the XDR layer (and we want to continue doing that)

> 
> 2. Or we could stop relying on the type of the DRC and rely instead on
> the type of the current request. This would involve in
> nfs_dupreq_start() using get_drc_type(req);  instead of drc->type.

that would be fine


Matt

> 
> 
> What do you think is the best fix?
> 
> Here is one quick fix I tested that worked. However, is it safe to
> simply add to the hashkey? I think so, but maybe I'm not thinking of all
> scenarios.
> @@ -574,6 +574,12 @@ nfs_dupreq_get_drc(struct svc_req *req)
>                                               "get drc for addr: %s", str);
>                          }
> 
> +                       /* Now include the nfsv3 or nfsv4 type in hashkey.
> +                        * Otherwise we confuse V4 and V3 caches which will
> +                        * later mess up process for deciding if a
> request is
> +                        * is cacheable or not. */
> +                       drc_k.d_u.tcp.hk += dtype;
> +
>                          t =
> rbtx_partition_of_scalar(&drc_st->tcp_drc_recycle_t,
> drc_k.d_u.tcp.hk);
>                          DRC_ST_LOCK();
> 
> 
> Here is a simple script I use to reproduce the defect:
> #!/usr/bin/perl
> 
> my $server_ip = "10.10.0.11";
> 
> while(1) {
>      my $output = `mount -t nfs -o vers=4 $server_ip:/ibma /mnt/cthon;
> cat /mnt/cthon/a; umount /mnt/cthon; `;
>      my $output = `mount -t nfs -o vers=3 $server_ip:/ibm/gpfs0/a
> /mnt/cthon; cat /mnt/cthon/a; umount /mnt/cthon; `;
> }
> 
> --
> Jeremy Bongio
> 
> jbon...@us.ibm.com
> IBM Linux Technology Center - Linux Filesystems Team
> Linux Development Engineer
> 
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> 

------------------------------------------------------------------------------
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to