Re: [Nfs-ganesha-devel] [Gluster-users] boot auto mount NFS-Ganesha exports failed

2020-03-19 Thread Soumya Koduri

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Hi,

Since its working on your test machine, most likely could be NFS 
client-side issue. Please check if there are any kernel fixes between 
those versions which may have caused this.


I see similar issue reported in below threads [1] [2]. As suggested 
there, could you try disabling kerberos module and specify "sec=sys" 
during mount. If the issue persists, please use "-vvv" option during 
mount to get more verbose output.



Thanks,
Soumya

[1] 
https://askubuntu.com/questions/854591/nfs-not-mounting-since-upgrading-from-14-04-to-16-04
[2] 
https://superuser.com/questions/1201159/nfs-v4-nfs4-discover-server-trunking-unhandled-error-512-after-reboot/1207346


On 3/13/20 7:08 PM, Renaud Fortier wrote:

Hi community,

Maybe someone could help me with this one: half the time, mounting 
nfs-ganesha NFS4 exports failed at boot. I’ve seach a lot about this 
problem but because it doesn’t happen at every boot it’s difficult to 
pinpoint the exact problem. It mount perfectly after boot is completed.


-

-Debian 9 (uptodate) on all Gluster servers and clients (4 apache2 web 
servers).


-Gluster version 6.7

-NFS-Ganesha version 2.8.3

-NFS 4.2

-fstab exemple: 192.168.11.90:/dev /data nfs4 
noatime,nodiratime,vers=4.2,_netdev 0 0


-NFS-Ganesha export exemple:

EXPORT {

     Export_Id = 2;

     Path = "/dev";

     Pseudo = "/dev";

     Access_Type = RW;

     Squash = No_root_squash;

     Disable_ACL = true;

     Protocols = "4";

     Transports = "UDP","TCP";

     SecType = "sys";

     FSAL {

     Name = "GLUSTER";

    Hostname = localhost;

     Volume = "dev";

     }

}

-Dmesg log: NFS: nfs4_discover_server_trunking unhandled error -512. 
Exiting with error EIO


-Systemd log:

     systemd[1]: Failed to mount /data.

     systemd[1]: Dependency failed for Remote File Systems.

     systemd[1]: remote-fs.target: Job remote-fs.target/start failed 
with result 'dependency'.


     systemd[1]: data.mount: Unit entered failed state.

---

I tried to reproduce the same problem with a test machine but it mount 
perfectly at every reboot. Then, I’m pretty sure the problem is on my 
clients. Also, no problem with fuse mount.


Any help or direction to follow will be greatly appreciated.

Thank you

Renaud Fortier






Community Meeting Calendar:

Schedule -
Every Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
gluster-us...@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users





___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Crash in libntirpc with 1.6.3 version

2018-10-15 Thread Soumya Koduri

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
There is similar crash (in 'svc_release_it') hit when rpc callback 
channel is  used. Jiffin reported the same in github [1]. Somewhere in 
cbk paths probably, xprt->ref is not taken.


Thanks,
Soumya


[1] https://github.com/nfs-ganesha/ntirpc/issues/153

On 10/16/18 1:58 AM, Naresh Babu wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.


Hi All,
We are using a custom FSAL with NFS Ganesha 2.6.3 version and 
libntirpc 1.6.3 version. We are consistently running into the following 
crash in libntirpc and wondering if this is a known issue. Appreciate 
any help to resolve this issue.


(gdb) bt
#0  0x7fdc88000478 in ?? ()
#1  0x7fdd5375f6d8 in svc_release_it (xprt=0x7fdc880430d0, flags=0, 
tag=0x7fdd5376ea36 <__func__.8221> "svc_ioq_write", line=233) at 
/home/naresh/clfsrepo3/external/nfs/src/libntirpc/ntirpc/rpc/svc.h:433
#2  0x7fdd5375fc46 in svc_ioq_write (xprt=0x7fdc880430d0, 
xioq=0x7fdcf0003160, ifph=0x12c8c10) at 
/home/naresh/clfsrepo3/external/nfs/src/libntirpc/src/svc_ioq.c:233
#3  0x7fdd5375fd88 in svc_ioq_write_callback (wpe=0x7fdcf00031c8) at 
/home/naresh/clfsrepo3/external/nfs/src/libntirpc/src/svc_ioq.c:257
#4  0x7fdd537605e0 in work_pool_thread (arg=0x7fdc900034d0) at 
/home/naresh/clfsrepo3/external/nfs/src/libntirpc/src/work_pool.c:181
#5  0x7fdd5255be25 in start_thread (arg=0x7fdc80e8e700) at 
pthread_create.c:308
#6  0x7fdd51e6834d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:113


(gdb) p *xprt->xp_ops
$3 = {xp_recv = 0x7fdc88000458, xp_stat = 0x7fdc88000458, xp_decode = 
0x7fdc880430c0, xp_reply = 0x7fdc880430c0, xp_checksum = 0x7fdc88000478, 
xp_destroy = 0x7fdc88000478, xp_control = 0x7fdc88000488, 
xp_free_user_data = 0x7fdc88000488}


(gdb) info symbol 0x7fdc88000478
No symbol matches 0x7fdc88000478.

(gdb) p *xprt
$2 = {xp_ops = 0x7fdc88000468, xp_dispatch = {process_cb = 
0x7fdc88000468, rendezvous_cb = 0x7fdc88000468}, xp_parent = 
0x7fdc880430c0, xp_tp = 0x7fdc880430c0 "\020", xp_netid = 0x7fdcf4000bc0 
"\240\246\227S\335\177", xp_p1 = 0x7fdc88005238, xp_p2 = 0x7fdc88043288,
   xp_p3 = 0x7fdc880051b0, xp_u1 = 0x7fdc880051b0, xp_u2 = 
0x7fdc88005238, xp_local = {nb = {maxlen = 2281730480, len = 32732, buf 
= 0x6}, ss = {ss_family = 0, __ss_align = 0,
       __ss_padding = '\000' , 
"\001\000\000\001\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377", 
'\000' }}, xp_remote = {nb = {maxlen = 0, len = 0, buf 
= 0x0}, ss = {ss_family = 12728, __ss_align = 0,
       __ss_padding = 
"\000\000\000\000\000\000\000\000\377\377\377\377", '\000' times>, " \000\000\000\000\000\000\000 \020", '\000' , 
"`\001\000\000\000\000\000\000\204", '\000' , 
"\270\061\004\210\334\177\000"}},
   xp_lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 
0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
__size = '\000' , __align = 0}, xp_fd = 0, xp_ifindex 
= 0, xp_si_type = 0, xp_type = 0, xp_refcnt = 0,

   xp_flags = 64}



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Backport list for 2.5.4

2017-11-01 Thread Soumya Koduri

Please include below commits as well -


commit f5c48022176a656143b28824c2fb913518bd8cc4
Author: Soumya Koduri <skod...@redhat.com>
Date:   Thu Oct 26 12:22:38 2017 +0530

FSAL_GLUSTER: Use the new API to be able to set lkowner


commit e03420fa445a0dac3e05c11f4718b6120744f1b5
Author: Kinglong Mee <kinglong...@gmail.com>
Date:   Thu Oct 26 06:23:58 2017 -0400

FSAL_GLUSTER: avoid overwrite the old errno in SET_GLUSTER_CREDS

commit 80afbd0f9d54682913de09f286664090beb3600d
Author: Kinglong Mee <kinglong...@gmail.com>
Date:   Thu Oct 26 06:03:04 2017 -0400

FSAL_GLUSTER: use the correct errno after glusterfs error out

commit 9306c968bc28d57ed4f8a57866fec02290049a4d
Author: Kinglong Mee <kinglong...@gmail.com>
Date:   Mon Oct 16 10:55:36 2017 +0800

GLUSTER: fix use after free of globalfd in open2



and

https://review.gerrithub.io/385085 - FSAL_GLUSTER: Fix memory leak while 
reading dirents

(yet to be merged .. got +2)

Thanks,
Soumya

On 10/31/2017 11:37 PM, Daniel Gryniewicz wrote:
Here's the set of commits that downstream Ceph needs.  Gluster can also 
use the non-Ceph related ones.


Note, these are oldest first, not newest first.

Daniel


commit b862fe360b2a0f1b1d9d5d6a8b91f1550b66b269
Author: Gui Hecheng <guihech...@cmss.chinamobile.com>
AuthorDate: Thu Mar 30 10:44:25 2017 +0800
Commit: Frank S. Filz <ffilz...@mindspring.com>
CommitDate: Fri Aug 11 14:31:22 2017 -0700

 SAL: extract fs logic from nfs4_recovery

 This is a prepare patch for modulized recovery backends.
 - define recovery apis: struct nfs_recovery_backend
 - define hooks for recovery_fs module

 Change-Id: I45523ef9a0e6f9a801fc733b095ba2965dd8751b
 Signed-off-by: Gui Hecheng <guihech...@cmss.chinamobile.com>
commit cb787a1cf4a4df4da672c6b00cb0724db5d99e4d
Author: Gui Hecheng <guihech...@cmss.chinamobile.com>
AuthorDate: Thu Mar 30 10:50:18 2017 +0800
Commit: Frank S. Filz <ffilz...@mindspring.com>
CommitDate: Fri Aug 11 14:31:23 2017 -0700

 SAL: introduce new recovery backend based on rados kv store

 Use rados OMAP API to implement a kv store for client tracking data

 Change-Id: I1aec1e110a2fba87ae39a1439818a363b6cfc822
 Signed-off-by: Gui Hecheng <guihech...@cmss.chinamobile.com>
commit fbc905015d01a7f2548b81d84f35b76524543f13
Author: Gui Hecheng <guihech...@cmss.chinamobile.com>
AuthorDate: Wed May 3 09:58:34 2017 +0800
Commit: Frank S. Filz <ffilz...@mindspring.com>
CommitDate: Fri Aug 11 14:31:23 2017 -0700

 cmake: make modulized recovery backends compile as modules

 - add USE_RADOS_RECOV option for new rados kv backend
 - keep original fs backend as default

 Change-Id: I26c2c4f9a433e6cd70f113fa05194d6817b9377a
 Signed-off-by: Gui Hecheng <guihech...@cmss.chinamobile.com>
commit eb4eea1343251f17fe39de48426bc4363eaef957
Author: Gui Hecheng <guihech...@cmss.chinamobile.com>
AuthorDate: Thu May 4 22:43:17 2017 +0800
Commit: Frank S. Filz <ffilz...@mindspring.com>
CommitDate: Fri Aug 11 14:31:23 2017 -0700

 config: add new config options for rados_kv recovery backend

 - new config block: RADOS_KV
 - new option: ceph_conf, userid, pool

 Change-Id: Id44afa70e8b5adb2cb2b9d48a807b0046f604f30
 Signed-off-by: Gui Hecheng <guihech...@cmss.chinamobile.com>
commit f7a09d87851f64a68c2438fdc09372703bcbebec
Author: Matt Benjamin <mbenja...@redhat.com>
AuthorDate: Thu Jul 20 15:21:00 2017 -0400
Commit: Frank S. Filz <ffilz...@mindspring.com>
CommitDate: Thu Aug 17 14:46:29 2017 -0700

 config: add config_url and RADOS url provider

 Provides a mechanism to to load nfs-ganesha config sections (e.g.,
 export blocks) from a generic URL.  Includes a URL provider
 which maps URLs to Ceph RADOS objects.

 Change-Id: I9067eaef2b38a78e9f1a877dfb9eb3c176239e71
 Signed-off-by: Matt Benjamin <mbenja...@redhat.com>
commit b6ce63479c965c12d2d3417abd1dd082cf0967b8
Author: Matt Benjamin <mbenja...@redhat.com>
AuthorDate: Fri Sep 22 14:21:46 2017 -0400
Commit: Frank S. Filz <ffilz...@mindspring.com>
CommitDate: Fri Sep 22 14:06:12 2017 -0700

 rpm spec: add RADOS_URLS

 Change-Id: I60ebd4cb5bc3b3184704b8951a5392ed91846cdd
 Signed-off-by: Matt Benjamin <mbenja...@redhat.com>
commit 247c4a61cd743e7b3430bb0a9780c3f6d3f73a44
Author: Matt Benjamin <mbenja...@redhat.com>
AuthorDate: Fri Sep 22 15:38:37 2017 -0400
Commit: Frank S. Filz <ffilz...@mindspring.com>
CommitDate: Fri Sep 22 14:06:28 2017 -0700

 rados url: handle error from rados_read()

 Change-Id: If437a989ddaea108216c28af99fab6da0f089e01
 Signed-off-by: Matt Benjamin <mbenja...@redhat.com>
commit d9f0536b7f3cbe6b9b4d0dc5b4e4acd3337d41b5
Author: Jeff Layton <jlay...@redhat.com>
AuthorDate: Fri Oct 6 14:23:23 2017 -0400
Commit: Frank S.

Re: [Nfs-ganesha-devel] UID and GID mapping

2017-10-31 Thread Soumya Koduri
Anonymous_uid & Anonymous_gid options can used in EXPORT {} block to set 
anonuid/anongid [1]


Thanks,
Soumya

[1] 
https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/config.txt#L210


On 10/30/2017 09:43 PM, William Allen Simpson wrote:

On 10/30/17 3:56 AM, Nitesh Sharma wrote:

Do you have any idea about nfs-ganesha UID and GID mapping


The developer's list might

What version are you using?



How can I map this entry in nfs-ganesha export which is above CephFS

/mnt/tvault_automation  *(rw,all_squash,anonuid=65534,anongid=65534)


My sample file is
=
root@d00-0c-29-05-b8-ca:~ # cat /etc/ganesha/ganesha.conf
EXPORT
{
   Export_Id = 1; # Each export needs to have a unique 'Export_Id' 
(mandatory)

   Path = "/"; # Export path in the related CephFS pool (mandatory)
   Pseudo = "/"; # Target NFS export path (mandatory for NFSv4)
   Access_Type = RW; # 'RO' for read-only access, default is 'None'
  #Squash = No_Root_Squash; # NFS squash option
   Squash=All, All_Squash, AllSquash, All_Anonymous, AllAnonymous;
   FSAL { # Exporting 'File System Abstraction Layer'
 Name = CEPH; # Ganesha backend, 'CEPH' for CephFS or 'RGW' for 
RADOS Gateway

   }
}
===



--
Thanks and Regards,
*Nitesh Sharma.*



-- 


Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] V2.5-stable maintenance

2017-10-09 Thread Soumya Koduri



On 10/08/2017 02:09 AM, Malahal Naineni wrote:
Soumya, Added that commit in addition to what I posted. I tried merging 
all 34 commits but couple things failed, so I bailed out the following. 
If you need any of these, please let me know!




Thanks Malahal..Thats fine by me.

-Soumya


1) 09303d9b1 FSAL_PROXY : storing stateid from background NFS server
 Merge was successful, but compilation failed. Looks like it needs 
some other commit(s) as well.


2) d89d67db2 nfs: fix error handling in nfs_rpc_v41_single
 Merge failed but upon further inspection, this is NOT applicable to 
V2.5-stable.


Other 32 commits are all cherry-picked with few needing merge conflict 
resolution. Here is the branch 
https://github.com/malahal/nfs-ganesha/commits/V2.5-stable


I will publish it early next week (may take few commits from dev.13 tag 
as well) after some trivial testing!


Regards, Malahal.

On Thu, Oct 5, 2017 at 8:32 PM, Soumya Koduri <skod...@redhat.com 
<mailto:skod...@redhat.com>> wrote:


Hi Malahal,

On 10/05/2017 09:06 AM, Malahal Naineni wrote:

85bd9217d GLUSTER: make sure to free xstat when meeting error



Before applying the above patch, I request to backport below commit
as well -

39119aa FSAL_GLUSTER: Use glfs_xreaddirplus_r for readdir

Thanks,
Soumya




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] V2.5-stable maintenance

2017-10-05 Thread Soumya Koduri

Hi Malahal,

On 10/05/2017 09:06 AM, Malahal Naineni wrote:

85bd9217d GLUSTER: make sure to free xstat when meeting error



Before applying the above patch, I request to backport below commit as 
well -


39119aa FSAL_GLUSTER: Use glfs_xreaddirplus_r for readdir

Thanks,
Soumya

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] cthon04 tests in the CentOS CI uses the deprecated/removed create-export-ganesha.sh script

2017-09-29 Thread Soumya Koduri



On 09/29/2017 03:35 PM, Niels de Vos wrote:

On Fri, Sep 29, 2017 at 03:06:39PM +0530, Soumya Koduri wrote:



On 09/28/2017 01:15 AM, Niels de Vos wrote:

Hi Soumya and Arthy,

The GlusterFS 3.12 release is now the standard version that gets
installed for CentOS users. We already identified a regression (related
to the 'gluster volume create ... force' command), and an emergency
update is being pushed to the CentOS mirrors for this. However, the
cthon04 test uses a script that is not included in Gluster anymore.
Could you look into a replacement for this?

A test-run fails due to the missing create-export-ganesha.sh script:
https://ci.centos.org/job/nfs_ganesha_cthon04/1511/console

The actual export creation is done in the basic-gluster.sh script:

https://github.com/nfs-ganesha/ci-tests/blob/centos-ci/common-scripts/basic-gluster.sh#L133

Please send a pull request, and feel free to merge it after you did a
manual run.



Done [1]. For now I have made changes to download those scripts from
glusterfs-3.10 github sources. May be its better to have a copy of those
scripts in our centos-ci project itself.


Thanks! Is it really intentional to not have these scripts in GlusterFS
3.12, or could they have been removed in error?


These scripts were part of glusterfs-ganesha package which was removed 
in glusterfs-3.11 in favor of storehaug.




We should have test-cases that install+configure environments as close
as possible to what we advise users to do. If these scripts are indeed
intentionally dropped, should the basic-gluster.sh script use gdeploy to
setup nfs-ganeasha? (gdeploy has recently been added to the CentOS
Storage SIG repository)


Sounds good. But need to check if gdeploy supports configuring and 
exporting volume via single-node ganesha server (as we haven't switched 
to HA configuration in our upstream CI tests yet).


Thanks,
Soumya



Thanks,
Niels




Thanks,
Soumya

[1] https://github.com/nfs-ganesha/ci-tests/pull/17


Thanks!
Niels (I'm out-of-office on Thursday)



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] cthon04 tests in the CentOS CI uses the deprecated/removed create-export-ganesha.sh script

2017-09-29 Thread Soumya Koduri



On 09/28/2017 01:15 AM, Niels de Vos wrote:

Hi Soumya and Arthy,

The GlusterFS 3.12 release is now the standard version that gets
installed for CentOS users. We already identified a regression (related
to the 'gluster volume create ... force' command), and an emergency
update is being pushed to the CentOS mirrors for this. However, the
cthon04 test uses a script that is not included in Gluster anymore.
Could you look into a replacement for this?

A test-run fails due to the missing create-export-ganesha.sh script:
   https://ci.centos.org/job/nfs_ganesha_cthon04/1511/console

The actual export creation is done in the basic-gluster.sh script:
   
https://github.com/nfs-ganesha/ci-tests/blob/centos-ci/common-scripts/basic-gluster.sh#L133

Please send a pull request, and feel free to merge it after you did a
manual run.



Done [1]. For now I have made changes to download those scripts from 
glusterfs-3.10 github sources. May be its better to have a copy of those 
scripts in our centos-ci project itself.


Thanks,
Soumya

[1] https://github.com/nfs-ganesha/ci-tests/pull/17


Thanks!
Niels (I'm out-of-office on Thursday)



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Recommended stable release for NFS-Ganesha

2017-09-14 Thread Soumya Koduri



On 09/15/2017 02:30 AM, Madhu Venugopal wrote:

Hi,

I am writing to enquire about the recommended stable release for 
NFS-Ganesha. I see that 2.5.2 is out on 
https://github.com/nfs-ganesha/nfs-ganesha/releases. But the wiki page 
at https://github.com/nfs-ganesha/nfs-ganesha/wiki only talks about 
versions 2.3 and 2.4. It has a line in there saying : "The current 2.4 
release is 2.4.1. We recommend users upgrade to this version.” Is this 
still the case? Or does 2.5.2 serve as the new stable release?


yes. That wiki may have not been updated but V2.5.2 is the latest and 
recommended stable release.


Thanks,
Soumya



We use NFS-Ganesha to serve ESXi datastores using NFS v3. Currently we 
are on version 2.2.0 and are looking to upgrade to a newer version. 
Hence the question.


Thanks,
Madhu


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Request clarification on mdcache_readdir_chunked()

2017-08-18 Thread Soumya Koduri

Hi Frank,

While I was scanning through readdir code path I noticed that we do 
create cache entry for each of the dirent (if not present) in the chunk 
as part of cbk - "mdc_readdir_chunked_cb()". But there is a repetitive 
check right after that (in "mdcache_readdir_chunked()") to verify if all 
those entries exist in cache or not.
Could you please clarify if that double check is necessary. Can there be 
any case where in we have chunk read (from FSAL or cache) but the 
dirents/entries are not yet cached?


Also it is followed by getattrs() call which seem to have been made 
unconditionally though the attributes have just been fetched as part of 
FSAL readdir call and are still valid. Can we add a check if the attrs 
are indeed expired before re-fetching them?



Thanks,
Soumya


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Proposed backports for 2.5.2

2017-08-10 Thread Soumya Koduri


> commit 7f2d461277521301a417ca368d3c7656edbfc903
>  FSAL_GLUSTER: Reset caller_garray to NULL upon free
>

Yes

On 08/09/2017 08:57 PM, Frank Filz wrote:

39119aa Soumya Koduri FSAL_GLUSTER: Use glfs_xreaddirplus_r for
readdir

Yes? No? It's sort of a new feature, but may be critical for some use cases.
I'd rather it go into stable than end up separately backported for
downstream.



Right..as it is more of a new feature, wrt upstream we wanted it to be 
part of only 2.6 on wards so as not to break stable branch (in case if 
there are nit issues).


But yes we may end up back-porting to downstream if we do not rebase to 
2.6 by then.


Thanks,
Soumya

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Weekly conference call timing

2017-08-10 Thread Soumya Koduri



On 08/10/2017 01:18 AM, Frank Filz wrote:

My daughter will be starting a new preschool, possibly as early as August
22nd. Unfortunately since it's Monday, Tuesday, Wednesday and I will need to
drop her off at 9:00 AM Pacific Time, which is right in the middle of our
current time slot...

We could keep the time slot and move to Thursday (or even Friday), or I
could make it work to do it an hour earlier.

I'd like to make this work for the largest number of people, so if you could
give me an idea of what times DON'T work for you that would be helpful.

7:30 AM to 8:30 AM Pacific Time would be:
10:30 AM to 11:30 AM Eastern Time
4:30 PM to 5:30 PM Paris Time
8:00 PM to 9:00 PM Bangalore Time (and 9:00 PM to 10:00 PM when we switch
back to standard time)


An hour earlier (same day) is fine with me as well.

Regards,
Soumya

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFSv4 delegation in Ganesha

2017-08-08 Thread Soumya Koduri


- Original Message -
> From: "Soumya Koduri" <skod...@redhat.com>
> To: "Jeff Layton" <jlay...@redhat.com>
> Cc: nfs-ganesha-devel@lists.sourceforge.net, "Yun-Chih Chen" 
> <yunchih@gmail.com>
> Sent: Monday, August 7, 2017 11:59:21 AM
> Subject: Re: [Nfs-ganesha-devel] NFSv4 delegation in Ganesha
> 
> 
> 
> On 08/03/2017 08:42 PM, Jeff Layton wrote:
> > On Fri, 2017-05-19 at 11:55 +0530, Soumya Koduri wrote:
> >> Hi,
> >>
> >> On 05/18/2017 08:04 PM, Yun-Chih Chen wrote:
> >>> Hi, friends:
> >>>
> >>> I'm studying the affect of NFSv4 on Linux.  As far as I know, NFS
> >>> implemented in Linux kernel supports read delegation but not write
> >>> delegation.  I wonder if Ganesha implements write delegation?  If yes,
> >>> can it take effect if accessed by Linux NFS client?
> >>>
> >>> I use the following config file to observe delegation in Ganesha:
> >>>
> >>> EXPORT {
> >>>  Export_Id = 77;
> >>>  Path = /e;
> >>>  Pseudo = /e;
> >>>  Access_Type = RW;
> >>>  FSAL {
> >>>  Name = XFS;
> >>>  }
> >>>
> >>>  Delegations = readwrite;
> >>> }
> >>>
> >>> NFSV4 {
> >>>  Delegations = true;
> >>> }
> >>>
> >>> However, I did not see any trace of delegation under various workloads
> >>> (example: repeated read [1], repeated write [2]) using tools like
> >>> nfstrace or tcpdump.  When running in debug mode, I always got the
> >>> following in the log file:
> >>>
> >>> ganesha.nfsd-8999[main] display_fsinfo :FSAL :DEBUG :  delegations = 0
> >>>
> >>> indicating that delegation was not on.
> >>> Is there something wrong with my config file?  Or any other clue
> >>> regarding NFS delegation?  Thanks ( ;
> >>
> >> AFAIK except for FSAL_GPFS, other FSALs do not have support for
> >> delegations yet. Also after switching to new extended APIs
> >> (>=nfs-ganesha-2.4), delegations are disabled. It needs some additional
> >> work. We are planning to address it as part of adding this support for
> >> FSAL_GLUSTER (hopefully in 2.6 release). WRT XFS, I am not sure if
> >> anyone is actively looking at it.
> >>
> >> Thanks,
> >> Soumya
> >>
> > 
> > Hi Soumya,
> > 
> > I just started looking at delegation support in ganesha (mostly with an
> > eye toward plumbing in delegation support for Ceph).
> > 
> > I think we probably need to rework the whole delegation interface (maybe
> > even give it a dedicated FSAL op), and was wondering if you had started
> > any work along those lines.
> > 
> 
> Hi Jeff,
> 
> yes. I started it but have done only POC to mainly test the glusterfs
> lease support (which is experimental feature) at first.  I could manage
> to get leases granted from and returned to the backend.
> 
> The only changes I had done so far in the core ganesha layer is to
> uncomment earlier delegation code path to get the request all the way
> till FSAL [1]. I now intent to fix the SAL layer and happy to
> collaborate with you. Agree that instead of clubbing with lock, a
> dedicated fop shall be better.
> 
> 

I have now added new fop "lease_op2()" and pushed the changes (along with some 
more fixes) to my github repo [1]. I am yet to test it out but please let me 
know if that interface looks fine.

Thanks,
Soumya

[1] https://github.com/soumyakoduri/nfs-ganesha/commits/delegation_2_6

> Thanks,
> Soumya
> 
> 
> [1] https://review.gerrithub.io/372998
> 
> > Thanks,
> > 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFSv4 delegation in Ganesha

2017-08-07 Thread Soumya Koduri



On 08/03/2017 08:42 PM, Jeff Layton wrote:

On Fri, 2017-05-19 at 11:55 +0530, Soumya Koduri wrote:

Hi,

On 05/18/2017 08:04 PM, Yun-Chih Chen wrote:

Hi, friends:

I'm studying the affect of NFSv4 on Linux.  As far as I know, NFS
implemented in Linux kernel supports read delegation but not write
delegation.  I wonder if Ganesha implements write delegation?  If yes,
can it take effect if accessed by Linux NFS client?

I use the following config file to observe delegation in Ganesha:

EXPORT {
 Export_Id = 77;
 Path = /e;
 Pseudo = /e;
 Access_Type = RW;
 FSAL {
 Name = XFS;
 }

 Delegations = readwrite;
}

NFSV4 {
 Delegations = true;
}

However, I did not see any trace of delegation under various workloads
(example: repeated read [1], repeated write [2]) using tools like
nfstrace or tcpdump.  When running in debug mode, I always got the
following in the log file:

ganesha.nfsd-8999[main] display_fsinfo :FSAL :DEBUG :  delegations = 0

indicating that delegation was not on.
Is there something wrong with my config file?  Or any other clue
regarding NFS delegation?  Thanks ( ;


AFAIK except for FSAL_GPFS, other FSALs do not have support for
delegations yet. Also after switching to new extended APIs
(>=nfs-ganesha-2.4), delegations are disabled. It needs some additional
work. We are planning to address it as part of adding this support for
FSAL_GLUSTER (hopefully in 2.6 release). WRT XFS, I am not sure if
anyone is actively looking at it.

Thanks,
Soumya



Hi Soumya,

I just started looking at delegation support in ganesha (mostly with an
eye toward plumbing in delegation support for Ceph).

I think we probably need to rework the whole delegation interface (maybe
even give it a dedicated FSAL op), and was wondering if you had started
any work along those lines.



Hi Jeff,

yes. I started it but have done only POC to mainly test the glusterfs 
lease support (which is experimental feature) at first.  I could manage 
to get leases granted from and returned to the backend.


The only changes I had done so far in the core ganesha layer is to 
uncomment earlier delegation code path to get the request all the way 
till FSAL [1]. I now intent to fix the SAL layer and happy to 
collaborate with you. Agree that instead of clubbing with lock, a 
dedicated fop shall be better.



Thanks,
Soumya


[1] https://review.gerrithub.io/372998


Thanks,



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] DBUS patch

2017-07-24 Thread Soumya Koduri

Hi Supriti,

On 07/24/2017 01:07 PM, Supriti Singh wrote:

Hello all,

I have submitted a new patch https://review.gerrithub.io/#/c/370835/ to allow 
only root users to access the dbus.In the
current dbus configuration, there are some security issues. For example, even a 
non-root user can call shutdown on a
ganesha process started by root. The easiest way to fix is to allow only root 
for now.


Apart from shutting down ganesha process, are there any other security 
issues which you are aware of? We had one user reporting a security 
threat with using DBus a while ago but hadn't provided much details.




For 2.6, we can have a better solution. As I understood, the plan is to support 
non-root as well in future. May be we
can have either a user group "ganesha" and we allow only these users to have 
access.


+1

This is simple and similar to what Kaleb had suggested to make ganesha 
process be able to run by non-root user.  But yes instead of tweaking 
the .conf file manually, if no one objects we can add it to default 
configuration.




The other solution would be to handle authorization in code. For example, using 
api dbus_bus_get_unix_user()
[https://dbus.freedesktop.org/doc/api/html/group__DBusBus.html#ga24d782c710f3d82caf1b1ed582dcf474]
 I have just started
looking into it. May be this solution is intrusive and hard to maintain. I will 
research a bit more.

Please let me know your thoughts.


I do not know how complicated it shall be, but as long as there is a way 
for ganesha service to get credentials of the user executing dbus 
command to compare with this unix_user, it should be good IMO.


Thanks,
Soumya

[1] http://seclists.org/oss-sec/2016/q4/349



Thanks,
Supriti

--
Supriti Singh SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham 
Norton,
HRB 21284 (AG Nürnberg)






--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFSv4 delegation in Ganesha

2017-05-19 Thread Soumya Koduri
Hi,

On 05/18/2017 08:04 PM, Yun-Chih Chen wrote:
> Hi, friends:
>
> I'm studying the affect of NFSv4 on Linux.  As far as I know, NFS
> implemented in Linux kernel supports read delegation but not write
> delegation.  I wonder if Ganesha implements write delegation?  If yes,
> can it take effect if accessed by Linux NFS client?
>
> I use the following config file to observe delegation in Ganesha:
>
> EXPORT {
> Export_Id = 77;
> Path = /e;
> Pseudo = /e;
> Access_Type = RW;
> FSAL {
> Name = XFS;
> }
>
> Delegations = readwrite;
> }
>
> NFSV4 {
> Delegations = true;
> }
>
> However, I did not see any trace of delegation under various workloads
> (example: repeated read [1], repeated write [2]) using tools like
> nfstrace or tcpdump.  When running in debug mode, I always got the
> following in the log file:
>
> ganesha.nfsd-8999[main] display_fsinfo :FSAL :DEBUG :  delegations = 0
>
> indicating that delegation was not on.
> Is there something wrong with my config file?  Or any other clue
> regarding NFS delegation?  Thanks ( ;

AFAIK except for FSAL_GPFS, other FSALs do not have support for 
delegations yet. Also after switching to new extended APIs 
(>=nfs-ganesha-2.4), delegations are disabled. It needs some additional 
work. We are planning to address it as part of adding this support for 
FSAL_GLUSTER (hopefully in 2.6 release). WRT XFS, I am not sure if 
anyone is actively looking at it.

Thanks,
Soumya

>
> Ganesha V2.5-rc6 running on Fedora 25, accessed by CentOS 7 client.
>
> [1] repeated read: http://codepad.org/6KUat5AI
> [2] repeated write: http://codepad.org/cuBZ3XuT
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Systemd service file

2017-05-05 Thread Soumya Koduri


On 05/05/2017 07:04 PM, Supriti Singh wrote:
> Hello,
>
> In the file scripts/systemd/nfs-ganesh.service the EnviromentFile
> expected is named "ganesha". But the file in systemd/sysconfig is named
> "nfs-ganesha". Is it intentional?
>

yes. That file gets generated at runtime by nfs-ganesha-config.service 
-> /usr/libexec/ganesha/nfs-ganesha-config.sh

Thanks,
Soumya

> Thanks,
> Supriti
>
> --
> Supriti Singh
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)
>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Broken V4 mounts on GPFS

2017-04-13 Thread Soumya Koduri


On 04/13/2017 06:33 PM, Daniel Gryniewicz wrote:
> I was seeing this on FSAL_MEM too, and thanks for pointing out the
> cause.  It was because I wasn't setting attrs.supported (only
> attrs.valid_mask) so required attributes were not supported.  I've
> updated FSAL_MEM, and will try to generate a patch for other FSALs.
>

Thats right. I have fixed it for Gluster too -
  https://review.gerrithub.io/#/c/357077/

Thanks,
Soumya
> Daniel
>
> On 04/13/2017 06:38 AM, Soumya Koduri wrote:
>> I see this issue even on GlusterFS. That seem to be the reason behind
>> https://ci.centos.org//job/nfs_ganesha_cthon04/563/console test failure
>> since this commit got merged.
>> NFSv3 mounts are working fine, but the v4 mounts are unsuccessful. I do
>> not see any errors in the pkt trace. Will enable debugging and update.
>>
>> Thanks,
>> Soumya
>>
>> On 04/05/2017 09:02 PM, Frank Filz wrote:
>>> Hmm, it may be some issue with what attribute support is claimed.
>>>
>>> I would start with a tcpdump trace and see if you can determine which 
>>> actual operation is failing.
>>>
>>> Turning on FSAL debugging might help.
>>>
>>> Frank
>>>
>>>> -Original Message-
>>>> From: Swen Schillig [mailto:s...@vnet.ibm.com]
>>>> Sent: Wednesday, April 5, 2017 5:25 AM
>>>> To: Frank Filz <ffilz...@mindspring.com>
>>>> Cc: nfs-ganesha-devel@lists.sourceforge.net
>>>> Subject: Broken V4 mounts on GPFS
>>>>
>>>> Hi Frank
>>>>
>>>> while testing some of my patches I figured your commit
>>>>
>>>> 65599c645a81edbe8b953cc29ac29978671a11be
>>>>
>>>> broke the ability to mount V4 filesystems on GPFS (ERR_ACCESS).
>>>>
>>>> Haven't had time to look into it yet,
>>>> Hopefully you already have a good idea what it could be.
>>>> Otherwise I will look into it later.
>>>>
>>>> Cheers Swen.
>>>
>>>
>>> ---
>>> This email has been checked for viruses by Avast antivirus software.
>>> https://www.avast.com/antivirus
>>>
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Broken V4 mounts on GPFS

2017-04-13 Thread Soumya Koduri
I see this issue even on GlusterFS. That seem to be the reason behind 
https://ci.centos.org//job/nfs_ganesha_cthon04/563/console test failure 
since this commit got merged.
NFSv3 mounts are working fine, but the v4 mounts are unsuccessful. I do 
not see any errors in the pkt trace. Will enable debugging and update.

Thanks,
Soumya

On 04/05/2017 09:02 PM, Frank Filz wrote:
> Hmm, it may be some issue with what attribute support is claimed.
>
> I would start with a tcpdump trace and see if you can determine which actual 
> operation is failing.
>
> Turning on FSAL debugging might help.
>
> Frank
>
>> -Original Message-
>> From: Swen Schillig [mailto:s...@vnet.ibm.com]
>> Sent: Wednesday, April 5, 2017 5:25 AM
>> To: Frank Filz 
>> Cc: nfs-ganesha-devel@lists.sourceforge.net
>> Subject: Broken V4 mounts on GPFS
>>
>> Hi Frank
>>
>> while testing some of my patches I figured your commit
>>
>> 65599c645a81edbe8b953cc29ac29978671a11be
>>
>> broke the ability to mount V4 filesystems on GPFS (ERR_ACCESS).
>>
>> Haven't had time to look into it yet,
>> Hopefully you already have a good idea what it could be.
>> Otherwise I will look into it later.
>>
>> Cheers Swen.
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Reg dynamic update of Export options

2017-04-13 Thread Soumya Koduri


On 04/12/2017 09:15 PM, Frank Filz wrote:
>> I'd prefer option 1, at this point.  Saving config in multiple places is
> likely to
>> cause more of these issues.  If, at some point, we need option 2, we could
>> add it then.
>
> I agree. It should be easy for the Gluster code to use
> op_ctx_export_has_option instead of the NFSv4_ACL_SUPPORT macro.
>
> One thing to consider though is that when this option is changed
> dynamically, there are cached attributes which either do or do not have ACLs
> depending on which way the change goes. This will cause some unexpected
> permission checking and erroneous getattr(ACL) until all attributes have
> expired.
>
> BTW, config_samples/config.txt does document the non-changeability of this
> option...
>
> Note also that the FSAL block in an EXPORT is not processed on update. If
> any options in the FSAL block will need to be dynamically updateable, then
> option 2, update_export() will be necessary (and the FSAL will have to
> manage parsing the FSAL block and processing the changes in a sensible way).

Thank you all for your inputs. For now I have modified FSAL_GLUSTER to 
read this option value from op_ctx [1]. We will evaluate and update if 
we need any options in FSAL block to be dynamically changed as well.

Thanks,
Soumya

[1] https://review.gerrithub.io/#/c/356999/

>
> Frank
>
>> On 04/12/2017 08:32 AM, Soumya Koduri wrote:
>>> Hi Frank,
>>>
>>> We ran into an issue with using dynamic export options update using
>>> "UpdateExport" dbus signal. This dbus signal invokes
>>> update_export_param() which shall update all the related export
>>> parameters in memory dynamically. But there are few options which each
>>> FSAL may choose to store in its local structure variables (like for
>>> eg., for ACL I see FSAL_GLUSTER storing the option value as
>>> glfs_export->acl_enabled and FSAL_GPFS has use_acl). These local
>>> structure variables seem to be getting updated only while creating the
>>> export but do not get changed with dynamic options update.
>>>
>>> To address this, either
>>> * we could modify FSALs to not have a local copy but read these export
>>> parameters always from op_ctx->export  (or)
>>> * have a new FSAL api (maybe update_exports - which is no op by
>>> default) to be invoked as part of update_export_param() and FSALs can
>>> choose to extend it to update their local structure values.
>>>
>>> Kindly share your inputs.
>>>
>>> Thanks,
>>> Soumya
>>>
>>> --
>>>  Check out the vibrant tech community on one of the world's
>>> most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>>
>>
>>
> 
> --
>> Check out the vibrant tech community on one of the world's most engaging
>> tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Last call for 2.4.4

2017-03-20 Thread Soumya Koduri
Hi Kaleb,

Will it be possible to consider https://review.gerrithub.io/#/c/353578/ 
(coverity fix) as well.

Thanks,
Soumya

On 03/20/2017 04:08 PM, Kaleb Keithley wrote:
>
> The following patches have been cherry-picked to the V2.4-stable branch.
>
> Last call for any others before I tag 2.4.4
>
> Thanks,
>
>>> Fix-Coverity-CID-155159-Deadlock
>>> commit a6636acb1448b3fa3330e5b7bd20afe7719e7a0f
>>> Change-Id: I92023e942e8f9ade894d57dac9c0d48dc351b5a8
>>>
>>> Reduce-and-mitigate-a-rename-readir-race-window
>>> commit d8dbbcd66958acdb456e511ebf878d1d75612ba4
>>> Change-Id: I9a7e706b25a8a2f1df54a982b2952aad7134a89d
>>>
>>> MDCACHE: only remove cached entry on unlink on success
>>> commit 35449b0ae0d9f1f60b882261df5c5a8f41a9796b
>>> Change-Id: Ifc5d2d3806979010fc38a1b3d6c8447308c155d9
>>>
>>> SAL - Keep a pointer to the obj owning the state
>>> commit ee8e090a35b8ce4b6e69db7014c64bd55264dae3
>>> Change-Id: Ia7c327bc694f7be21bc5a9802abfefc3f697d133
>>>
>>> MDCACHE - Validate ATTRs on READDIR
>>> commit ba68c98a195732d479b5f11a2413fa8b9c09b081
>>> Change-Id: I16c78ce9bd6a1068e989066734b68d083b35e933
>>>
>>> Remove PSEUDO directories on unexport
>>> commit 1fe6d7699b32633f8d6e71436af645cdf813d90d
>>> Change-Id: Id92e6d8809ac9dd49cdc405a2c6576fcc0c4642c
>>>
>>> Fix SSE4_2 compile path
>>> commit c8fcf794647e2a871b66ed9a4f7f815177794659
>>> Change-Id: Ieb7d3b8166fda68bc8e15aec4a281c0af1019438
>>>
>>> MDCACHE: rename needs to update parent pointer for directory
>>> commit e74b24d7ebec8d5e7608f892553badec2aa0bc47
>>> Change-Id: I6898002e570c09652b754438790b479ae4732f18
>>>
>>> FSAL_GLUSTER: Share gfapi objects across entries exporting same volume
>>> commit d9ebe114025fa3729b49456452197c01fc654965
>>> Change-Id: Ie28d6c4031b4ad3c8654a5b149ceb7a0a5e77060
>>>
>>> FSAL_GLUSTER: use of inet_ntoa(3)
>>> commit 836182b6eb82e6d893e9e44e31a132dbe3d79f20
>>> Change-Id: Ia0831149b412f226b491873d3efc0a8ecd0017d1
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Request: I'm looking for good file system test programs

2017-01-17 Thread Soumya Koduri


On 01/17/2017 03:10 AM, Kevin C. wrote:
> Frank asked: What sorts of tests?
>
> My answer:
>
> I am hoping to benefit from the experience of this group since this group has 
> spent much more time than I working with NFS-Ganesha (and perhaps other file 
> systems). I hope that this group has identified useful tests (beside 
> throughput tests). For example; a file data write test would not only confirm 
> that the write function did not report errors but would also read back 
> previously written data to verify that the contents remain correct. Reads 
> immediately after write are very likely to read back from cache so I expect a 
> good test program would allow read back validation of data at a later time 
> (e.g. after reboot). This group is probably more knowledgeable than I am 
> regarding the many possible file system operations: create, delete, open, 
> close, read, write, get attribute, set attribute, etc.

IOzone and Bonnie(++) test-suites proved useful sometimes to catch I/O 
errors along with performance benchmarking. These test-suites do quite 
extensive Writes, Reads, Re-writes and re-reads. Not sure if the content 
is verified while doing READs.

HTH,
Soumya

>
>
> On 01/16/2017 02:59 PM, Frank Filz wrote:
>>> I'm looking for file system operation/stability test programs.
>>>
>>>
>>> I'm most interested in test programs that do many file operations that are
>>> then verified rather than programs that concentrate on performance tests.
>> What sorts of tests? There is the pjd-fstest test suite that tests POSIX
>> compliance.
>>
>> Frank
>>
>>
>> ---
>> This email has been checked for viruses by Avast antivirus software.
>> https://www.avast.com/antivirus
>>
>>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Understandings of ganesha-ha.sh

2016-11-07 Thread Soumya Koduri
Hi,

On 11/05/2016 12:29 AM, ML Wong wrote:
> I like to ask for some recommendations here.
>
> 1) For /usr/libexec/ganesha/ganesha-ha.sh, as we have been taking
> advantages of using pacemaker+corosync for some other services, however,
> we always run into the issue of losing other resources we setup in the
> cluster when we run ganesha-ha.sh add/delete. Just like to know if
> that's what is expected?  I have already tried 2 different setups, they
> both give me the same result, just like to better understand if i need
> to find a way to work around for our environment.

Do you mean you are using pacemaker+corosync to manage services other 
than nfs-ganesha and you loose those resources when ganesha.sh 
add/delete operation is performed? Could you please provide details 
about those resources affected?

>
> 2) Have anyone of you running into a scenario - with a demand of only
> adding capacity to the Gluster volume, but not necessary to add more
> nodes available to the Cluster? Meaning, continue adding bricks
> accordingly to the existing Gluster volume, but without doing the
> ganesha-ha.sh  --add node process?  Is it a bad idea? Or, between
> Ganesha, and Gluster, do they have to have same number of
> "HA_CLUSTER_NODES" as the number of Gluster peers?

Its perfectly acceptable. Ganesha cluster is a subset of Gluster trusted 
storage pool. So you can increase the capacity of gluster volume without 
the need to alter the nfs-ganesha cluster.

>
> 3) Have anyone in the list tried adding nodes into Ganesha without using
> ganesha-ha.sh? And, just by using "pcs"? Ganesha team, do i miss any
> other resource, and constraints for new nodes?
> a) nfs_setup/mon/grace-clone
> b) [node]-cluster_ip-1,
> c) location constraint for each member-node with score priority
> d) location-nfs-grace-clone
> e) order constraints

I am not sure if any one has tried out that. We strongly recommend to 
use the ganesha.sh script as there could be additions/fixes which went 
in that script w.r.t cluster configuration. You may have to double-check 
the resources and constraints every time you try to configure them 
manually. Do you see any difference between using the script and the 
manual configuration?

Thanks,
Soumya

>
> Thanks all,
> Melvin
> For gluster-user, i am not sure if this is the right list to post. Sorry
> for the spam.
>
>
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] [Patch] SAL: Ref leak in lock merge codepath (V2.3-stable) branch

2016-09-28 Thread Soumya Koduri


On 09/27/2016 09:56 PM, Frank Filz wrote:
>> We observed a ref leak of cache-inode entry (in V2.3-stable codepath) in
>> case if there are multiple locks issued on that entry which may get merged.
>> Attached the fix for it.
>>
>> Kindly review and merge it into V2.3-stable branch.
>
> Ok, I had to walk the code carefully...
>
> Looks good.
>
> Ah, and just looked at the 2.4 code... In the case of state_lock's call to 
> merge_lock_entry, it checks if the list is empty BEFORE calling 
> merge_lock_entry, if so, it takes the needed object reference to "pin" the 
> object.

The commit which Malahal pointed out in ganltc repo takes similar 
approach. We could probably take that patch instead for 2.3.

Thanks,
Soumya

>
> Another way to fix this would have been to add the new lock entry to the list 
> BEFORE calling merge_lock_entry... Then merge_lock_entry would NEVER empty 
> the list since the entry to be merged is already in the list (and 
> merge_lock_entry already skips it).
>
> Frank
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] [Patch] SAL: Ref leak in lock merge codepath (V2.3-stable) branch

2016-09-27 Thread Soumya Koduri

Hi,

We observed a ref leak of cache-inode entry (in V2.3-stable codepath) in 
case if there are multiple locks issued on that entry which may get 
merged. Attached the fix for it.


Kindly review and merge it into V2.3-stable branch.

Thanks,
Soumya
>From df75e83d4cc2ba03a69564807e7a890967505e64 Mon Sep 17 00:00:00 2001
From: Soumya Koduri <skod...@redhat.com>
Date: Tue, 27 Sep 2016 14:45:41 +0530
Subject: [PATCH] SAL: Ref leak in lock merge codepath

In state_lock(), we pin the cache_entry before processing the
lock operation. This shall be unref'ed at the end of the
operation unless it is the first lock entry for that cache
entry.
But in case of overlapping locks, in merge_lock_entry(), once
we merge the locks, we remove the older one. This could lead
to empty lock list for that cache_entry, but with the earlier
ref not taken out resulting in a leak.
The fix is to unpin the cache entry if the lock entry removed
was the last one in its lock list.

Change-Id: I9ed9e9ad8115d1e3c5dcc28bcc8e733b94b82de2
Signed-off-by: Soumya Koduri <skod...@redhat.com>
---
 src/SAL/state_lock.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/SAL/state_lock.c b/src/SAL/state_lock.c
index 1910077..22564ed 100644
--- a/src/SAL/state_lock.c
+++ b/src/SAL/state_lock.c
@@ -948,6 +948,13 @@ static void merge_lock_entry(struct state_hdl *ostate,
 		LogEntry("Merged", lock_entry);
 		LogEntry("Merging removing", check_entry);
 		remove_from_locklist(check_entry);
+
+		/* if check_entry was the last lock entry, unpin
+		 * the cache entry */
+		if (glist_empty(>object.file.lock_list)) {
+			cache_inode_dec_pin_ref(entry, false);
+			cache_inode_lru_unref(entry, LRU_FLAG_NONE);
+		}
 	}
 }
 
-- 
2.5.0

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Final throes of V2.4

2016-09-21 Thread Soumya Koduri


On 09/22/2016 04:02 AM, Frank Filz wrote:
>> I have pushed rc7 with Matt's c++ compile changes and one final patch from
>> Daniel G.
>>
>> Please have at it. I'd like to get as many FSAL's verified against rc7 by
>> 9:00 AM PDT Thursday. At that time, unless some major fire has erupted, I
>> will tag V2.4.0 and push that so Kaleb can get on with his work to include
>> V2.4.0.
>
> Hmm, centos-ci is showing a failure in the Cthon04 lock tests. I ran NFS v3
> lock tests and got a pass.

I ran the cthon04 tests (using FSAL_GLUSTER) on v3 and v4 mounts. They 
seem to pass. But if I run in a loop, sometimes (very much spurious - 
hit only once) ganesha process seems to crash. One of the bt seen is


(gdb) bt
#0  0x7f570da9fa98 in __GI_raise (sig=sig@entry=6)
 at ../sysdeps/unix/sysv/linux/raise.c:55
#1  0x7f570daa169a in __GI_abort () at abort.c:89
#2  0x7f570dae2e1a in __libc_message (do_abort=do_abort@entry=2,
 fmt=fmt@entry=0x7f570dbf5a00 "*** Error in `%s': %s: 0x%s ***\n")
 at ../sysdeps/posix/libc_fatal.c:175
#3  0x7f570dae91e4 in malloc_printerr (action=,
 str=0x7f570dbf5a48 "corrupted double-linked list (not small)",
 ptr=, ar_ptr=) at malloc.c:5000
#4  0x7f570daebd5a in _int_free (av=0x7f565020, p=,
 have_lock=0) at malloc.c:4008
#5  0x7f570daeebcc in __GI___libc_free (mem=) at 
malloc.c:2962
#6  0x0044ad9b in gsh_free_size (p=0x7f56500a3290, n=1360)
 at 
/home/guest/Documents/workspace/nfs-ganesha/src/include/abstract_mem.h:287
#7  0x7f570f4000b9 in mem_free (p=0x7f56500a3290, n=1360)
 at 
/home/guest/Documents/workspace/nfs-ganesha/src/libntirpc/ntirpc/rpc/types.h:208
#8  0x7f570f4006c3 in free_rpc_msg (msg=0x7f56500a3290)
 at 
/home/guest/Documents/workspace/nfs-ganesha/src/libntirpc/src/svc.c:254
#9  0x004e7ba5 in nfs_dupreq_rele (req=0x7f56500a0ad8,
 func=0x54a190 )
 at 
/home/guest/Documents/workspace/nfs-ganesha/src/RPCAL/nfs_dupreq.c:1257
#10 0x0044a50a in nfs_rpc_execute (reqdata=0x7f56500a0ab0)
 at 
/home/guest/Documents/workspace/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1405


and another one (I lost the core) , but it was 
mdcache_lru_get->mdcache_lru_clean -> fsal_close()->close() . In 
FSAL_GLUSTER()->file_close(), below assert was hit -

assert(obj_hdl->type == REGULAR_FILE);

The obj_hdl->type was a large number and did not have any of the defined 
macros value. I tried reproducing, but haven't hit it again (neither of 
the above crashes).

But I would like to check if the above check is valid in FSALs close() 
routines ? I see this check in vfs_close() as well. But I assume 
mdcache_lru_clean could be called on an obj_hdl of any type but not 
restricted to REGULAR_FILE. Could you please confirm?

Thanks,
Soumya


>
> If there is something broken here, we really should try to fix ASAP.
>
> Thanks
>
> Frank
>
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
> --
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Blocking locks in FSALs

2016-09-13 Thread Soumya Koduri
Hi Frank,

On 09/13/2016 03:30 AM, Frank Filz wrote:
>> As I have dug into things, I realize blocking locks (used by NFS v3 NLM
>> clients) are broken for FSALs that don't support async blocking locks.
>>
>> It looks like libgfapi and libcephfs both support blocking locks (libgfapi
> with
>> F_SETLKW and libcephfs by passing the last param (sleep) to ceph_ll_setlk
> as
>> true). If this is the case, I propose adding pseudo-async blocking lock
> support
>> to SAL for when support_ex is true but lock_support_async_block is false.
>>
>> The idea is the following:
>>
>> There will be a configurable number of blocking lock threads.
>>
>> Any time a blocking lock is not able to be granted immediately (we will
> first
>> use a non-blocking lock request to the FSAL), and the FSAL doesn't support
>> async blocking locks, it gets put on a queue.
>>
>> The blocked lock threads will pull items off the queue and do a blocking
> lock
>> request to the FSAL. This will have to be cancelable somehow (that may be
>> an issue, I need thoughts from GLUSTER and CEPH for that, for FSAL_VFS,
>> fcntl can be interrupted by a SIGIO to the thread).
>>
>> Another single thread will poll any blocked locks not handled by an async
>> capable FSAL or a blocking lock thread. It will poll each lock at some
> modest
>> interval (with a non-blocking lock).
>>
>> This all is modeled in the multilock tool in ml_posix_client. I don't know
> if I
>> actually tested it, but it's also modeled for libcephfs with
> ml_ceph_client.
>>
>> I'd appreciate any input on this, most especially from Gluster folks.
>
> Ok, I've done some investigation and asking around...
>
> It turns out neither libgfapi nor libcephfs blocking locks are
> interruptible...

yes. AFAIK, the gfapi blocking lock request cannot be cancelled and gets 
blocked till its granted by glusterFS server.

>
> So with that, the only FSAL that would benefit from the blocking lock
> threads is FSAL_VFS.
>
> So at this point, it may not make sense to implement the blocking threads
> currently, and just use a polled model.

Do you mean similar to using upcall to poll and check if any of those 
blocked locks are granted like the way FSAL_GPFS handled async locks I 
believe?

Thanks,
Soumya

>
> Ultimately what we really need from libgfapi and libcephfs is a true async
> blocking lock...
>
> Frank
>
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
> --
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity
> planning reports. http://sdm.link/zohodev2dev
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] md-cache (setattr2) optimization

2016-08-29 Thread Soumya Koduri
Hi Dan/Frank,

I was looking at mdcache code a bit for performance optimizations 
possible. I have a couple of queries. Please let me know your inputs -

* In ex setattr2(), we have support for FSAL to update the attributes in 
the obj handle. Can we make use of them and get away with getattrs() 
post that in mdcache_setattr2()? Or better can we update just the 
attributes with the values we are modifying via setattrs instead of 
trying to re-fetch the attributes?

I am planning to work on necessary changes. Please let me know if you 
see any issues with it.

* Also where do we invalidate/update parent directory(s)'s attributes 
post successful unlink/rename operations?

Thanks,
Soumya


--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] MDC up call

2016-08-23 Thread Soumya Koduri
Hi Dan,

On 08/22/2016 05:40 PM, Daniel Gryniewicz wrote:
> Excellent Marc, thanks.
>
> Soumya, could you test this with the Gluster up-calls to make sure I
> didn't break them?

I hit below crash when tested with the latest patchset -

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f2ca4f45700 (LWP 6475)]
0x00523598 in mdc_up_invalidate (export=0x1b73f60, 
handle=0x7f2cb4001b08, flags=271)
 at 
/home/guest/Documents/workspace/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c:55
55  key.fsal = export->sub_export->fsal;
(gdb) bt
#0  0x00523598 in mdc_up_invalidate (export=0x1b73f60, 
handle=0x7f2cb4001b08, flags=271)
 at 
/home/guest/Documents/workspace/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c:55
#1  0x00436ea2 in queue_invalidate (ctx=0x7f2cb4001b60) at 
/home/guest/Documents/workspace/nfs-ganesha/src/FSAL_UP/fsal_up_async.c:81
#2  0x004fbcb7 in fridgethr_start_routine (arg=0x7f2cb4001b60) 
at /home/guest/Documents/workspace/nfs-ganesha/src/support/fridgethr.c:550
#3  0x7f2cdffc860a in start_thread (arg=0x7f2ca4f45700) at 
pthread_create.c:334
#4  0x7f2cdf8e1bbd in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) p export
$1 = (struct fsal_export *) 0x1b73f60
(gdb) p export->sub_export
$2 = (struct fsal_export *) 0x0
(gdb)

Thanks,
Soumya
>
> Thanks,
> Daniel
>
> On Sun, Aug 21, 2016 at 8:07 PM, Marc Eshel  wrote:
>> This time it did work.
>> Marc.
>>
>>
>>
>> From:   Daniel Gryniewicz 
>> To: Marc Eshel/Almaden/IBM@IBMUS
>> Cc: Frank Filz , NFS Ganesha Developers
>> 
>> Date:   08/21/2016 02:45 PM
>> Subject:Re: MDC up call
>>
>>
>>
>> In general, MDCACHE assumes it has op_ctx set, and I'd prefer to not
>> have that assumption violated, as it will complicate the code a lot.
>>
>> It appears that the export passed into the upcalls is already the
>> MDCACHE export, not the sub-export.  I've uploaded a new version of
>> the patch with that change.  Coud you try it again?
>>
>> On Fri, Aug 19, 2016 at 4:56 PM, Marc Eshel  wrote:
>>> I am not sure you need to set op_ctx
>>> I fixed it for this path by not calling  mdc_check_mapping() from
>>> mdcache_find_keyed() if op_ctx is NULL
>>> I think the mapping should already exist for calls that are coming from
>>> up-call.
>>> Marc.
>>>
>>>
>>>
>>> From:   Daniel Gryniewicz 
>>> To: Marc Eshel/Almaden/IBM@IBMUS
>>> Cc: Frank Filz ,
>>> nfs-ganesha-devel@lists.sourceforge.net
>>> Date:   08/19/2016 06:13 AM
>>> Subject:Re: MDC up call
>>>
>>>
>>>
>>> Marc, could you try with this patch: https://review.gerrithub.io/287904
>>>
>>> Daniel
>>>
>>> On 08/18/2016 06:55 PM, Marc Eshel wrote:
 Was up-call with MDC tested?
 It looks like it is trying to use op_ctx which is NULL.
 Marc.


 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 0x7fe867fff700 (LWP 18907)]
 0x00532b76 in mdc_cur_export () at

>>>
>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h:376
 376 return mdc_export(op_ctx->fsal_export);
 (gdb) where
 #0  0x00532b76 in mdc_cur_export () at

>>>
>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h:376
 #1  0x005342a1 in mdc_check_mapping (entry=0x7fe870001530) at

>>>
>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:210
 #2  0x0053584c in mdcache_find_keyed (key=0x7fe867ffe470,
 entry=0x7fe867ffe468) at

>>>
>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:636
 #3  0x005358c1 in mdcache_locate_keyed (key=0x7fe867ffe470,
 export=0x12d8f40, entry=0x7fe867ffe468, attrs_out=0x0)
 at

>>>
>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:670
 #4  0x0052feac in mdcache_create_handle (exp_hdl=0x12d8f40,
 hdl_desc=0x7fe880001088, handle=0x7fe867ffe4e8, attrs_out=0x0)
 at

>>>
>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1629
 #5  0x00433f36 in lock_avail (export=0x12d8f40,
 file=0x7fe880001088, owner=0x7fe87c302dc0, lock_param=0x7fe8800010a0)
>> at
 /nas/ganesha/new-ganesha/src/FSAL_UP/fsal_up_top.c:172
 #6  0x00438142 in queue_lock_avail (ctx=0x7fe880001100) at
 /nas/ganesha/new-ganesha/src/FSAL_UP/fsal_up_async.c:243
 #7  0x0050156f in fridgethr_start_routine (arg=0x7fe880001100)
>>> at
 /nas/ganesha/new-ganesha/src/support/fridgethr.c:550
 #8  0x7fea288a0df3 in start_thread (arg=0x7fe867fff700) at
 pthread_create.c:308
 #9  0x7fea27f603dd in clone () at
 

Re: [Nfs-ganesha-devel] Addtional parameters that might be interesting to dynamic update

2016-08-10 Thread Soumya Koduri


On 08/11/2016 11:08 AM, Soumya Koduri wrote:
>
> On 08/11/2016 01:23 AM, Frank Filz wrote:
>> Having vanquished (for the most part) dynamic export update, and the ease of
>> doing so, I have started to think about what other config parameters would
>> be useful to be able to dynamically update.
>
> How about LOG {} parameters? Changing log level without the need of
> restarting server could really help us in debugging.

Thanks to Malahal. There seems to be already a DBus interface [1] to 
change the log level dynamically. Please ignore this request.

Thanks,
Soumya

[1] 
https://github.com/nfs-ganesha/nfs-ganesha/wiki/Dbusinterface#orgganeshanfsdlog

>
>>
>> Please read over this and give feedback.
>>
>> Thanks
>>
>> Frank
>>
>> NFS_CORE_PARAM:
>>
>> All the ports and such probably aren't a good idea to dynamically update.
>>
>> Nb_Worker would certainly be useful to be able to change.
>>
>> Drop_.*_Errors, should be easy to update, someone might want that.
>>
>> DRC options are probably not good to tweak dynamically?
>>
>> RPC options are probably not good to tweak dynamically?
>>
>> NFS_IP_NAME
>>
>> Changing the expiration time probably is ok to change.
>>
>> NFS_KRB5
>>
>> I don't think any of these are candidates for dynamic update.
>>
>> NFSV4
>>
>> Changing any of these options will probably wreak havoc, though playing with
>> the numeric owners options dynamically is probably not too horrid.
>>
>> EXPORT { FSAL { } }
>>
>> I didn't explore any of these with dynamic export update. FSAL_VFS allows
>> configuring the type of fsid used for the filesystem, changing that
>> dynamically is a bad idea since it changes the format of file handles.
>>
>> CACHE_INODE
>>
>> Most of these should be updateable. NPart would not be changeable. I'm not
>> sure if any others would be problematical. I wonder if some of them are no
>> longer used.
>>
> Not related to this topic. But I see that though md-cache layer is tied
> to CACHE_INODE parameters, it doesn't read those options atm. We may
> need that support to be able to use some of the options ( like
> cache-size, attr_expiration_time etc) and even to dynamically update
> them right?
>
> Thanks,
> Soumya
>
>
>> 9P
>>
>> These don't look like good candidates for dynamic update.
>>
>> CEPH
>>
>> Changing the config path for libcephfs won't accomplish anything
>>
>> GPFS
>>
>> I'm not sure some of these should even be config variables, not sure if any
>> make sense for dynamic update
>>
>> RGW
>>
>> These need consideration, probably not candidates for dynamic update
>>
>> VFS/XFS
>>
>> Some of the same questionable options as GPFS
>>
>> ZFS
>>
>> Same options as VFS/XFS
>>
>> PROXY
>>
>> Not worth making dynamically updateable until we really make this thing
>> work...
>>
>>
>>
>>
>> ---
>> This email has been checked for viruses by Avast antivirus software.
>> https://www.avast.com/antivirus
>>
>>
>> --
>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>> patterns at an interface-level. Reveals which users, apps, and protocols are
>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>> J-Flow, sFlow and other flows. Make informed decisions using capacity
>> planning reports. http://sdm.link/zohodev2dev
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>
>
> --
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity
> planning reports. http://sdm.link/zohodev2dev
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. http://sdm.link/zohodev2dev
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] [Gluster-users] NFS-Ganesha lo traffic

2016-08-09 Thread Soumya Koduri


On 08/09/2016 09:06 PM, Mahdi Adnan wrote:
> Hi,
> Thank you for your reply.
>
> The traffic is related to GlusterFS;
>
> 18:31:20.419056 IP 192.168.208.134.49058 > 192.168.208.134.49153: Flags
> [.], ack 3876, win 24576, options [nop,nop,TS val 247718812 ecr
> 247718772], length 0
> 18:31:20.419080 IP 192.168.208.134.49056 > 192.168.208.134.49154: Flags
> [.], ack 11625, win 24576, options [nop,nop,TS val 247718812 ecr
> 247718772], length 0
> 18:31:20.419084 IP 192.168.208.134.49060 > 192.168.208.134.49152: Flags
> [.], ack 9861, win 24576, options [nop,nop,TS val 247718812 ecr
> 247718772], length 0
> 18:31:20.419088 IP 192.168.208.134.49054 > 192.168.208.134.49155: Flags
> [.], ack 4393, win 24568, options [nop,nop,TS val 247718812 ecr
> 247718772], length 0
> 18:31:20.420084 IP 192.168.208.134.49052 > 192.168.208.134.49156: Flags
> [.], ack 5525, win 24576, options [nop,nop,TS val 247718813 ecr
> 247718773], length 0
> 18:31:20.420092 IP 192.168.208.134.49049 > 192.168.208.134.49158: Flags
> [.], ack 6657, win 24576, options [nop,nop,TS val 247718813 ecr
> 247718773], length 0
> 18:31:20.421065 IP 192.168.208.134.49050 > 192.168.208.134.49157: Flags
> [.], ack 4729, win 24570, options [nop,nop,TS val 247718814 ecr
> 247718774], length 0
>

Looks like that is the traffic coming to the bricks local to that node 
(>4915* ports are used by glusterfs brick processes). It could be from 
nfs-ganesha or any other glusterfs client processes (like self-heal 
daemon etc). Do you see this traffic even when there is no active I/O 
from the nfs-client? If so, it could be from the self-heal daemon then. 
Verify if there are any files/directories to be healed.

> Screenshot from wireshark can be found in the attachments.
> 208.134 is the server IP address, and it's looks like it talking to
> itself via the lo interface, im wondering if this is a normal behavior
> or not.
yes. It is the expected behavior when there are clients actively 
accessing the volumes.

> and regarding the Ganesha server logs, how can i debug it to find why
> the servers not responding to the requests on time ?

I suggest again to take tcpdump. Sometimes nfs-ganesha server (glusterfs 
client) may have to communicate with all the bricks over the network 
(like LOOKUP) and that may result in delay if there are lots of bricks 
involved. Try capturing packets from the node where the nfs-ganesha 
server is running and examine the packets between any of the NFS-client 
request and its corresponding reply packet.

I usually use below cmd to capture the packets on all the interfaces -
#tcpdump -i any -s 0 -w /var/tmp/nfs.pcap tcp and not port 22

Thanks,
Soumya
>
>
> --
>
> Respectfully*
> **Mahdi A. Mahdi*
>
>
>
>> Subject: Re: [Gluster-users] NFS-Ganesha lo traffic
>> To: mahdi.ad...@outlook.com
>> From: skod...@redhat.com
>> CC: gluster-us...@gluster.org; nfs-ganesha-devel@lists.sourceforge.net
>> Date: Tue, 9 Aug 2016 18:02:01 +0530
>>
>>
>>
>> On 08/09/2016 03:33 PM, Mahdi Adnan wrote:
>> > Hi,
>> >
>> > Im using NFS-Ganesha to access my volume, it's working fine for now but
>> > im seeing lots of traffic on the Loopback interface, in fact it's the
>> > same amount of traffic on the bonding interface, can anyone please
>> > explain to me why is this happening ?
>>
>> Could you please capture packets on those interfaces using tcpdump and
>> examine the traffic?
>>
>> > also, i got the following error in the ganesha log file;
>> >
>> > 09/08/2016 11:35:54 : epoch 57a5da0c : gfs04 :
>> > ganesha.nfsd-1646[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health
>> > status is unhealthy. Not sending heartbeat
>> > 09/08/2016 11:46:04 : epoch 57a5da0c : gfs04 :
>> > ganesha.nfsd-1646[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health
>> > status is unhealthy. Not sending heartbeat
>> > 09/08/2016 11:54:39 : epoch 57a5da0c : gfs04 :
>> > ganesha.nfsd-1646[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health
>> > status is unhealthy. Not sending heartbeat
>> > 09/08/2016 12:06:04 : epoch 57a5da0c : gfs04 :
>> > ganesha.nfsd-1646[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health
>> > status is unhealthy. Not sending heartbeat
>> >
>> > is it something i should care about ?
>>
>> Above warnings are thrown when the outstanding rpc request queue count
>> doesn't change within two heartbeats, in other words the server may be
>> taking a while to process the requests and responding slowly to its
> clients.
>>
>> Thanks,
>> Soumya
>>
>> >
>> > My ganesha config is the following;
>> >
>> >
>> > EXPORT{
>> > Export_Id = 1 ;
>> > Path = "/vlm02";
>> >
>> > FSAL {
>> > name = GLUSTER;
>> > hostname = "gfs04";
>> > volume = "vlm02";
>> > }
>> >
>> > Access_type = RW;
>> > Disable_ACL = TRUE;
>> > Squash = No_root_squash;
>> > Protocols = "3" ;
>> > Transports = "TCP";
>> > }
>> >
>> >
>> > Im accessing it via a floating ip assigned by CTDB.
>> >
>> >
>> > Thank you.
>> > --
>> >
>> > Respectfully*
>> > **Mahdi A. Mahdi*
>> >
>> >
>> >
>> > 

Re: [Nfs-ganesha-devel] ESTALE on 'find .'

2016-07-29 Thread Soumya Koduri
> fsid was broken in the attribute-copy changes. I'll submit a potential
> fix for Frank to look at.

Thanks Dan.

James,
WRT to other SEGV with upcall issue, probably it is related to a 
potential memory corruption in upcall processing in FSAL_GLUSTER when 
using glusterFS master/3.8 branch. This is being fixed as part of [1]. 
Till it gets fixed, please turn off upcalls.

Thanks,
Soumya


[1] http://review.gluster.org/#/c/14701/

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] FSAL locks implementation

2016-07-28 Thread Soumya Koduri
Thanks again Frank for the clarification and the details.

Soumya.

On 07/27/2016 09:24 PM, Frank Filz wrote:
>>> That is why we have the release_IP dBus command, it signals the

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] FSAL locks implementation

2016-07-27 Thread Soumya Koduri
>
> That is why we have the release_IP dBus command, it signals the Ganesha that
> is giving up clients to drop their locks.

Thanks a lot Frank. Right now in our HA solution, we do not send any 
event. So by default I guess it takes "EVENT_TAKE_IP". Will try out 
"EVENT_RELEASE_IP".

So IIUC, the only difference between "EVENT_TAKE_IP" and 
"EVENT_RELEASEIP" is that TAKE_IP just loads nfsv4 clients in memory to 
validate them during recovery where as release IP shall expire all the 
existing clients of that IP (which involves releasing open and lock 
state) and then loads them in memory for recovery validation?

And what about NLM locks?


 else {
 nfs_release_nlm_state(gsp->ipaddr);
 if (gsp->event == EVENT_RELEASE_IP)
 nfs_release_v4_client(gsp->ipaddr);
 else
 nfs4_load_recov_clids_nolock(gsp);
 }
<

"nfs_release_nlm_state" seems to be notifying the NLM clients but I do 
not see any calls to flush the granted locks at the back-end as well?

Thanks,
Soumya



>
> Frank
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] find causes sigsegv in fridgethr.c (2.4 dev)

2016-07-26 Thread Soumya Koduri


On 07/26/2016 10:57 PM, James Rose wrote:
> Hi Soumya,
>
> I have tested again with features.cache-invalidation disabled.
>
> This does improve the situation.  The find now succeeds while the rsync
> is in progress.

Okay. Could you please print fe->ctx and check if indeed gone stale. 
There had been few changes to upcall processing code-path.
@Dan, do you suspect any?

>
> There could be another issue too.  I have yet to create a reliable test
> to trigger this but I often receive stale file handles in situations
> where I would not expect to.  Almost always after a server restart where
> client does not umount and sometimes after data has been updated by
> another client.  I'll send a new message to the lists when I have some
> repeatable examples.

I see this issue too. Will look into it.

Thanks,
Soumya

>
> Thanks
>
> James.
>
>
>
> On 26 July 2016 at 13:55, Soumya Koduri <skod...@redhat.com
> <mailto:skod...@redhat.com>> wrote:
>
>
> Strange. I have been testing nfs-ganesha+gluster (master branch) on
> fedora. Haven't hit this issue. From both the bt looks like may be
> fe->ctx has gone bad. Could you please verify that in gdb?
>
> Also from the test case, since there are two clients involved, there
> could have been upcalls generated. Do you have the upcall
> (features.cache-invalidation) on for the gluster volume. If yes,
> could you turn it off and re-test it.
>
> Thanks,
> Soumya
>
>
> Dev 24 and 26 SIGSEGV when an nfs v4 client runs find on the
> test data
> set.  This is not triggered in dev 21.
>
> Test system is CentOS7.2
> Gluster 3.8.1-1.el7
> libntirpc-1.4.0-0.2pre2
>
> Clients are SL6.7
>
> To reproduce.
>
> Copy data to the cluster (SL6.7 nfs v4 client).
>
> from another client attempt to run find on the data.
>
>
> GDB backtrace for dev 24:
>
> (gdb) continue
> Continuing.
> [New Thread 0x7f8ad0ddd700 (LWP 4580)]
> [New Thread 0x7f8ad05dc700 (LWP 4581)]
> [New Thread 0x7f8acfddb700 (LWP 4582)]
> [New Thread 0x7f8acf5da700 (LWP 4583)]
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7f8acf5da700 (LWP 4583)]
> 0x in ?? ()
> (gdb) backtrace
> #0  0x in ?? ()
> #1  0x004f921b in fridgethr_start_routine
> (arg=0x7f8ae68ada80)
> at
> 
> /usr/src/debug/nfs-ganesha-2.4-dev-24-0.1.1-Source/support/fridgethr.c:550
> #2  0x7f8b015e4dc5 in start_thread (arg=0x7f8acf5da700) at
> pthread_create.c:308
> #3  0x7f8b00ec7ced in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
> (gdb) exit
>
>
> GDB bactrace for dev 26
>
> (gdb) continue
> Continuing.
> [New Thread 0x7fbdc4fff700 (LWP 4323)]
> [New Thread 0x7fbdf39e9700 (LWP 4324)]
> [New Thread 0x7fbdf0445700 (LWP 4325)]
> [New Thread 0x7fbdc07ff700 (LWP 4327)]
> [New Thread 0x7fbdbfffe700 (LWP 4328)]
> [New Thread 0x7fbdbf7fd700 (LWP 4329)]
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fbdc07ff700 (LWP 4327)]
> pthread_cond_wait@@GLIBC_2.3.2 () at
> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:134
> 134movlMUTEX_KIND(%r8), %eax
> (gdb) backtrace
> #0  pthread_cond_wait@@GLIBC_2.3.2 () at
> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:134
> #1  0x7fbdf3cbfb0d in fridgethr_freeze (thr_ctx=0x7fbdd70e6300,
> fr=0x7fbdefc70c00) at
> /usr/src/debug/nfs-ganesha-2.4-dev-26/src/support/fridgethr.c:420
> #2  fridgethr_start_routine (arg=0x7fbdd70e6300) at
> /usr/src/debug/nfs-ganesha-2.4-dev-26/src/support/fridgethr.c:554
> #3  0x7fbdf1f65dc5 in start_thread (arg=0x7fbdc07ff700) at
> pthread_create.c:308
> #4  0x7fbdf1632ced in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
> (gdb) quit
>
>
> Thanks
>
> James
>
>
> 
> --
> What NetFlow Analyzer can do for you? Monitors network bandwidth
> and traffic
> patterns at an interface-level. Reveals which users, apps, and
> protocols are
> consuming the most bandwidth. Provides multi-vendor support for
>  

Re: [Nfs-ganesha-devel] Assertion `item->type == CONFIG_BLOCK' failed. when

2016-07-26 Thread Soumya Koduri


On 07/26/2016 01:26 AM, Ketan Dixit wrote:
> Log snippet of the failure for reference.
>
> On Mon, Jul 25, 2016 at 11:46 AM, Ketan Dixit  > wrote:
>
> Hello,
>
>
> I compiled nfs-ganesha source code  on ubuntu machine. (from the
> Next branch pulled on July 22 2016)
>
> I am hitting an error when starting the nfs-ganesha service. Logs
> are getting flooded with this message.
>  *ganesha.nfsd:
> /home/ubuntu/nfs-ganesha/src/config_parsing/config_parsing.c:1288:
> proc_block: Assertion `item->type == CONFIG_BLOCK' failed.*
>
> On turning on Verbose logging, I observe that the error occurs just
> after printing fsinfo in the FSAL component.
>  *ganesha.nfsd-1582[main] display_fsinfo :FSAL :DEBUG
> :FileSystem info: {
> *
> *  .*
> *  .*
> *  .*
> * ganesha.nfsd-1582[main] display_fsinfo :FSAL :DEBUG :}*
>
>
> I do not see anything obviously wrong with the config file. Any
> inputs will be appreciated. How do I debug this issue?
>

I am running Gluster(on fedora though) and the same config worked for me 
as well . What are the cmake options used? Maybe gdb could help?

Thanks,
Soumya

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] How do I prevent nfs-ganesha from keeping file handles open?

2016-07-21 Thread Soumya Koduri


On 07/21/2016 11:50 AM, steve landiss wrote:
> Yes, that worked.  I think that should be reflected in the docs.  It
> took me a bit of time to figure that out.
>

Done. I haven't looked at rest of the contents of that page. But updated 
this particular step for now.

Thanks,
Soumya

> Thanks
> Steve
>
>
> On Wednesday, July 20, 2016 10:45 PM, Soumya Koduri <skod...@redhat.com>
> wrote:
>
>
>
>
> On 07/20/2016 11:09 PM, steve landiss wrote:
>> I am following the instructions
>> at
> https://github.com/nfs-ganesha/nfs-ganesha/wiki/Dbusinterface#Configuring_the_System_DBus_Service
>>
>>
>> Ater copying the org.ganesha.nfsd.conf to /etc/dbus-1/system.d/ I get
>> the following error:
>>
>> "Error org.freedesktop.DBus.Error.ServiceUnknown: The name
>> org.ganesha.nfsd was not provided by any .service files"
>
> You may have to restart the "messagebus" service to load this file.
> Could you please try that?
>
> Thanks,
> Soumya
>
>>
>> Looks like I need a service file in /usr/share/dbus-1/services/ ?
>>
>> If so, where can I get a sample file?
>>
>> Steve
>>
>>
>>
>> On Wednesday, July 20, 2016 10:27 AM, Frank Filz
>> <ffilz...@mindspring.com <mailto:ffilz...@mindspring.com>> wrote:
>>
>>
>> Yes, dbus add/remove export should work for all FSALs.
>>
>> Frank
>>
>> *From:*steve landiss [mailto:steve.land...@yahoo.com
> <mailto:steve.land...@yahoo.com>]
>> *Sent:* Wednesday, July 20, 2016 10:07 AM
>> *To:* Jiffin Tony Thottan <jthot...@redhat.com
> <mailto:jthot...@redhat.com>>; Frank Filz
>> <ffilz...@mindspring.com <mailto:ffilz...@mindspring.com>>;
> nfs-ganesha-devel@lists.sourceforge.net
> <mailto:nfs-ganesha-devel@lists.sourceforge.net>;
>> nfs-ganesha-supp...@lists.sourceforge.net
> <mailto:nfs-ganesha-supp...@lists.sourceforge.net>
>> *Subject:* Re: [Nfs-ganesha-devel] How do I prevent nfs-ganesha from
>> keeping file handles open?
>>
>> Will this work for VFS as well?  That is, I just want to add and remove
>> local EBS Luns without restarting ganesha.  I am running ganesha in AWS,
>> and I dynamically attach and detach EBS volumes.  I want these to be
>> exported and unexported.  I am using the VFS FSAL.  So will dbus work in
>> this scenario?
>>
>> Thanks
>> Steve
>>
>> On Wednesday, July 20, 2016 12:15 AM, Jiffin Tony Thottan
>> <jthot...@redhat.com <mailto:jthot...@redhat.com>
> <mailto:jthot...@redhat.com <mailto:jthot...@redhat.com>>> wrote:
>>
>>
>>
>> On 19/07/16 23:38, steve landiss wrote:
>>
>>Got it.  Any examples on how I can use dbus to remove my export?
>>
>>
>> These two commands used in gluster[1] for add/remove export dynamically
>>
>> Remove export
>> #dbus-send --print-reply --system --dest=org.ganesha.nfsd
>> /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport
>> uint16:
>>
>> #dbus-send  --print-reply --system --dest=org.ganesha.nfsd
>> /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport
>> string:
>> string:"EXPORT(Path=/)"
>>
>> Just make sure file org.ganesha.nfsd.conf is present in
>> /etc/dbus-1/system.d/ (for fedora/centos etc) before running
>> this command
>>
>> [1]
>>
> https://github.com/gluster/glusterfs/blob/master/extras/ganesha/scripts/dbus-send.sh
>> --
>> Regards
>> Jiffin
>>
>>
>>
>>A related question - how do I add a new filesystem dynamically
>>without bouncing ganesha?
>>
>>On Tuesday, July 19, 2016 11:02 AM, Frank Filz
>><ffilz...@mindspring.com <mailto:ffilz...@mindspring.com>>
> <mailto:ffilz...@mindspring.com <mailto:ffilz...@mindspring.com>> wrote:
>>
>>You should not unmount a filesystem Ganesha is actively exporting…
>>
>>Using dbus to remove your export will close all open files.
>>
>>Frank
>>
>>*From:*steve landiss [mailto:steve.land...@yahoo.com
> <mailto:steve.land...@yahoo.com>]
>>*Sent:* Tuesday, July 19, 2016 10:53 AM
>>*To:* nfs-ganesha-devel@lists.sourceforge.net
> <mailto:nfs-ganesha-devel@lists.sourceforge.net>
>><mailto:nfs-ganesha-devel@lists.sourceforge.net
> <mailto:nfs-ganesha-devel@lists.sourceforge.net>>;
>>nfs-ganesha-supp...@lists.sourceforge.net
> <mailto:nfs-ganesha-supp...@lists.sourceforge.net>
>>  

Re: [Nfs-ganesha-devel] How do I prevent nfs-ganesha from keeping file handles open?

2016-07-20 Thread Soumya Koduri


On 07/20/2016 11:09 PM, steve landiss wrote:
> I am following the instructions
> at 
> https://github.com/nfs-ganesha/nfs-ganesha/wiki/Dbusinterface#Configuring_the_System_DBus_Service
>
>
> Ater copying the org.ganesha.nfsd.conf to /etc/dbus-1/system.d/ I get
> the following error:
>
> "Error org.freedesktop.DBus.Error.ServiceUnknown: The name
> org.ganesha.nfsd was not provided by any .service files"

You may have to restart the "messagebus" service to load this file. 
Could you please try that?

Thanks,
Soumya

>
> Looks like I need a service file in /usr/share/dbus-1/services/ ?
>
> If so, where can I get a sample file?
>
> Steve
>
>
>
> On Wednesday, July 20, 2016 10:27 AM, Frank Filz
>  wrote:
>
>
> Yes, dbus add/remove export should work for all FSALs.
>
> Frank
>
> *From:*steve landiss [mailto:steve.land...@yahoo.com]
> *Sent:* Wednesday, July 20, 2016 10:07 AM
> *To:* Jiffin Tony Thottan ; Frank Filz
> ; nfs-ganesha-devel@lists.sourceforge.net;
> nfs-ganesha-supp...@lists.sourceforge.net
> *Subject:* Re: [Nfs-ganesha-devel] How do I prevent nfs-ganesha from
> keeping file handles open?
>
> Will this work for VFS as well?  That is, I just want to add and remove
> local EBS Luns without restarting ganesha.  I am running ganesha in AWS,
> and I dynamically attach and detach EBS volumes.  I want these to be
> exported and unexported.  I am using the VFS FSAL.  So will dbus work in
> this scenario?
>
> Thanks
> Steve
>
> On Wednesday, July 20, 2016 12:15 AM, Jiffin Tony Thottan
> > wrote:
>
>
>
> On 19/07/16 23:38, steve landiss wrote:
>
> Got it.  Any examples on how I can use dbus to remove my export?
>
>
> These two commands used in gluster[1] for add/remove export dynamically
>
> Remove export
> #dbus-send --print-reply --system --dest=org.ganesha.nfsd
> /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport
> uint16:
>
> #dbus-send  --print-reply --system --dest=org.ganesha.nfsd
> /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport
> string:
> string:"EXPORT(Path=/)"
>
> Just make sure file org.ganesha.nfsd.conf is present in
> /etc/dbus-1/system.d/ (for fedora/centos etc) before running
> this command
>
> [1]
> https://github.com/gluster/glusterfs/blob/master/extras/ganesha/scripts/dbus-send.sh
> --
> Regards
> Jiffin
>
>
>
> A related question - how do I add a new filesystem dynamically
> without bouncing ganesha?
>
> On Tuesday, July 19, 2016 11:02 AM, Frank Filz
>   wrote:
>
> You should not unmount a filesystem Ganesha is actively exporting…
>
> Using dbus to remove your export will close all open files.
>
> Frank
>
> *From:*steve landiss [mailto:steve.land...@yahoo.com]
> *Sent:* Tuesday, July 19, 2016 10:53 AM
> *To:* nfs-ganesha-devel@lists.sourceforge.net
> ;
> nfs-ganesha-supp...@lists.sourceforge.net
> 
> *Subject:* [Nfs-ganesha-devel] How do I prevent nfs-ganesha from
> keeping file handles open?
>
> nfs-ganesha keeps file handles open after an operation.  This
> prevents me from unmounting an exported filesystem while nfs-ganesha
> is running.  How do I prevent ganesha from doing this?
>
> Steve
>
> 
> Avast logo
> 
> 
>   
> This email has been checked for viruses by Avast antivirus software.
> www.avast.com
> 
> 
>
>
>
>
>
> 
> --
>
> What NetFlow Analyzer can do for you? Monitors network bandwidth and
> traffic
>
> patterns at an interface-level. Reveals which users, apps, and
> protocols are
>
> consuming the most bandwidth. Provides multi-vendor support for
> NetFlow,
>
> J-Flow, sFlow and other flows. Make informed decisions using
> capacity planning
>
> reports.http://sdm.link/zohodev2dev
>
>
>
> ___
>
> Nfs-ganesha-devel mailing list
>
> Nfs-ganesha-devel@lists.sourceforge.net
> 
>
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
>
>
>
>
>
>
> 
> Avast logo
> 
>   
> This email has been checked for viruses by Avast antivirus software.
> www.avast.com
> 

Re: [Nfs-ganesha-devel] Coverity issue 150368

2016-07-19 Thread Soumya Koduri
Hi Dan,

On 07/18/2016 11:55 PM, Daniel Gryniewicz wrote:
> Hi, Soumya
>
> Coverity caught an issue (150368) with glusterfs_setttr2() that looks
> legitimate.  The problem is that the error cases all set status, and
> then goto out. However, the first thing done after the out label is to
> overwrite status from retval.  It looks like either everything should be
> using status, or everything should be using retval.  This may mask error
> returns in this case, I believe.
>

Right. Thanks for reporting. Have submitted the fix for review-
  https://review.gerrithub.io/#/c/284432/

Thanks,
Soumya

> Thanks,
> Daniel

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] My big patch set I believe will be ready to merge this week

2016-07-15 Thread Soumya Koduri
Hi Frank,

On 07/15/2016 04:29 AM, Frank Filz wrote:
> I think the saga of support_ex, FSAL_MDCACHE, and copy attributes out
> instead of having them in the fsal_obj_handle is finally coming to a close.
>
> Soumya, if you can merge on top of my patch set, and examine any patches
> that change FSAL_VFS for any further changes you might need to pay attention
> to for FSAL_GLUSTER support_ex, I probably could merge your patches also.

I have re-based my patch set on top of yours and included attribute 
changes in open2, fetch_attrs (to the best of my knowledge).
I have done some basic testing this time and haven't hit any issues. So 
kindly review and merge the patches.

Thanks,
Soumya

>
> Frank
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] All_Squash + anonuid = 0 + anongid = 0

2016-07-13 Thread Soumya Koduri


On 07/13/2016 02:54 AM, Frank Filz wrote:
> I haven’t heard a response from anyone else,
>
>
>
> Malahal, Soumya, anyone else do you concur?
>
>
>
> Thanks
>
>
>
> Frank
>
>
>
> *From:*Ben Werthmann [mailto:b...@apcera.com]
> *Sent:* Tuesday, July 12, 2016 1:51 PM
> *To:* Frank Filz 
> *Cc:* nfs-ganesha-devel@lists.sourceforge.net
> *Subject:* Re: [Nfs-ganesha-devel] All_Squash + anonuid = 0 + anongid = 0
>
>
>
> Is the consensus to move forward with leaving attributes completely
> alone on setattr?
>
>
>
> On Tue, Jun 14, 2016 at 7:20 PM, Frank Filz  > wrote:
>
> Ben,
>
>
>
> Thanks for bringing this to the list.
>
>
>
> For a while (really since shortly after we put the squashing of
> attributes in), I’ve been wondering if it really is the right thing.
> Back then we did acknowledge that Ganesha’s behavior would be
> different than knfsd.
>
>
>
> I think we probably should leave the attributes completely alone on
> setattr. This would allow this case of a SETATTR with a new owner
> and/or owner_group to succeed if anon_uid was 0.

Not sure if it is the right approach for all other anon_IDs, but atleast 
for anon_ID '0', it makes sense not to squash the attr->owner and have 
the same privileges as genuine root user.

Thanks,
Soumya

>
>
>
> On create, when owner and/or owner_group are provided, we perhaps
> should do what knfsd does, and that is drop owner if it is not the
> same as creds->uid unless creds->uid is 0, and drop owner_group if
> it is not creds->gid or a member of creds->alt_groups, unless
> creds->uid is 0.
>
>
>
> Now in the long run, I hear a need for containers (and maybe some
> other scenarios) to specify a certain id has root privileges within
> a particular export and we should look at ways to allow that, though
> it gets complex with the permission checking done for directory
> mutating operations (creates, unlink, link, rename).
>
>
>
> Thanks
>
>
>
> Frank
>
>
>
> *From:*Ben Werthmann [mailto:b...@apcera.com ]
> *Sent:* Tuesday, June 14, 2016 2:40 PM
> *To:* nfs-ganesha-devel@lists.sourceforge.net
> 
> *Subject:* [Nfs-ganesha-devel] All_Squash + anonuid = 0 + anongid = 0
>
>
>
> nfs-ganesha devs,
>
>
>
>
> Frank Filz and I had a quick chat about this in IRC yesterday. He
> recommended that I send a mail to this list. We've noticed in our
> testing that this config does not work the same as kernel NFS:
>
>
>
> Squash = All_Squash
> anonuid = 0
> anongid = 0
>
>
>
> In this case I think that squashing uid/gid in SETATTR is a bit
> heavy handed. The kernel nfs server will allow the anonuid to chown
> files to other IDs with squash all. This is not possible with
> Ganehsa at this time.
>
>
>
>
>
> Thanks,
>
>
>
> Ben Werthmann
>
>
>
> 
> 
>
>   
>
> Virus-free. www.avast.com
> 
> 
>
>
>
>
>
>
> 
> Avast logo
> 
>   
>
> This email has been checked for viruses by Avast antivirus software.
> www.avast.com
> 
>
>
>

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Announce Push of V2.4-dev-21

2016-06-21 Thread Soumya Koduri


On 06/21/2016 06:15 AM, Malahal Naineni wrote:
> I posted a patch for this here: https://review.gerrithub.io/281158
> But pynfs and cthon, they both crash ganesha due to
> get_state_obj_ref(state) returning NULL.

I had hit similar crash while running pynfs. Had submitted patch - 
https://review.gerrithub.io/#/c/280763 to handle such failures.

Thanks,
Soumya

>
> Regards, Malahal.
>
> Malahal Naineni [mala...@us.ibm.com] wrote:
>> Marc, I will have a fix today for this.
>>
>> Regards, Malahal.
>>
>> Marc Eshel [es...@us.ibm.com] wrote:
>>> This version is better I am mounting v3 and I can now do ls now, but
>>> coping a small file into the mount point I get
>>> Marc.
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> [Switching to Thread 0x7f6595b0e280 (LWP 10125)]
>>> 0x00528775 in mdc_cur_export () at
>>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h:372
>>> 372 return mdc_export(op_ctx->fsal_export);
>>> (gdb) where
>>> #0  0x00528775 in mdc_cur_export () at
>>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h:372
>>> #1  0x00529bde in mdc_check_mapping (entry=0x7f63cc0014e0) at
>>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:160
>>> #2  0x0052b2d2 in mdcache_find_keyed (key=0x7f6595b0cd50,
>>> entry=0x7f6595b0cd48) at
>>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:582
>>> #3  0x0052b32e in mdcache_locate_keyed (key=0x7f6595b0cd50,
>>> export=0xb5d7f0, entry=0x7f6595b0cd48) at
>>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:612
>>> #4  0x005263e4 in mdcache_create_handle (exp_hdl=0xb5d7f0,
>>> hdl_desc=0x7f6595b0cf50, handle=0x7f6595b0cdc0)
>>>  at
>>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1130
>>> #5  0x00521f76 in mdc_up_invalidate (export=0xb5d7f0,
>>> handle=0x7f6595b0cf50, flags=3) at
>>> /nas/ganesha/new-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c:49
>>> #6  0x7f6595b1ac14 in GPFSFSAL_UP_Thread (Arg=0xb6ee60) at
>>> /nas/ganesha/new-ganesha/src/FSAL/FSAL_GPFS/fsal_up.c:310
>>> #7  0x7f6598491df3 in start_thread (arg=0x7f6595b0e280) at
>>> pthread_create.c:308
>>> #8  0x7f6597b513dd in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
>>>
>>>
>>>
>>> From:   "Frank Filz" 
>>> To: "'nfs-ganesha-devel'" 
>>> Date:   06/17/2016 03:34 PM
>>> Subject:[Nfs-ganesha-devel] Announce Push of V2.4-dev-21
>>>
>>>
>>>
>>> Branch next
>>>
>>> Tag:V2.4-dev-21
>>>
>>> Release Highlights
>>>
>>> * Remove FSAL_PT, FSAL_HPSS, FSAL_LUSTRE, Add FSAL_RGW to everything.cmake
>>>
>>> * Some NFS v3 bug fixes
>>>
>>> * [fridgethr.c] Prevent infinite loop for timed out sync.
>>>
>>> * FSAL_GLUSTER : symlink operation fails when acl is enabled
>>>
>>> * MDCACHE - call reopen for reopen, not open
>>>
>>> Signed-off-by: Frank S. Filz 
>>>
>>> Contents:
>>>
>>> 758a361 Frank S. Filz V2.4-dev-21
>>> f8247e2 Daniel Gryniewicz MDCACHE - call reopen for reopen, not open
>>> 3c682c2 Jiffin Tony Thottan FSAL_GLUSTER : symlink operation fails when
>>> acl
>>> is enabled
>>> fd01c8c Swen Schillig [fridgethr.c] Prevent infinite loop for timed out
>>> sync.
>>> e0319db Malahal Naineni Stop MOUNT/NLM as additional services in NFSv4
>>> only
>>> environments
>>> 96adc4c Frank S. Filz Reorganize nfs3_fsstat.c, nfs3_link.c, and
>>> nfs3_write.c
>>> c97be4a Frank S. Filz Change behavior - put_ref and get_ref are required.
>>> b2a6ff2 Frank S. Filz In nfs3_Mnt.c do not release obj_handle
>>> a88544f Frank S. Filz Remove FSAL_PT
>>> 34fccf2 Frank S. Filz Remove FSAL_HPSS
>>> fe8476d Frank S. Filz Remove FSAL_LUSTRE
>>> 9aac4d8 Frank S. Filz Add FSAL_RGW to everything.cmake
>>>
>>>
>>> ---
>>> This email has been checked for viruses by Avast antivirus software.
>>> https://www.avast.com/antivirus
>>>
>>>
>>> --
>>> What NetFlow Analyzer can do for you? Monitors network bandwidth and
>>> traffic
>>> patterns at an interface-level. Reveals which users, apps, and protocols
>>> are
>>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>>> J-Flow, sFlow and other flows. Make informed decisions using capacity
>>> planning
>>> reports. http://sdm.link/zohomanageengine
>>> ___
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>>> patterns at an interface-level. Reveals which users, apps, and 

Re: [Nfs-ganesha-devel] Copy Fails with NFSv4 having Kerberos Authentication in NFS Ganesha v2.3

2016-06-20 Thread Soumya Koduri
Hi,

On 06/20/2016 12:50 PM, Srivastava, AshishX wrote:
> Hi Soumya,
> We already tried the same. But not able to find any error logs. Also we 
> verified the wireshark logs at same time but from server no Error response is 
> send but still we see Permission Denied at client side.
> We verified the behavior on both Ubuntu and Fedora Client and still issue is 
> seen.

Hm. Maybe its an error generated on the client-side. Have you tried 
enabling client-side debug logs as well?

Thanks,
Soumya

>
> Thanks
> Ashish Kr Srivastava
>
>
> -Original Message-
> From: Soumya Koduri [mailto:skod...@redhat.com]
> Sent: Monday, June 20, 2016 12:42 PM
> To: Srivastava, AshishX <ashishx.srivast...@intel.com>; 
> nfs-ganesha-devel@lists.sourceforge.net
> Cc: Gadepalli, Naresh K <naresh.k.gadepa...@intel.com>; 'Monica Aggarwal' 
> <monica.aggar...@aricent.com>; Shivastava, RakeshX 
> <rakeshx.shivast...@intel.com>; Sekar, Arun KumarX 
> <arun.kumarx.se...@intel.com>; Gupta, ManishX <manishx.gu...@intel.com>; 
> Kumar, DeepakX X <deepakx.x.ku...@intel.com>
> Subject: Re: [Nfs-ganesha-devel] Copy Fails with NFSv4 having Kerberos 
> Authentication in NFS Ganesha v2.3
>
>
>
> On 06/20/2016 11:50 AM, Srivastava, AshishX wrote:
>> Hi Soumya,
>>
>> I applied the patch and test the same, but still we are facing the
>> same issue.
>>
>> **
>>
>> *Scenario Being Tested : *
>>
>> I am running a script which copies 10 1 GB file simultaneously in 1
>> iteration. In some iterations for some files it shows copy error with
>> reason Permission denied, at same time there is no error log in nfs-ganesha.
>
> Okay. Please try increasing log level to DEBUG or FULL_DEBUG and check if you 
> see any errors in the log file. You could increase the log level by adding 
> the below block in ganesha.conf file -
>
> LOG {
>   COMPONENTS {
>   ALL = DEBUG;
>   }
> }
>
>
> For more logging options, please refer to "src/config_samples/config.txt" 
> file in the source.
>
> Thanks,
> Soumya
>
>>
>> There is no issue if copy operation is performed then no issue is seen.
>>
>> Have to faced similar type of issue.
>>
>> Thanks
>>
>> Ashish Kr Srivastava
>>
>> -Original Message-
>> From: Srivastava, AshishX
>> Sent: Saturday, June 18, 2016 12:48 PM
>> To: Soumya Koduri <skod...@redhat.com>;
>> nfs-ganesha-devel@lists.sourceforge.net
>> Cc: Gadepalli, Naresh K <naresh.k.gadepa...@intel.com>; 'Monica
>> Aggarwal' <monica.aggar...@aricent.com>; Shivastava, RakeshX
>> <rakeshx.shivast...@intel.com>; Sekar, Arun KumarX
>> <arun.kumarx.se...@intel.com>; Gupta, ManishX
>> <manishx.gu...@intel.com>; Kumar, DeepakX X
>> <deepakx.x.ku...@intel.com>
>> Subject: RE: [Nfs-ganesha-devel] Copy Fails with NFSv4 having Kerberos
>> Authentication in NFS Ganesha v2.3
>>
>> Hi Soumya,
>>
>> Thanks for your reply. But the issue mentioned in the patch is related
>> to mount/remount. But in my case there is no issue in mount/remount.
>>
>> I am running a script which copies 10 1 GB file simultaneously in 1
>> iteration. In some iterations for some files it shows copy error with
>> reason Permission denied, at same time there is no error log in nfs-ganesha.
>>
>> There is no issue if copy operation is performed then no issue is seen.
>>
>> Have to faced similar type of issue.
>>
>> Thanks
>>
>> Ashish Kr Srivastava
>>
>> -Original Message-
>>
>> From: Soumya Koduri [mailto:skod...@redhat.com]
>>
>> Sent: Friday, June 17, 2016 6:57 PM
>>
>> To: Srivastava, AshishX <ashishx.srivast...@intel.com
>> <mailto:ashishx.srivast...@intel.com>>;
>> nfs-ganesha-devel@lists.sourceforge.net
>> <mailto:nfs-ganesha-devel@lists.sourceforge.net>
>>
>> Cc: Gadepalli, Naresh K <naresh.k.gadepa...@intel.com
>> <mailto:naresh.k.gadepa...@intel.com>>; 'Monica Aggarwal'
>> <monica.aggar...@aricent.com <mailto:monica.aggar...@aricent.com>>;
>> Shivastava, RakeshX <rakeshx.shivast...@intel.com
>> <mailto:rakeshx.shivast...@intel.com>>; Sekar, Arun KumarX
>> <arun.kumarx.se...@intel.com <mailto:arun.kumarx.se...@intel.com>>;
>> Gupta, ManishX <manishx.gu...@intel.com
>> <mailto:manishx.gu...@intel.com>>; Kumar, DeepakX X
>> <deepakx.x.ku...@intel.com <mailto:deepakx.x.ku...@intel.com>>
>>
&

Re: [Nfs-ganesha-devel] x-attr support in GlusterFS

2016-06-20 Thread Soumya Koduri
Hi,

On 06/18/2016 04:41 AM, Daniel Vega wrote:
> Hello,
>
> I've been working on implementing x-attr support for GlusterFS. Even
> though all the x-attr related functions are implemented and correctly
> referenced in handle.c, I still get "Operation not supported" when
> trying to set/get/list extended attributes.
>

I do not know what all options need to be set in NFS-Ganesha to enable 
xattr support, but may be via gdb you could check if they are passed to 
FSAL_GLUSTER. Do you see any errors in ganesha/gfapi log files?

Thanks,
Soumya

> Is there something else that needs to be done to make sure that the
> x-attr functions are going through my implementation in the Gluster FSAL?
>
> I'm using NFS-Ganesha 2.2-7.
>
> PS: I've set _NO_XATTRD OFF.
>
> Thanks,
>
> --
> Daniel
>
>
> --
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
> reports. http://sdm.link/zohomanageengine
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports. http://sdm.link/zohomanageengine
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Copy Fails with NFSv4 having Kerberos Authentication in NFS Ganesha v2.3

2016-06-20 Thread Soumya Koduri


On 06/20/2016 11:50 AM, Srivastava, AshishX wrote:
> Hi Soumya,
>
> I applied the patch and test the same, but still we are facing the same
> issue.
>
> **
>
> *Scenario Being Tested : *
>
> I am running a script which copies 10 1 GB file simultaneously in 1
> iteration. In some iterations for some files it shows copy error with
> reason Permission denied, at same time there is no error log in nfs-ganesha.

Okay. Please try increasing log level to DEBUG or FULL_DEBUG and check 
if you see any errors in the log file. You could increase the log level 
by adding the below block in ganesha.conf file -

LOG {
 COMPONENTS {
 ALL = DEBUG;
 }
}


For more logging options, please refer to 
"src/config_samples/config.txt" file in the source.

Thanks,
Soumya

>
> There is no issue if copy operation is performed then no issue is seen.
>
> Have to faced similar type of issue.
>
> Thanks
>
> Ashish Kr Srivastava
>
> -Original Message-
> From: Srivastava, AshishX
> Sent: Saturday, June 18, 2016 12:48 PM
> To: Soumya Koduri <skod...@redhat.com>;
> nfs-ganesha-devel@lists.sourceforge.net
> Cc: Gadepalli, Naresh K <naresh.k.gadepa...@intel.com>; 'Monica
> Aggarwal' <monica.aggar...@aricent.com>; Shivastava, RakeshX
> <rakeshx.shivast...@intel.com>; Sekar, Arun KumarX
> <arun.kumarx.se...@intel.com>; Gupta, ManishX <manishx.gu...@intel.com>;
> Kumar, DeepakX X <deepakx.x.ku...@intel.com>
> Subject: RE: [Nfs-ganesha-devel] Copy Fails with NFSv4 having Kerberos
> Authentication in NFS Ganesha v2.3
>
> Hi Soumya,
>
> Thanks for your reply. But the issue mentioned in the patch is related
> to mount/remount. But in my case there is no issue in mount/remount.
>
> I am running a script which copies 10 1 GB file simultaneously in 1
> iteration. In some iterations for some files it shows copy error with
> reason Permission denied, at same time there is no error log in nfs-ganesha.
>
> There is no issue if copy operation is performed then no issue is seen.
>
> Have to faced similar type of issue.
>
> Thanks
>
> Ashish Kr Srivastava
>
> -Original Message-
>
> From: Soumya Koduri [mailto:skod...@redhat.com]
>
> Sent: Friday, June 17, 2016 6:57 PM
>
> To: Srivastava, AshishX <ashishx.srivast...@intel.com
> <mailto:ashishx.srivast...@intel.com>>;
> nfs-ganesha-devel@lists.sourceforge.net
> <mailto:nfs-ganesha-devel@lists.sourceforge.net>
>
> Cc: Gadepalli, Naresh K <naresh.k.gadepa...@intel.com
> <mailto:naresh.k.gadepa...@intel.com>>; 'Monica Aggarwal'
> <monica.aggar...@aricent.com <mailto:monica.aggar...@aricent.com>>;
> Shivastava, RakeshX <rakeshx.shivast...@intel.com
> <mailto:rakeshx.shivast...@intel.com>>; Sekar, Arun KumarX
> <arun.kumarx.se...@intel.com <mailto:arun.kumarx.se...@intel.com>>;
> Gupta, ManishX <manishx.gu...@intel.com
> <mailto:manishx.gu...@intel.com>>; Kumar, DeepakX X
> <deepakx.x.ku...@intel.com <mailto:deepakx.x.ku...@intel.com>>
>
> Subject: Re: [Nfs-ganesha-devel] Copy Fails with NFSv4 having Kerberos
> Authentication in NFS Ganesha v2.3
>
> Hi,
>
> If you are using NFS-Ganesha sources, could you try applying below patch
> and verify if it fixes the issue.
>
>  - https://review.gerrithub.io/#/c/274710/
>
> Thanks,
>
> Soumya
>
> On 06/17/2016 04:51 PM, Srivastava, AshishX wrote:
>
>  > Hi,
>
>  >
>
>  > We are using nfs ganesha v2.3. While copying with nfsv4, having
>
>  > Kerberos authentication we are getting random write failure issue. For
>
>  > some files write fails with *Input/output Error* and for some files
>
>  > *Permission Denied*
>
>  >
>
>  > *Test Setup:*
>
>  >
>
>  > *Client :*Ubuntu/Fedora VM
>
>  >
>
>  > *Server :* Linux with *3.4.105*with NFS Ganesha, RAM 8GB(Free ~4.5GB)
>
>  >
>
>  > *Mount Point : Local Disk on Ubuntu VM.*
>
>  >
>
>  > *Mont Command : */mount -t nfs4 -o sec=krb5 10.0.5.226:/mnt/extraDisk/
>
>  > sdfs_qat_mnt//
>
>  >
>
>  > *Test Scenario : *Local Mount point is created on server side and same
>
>  > is mounted from client side. Now script is running which is coping 10
>
>  > 1GB files parallel and this script is getting run multiple times.
>
>  >
>
>  > *Test  Output : *It is seen that after some iteration copy is fail for
>
>  > some files. and in next iteration no issue is seen for same file.
>
>  >
>
>  > *Error reported at client side: *
>
>  >
>
>

Re: [Nfs-ganesha-devel] Copy Fails with NFSv4 having Kerberos Authentication in NFS Ganesha v2.3

2016-06-17 Thread Soumya Koduri
Hi,

If you are using NFS-Ganesha sources, could you try applying below patch 
and verify if it fixes the issue.
- https://review.gerrithub.io/#/c/274710/

Thanks,
Soumya

On 06/17/2016 04:51 PM, Srivastava, AshishX wrote:
> Hi,
>
> We are using nfs ganesha v2.3. While copying with nfsv4, having Kerberos
> authentication we are getting random write failure issue. For some files
> write fails with *Input/output Error* and for some files *Permission Denied*
>
> *Test Setup:*
>
> *Client :*Ubuntu/Fedora VM
>
> *Server :* Linux with *3.4.105*with NFS Ganesha, RAM 8GB(Free ~4.5GB)
>
> *Mount Point : Local Disk on Ubuntu VM.*
>
> *Mont Command : */mount -t nfs4 -o sec=krb5 10.0.5.226:/mnt/extraDisk/
> sdfs_qat_mnt//
>
> *Test Scenario : *Local Mount point is created on server side and same
> is mounted from client side. Now script is running which is coping 10
> 1GB files parallel and this script is getting run multiple times.
>
> *Test  Output : *It is seen that after some iteration copy is fail for
> some files. and in next iteration no issue is seen for same file.
>
> *Error reported at client side: *
>
> /cp: writing
> `/root/users/Deepa/dsa_automation/sdfs_qat_mnt/2M_10GB_Iteration/0/8_47932':
> Input/output error//
>
> /
>
> /cp: writing
> `/root/users/Deepa/dsa_automation/sdfs_qat_mnt/2M_10GB_Iteration/0/9_47932':
> Permission denied//
>
> /
>
> *Error Logs in nfsganesha at server side when issue is seen - *
>
> 02/05/2016 10:52:01 : epoch 5726ef84 : DSA_LDAP :
> ganesha.nfsd-4131[decoder] AuthenticateRequest :DISP :INFO :*Could not
> authenticate request... rejecting with AUTH_STAT=AUTH_REJECTEDCRED*
> 02/05/2016 10:54:01 : epoch 5726ef84 : DSA_LDAP :
> ganesha.nfsd-4131[decoder] AuthenticateRequest :DISP :INFO :*Could not
> authenticate request... rejecting with AUTH_STAT=AUTH_REJECTEDCRED*
>
> We debug and found that in libntirpc file authgss_hash.c function
> authgss_hash_init, the axp->gen was never initialize & used directly and
> due to that reason some garbage value is assigned and its behavior is
> unexpected and the above logs comes multiple times. After fix the issue
> is not seen.
>
> But still for some files we see Permission Denied error while copy
> operation. But at this time there is no error log at server side.
>
> We tested the same with Fedora & Ubuntu Clients.
>
> */Can you please let us know whether this is any known issue or not./**//*
>
> Thanks & Regards
> Ashish Kr Srivastava
>
>
>
> --
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
> reports. http://sdm.link/zohomanageengine
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports. http://sdm.link/zohomanageengine
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Patches to backport to V2.3-stable branch

2016-05-19 Thread Soumya Koduri
Hi Malahal,

Request you to backport below commits to V2.3-stable branch.

  commit a1d57cc5260e88e29bc4172f47566b9f313cc7f0
  Author: Soumya Koduri <skod...@redhat.com>
  Date:   Fri Oct 30 23:11:35 2015 +0530

   nfs: Use option grace_period to determine grace timeout


commit 0ed85098f09fcff3e90bd9e111b68b840902acc6
Author: Kaleb S KEITHLEY <kkeit...@redhat.com>
Date:   Tue Dec 8 12:18:40 2015 -0500

 fsal_gluster: eliminate duplicate code in gluster2fsal_error()


commit fbc6f075d7c1b2a767169561df46cacc455b0c58
Author: Jiffin Tony Thottan <jthot...@gmail.com>
Date:   Wed Mar 30 11:50:05 2016 +0530

 FSAL_GLUSTER : adding logrotate file for ganesha-gfapi.log


commit ffedec79badd946cc0ca5483867b029f996ec17a
Author: Soumya Koduri <skod...@redhat.com>
Date:   Tue May 10 13:30:07 2016 +0530

 FSAL_GLUSTER: set default errno to EINVAL


commit 3a34bedef9dc3796ee7a55e3ba7d1966f2adb2a9
Author: Alexander Bersenev <b...@hackerdom.ru>
Date:   Wed Apr 27 16:09:07 2016 +0530

 RPCSEC_GSS: When using kerberos validate principals but not handles


Thanks,
Soumya



--
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Cache inode entries marked 'LRU_ENTRY_CLEANUP'

2016-05-19 Thread Soumya Koduri


On 05/18/2016 08:24 PM, Daniel Gryniewicz wrote:
> On 05/18/2016 10:19 AM, Soumya Koduri wrote:
>> Hi Dan,
>>
>> Reviving this old thread.
>>
>> On 02/02/2016 07:44 PM, Daniel Gryniewicz wrote:
>>> On 02/02/2016 05:34 AM, Soumya Koduri wrote:
>>>> Thank you all for your inputs. I shall check if there is any easy
>>>> way of
>>>> re-producing it and then test with Daniel's patch added.
>>>
>>> That patch applies on top of mdcache, so it will need modification to
>>> work on top of stock ganesha.  I can do that, if you need it, once you
>>> have a reproducer.
>>>
>>>>
>>>> On 02/01/2016 10:10 PM, Frank Filz wrote:
>>>>> There might be a refcount leak. I think we do need to clean this stuff
>>>>> up,
>>>>> and maybe change how we handle killed cache inode entries. The
>>>>> process is
>>>>> now inline, it is supposed to clean up on the final unref.
>>>>
>>>> Okay. But in case if there are any special entries (like Matt had
>>>> mentioned below) which do not get cleaned up in normal code path,
>>>> shouldn't we forcefully clean them up as part of unexport?
>>>>
>>>
>>> An entry can be on multiple exports, so unconditionally removing it on
>>> unexport will break things.
>>>
>>> More likely, we should be removing it from the export anyway?  That way
>>> it will be cleaned up when the last ref is dropped.
>>>
>>> That code looks like it's assuming another thread is actively cleaning
>>> up the entry, but I think that may not be the case.
>>>
>>
>> We seem to have got definite steps to reproduce this issue (at-least
>> with FSAL_GLUSTER) with v2.3-stable branch [1]. Could you please provide
>> info on what thread should be cleaning up such entries.
>>
>> Steps to reproduce
>>
>> 1.) create a volume and start a volume
>> 2.) export the volume
>> 3.) run the pynfs test on it(only subset is enough, for e.g., "link"
>> test)
>> 4.) try to unexport the volume using dbus.
>>
>
> Is there a way to test on 2.4 after the next -dev?  I just closed the
> last refcount issue I've been able to find on there, and it'd be nice to
> know if that one is fixed too.  I assumed these issues were introduced
> by MDCACHE, but it's possible one or more of them were already extant in
> cache_inode, and I just found them.

We haven't tested it on 2.4 yet. We will test it out and let you know 
the results. At least when debugging this issue on V2.3-stable, we found 
that that particular entry which is marked "LRU_ENTRY_CLEANUP" has 
lru.refcnt = 1 and has open fd assaciated with openflags set to 
(FSAL_O_READ | FSAL_O_WRITE) (just in case if it gives any hint).

Thanks,
Soumya

>
> In the mean time, I'll give a shot at 2.3.x and see if I can reproduce
> locally.
>
> Daniel

--
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Cache inode entries marked 'LRU_ENTRY_CLEANUP'

2016-05-18 Thread Soumya Koduri
Hi Dan,

Reviving this old thread.

On 02/02/2016 07:44 PM, Daniel Gryniewicz wrote:
> On 02/02/2016 05:34 AM, Soumya Koduri wrote:
>> Thank you all for your inputs. I shall check if there is any easy way of
>> re-producing it and then test with Daniel's patch added.
>
> That patch applies on top of mdcache, so it will need modification to
> work on top of stock ganesha.  I can do that, if you need it, once you
> have a reproducer.
>
>>
>> On 02/01/2016 10:10 PM, Frank Filz wrote:
>>> There might be a refcount leak. I think we do need to clean this stuff
>>> up,
>>> and maybe change how we handle killed cache inode entries. The
>>> process is
>>> now inline, it is supposed to clean up on the final unref.
>>
>> Okay. But in case if there are any special entries (like Matt had
>> mentioned below) which do not get cleaned up in normal code path,
>> shouldn't we forcefully clean them up as part of unexport?
>>
>
> An entry can be on multiple exports, so unconditionally removing it on
> unexport will break things.
>
> More likely, we should be removing it from the export anyway?  That way
> it will be cleaned up when the last ref is dropped.
>
> That code looks like it's assuming another thread is actively cleaning
> up the entry, but I think that may not be the case.
>

We seem to have got definite steps to reproduce this issue (at-least 
with FSAL_GLUSTER) with v2.3-stable branch [1]. Could you please provide 
info on what thread should be cleaning up such entries.

Steps to reproduce

1.) create a volume and start a volume
2.) export the volume
3.) run the pynfs test on it(only subset is enough, for e.g., "link" test)
4.) try to unexport the volume using dbus.

Thanks,
Soumya

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1330365#c12

> Daniel

--
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Ganesha 2.3.0 crash issue

2016-05-04 Thread Soumya Koduri


On 05/04/2016 08:30 PM, Soumya Koduri wrote:
> Hi Malahal,
>
> On 05/04/2016 07:37 PM, Malahal Naineni wrote:
>> Soumya Koduri [skod...@redhat.com] wrote:
>>> Hi Krishna,
>>>
>>> yes. We had been reported similar issue earlier and Frank submitted a
>>> patch to fix it [1]. I think the fix is available in V2.3.1 or later
>>> branches.
>>>
>>> Thanks,
>>> Soumya
>>
>> Soumya, is this zero length file handle case? If so, can anyone explain
>> why we got a zero length file handle? (maybe some NFS client bug???)
>>
>
> yes. It is because of zero length file handle.

Sorry correction. We had suspected that the user could have hit the 
crash because of zero length filehandle. We couldn't debug the core they 
produced because of missing debuginfos.

Krishna,
Can you confirm it if you happen to have a core.

Thanks,
Soumya

Unfortunately, even we
> couldn't figure out why/how it has happened. We had been reported about
> this from a user and from code-inspection we found a possible bug in
> XDR_DECODE & could reproduce the issue by injecting errors via gdb.
>
> Krishna,
> Do you have any insights on when exactly(in what scenarios) you hit this
> crash?
>
> Thanks,
> Soumya
>
>> Regards, Malahal.
>>
>>>
>>> [1] https://review.gerrithub.io/#/c/263358/
>>>
>>> On 04/29/2016 07:09 AM, Krishna Harathi wrote:
>>>> Hi,
>>>>
>>>> We are using Ganesha 2.3.0, hitting this crash. I see that we have a
>>>> 2.3.2 as well as other fixes in the main branch. Before I start browsing
>>>> all the commits, does this stack trace looks familiar, this issue is
>>>> already fixed in 2.3.2 or later?
>>>>
>>>>   #0  0x0049abbd in xdr_nfs_fh3 (xdrs=0x7fe288a47bd8,
>>>>   objp=0x7fe288a483f0) at
>>>>   
>>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/Protocols/XDR/xdr_nfs23.c:465
>>>>   #1  0x0049b3f2 in xdr_diropargs3 (xdrs=0x7fe288a47bd8,
>>>>   objp=0x7fe288a483f0) at
>>>>   
>>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/Protocols/XDR/xdr_nfs23.c:835
>>>>   #2  0x0049b6fb in xdr_LOOKUP3args (xdrs=0x0,
>>>>   objp=0x7fe288a483f0) at
>>>>   
>>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/Protocols/XDR/xdr_nfs23.c:1014
>>>>   #3  0x7fe29ce88231 in svcauth_none_wrap (auth=0x7fe29d0af680,
>>>>   req=0x7fe288a48230, xdrs=0x7fe288a47bd8, xdr_func=0x49b6dd
>>>>   , xdr_ptr=0x7fe288a483f0 "")
>>>>at
>>>>   
>>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/libntirpc/src/svc_auth_none.c:66
>>>>   #4  0x7fe29ce8fe3c in svc_vc_getargs (xprt=0x7fe288a47f80,
>>>>   req=0x7fe288a48230, xdr_args=0x49b6dd ,
>>>>   args_ptr=0x7fe288a483f0, u_data=0x7fe288a483e8)
>>>>at
>>>>   
>>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/libntirpc/src/svc_vc.c:947
>>>>   #5  0x0044e266 in nfs_rpc_get_args (reqnfs=0x7fe288a48228)
>>>>   at
>>>>   
>>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:2122
>>>>   #6  0x0044d6eb in thr_decode_rpc_request (context=0x0,
>>>>   xprt=0x7fe288a47f80) at
>>>>   
>>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1869
>>>>   #7  0x0044d891 in thr_decode_rpc_requests
>>>>   (thr_ctx=0x7fe28980e080) at
>>>>   
>>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1915
>>>>   #8  0x0050f338 in fridgethr_start_routine
>>>>   (arg=0x7fe28980e080) at
>>>>   
>>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/support/fridgethr.c:561
>>>>   #9  0x7fe29d2beb50 in start_thread () from
>>>>   /lib/x86_64-linux-gnu/libpthread.so.0
>>>>   #10 0x7fe29c99c7bd in clone () from 
>>>> /lib/x86_64-linux-gnu/libc.so.6
>>&

Re: [Nfs-ganesha-devel] Ganesha 2.3.0 crash issue

2016-05-04 Thread Soumya Koduri
Hi Malahal,

On 05/04/2016 07:37 PM, Malahal Naineni wrote:
> Soumya Koduri [skod...@redhat.com] wrote:
>> Hi Krishna,
>>
>> yes. We had been reported similar issue earlier and Frank submitted a
>> patch to fix it [1]. I think the fix is available in V2.3.1 or later
>> branches.
>>
>> Thanks,
>> Soumya
>
> Soumya, is this zero length file handle case? If so, can anyone explain
> why we got a zero length file handle? (maybe some NFS client bug???)
>

yes. It is because of zero length file handle. Unfortunately, even we 
couldn't figure out why/how it has happened. We had been reported about 
this from a user and from code-inspection we found a possible bug in 
XDR_DECODE & could reproduce the issue by injecting errors via gdb.

Krishna,
Do you have any insights on when exactly(in what scenarios) you hit this 
crash?

Thanks,
Soumya

> Regards, Malahal.
>
>>
>> [1] https://review.gerrithub.io/#/c/263358/
>>
>> On 04/29/2016 07:09 AM, Krishna Harathi wrote:
>>> Hi,
>>>
>>> We are using Ganesha 2.3.0, hitting this crash. I see that we have a
>>> 2.3.2 as well as other fixes in the main branch. Before I start browsing
>>> all the commits, does this stack trace looks familiar, this issue is
>>> already fixed in 2.3.2 or later?
>>>
>>>  #0  0x0049abbd in xdr_nfs_fh3 (xdrs=0x7fe288a47bd8,
>>>  objp=0x7fe288a483f0) at
>>>  
>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/Protocols/XDR/xdr_nfs23.c:465
>>>  #1  0x0049b3f2 in xdr_diropargs3 (xdrs=0x7fe288a47bd8,
>>>  objp=0x7fe288a483f0) at
>>>  
>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/Protocols/XDR/xdr_nfs23.c:835
>>>  #2  0x0049b6fb in xdr_LOOKUP3args (xdrs=0x0,
>>>  objp=0x7fe288a483f0) at
>>>  
>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/Protocols/XDR/xdr_nfs23.c:1014
>>>  #3  0x7fe29ce88231 in svcauth_none_wrap (auth=0x7fe29d0af680,
>>>  req=0x7fe288a48230, xdrs=0x7fe288a47bd8, xdr_func=0x49b6dd
>>>  , xdr_ptr=0x7fe288a483f0 "")
>>>   at
>>>  
>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/libntirpc/src/svc_auth_none.c:66
>>>  #4  0x7fe29ce8fe3c in svc_vc_getargs (xprt=0x7fe288a47f80,
>>>  req=0x7fe288a48230, xdr_args=0x49b6dd ,
>>>  args_ptr=0x7fe288a483f0, u_data=0x7fe288a483e8)
>>>   at
>>>  
>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/libntirpc/src/svc_vc.c:947
>>>  #5  0x0044e266 in nfs_rpc_get_args (reqnfs=0x7fe288a48228)
>>>  at
>>>  
>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:2122
>>>  #6  0x0044d6eb in thr_decode_rpc_request (context=0x0,
>>>  xprt=0x7fe288a47f80) at
>>>  
>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1869
>>>  #7  0x0044d891 in thr_decode_rpc_requests
>>>  (thr_ctx=0x7fe28980e080) at
>>>  
>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1915
>>>  #8  0x0050f338 in fridgethr_start_routine
>>>  (arg=0x7fe28980e080) at
>>>  
>>> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/support/fridgethr.c:561
>>>  #9  0x7fe29d2beb50 in start_thread () from
>>>  /lib/x86_64-linux-gnu/libpthread.so.0
>>>  #10 0x7fe29c99c7bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
>>>  #11 0x in ?? ()
>>>  (gdb) print objp
>>>  $1 = (nfs_fh3 *) 0x7fe288a483f0
>>>  (gdb) print *objp
>>>  $2 = {data = {data_len = 0, data_val = 0x0}}
>>>  (gdb) print *xdrs
>>>  $3 = {x_op = XDR_DECODE, x_ops = 0x7fe29d0adfc0, x_public =
>>>  0x7fe288a483e8, x_private = 0x7fe28881c0e0, x_lib = {0x2,
>>>  0x7fe288a47f80}, x_base = 0x0, x_v = {vio_base = 0x0, vio_head =
>>>  0x0, vio_tail = 0x0, vio_wrap = 0x0},
>>> x_handy = 0, x_flags = 1}
>>>
>>>
>>>
>>> Thanks in advan

Re: [Nfs-ganesha-devel] Ganesha 2.3.0 crash issue

2016-04-28 Thread Soumya Koduri
Hi Krishna,

yes. We had been reported similar issue earlier and Frank submitted a 
patch to fix it [1]. I think the fix is available in V2.3.1 or later 
branches.

Thanks,
Soumya

[1] https://review.gerrithub.io/#/c/263358/

On 04/29/2016 07:09 AM, Krishna Harathi wrote:
> Hi,
>
> We are using Ganesha 2.3.0, hitting this crash. I see that we have a
> 2.3.2 as well as other fixes in the main branch. Before I start browsing
> all the commits, does this stack trace looks familiar, this issue is
> already fixed in 2.3.2 or later?
>
> #0  0x0049abbd in xdr_nfs_fh3 (xdrs=0x7fe288a47bd8,
> objp=0x7fe288a483f0) at
> 
> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/Protocols/XDR/xdr_nfs23.c:465
> #1  0x0049b3f2 in xdr_diropargs3 (xdrs=0x7fe288a47bd8,
> objp=0x7fe288a483f0) at
> 
> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/Protocols/XDR/xdr_nfs23.c:835
> #2  0x0049b6fb in xdr_LOOKUP3args (xdrs=0x0,
> objp=0x7fe288a483f0) at
> 
> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/Protocols/XDR/xdr_nfs23.c:1014
> #3  0x7fe29ce88231 in svcauth_none_wrap (auth=0x7fe29d0af680,
> req=0x7fe288a48230, xdrs=0x7fe288a47bd8, xdr_func=0x49b6dd
> , xdr_ptr=0x7fe288a483f0 "")
>  at
> 
> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/libntirpc/src/svc_auth_none.c:66
> #4  0x7fe29ce8fe3c in svc_vc_getargs (xprt=0x7fe288a47f80,
> req=0x7fe288a48230, xdr_args=0x49b6dd ,
> args_ptr=0x7fe288a483f0, u_data=0x7fe288a483e8)
>  at
> 
> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/libntirpc/src/svc_vc.c:947
> #5  0x0044e266 in nfs_rpc_get_args (reqnfs=0x7fe288a48228)
> at
> 
> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:2122
> #6  0x0044d6eb in thr_decode_rpc_request (context=0x0,
> xprt=0x7fe288a47f80) at
> 
> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1869
> #7  0x0044d891 in thr_decode_rpc_requests
> (thr_ctx=0x7fe28980e080) at
> 
> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1915
> #8  0x0050f338 in fridgethr_start_routine
> (arg=0x7fe28980e080) at
> 
> /home/kharathi/source1/oneblox/repos/packaging/nfs-ganesha-2.3.0/nfs-ganesha-2.3.0/src/support/fridgethr.c:561
> #9  0x7fe29d2beb50 in start_thread () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> #10 0x7fe29c99c7bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
> #11 0x in ?? ()
> (gdb) print objp
> $1 = (nfs_fh3 *) 0x7fe288a483f0
> (gdb) print *objp
> $2 = {data = {data_len = 0, data_val = 0x0}}
> (gdb) print *xdrs
> $3 = {x_op = XDR_DECODE, x_ops = 0x7fe29d0adfc0, x_public =
> 0x7fe288a483e8, x_private = 0x7fe28881c0e0, x_lib = {0x2,
> 0x7fe288a47f80}, x_base = 0x0, x_v = {vio_base = 0x0, vio_head =
> 0x0, vio_tail = 0x0, vio_wrap = 0x0},
>x_handy = 0, x_flags = 1}
>
>
>
> Thanks in advance.
>
> Regards.
> Krishna Harathi
>
>
> --
> Find and fix application performance issues faster with Applications Manager
> Applications Manager provides deep performance insights into multiple tiers of
> your business applications. It resolves application problems quickly and
> reduces your MTTR. Get your free trial!
> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Any FSAL not using 'fsal_obj_handle->attrs->numlinks' ?

2016-04-21 Thread Soumya Koduri
Hi,

Patch [1] makes use of 'numlinks' value stored in 
fsal_obj_handle->attrlist, to determine the link count of the entry 
being removed. Following Dan's comment, I request you to let me know if 
there is any FSAL which doesn't use/update numlinks with linkcount.

I see that its value is being queried in 'cache_inode_close()' as well. 
But still would like to confirm it.

Thanks,
Soumya

[1] https://review.gerrithub.io/#/c/273611/3

On 04/20/2016 02:27 PM, Soumya Koduri wrote:
> Hi Frank/Matt,
>
> Reviving the below thread. Since we seem to agree that we should discard
> the entry if the link count is 1 instead of trying to refresh the
> attributes, I have made the relevant changes. Kindly review the same.
>
> Patch: https://review.gerrithub.io/#/c/273611/
>
> Thanks,
> Soumya
>
>
> On 07/07/2014 10:30 PM, Matt W. Benjamin wrote:
>> Hi,
>>
>> Adam and I think that we should be able to discard the cache entry
>> if the link count was 1, yes.
>>
>> Matt
>>
>> - "Frank Filz" <ffilz...@mindspring.com> wrote:
>>
>>>> I have a question on cache_inode_remove:
>>>>
>>>> >From what I understand, we call the following function on the
>>> removed
>>>>> entry,
>>>> (void)cache_inode_refresh_attrs_locked(to_remove_entry, req_ctx);
>>> And
>>>> this is required to update the atrributes of the removed entry.
>>>>
>>>> I work on FSAL_GLUSTER and I've observed 2 things,
>>>>
>>>> 1. In case of a regular file, with no links, we always
>>>>  see the error,
>>>>  cache_inode_refresh_attrs :INODE :DEBUG :Failed on entry
>>>> 0x7f84800043a0 CACHE_INODE_FSAL_ESTALE
>>>>  and we don't call cache_inode_fixup_md, the function that seems
>>> to
>>>> actually refresh the attributes.
>>>>  This error is seen as we call getattrs on a removed file and
>>> that
>>> returns an
>>>> error.
>>>>
>>>> 2. When the file has a hard link, function cache_inode_fixup_md is
>>> called
>>>> from cache_inode_refresh_attrs.
>>>>
>>>> Question
>>>> =
>>>> 1. In the case of a regular file which has no links, this call will
>>> invariably hit the
>>>> error block in
>>>>  cache_inode_refresh_attrs. We end up calling a stat on a file
>>> that
>>> doesn't
>>>> exist in the back end every
>>>>  time we remove a file. Is the second
>>> cache_inode_refresh_attrs_locked
>>> call
>>>> in the cache_inode_remove
>>>>  required only when the file that is removed has links? Or are
>>> there any
>>>> other reasons for this call?
>>>> 2. Does the error message "cache_inode_refresh_attrs :INODE :DEBUG
>>>> :Failed on entry 0x7f84800043a0 CACHE_INODE_FSAL_ESTALE"
>>>>  indicate a successful unlink?
>>>
>>> So we do the refresh attrs just in case the file has had it's link
>>> count
>>> increased since we last fetched attributes for it. Of course most of
>>> the
>>> time, that has not happened, and most of the time, the file had a link
>>> count
>>> of 1, and therefore the refresh attrs fails with ESTALE.
>>>
>>> I wonder if we could actually look at our link count. Our link count
>>> SHOULD
>>> reflect any links to the file that are being used by our clients
>>> (since
>>> every time we do a LOOKUP we refresh attrs - if the file had a hard
>>> link
>>> added outside Ganesha, and one of our clients did a LOOKUP on that new
>>> name,
>>> our link count should be updated and now reflect both entries).
>>>
>>> Matt, Adam, what say you?
>>>
>>> Frank
>>
>
> --
> Find and fix application performance issues faster with Applications Manager
> Applications Manager provides deep performance insights into multiple tiers of
> your business applications. It resolves application problems quickly and
> reduces your MTTR. Get your free trial!
> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] How to build v2.3.2

2016-04-19 Thread Soumya Koduri
Hi Serkan,

I see V2.3.2 tag when I checkout to V2.3-stable branch. Are you using 
latest sources? Alternatively you could checkout to a branch using tag 
as well.

# git checkout -b V2.3.2 V2.3.2
M   src/libntirpc
Switched to a new branch 'V2.3.2'
# git describe
V2.3.2

# git checkout -b V2.3-stable origin/V2.3-stable
M   src/libntirpc
Branch V2.3-stable set up to track remote branch V2.3-stable from origin.
Switched to a new branch 'V2.3-stable'
# git describe
V2.3.2
#

Thanks,
Soumya


On 04/19/2016 10:52 AM, Serkan Çoban wrote:
> Hi,
>
> I am trying to build ganesha-nfs v2.3.2 from git but when I checkout
> v2.3-stable branch I cannot get v.2.3.2. Can you tell me what steps
> needed to build v2.3.2?
>
> Thanks,
> Serkan
>
> --
> Find and fix application performance issues faster with Applications Manager
> Applications Manager provides deep performance insights into multiple tiers of
> your business applications. It resolves application problems quickly and
> reduces your MTTR. Get your free trial!
> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Crash when running ls

2016-04-11 Thread Soumya Koduri


On 04/11/2016 06:41 AM, Varghese Devassy wrote:
> Soumya,
>
> Are there any plans of adding these missing fixes into 2.2?

I am not sure if V2.2 branch is still being actively supported. Request 
Malahal/Kaleb to comment.

Thanks,
Soumya

>
> Varghese Devassy
> v_deva...@yahoo.com
>
> On 2016-04-10 06:40 AM, Soumya Koduri wrote:
>> Hi,
>>
>> I had run into similar crash when cache inode entries get reaped.
>> Below patch had fixed the issue.
>>
>> https://review.gerrithub.io/#/c/258687/
>>
>> I think this patch is not merged into V2.2. Please check
>> nfs-ganesha-2.3 or current next branch packages.
>>
>> Thanks,
>> Soumya
>>
>> On 04/09/2016 05:54 AM, Varghese Devassy wrote:
>>> Hi,
>>>
>>> We have our own FSAL that runs in the following version of NFS ganesha
>>> on a CentOS system. I have been trying to locate a crash for the last
>>> few days but with no progress. Any help in this matter will be greatly
>>> appreciated.
>>>
>>> # rpm -qa | grep -i ganesha
>>> nfs-ganesha-2.2.0-6.el7.x86_64
>>> nfs-ganesha-utils-2.2.0-6.el7.x86_64
>>> nfs-ganesha-debuginfo-2.2.0-6.el7.x86_64
>>> #
>>>
>>> # uname -a
>>> Linux vcd-test1 3.10.0-327.3.1.el7.x86_64 #1 SMP Wed Dec 9 14:09:15 UTC
>>> 2015 x86_64 x86_64 x86_64 GNU/Linux
>>> #
>>>
>>> We have been having a crash when running ls -lR for a long periods of
>>> time (over 1+ hours). To make the crash earlier, we reduced the inode
>>> cache to 1 by introducing the following configuration parameter.
>>>
>>> CACHEINODE
>>> {
>>>   Entries_HWMark = 1;
>>> }
>>>
>>> With this change, I am able to get the crash in two ls commands with the
>>> following steps:
>>> 1. mount  /mnt
>>> 2. cd /mnt
>>> 3. ls
>>> 4. cd var
>>> 5. ls (ganesha crashes)
>>>
>>> It seems to me that the somehow the entry obtained from the cache has
>>> the wrong file handle for var directory. Because, I checked the FH in
>>> debugger and it has some FH values that seems wrong.
>>>
>>> The problematic dir_entry is retrieved at the following line in
>>> src/Protocols/NFS/nfs3_readdirplus.c
>>>
>>>   188 dir_entry =
>>> nfs3_FhandleToCache(&(arg->arg_readdirplus3.dir),
>>>   189 &(res->res_readdirplus3.status),
>>>   190   );
>>>
>>> Here is the stack trace.
>>>
>>> (gdb) where
>>> #0  copy_ganesha_fh (dst=0x7fe602ffd700, src=0x7fe60a825708) at
>>> export.c:66
>>> #1  0x7fe62d5f2699 in create_handle (export_pub=0x7fe60a81f290,
>>>   fh_desc=0x7fe60a825710, pub_handle=0x7fe602ffd7b0) at export.c:136
>>> #2  0x7fe631d7b7ae in cache_inode_get_keyed (key=0x7fe60a825700,
>>>   flags=, status=0x7fe602ffd82c)
>>>   at
>>> /usr/src/debug/nfs-ganesha-2.2.0/src/cache_inode/cache_inode_get.c:312
>>> #3  0x7fe631d771c6 in cache_inode_lookupp_impl
>>> (entry=0x7fe60a825580,
>>>   parent=0x7fe602ffd908)
>>>   at
>>> /usr/src/debug/nfs-ganesha-2.2.0/src/cache_inode/cache_inode_lookupp.c:110
>>>
>>> #4  0x7fe631d77883 in cache_inode_lookupp (entry=0x7fe60a825580,
>>>   parent=0x7fe602ffd908)
>>>   at
>>> /usr/src/debug/nfs-ganesha-2.2.0/src/cache_inode/cache_inode_lookupp.c:172
>>>
>>> #5  0x7fe631d113a8 in nfs3_readdirplus (arg=,
>>>   worker=, req=, res=0x7fe601014140)
>>>   at
>>> /usr/src/debug/nfs-ganesha-2.2.0/src/Protocols/NFS/nfs3_readdirplus.c:268
>>>
>>> #6  0x7fe631d032a8 in nfs_rpc_execute (req=0x7fe60701c2c0,
>>>   worker_data=0x7fe60100e180)
>>>   at
>>> /usr/src/debug/nfs-ganesha-2.2.0/src/MainNFSD/nfs_worker_thread.c:1268
>>> #7  0x7fe631d04dac in worker_run (ctx=0x7fe62de7da80)
>>>   at
>>> /usr/src/debug/nfs-ganesha-2.2.0/src/MainNFSD/nfs_worker_thread.c:1535
>>> #8  0x7fe631da1a89 in fridgethr_start_routine (arg=0x7fe62de7da80)
>>>   at /usr/src/debug/nfs-ganesha-2.2.0/src/support/fridgethr.c:562
>>> #9  0x7fe630045df5 in start_thread () from /lib64/libpthread.so.0
>>> #10 0x7fe62f7151ad in clone () from /lib64/libc.so.6
>>> (gdb)
>>>
>>> Thank you
>>>
>

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301=/ca-pub-7940484522588532
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Issue with readdir cache in VFS

2016-04-10 Thread Soumya Koduri


On 04/10/2016 07:53 AM, steve landiss wrote:
> Thanks Mat!t... Where can I see an example of how Gluster is
> invalidating the upcalls?

You could refer to "src/FSAL/FSAL_GLUSTER/fsal_up.c" for upcall routines 
used by Gluster.
>
> And how would I bound the time to restat to 0?  Where is that set?

To disable the caching (or configure the time), use the below config 
option in ganesha.conf file.

CACHEINODE { Attr_Expiration_Time = 0; }

Thanks,
Soumya

>
> Thanks
> Steve
>
>
> On Sunday, April 10, 2016 8:08 AM, Matt Benjamin 
> wrote:
>
>
> Hi Steve,
>
> 1. The underlying problem is lack of invalidates with VFS.  Strategies
> for providing them have been considered, but I don't know of any actual
> work going on to provide them.  FSALs like Gluster and GPFS implement
> Ganesha's invalidate upcalls and would avoid this.  Also, you -can-
> bound the time to restat (even make it 0, or no cache).
>
> 2. I'm not sure, other than disabling caching.  It's not intended
> behavior, obviously.
>
> Matt
>
> - Original Message -
>  > From: "steve landiss"  >
>  > To: nfs-ganesha-devel@lists.sourceforge.net
> ,
> nfs-ganesha-supp...@lists.sourceforge.net
> 
>  > Sent: Saturday, April 9, 2016 4:52:52 AM
>  > Subject: [Nfs-ganesha-devel] Issue with readdir cache in VFS
>  >
>  > I am using nfs-ganesha with the VFS backend.
>  >
>  > If I create a file under the actual path that is being exported, it
> takes a
>  > few seconds (maybe 30?) to show up in the export.
>  >
>  > The bigger problem is, if the client is continually doing a readdir
> on that
>  > export, it never shows up because the case is continually being marked as
>  > up-to-date.
>  >
>  > The way to repro this is to simply export a directory, and have a
> client do a
>  > ls on that export in a tight loop. Any file you create in the
> original dir
>  > will never show up.
>  >
>  > How do I disable this cace?
>
>  >
>  >
> --
>  > Find and fix application performance issues faster with Applications
> Manager
>  > Applications Manager provides deep performance insights into multiple
> tiers
>  > of
>  > your business applications. It resolves application problems quickly and
>  > reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
>  > gampad/clk?id=1444514301=/ca-pub-7940484522588532
>  > ___
>  > Nfs-ganesha-devel mailing list
>  > Nfs-ganesha-devel@lists.sourceforge.net
> 
>  > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>  >
>
> --
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
>
> http://www.redhat.com/en/technologies/storage
>
> tel.  734-707-0660
> fax.  734-769-8938
> cel.  734-216-5309
>
>
>
>
>
> --
> Find and fix application performance issues faster with Applications Manager
> Applications Manager provides deep performance insights into multiple tiers of
> your business applications. It resolves application problems quickly and
> reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
> gampad/clk?id=1444514301=/ca-pub-7940484522588532
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301=/ca-pub-7940484522588532
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs-ganesha crash!

2016-04-06 Thread Soumya Koduri
Hi,

Can you please provide core (or at least bt)? Also please check 
'/var/log/ganesha.log' and '/var/log/ganesha-gfapi.log' for any errors.

Thanks,
Soumya

On 04/06/2016 05:48 PM, zhw bai wrote:
> @all
>  my nfs-ganesha crashed when I do my test, anyone can help? thank
> you in advance.'
>
>  My test steps:
>a、client : mount -t nfs data-node4:/ceshi /mnt#mount with nfs
>b、dd if=/dev/zero if=/mnt/test.img bs=1M count=6
>c、nfs crash !Every time the data is not more than 5GB!
>
>  following is my test env:
>  1、nfs-ganesha  version 2.3.0
> EXPORT
> {
>  # Export Id (mandatory, each EXPORT must have a unique Export_Id)
>  Export_Id = 77;
>
>  # Exported path (mandatory)
>  Path = /ceshi;
>
>  # Pseudo Path (required for NFS v4)
>  Pseudo = /ceshi;
>
>  # Required for access (default is None)
>  # Could use CLIENT blocks instead
>  Access_Type = RW;
>  Squash = No_root_squash;
>  #SecType = sys;
>
>
>  # Exporting FSAL
>  FSAL {
>  Name = GLUSTER;
>  Hostname = localhost;
>  Volume = ceshi;
>  }
> }
>
> 2. gluster version 3.7.8
> volume info
>
> Volume Name: ceshi
> Type: Stripe
> Volume ID: 6caa21f9-5b3f-4067-9605-0af6416f2044
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: data-node4:/brick/brick1/ceshi1
> Brick2: data-node5:/brick/brick1/ceshi1
> Options Reconfigured:
> nfs.disable: off
> nfs.register-with-portmap: on
> nfs.enable-ino32: yes
> nfs.export-volumes: on
>
>
> --
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] [ISSUE] nfs-mon resource errors

2016-02-07 Thread Soumya Koduri
Hi Christian,

On 02/06/2016 03:04 AM, Christian Petersen wrote:
> Hey guys,
>
> My setup is as follows.
>
> ·ESXi and vCenter with 3 virtual machines running CentOS 7
>
> osolid state VMFS volumes attached
>
> ·GlusterFS
>
> ·3 active nodes making up a replicated volume
>
> ·NFS-Ganesha
>
> ·PCS/Pacemaker/Corosync cluster
>
> ·3 NIC cards
>
> 1.mgmt
>
> 2.NFS share traffic
>
> 3.GlusterFS replication traffic
>
> I've followed every guide I could find online and have gotten everything
> to work the best I can except for an error I am getting with the
> cluster.  More specifically the error is with the nfs-grace resource
> after I simulate a single failure. I shutdown node 1 which is the node
> that the share originates from and I get the following.
>
> Failed Actions:
>
> * nfs-grace_monitor_5000 on file03 'unknown error' (1): call=51,
> status=Timed Out, exitreason='none',
>
>  last-rc-change='Tue Jan  5 14:15:21 2016', queued=0ms, exec=0ms
>
> * nfs-grace_monitor_5000 on file02 'unknown error' (1): call=49,
> status=Timed Out, exitreason='none',
>
> last-rc-change='Tue Jan  5 14:15:21 2016', queued=0ms, exec=0ms
>
'pcs status' command is run as part of these resource-agent scripts 
which seem to take quite some time to finish if any of the node is down 
in the cluster. As mentioned by Kaleb in another mail thread, below 
patch shall address this issue as well.

  http://review.gluster.org/#/c/12964/

Till then as a workaround you could try increasing the timeout value of 
these resources using the below command and check if it works-

pcs resource update nfs-mon timeout=90s
pcs resource update nfs-grace timeout=90s

HTH,
-Soumya

> I know this isn't exactly a problem with Ganesha specifically, but I
> know that your scripts when Ganesha is enabled in Gluster are what
> instantiate the cluster configuration.  At one point I had to re-ip
> everything to move from my DEV environment to TEST and this started
> occuring then, but I have configured all of the hosts entries,
> ganesha-ha and pcs virtual IPs to reflect the new network addresses.
>
> These errors are detailed a little bit more in the Corosync log but
> there is nothing to be found in the nfs-ganesha log.  The part of the
> Corosync log when the staged failure occurs is posted at the link below.
>
> http://ur1.ca/ohhfk
>
> Please help!
>
> *Christian Petersen**
> **Senior Systems Analyst, Team Lead***
>
>   
>
>
> Direct Line: (780) 395 7781
> Mobile: (780) 221 5982
> cpeter...@contava.com
>
> Contava Logo
>
> *Calgary  Edmonton  Fort McMurray  Vancouver*
>
> #104, 4103 - 97 Street NW • Edmonton, Alberta • T6E 6E9
>
> Toll-Free: (800) 661 9821 • Phone: (780) 434 7564 • Fax: (780) 435 2109
> • www.contava.com 
>
> This communication may contain privileged or confidential information.
> If you are not the intended recipient or received
> this communication by error, please notify the sender and delete the
> message without copying or disclosing it.
>
>
>
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Cache inode entries marked 'LRU_ENTRY_CLEANUP'

2016-02-02 Thread Soumya Koduri
Thank you all for your inputs. I shall check if there is any easy way of 
re-producing it and then test with Daniel's patch added.

On 02/01/2016 10:10 PM, Frank Filz wrote:
> There might be a refcount leak. I think we do need to clean this stuff up,
> and maybe change how we handle killed cache inode entries. The process is
> now inline, it is supposed to clean up on the final unref.

Okay. But in case if there are any special entries (like Matt had 
mentioned below) which do not get cleaned up in normal code path, 
shouldn't we forcefully clean them up as part of unexport?

Thanks,
Soumya

>
> The whole process got a LOT more complex with the introduction of
> asynchronous unexport...
>
> Frank
>
>> -Original Message-
>> From: Daniel Gryniewicz [mailto:d...@redhat.com]
>> Sent: Monday, February 1, 2016 7:50 AM
>> To: nfs-ganesha-devel@lists.sourceforge.net
>> Subject: Re: [Nfs-ganesha-devel] Cache inode entries marked
>> 'LRU_ENTRY_CLEANUP'
>>
>> I've actually made come cleanups and fixes related to this on top of my
>> MDCACHE work.  At the time, I did an extensive run-through of the code
>> (and discussions with Matt) to be sure my changes were correct.  My guess
> is
>> that these changes fix this issue, but I haven't gone back today and
> worked
>> through those codepaths again to be sure.
>>
>> The actual commit with these changes is here, if people want to look at
> it:
>>
>> https://github.com/dang/nfs-
>> ganesha/commit/07eb482aa0efd19349a6904a85378733fe84a9a1
>>
>> Daniel
>>
>> On 02/01/2016 09:57 AM, Matt Benjamin wrote:
>>> Hi Soumya,
>>>
>>> Originally, cleanup was merely moving the erstwhile behavior of
> "kill-entry"
>> out-of-line.
>>>
>>> All entries everywhere should have a refcount thta is positive, starting
> at 1
>> for "not in use but resident in the cache entry lookup table"--or 0 when
> all
>> refs are returned, and only then briefly before being destroyed.
>>>
>>> We should probably review all the workflows the involve the cleanup
>> queue, and discuss whether they are fully coherent.  It seems possible
> that
>> entries which don't clean up are "special" in some way that evolved more
>> recently, like being at the root of an export?
>>>
>>> Matt
>>>
>>> - Original Message -
>>>> From: "Soumya Koduri" <skod...@redhat.com>
>>>> To: nfs-ganesha-devel@lists.sourceforge.net
>>>> Sent: Monday, February 1, 2016 4:15:32 AM
>>>> Subject: [Nfs-ganesha-devel] Cache inode entries marked
>> 'LRU_ENTRY_CLEANUP'
>>>>
>>>> Hi,
>>>>
>>>> In cache_inode_unexport(), if we encounter an entry marked
>>>> 'LRU_ENTRY_CLEANUP', we ignore it and continue.
>>>>
>>>>  status = cache_inode_lru_ref(entry, LRU_FLAG_NONE);
>>>>
>>>>
>>>>
>>>>  if (status != CACHE_INODE_SUCCESS) {
>>>>  /* This entry was going stale, skip it. */
>>>>
>>>>PTHREAD_RWLOCK_unlock(>lock);
>>>>continue;
>>>> }
>>>>
>>>> Who is responsible for cleaning up such entries? Shouldn't we do
>>>> force cleanup here?
>>>> We ran into a case where dbus thread is looping indefinitely on one
>>>> such entry and there were no threads involved in cleaning it up [1].
>>>> That particular entry even had a positive refcnt.
>>>>
>>>> Please provide your inputs.
>>>>
>>>> Thanks,
>>>> Soumya
>>>>
>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1301450#c3
>>>>
>>>> -
>>>> -
>>>> Site24x7 APM Insight: Get Deep Visibility into Application
>>>> Performance APM + Mobile APM + RUM: Monitor 3 App instances at just
>>>> $35/Month Monitor end-to-end web transactions and take corrective
>>>> actions now Troubleshoot faster and improve end-user experience.
>> Signup Now!
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
>>>> ___
>>>> Nfs-ganesha-devel mailing list
>>>> Nfs-ganesha-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>>
>>>
>>
>>
>>
> -

[Nfs-ganesha-devel] Cache inode entries marked 'LRU_ENTRY_CLEANUP'

2016-02-01 Thread Soumya Koduri
Hi,

In cache_inode_unexport(), if we encounter an entry marked 
'LRU_ENTRY_CLEANUP', we ignore it and continue.

   status = cache_inode_lru_ref(entry, LRU_FLAG_NONE); 

 

   if (status != CACHE_INODE_SUCCESS) {
   /* This entry was going stale, skip it. */

 PTHREAD_RWLOCK_unlock(>lock);
continue;
}

Who is responsible for cleaning up such entries? Shouldn't we do force 
cleanup here?
We ran into a case where dbus thread is looping indefinitely on one such 
entry and there were no threads involved in cleaning it up [1]. That 
particular entry even had a positive refcnt.

Please provide your inputs.

Thanks,
Soumya

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1301450#c3

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Patches that need back port from V2.4 to V2.3

2016-01-19 Thread Soumya Koduri
Hi Malahal,

Sorry for the delay. Yes they can be backported (including FSAL_GLUSTER 
patches).

Thanks,
Soumya

On 01/15/2016 07:44 AM, Malahal Naineni wrote:
> Hi All,
>
> I came up with this list of commits that may need to be back ported to
> V2.3. If you are the author of a commit, please respond if the commit
> needs to be in V2.3 or not. Also, if I missed any commit that you want
> to see in V2.3, please let me know as well.
>
> Thank you in advance.
>
> Regards, Malahal.
>
> commit e7307c5146d8d78e7aed0068d250366c307ccb51
> Author: Frank S. Filz <ffilz...@mindspring.com>
>
>  Resolve race between get_state_owner and dec_state_owner_ref differently
>
>
> commit 50d69dd11e3f5210e21715b429d1df0ff3f36e1c
> Author: Frank S. Filz <ffilz...@mindspring.com>
>
>  Use a list of cached open owners and maintain a refcount
>
>
> commit 3054a3b41600a9f3e7bfed52aaa0f2f0a3d89cde
> Author: Matt Benjamin <mbenja...@redhat.com>
>
>  FSAL_CEPH: NUL-terminate symlink buffers
>
>
> commit e4d51915b3fbdc06212777819c879be77d5cfff7
> Author: Soumya Koduri <skod...@redhat.com>
>
>  service files: Helper service to pre-process config options
>
>
> commit 020eb9129da15a6a810135c5af6fabd5e091ed67
> Author: Soumya Koduri <skod...@redhat.com>
>
>  Reset the first_export pointer during cache_inode_entry cleanup
>
>
> commit 7321b97670866a77008b50782c217dc21441c83d
> Author: jiffin tony thottan <jthot...@redhat.com>
>
>  FSAL_GLUSTER : Handle ENOENT properly in getattrs
>
>
> commit 06b85fd8263ff9a8331c36821dad9ec6be366798
> Author: Soumya Koduri <skod...@redhat.com>
>
>  exports_init(): Unref the export if pNFS DS is enabled
>
>
> commit 50deb1d7f54da48fee7f9df1c46970811c84b169
> Author: Jeremy Bongio <jbon...@us.ibm.com>
>
>  Use request type instead of DRC type to decide what can be cached.
>
>
> commit d48faa673928eeeb858857a2369cadc24e145a5e
> Author: Malahal Naineni <mala...@us.ibm.com>
>
>  GPFS: Fix the zombie detection code.
>
>
> commit 3138420ec5bb4b19caad340349638396cbfed6f3
> Author: jiffin tony thottan <jthot...@redhat.com>
>
>  FSAL_GLUSTER : Populate ALLOW acl entries accordingly if only DENY is 
> present
>
>
> commit d669eb56d19f1d722e1919d0f39e7055caa88ddd
> Author: Krishna Harathi <khara...@exablox.com>
>
>  nfsv3 - fix malformed packet response in readdir when zero entries are
>  returned. Also in cache_inode_readdir.
>
>
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] pNFS documentation

2015-12-21 Thread Soumya Koduri


On 12/22/2015 06:48 AM, Gmail wrote:
> Hello All,
>
> Is there any documentation describing how does pNFS works in NFS-Ganesha?
>
> I’ve done a simple setup with 3 Gluster storage nodes and two clients,
> NFS-Ganesha is running on the three storage nodes, the Gluster volume is
> configured in distributed configuration.
>
> I was able to get pNFS to work and I’ve traced the RPCs and I found that
> the client is able to send data directly to the storage nodes, so the
> functionality is working.
>
> But when I tried to mount the Gluster volume from two different
> NFS-Ganesha nodes on two different clients, I was able to mount
> successfully, and I was able to write from both clients, but what I’ve
> noticed when I use two different storage nodes for writes to the same
> file, I see the modified file from the other client immediately, but the
> file disappears from the client where I’ve wrote it then shows up after
> about 30 seconds!
>
> Is there anyone who knows how the MDS is working in pNFS Ganesha?! and
> how does the MDS in the whole clusters keep their metadata consistent?
>
Gluster doesn't yet have support for multiple MDS (steps and limitations 
have been documented in [1]), which is the reason you see delay or 
inconsistencies while using different servers to mount. However there is 
effort going on to support it in nfs-ganesha 2.4 [2]

[1] 
http://gluster.readthedocs.org/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Intergration/
 
(section: Configuring Gluster volume for pNFS)
[2] http://review.gluster.org/#/c/12367/

Thanks,
Soumya

>
>
> *— Bishoy*
>
>
>
>
>
>
>
>
> --
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] FOPs to be blocked during grace

2015-11-16 Thread Soumya Koduri


On 11/10/2015 08:21 PM, J. Bruce Fields wrote:
> On Tue, Nov 10, 2015 at 05:13:53PM +0530, Soumya Koduri wrote:
>>
>>
>> On 11/10/2015 01:50 AM, J. Bruce Fields wrote:
>>> On Mon, Nov 09, 2015 at 09:36:03PM +0530, Soumya Koduri wrote:
>>>> Hi,
>>>>
>>>>   From the code looks looks like, we block the following FOPs while the
>>>> NFS server is in grace (which have 'nfs_in_grace' check)-
>>>>
>>>> NFSv3 -
>>>> SETATTR
>>>>
>>>> NLM -
>>>> LOCK
>>>> UNLOCK
>>>>
>>>> NFSv4 -
>>>> OPEN
>>>> LOCK
>>>> REMOVE
>>>> RENAME
>>>> SETATTR
>>>>
>>>> Request clarification behind selecting these fops. Dan confirmed that
>>>> 'as per 5661 RFC, RENAME and REMOVE should be denied during grace to
>>>> support volatile file handles (which we don't support...).
>>>
>>> RENAME and REMOVE also conflict with delegations.  So I think you don't
>>> want to allow those till clients have recovered their delegations (or
>>> discovered that they can't).
>>>
>>> I think LINK belongs on that list for similar reasons.
>>>
>>
>> Thanks Bruce. What about NFSv3 fops?  I can see I/Os going on even
>> with kernel-NFS while the server is in grace (haven't checked with
>> delegations though)
>
> Yes, the grace period should really block v3 ops too.  It can't block
> opens, obviously (there aren't any v3 opens), but it should block
> specific operations that would conflict with recoverable v4 state.
>
> That said, Linux knfsd currently *doesn't* do most of that.  (I believe
> it's only blocking NLM lock/unlock).  I think that's a bug.  But that
> means I don't have much experience yet with what it would mean to turn
> this on.  People aren't used to v3 blocking on the grace period (unless
> they do a lot of file locking), so you'd want to be careful to minimize
> the impact, I think--e.g. make sure the server knows not to enforce
> these things if there are no NFSv4 clients to recover.

Alright. So if wish to have this behavior, we can have a flag to check 
if there is any earlier NFSv4 persistent state which can be reclaimed 
(not sure if we can specifically check for only delegations state) and 
then accordingly block/unblock NFSv3 fops - the list should be 
READ,WRITE,SETATTR,LINK,REMOVE,RENAME. Right?

Thanks,
Soumya

>
> --b.
>
>>
>> -Soumya
>>
>>> --b.
>>
>>
>>>
>>>>
>>>> And from RFC 3530 -
>>>> 
>>>> 8.6.2.  Server Failure and Recovery
>>>>
>>>>  If the server loses locking state (usually as a result of a restart
>>>>  or reboot), it must allow clients time to discover this fact and re-
>>>>  establish the lost locking state.  The client must be able to re-
>>>>  establish the locking state without having the server deny valid
>>>>  requests because the server has granted conflicting access to another
>>>>  client.  Likewise, if there is the possibility that clients have not
>>>>  yet re-established their locking state for a file, the server must
>>>>  disallow READ and WRITE operations for that file.  The duration of
>>>>  this recovery period is equal to the duration of the lease period.
>>>>
>>>> .
>>>> .
>>>> .
>>>>
>>>>  The period of special handling of locking and READs and WRITEs, equal
>>>>  in duration to the lease period, is referred to as the "grace
>>>>  period".  During the grace period, clients recover locks and the
>>>>  associated state by reclaim-type locking requests (i.e., LOCK
>>>>  requests with reclaim set to true and OPEN operations with a claim
>>>>  type of CLAIM_PREVIOUS).  During the grace period, the server must
>>>>  reject READ and WRITE operations and non-reclaim locking requests
>>>>  (i.e., other LOCK and OPEN operations) with an error of
>>>>  NFS4ERR_GRACE.
>>>>
>>>> 
>>>>
>>>> Does it mean that the NFS server need to reject I/Os as well unless it
>>>> is sure that there can be no other reclaim-type LOCK/OPEN requests? Also
>>>> why is SETATTR handled specially unlike WRITE fop.
>>>>
>>>> Thanks,
>>>> Soumya
>>>>
>>>> -

Re: [Nfs-ganesha-devel] TCP ACK not being sent

2015-11-10 Thread Soumya Koduri


On 11/10/2015 12:03 AM, Malahal Naineni wrote:
> Soumya Koduri [skod...@redhat.com] wrote:
>>
>>
>> On 11/06/2015 06:57 PM, Niels de Vos wrote:
>>> On Fri, Nov 06, 2015 at 07:47:35AM -0500, Soumya Koduri wrote:
>>>> Hi,
>>>>
>>>> In a 2-node nfs-ganesha cluster setup, we have noticed that after
>>>> couple of iterations of failover & failback of the IP between those
>>>> nodes, client I/O gets stuck. We have observed this in RHEL 7.1
>>>> environments (not sure about RHEL 6). While debugging I see that, the
>>>> node which takes over Virtual IP(after couple of iterations) doesn't
>>>> respond(acknowledge) to the client's TCP SYN packet.
>>>>
>>>> Found couple of discussions around it in few forums and I tried tuning
>>>> certain TCP parameters (tcp_timestamp, tcp_window_scaling) as
>>>> mentioned in there. But it did not work. The current work-around we
>>>> are left with (to resume the I/Os) is either
>>>> * restart nfs-ganesha service on the node which has taken over IP, to
>>>> clear the existing established TCP connections. Or
>>>> * failback the IP by getting the original node back online to resume
>>>> the I/O.
>>>>
>>>> Any ideas on what could be have been the reason for TCP ACK not being
>>>> sent to the TCP SYN packet coming on an existing connection in
>>>> ESTABLISHED state? Any pointers on how to fix that?
>>>
>>> CTDB has a function to "tickle" connections. This facilitates a faster
>>> fail-over if the client does not detect it needs to re-connect. We
>>> possibly need to do something like this for pacemaker/ganesha too.
>
> We ran into this few months back where I found a similar (same) issue.
> Our IP fail over guys did implement something similar to what CDTB does.
>

Thanks Malahal. I tried out pacemaker portblock resource agent [1] which 
Niels has suggested. It seems to tickle few invalid ACK packets from the 
server which forces client to reset the TCP connection and thus allowing 
progress in I/O after a while.

[1] http://linux-ha.org/doc/man-pages/re-ra-portblock.html

> Regards, Malahal.
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] FOPs to be blocked during grace

2015-11-09 Thread Soumya Koduri
Hi,

 From the code looks looks like, we block the following FOPs while the 
NFS server is in grace (which have 'nfs_in_grace' check)-

NFSv3 -
SETATTR

NLM -
LOCK
UNLOCK

NFSv4 -
OPEN
LOCK
REMOVE
RENAME
SETATTR

Request clarification behind selecting these fops. Dan confirmed that
'as per 5661 RFC, RENAME and REMOVE should be denied during grace to 
support volatile file handles (which we don't support...).

And from RFC 3530 -

8.6.2.  Server Failure and Recovery

If the server loses locking state (usually as a result of a restart
or reboot), it must allow clients time to discover this fact and re-
establish the lost locking state.  The client must be able to re-
establish the locking state without having the server deny valid
requests because the server has granted conflicting access to another
client.  Likewise, if there is the possibility that clients have not
yet re-established their locking state for a file, the server must
disallow READ and WRITE operations for that file.  The duration of
this recovery period is equal to the duration of the lease period.

.
.
.

The period of special handling of locking and READs and WRITEs, equal
in duration to the lease period, is referred to as the "grace
period".  During the grace period, clients recover locks and the
associated state by reclaim-type locking requests (i.e., LOCK
requests with reclaim set to true and OPEN operations with a claim
type of CLAIM_PREVIOUS).  During the grace period, the server must
reject READ and WRITE operations and non-reclaim locking requests
(i.e., other LOCK and OPEN operations) with an error of
NFS4ERR_GRACE.



Does it mean that the NFS server need to reject I/Os as well unless it 
is sure that there can be no other reclaim-type LOCK/OPEN requests? Also 
why is SETATTR handled specially unlike WRITE fop.

Thanks,
Soumya

--
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a 
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911=/4140
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] TCP ACK not being sent

2015-11-06 Thread Soumya Koduri
Hi Stijn,

On 11/06/2015 07:19 PM, Stijn De Weirdt wrote:
> hi soumya,
>
> can you give some details on the clients kernel and nfs protocol used?
>
> a colleague of mine did some similar tests, and only with nfs 4
> minorversion=1 we get failover/failback working. with 4.0 on recent 7.1
> of 6.7 we have always IO errors on client side.

I am using fedora22 nfs-client (Linux version - 4.2.3-200, Protocol - 
NFSv4). We do not see any I/O errors though (tried with both NFSv3 & 
NFSv4 mounts). As mentioned below, it works all well for initial couple 
of iterations of failover/failback.

Thanks,
Soumya

>
> stijn
>
> On 11/06/2015 01:47 PM, Soumya Koduri wrote:
>> Hi,
>>
>> In a 2-node nfs-ganesha cluster setup, we have noticed that after
>> couple of iterations of failover & failback of the IP between those
>> nodes, client I/O gets stuck. We have observed this in RHEL 7.1
>> environments (not sure about RHEL 6). While debugging I see that, the
>> node which takes over Virtual IP(after couple of iterations) doesn't
>> respond(acknowledge) to the client's TCP SYN packet.
>>
>> Found couple of discussions around it in few forums and I tried
>> tuning certain TCP parameters (tcp_timestamp, tcp_window_scaling) as
>> mentioned in there. But it did not work. The current work-around we
>> are left with (to resume the I/Os) is either * restart nfs-ganesha
>> service on the node which has taken over IP, to clear the existing
>> established TCP connections. Or * failback the IP by getting the
>> original node back online to resume the I/O.
>>
>> Any ideas on what could be have been the reason for TCP ACK not being
>> sent to the TCP SYN packet coming on an existing connection in
>> ESTABLISHED state? Any pointers on how to fix that?
>>
>> Thanks, Soumya
>>
>> --
>>
>>
> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>
>
> --
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] TCP ACK not being sent

2015-11-06 Thread Soumya Koduri
Hi,

In a 2-node nfs-ganesha cluster setup, we have noticed that after couple of 
iterations of failover & failback of the IP between those nodes, client I/O 
gets stuck. We have observed this in RHEL 7.1 environments (not sure about RHEL 
6). While debugging I see that, the node which takes over Virtual IP(after 
couple of iterations) doesn't respond(acknowledge) to the client's TCP SYN 
packet. 

Found couple of discussions around it in few forums and I tried tuning certain 
TCP parameters (tcp_timestamp, tcp_window_scaling) as mentioned in there. But 
it did not work. The current work-around we are left with (to resume the I/Os) 
is either 
* restart nfs-ganesha service on the node which has taken over IP, to clear the 
existing established TCP connections. Or 
* failback the IP by getting the original node back online to resume the I/O.

Any ideas on what could be have been the reason for TCP ACK not being sent to 
the TCP SYN packet coming on an existing connection in ESTABLISHED state? Any 
pointers on how to fix that?

Thanks,
Soumya

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Need clarification on LAYOUTRECALL implementation

2015-10-13 Thread Soumya Koduri
Hi Dan/Matt,

In the recent LinuxCon Europe Conference, Christoph Hellwig has given a 
talk on "A Simple, and Scalable pNFS Server For Linux" [1]

During that talk, he has mentioned that for kernel-NFS , LayoutRecall 
semantics & logic is same as DelegRecall (at least for block & 
object-type layouts and probably for file layouts too), i.e, for any 
conflicting access, we need to block that I/O until the layouts are 
recalled.

But I guess in NFS-Ganesha, we do not block the I/Os at the moment while 
recalling layouts. We would like to know if we need to follow kernel-NFS 
here or is to left to NFS server implementation to decide when to recall 
layouts and why we choose not to block the I/Os.

Also we seem to be leaving it to FSALs to handle/recall the layouts 
unlike locks/Delegations whose conflicts are checked in common SAL layer 
itself. Is there any particular reason behind it? Do we leave the 
decision when to recall layout and/or block the conflicting I/Os to the 
FSALs to handle.

[1] 
http://events.linuxfoundation.org/events/linuxcon-europe/program/schedule

Thanks,
Soumya

On 09/24/2015 06:21 PM, Daniel Gryniewicz wrote:
> A layout is a guarantee of ownership for the portion of the file
> covered.  No other conflicting file access can be done while the
> layout is granted.  So, if conflicting access is needed, the layout
> must be recalled.  In addition, if something happens in the cluster
> that would invalidate that access to that part of the file (such as
> loss of a node moving data to a backup node, or cluster optimization
> moving the location of the data, or storage nodes partitioning making
> the data unavailable, etc.), the layout must be recalled.
>
> It's probably best to recall layouts via an upcall.  VFS is not the
> best model to follow here, since it's not a clustered filesystem.
>
> Dan
>
> On Thu, Sep 24, 2015 at 5:07 AM, Jiffin Tony Thottan
>  wrote:
>> Hi all,
>>
>> Currently I am trying to add support for LAYOUTRECALL in FSAL_GLUSTER.
>> So I look through other
>> FSAL implementation and RFC5661 once again. As far as I understand it is
>> a notification send from
>> M.D.S. to client demanding  back the layouts. First I try to figure out
>> scenarios in which layoutrecall
>> is useful and following came into my mind(May be I am wrong and also
>> please help me finding more) :
>>
>> 1.) While an I/O is performed , layout of file changes due to a
>> gluster-internal process
>>
>> 2.) two clients performing I/O on same file based on layout provided by
>> two different M.D.Ses
>> [Currently in FSAL_GLUSTER provides layout for entire file because
>> entire file is located on Storage Device]
>>
>> 3.) When detach a brick from the storage pool in gluster.
>>
>> But as a second thought , is it necessary to have LAYOUTRECALL ? Layout
>> grants permission to a client
>> for performing  a I/O. But it does not guarantee  such that only `this
>> client can perform I/O on that`.
>> And commenting out LAYOUTRECALL from FSAL_CEPH increases my doubt.
>>
>> And one more question , FSAL_GPFS introduced LAYOUTRECALL as part of
>> UPCALL thread and FSAL_VFS
>> as part of a call back thread. So which one will be better should
>> handled as part of UPCALL thread or
>> separately using another thread ?
>>
>> Please correct me if anything mentioned above is wrong.
>>
>> With Regards and Thanks,
>> Jiffin
>>
>> --
>> Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
>> Get real-time metrics from all of your servers, apps and tools
>> in one place.
>> SourceForge users - Click here to start your Free Trial of Datadog now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
> --
> Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
> Get real-time metrics from all of your servers, apps and tools
> in one place.
> SourceForge users - Click here to start your Free Trial of Datadog now!
> http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] IP based recovery

2015-09-18 Thread Soumya Koduri
Hi Jeremy,

On 09/19/2015 12:08 AM, Jeremy Bongio wrote:
> How do the recovery directories work?
>
> I see in nfs4_create_recov_dir(void) that we create the top level
> recovery directories and optionally a nodeid directory:
>
> err = mkdir(NFS_V4_RECOV_ROOT, 0755);
> ...
> snprintf(v4_recov_dir, sizeof(v4_recov_dir), "%s/%s", NFS_V4_RECOV_ROOT,
> NFS_V4_RECOV_DIR);
> err = mkdir(v4_recov_dir, 0755);
> ...
> snprintf(v4_old_dir, sizeof(v4_old_dir), "%s/%s", NFS_V4_RECOV_ROOT,
> NFS_V4_OLD_DIR);
> err = mkdir(v4_old_dir, 0755);
> ...
> if (nfs_param.core_param.clustered) {
> snprintf(v4_recov_dir, sizeof(v4_recov_dir), "%s/%s/node%d",
> NFS_V4_RECOV_ROOT, NFS_V4_RECOV_DIR, g_nodeid);
> err = mkdir(v4_recov_dir, 0755);
> ...
> snprintf(v4_old_dir, sizeof(v4_old_dir), "%s/%s/node%d",
> NFS_V4_RECOV_ROOT, NFS_V4_OLD_DIR, g_nodeid);
> err = mkdir(v4_old_dir, 0755);
> ...
> }
>
> However, we aren't creating recovery directories by IP. Then in
> nfs4_load_recov_clids_nolock() we look for a directory named by IP:
>
> if (gsp->event == EVENT_UPDATE_CLIENTS)
>  snprintf(path, sizeof(path), "%s", v4_recov_dir);
>
> else if (gsp->event == EVENT_TAKE_IP)
>  snprintf(path, sizeof(path), "%s/%s/%s", NFS_V4_RECOV_ROOT,
> gsp->ipaddr, NFS_V4_RECOV_DIR);
>
> else if (gsp->event == EVENT_TAKE_NODEID)
>  snprintf(path, sizeof(path), "%s/%s/node%d", NFS_V4_RECOV_ROOT,
> NFS_V4_RECOV_DIR, gsp->nodeid);
>
> else
>  return;
> ...
> dp = opendir(path);
> ...
> rc = nfs4_read_recov_clids(dp, path, NULL, v4_old_dir, 1);
>
>
> I don't see where that directory structure is created,
> NFS_V4_RECOV_ROOT/server_ipaddr/NFS_V4_RECOV_DIR. What am I missing?
>
you are right. The state gets created in 
NFS_V4_RECOV_ROOT/NFS_V4_RECOV_DIR.  As mentioned in [1], we have 
scripts which creates symlinks to this directory with the path 
'NFS_V4_RECOV_ROOT/server_ipaddr/NFS_V4_RECOV_DIR'. You could optionally 
choose to rsync the state to this new path periodically.

This is a stop-gap solution we choose to have till we get the 
'clustered' state corrected in ganesha.

[1] http://sourceforge.net/p/nfs-ganesha/mailman/message/34236611/

Thanks,
Soumya


--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] IP based recovery

2015-09-17 Thread Soumya Koduri


On 09/18/2015 10:44 AM, Soumya Koduri wrote:
> Hi,
>
> There is already an IP-based recovery solution provided by nfs-ganesha
> (except for one small issue with the 'clustered' config option which I
> shall mention below) and yes, we are currently using the same for our HA
> cluster solution.
>
> By default, if not provided any event in the input to the d-bus grace
> command, the event maps to EVENT_TAKE_IP and looks for
> '/var/lib/nfs/ganesha/${IP}' directory to load the clientIDs to be
> recovered.
>
>
> On 09/17/2015 10:56 PM, Malahal Naineni wrote:
>> IBM actually uses RELEASE_IP and TAKENODE (instead of TAKE_IP). Don't
>> know the reason for takenode event, but it doesn't sound right to me
>> in all cases.
>>
>> Soumya reported an issue with the recovery directory having nodeid
>> rather than ip based. Any progress there?
>
> The issue was with 'clustered' config option. If this option is set,
> nfs-ganesha writes the clients state to '/var/lib/nfs/ganesha/node{id}'
> directory. Since we need IP based solution, we had to turn off this
> config option though we have a cluster of nfs-ganesha nodes.
>
> If I remember from the discussion we had, you had agreed to de-couple
> this option from node-id so that this option can in general be used by
> anyone planning to have a cluster (and not using node-IDs). And sorry I
> have't got time to check on that later, but shall be able to look it (in
> a couple of weeks time) if you would like me to work on it.
>
And once we correct this option, we can provide another option, may be 
'state_dir' (which could take node-id/IP etc in 'string' format) to be 
used by ganesha to create client state in '/var/lib/nfs/ganesha/state_dir'.

We are currently making use of symlinks to create those paths.

Thanks,
Soumya

> Thanks,
> Soumya
>
>>
>> Regards, Malahal.
>>
>>
>> Frank Filz [ffilz...@mindspring.com] wrote:
>>>> We've previously used node-based recovery, which is basically what's
>>>> implemented right now in Ganesha. However, for a number of reasons we
>>>> need an IP-based recovery solution. Malahal told me that Redhat wants an
>>>> IP-based solution as well. Soumya (or anyone else), have you been working
>>>> on this? Do you have anything to show yet?
>>>
>>> I thought the recovery was IP based already...
>>>
>>> There are basically three scenarios of interest:
>>>
>>> 1. Node goes down and IPs are moved to other nodes
>>> 2. Interface on Node goes down and IP is moved to another node
>>> 3. Actually there's just two scenarios, because failback when a node or
>>> interface comes back online is just scenario 2...
>>>
>>> This means there are three main actions to manage the transfer of state:
>>>
>>> 1. RELEASE_IP, if an interface goes down, the node that is losing that IP
>>> needs to release state associated with that IP
>>>
>>> 2. TAKE_IP, whichever node is acquiring an IP address (whether from a failed
>>> node, failed interface, or failback) must notify v3 clients and then accept
>>> reclaims from the appropriate v3 and v4 cleints.
>>>
>>> 3. Somewhere in there all nodes need to cooperate to enforce grace period.
>>>
>>> Frank
>>>
>>>
>>> ---
>>> This email has been checked for viruses by Avast antivirus software.
>>> https://www.avast.com/antivirus
>>>
>>>
>>> --
>>> Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
>>> Get real-time metrics from all of your servers, apps and tools
>>> in one place.
>>> SourceForge users - Click here to start your Free Trial of Datadog now!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140
>>> ___
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>>
>
> --
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Provide NLM/NFSv4 client identifier to FSAL

2015-09-01 Thread Soumya Koduri
Thanks for your inputs.

On 08/31/2015 09:24 PM, Frank Filz wrote:
> You could actually do this with the old interface if you support lock
> owners. While the lock owner is defined in the FSAL as a void *, it is
> in fact always a state_owner_t, which would get you to this client
> information.
This seems doable. We could extract client information and store it as 
lock/open owner.
>
> On the other hand, let’s have someone else play with the extended FSAL
> API… J
>
> Now on the idea in general, one question I have is what you plan to do
> in the FSAL to avoid too much I/O in keeping track of this information.
>
The libraries intending to use this feature could provide separate API 
to set the client identifier(like UID/GID) before performing lock/open 
fop (in the case of gluster, may be 'glfs_set_lockowner'). Instead we 
can choose to provide new OPEN/LOCK APIs which can accommodate this new 
field to avoid multiple I/Os.

> With the extended API, since the FSAL allocates the state_t, you COULD
> track something there, but that’s per-file-per-owner, so that’s way more
> than per-client (since there could be multiple owners from a given client).
Will the NFS-clients send these same lock/open owners during reclaim of 
their state? If yes, we could store this lock owner information itself I 
guess instead of client identifiers.

Thanks,
Soumya

>
> Frank
>
> *From:*Daniel Gryniewicz [mailto:d...@redhat.com]
> *Sent:* Monday, August 31, 2015 6:03 AM
> *To:* nfs-ganesha-devel@lists.sourceforge.net
> *Subject:* Re: [Nfs-ganesha-devel] Provide NLM/NFSv4 client identifier
> to FSAL
>
> Since the state_owner_t is referenced in the state_t, this should just
> fall out of Frank's state rework patches, shouldn't it?  Just implement
> the *2() versions of the FSAL API calls, and you will get a state with
> each relevant call.
>
> Dan
>
> On Mon, Aug 31, 2015 at 2:34 AM, Soumya Koduri <skod...@redhat.com
> <mailto:skod...@redhat.com>> wrote:
>
> Hi,
>
> As discussed over #ganesha, we are wondering if we can provide any
> identifier unique to each NFS client to FSAL so that it may/can help the
> backend filesystems to validate the reclaim lock requests.
>
> The use-case is to allow reclaim/replay of the lock requests by NFS
> clients after NFS-Ganesha server reboot or failover (particularly if in
> case the backend filesytem hasn't flushed the earlier locks by then).
>
> The unique identifier may be (assuming they remain same across the NFS
> servers in a cluster) -
> NLM caller name
> NFSv4 client identifier
>
> Request to share your inputs/comments.
>
> Thanks,
> Soumya
>
>
> 
> --
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> <mailto:Nfs-ganesha-devel@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
>
>
> --
>
>
>
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Provide NLM/NFSv4 client identifier to FSAL

2015-08-31 Thread Soumya Koduri
Hi,

As discussed over #ganesha, we are wondering if we can provide any 
identifier unique to each NFS client to FSAL so that it may/can help the 
backend filesystems to validate the reclaim lock requests.

The use-case is to allow reclaim/replay of the lock requests by NFS 
clients after NFS-Ganesha server reboot or failover (particularly if in 
case the backend filesytem hasn't flushed the earlier locks by then).

The unique identifier may be (assuming they remain same across the NFS 
servers in a cluster) -
NLM caller name
NFSv4 client identifier

Request to share your inputs/comments.

Thanks,
Soumya


--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] 'clustered' configuration parameter + use of nodeid

2015-06-30 Thread Soumya Koduri
Hi Frank/Malahal,

Re-opening this thread.

On 06/24/2015 04:47 AM, Frank Filz wrote:
 Note that the way Ganesha handles the epoch for clientids is structured in a 
 way that Ganesha doesn't have to be party to the details.

  There is a command line option that allows and external actor 
(presumably the startup script) to pass Ganesha a unique clientid epoch 
for each instance.

This epoch should include some kind of node identifier for a cluster, 
and some kind of time element (either a timestamp or a counter that is 
incremented each

time Ganesha starts on the node) so that each time Ganesha is started on 
a given node, a new epoch is assigned. A cluster global atomic counter 
would also work

  that would issue a new unique epoch to any Ganesha instance in the 
cluster. The epoch is 32 bits.


We have a situation where in we need to support nfs-ganesha servers in a 
clustered environment with all the nodes ntp-configured. So we cannot 
change and set different epoch values for each of them.

The only work-around we are left with is to start nfs-ganesha server 
with a delay on each node of the cluster. But that doesn't entirely 
guarantee that nfs-ganesha servers start_time will not collide 
especially during failover and restart scenarios.

So we would like to know if there is any better way to handle and 
prevent this problem.

May be ServerEpoch could contain 'boot_time + node_id/ip_addr(which can 
be provided via a config option)'.

Please share your thoughts.

Thanks,
Soumya

--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] 'clustered' configuration parameter + use of nodeid

2015-06-24 Thread Soumya Koduri


On 06/24/2015 04:47 AM, Frank Filz wrote:
 As we were discussing over #ganesha, currently 'clustered' mode mandates
 that each of the NFS-Ganesha servers is associated with a nodeid which may
 not be applicable for all the clustering solutions. We may choose to use
 IP_ADDR or hostnames to store persistent state information for each of the
 nodes in the cluster, which would then require this option to be turned off.

 There could be cases where FSAL may like to know if its operating in
 'clustered' mode for any special handling (if needed). This cannot be
 achieved by using existing 'clustered' option (if not using nodeids).

 So we would like to know if we can de-couple 'clustered' option with the
 usage of nodeids and have different option if required to use nodeids.

 Please share your thoughts.

 I definitely agree that the clustered option is misnamed.

 Further, from my investigation, which is not complete, it looks like a pure 
 IP based clustering solution will not currently work with EVENT_TAKEIP for 
 NFS v4.

 For NFS v4, we persist information about each client-id so that we can 
 determine if a client attempting to reclaim state has the right to do so, in 
 particular, that it has not run afoul of the edge conditions documented in 
 Section 9.6.3.4 of RFC 7530.

 The code appears to look for a directory 
 NFS_V4_RECOV_ROOT/gsp-ipaddr/NFS_V4_RECOV_DIR when it receives an 
 EVENT_TAKEIP. But from what I can see, no other code creates or puts anything 
 in such a directory. Instead, it seems that clientid information is only 
 persisted in a directory: NFS_V4_RECOV_ROOT/NFS_V4_RECOV_DIR/nodeid, which is 
 what is searched for EVENT_TAKE_NODE.
But we could sync information from 'NFS_V4_RECOV_ROOT/NFS_V4_RECOV_DIR' 
to 'NFS_V4_RECOV_ROOT/gsp-ipaddr/NFS_V4_RECOV_DIR' before sending 
EVENT_TAKEIP. We are making use of symlinks to do so at the moment.


  From talking with Malahal on IRC, I think the intent of EVENT_TAKE_NODE is 
 that it is broadcast to all nodes that receive an IP address from another 
 node, rather than sending an EVENT_TAKE_IP for each IP address that is moved. 
 That may save some messages, but it's unclear that there aren’t some 
 pitfalls. We would expect the only clients that would attempt reclaim would 
 be those that actually moved (so a node that got an EVENT_RELEASE_IP to 
 failback an IP address would dump the state for those clients associated with 
 that IP address, and the node that we failed back to would get the 
 EVENT_TAKE_NODE). But what conditions do we remove entries from the 
 directory? In the case of the failback, we should only remove the entries 
 belonging to the failed back IP address, the node will have other entries in 
 that directory for the IP addresses that it retains.

 Because of this, it would seem clearer to me if we only had EVENT_TAKE_IP, 
 and that clientid persistent information was retained per IP address.

We can have all the functionality we get with EVENT_TAKE_NODE using 
EVENT_TAKEIP as well. As mentioned above, as long as we make sure 
'NFS_V4_RECOV_ROOT/gsp-ipaddr/NFS_V4_RECOV_DIR' has the relevant state 
information of the node failed (either by using rsync/symlinks etc.,), 
we can let other nfs-ganesha servers read the state information of only 
that particular node by sending D-bus signal with EVENT_TAKEIP. We got 
it worked for our cluster.


 I think now is the time to make sure that Ganesha's clustering infrastructure 
 is flexible and meets multiple implementation needs.
Agree.


 Note that the way Ganesha handles the epoch for clientids is structured in a 
 way that Ganesha doesn't have to be party to the details. There is a command 
 line option that allows and external actor (presumably the startup script) to 
 pass Ganesha a unique clientid epoch for each instance. This epoch should 
 include some kind of node identifier for a cluster, and some kind of time 
 element (either a timestamp or a counter that is incremented each time 
 Ganesha starts on the node) so that each time Ganesha is started on a given 
 node, a new epoch is assigned. A cluster global atomic counter would also 
 work that would issue a new unique epoch to any Ganesha instance in the 
 cluster. The epoch is 32 bits.

Thanks for the note.

 Frank


--
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors 
network devices and physical  virtual servers, alerts via email  sms 
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Problems in /usr/libexec/ganesha/dbus-send.sh and ganesha dbus interface when disabling exports from gluster

2015-06-18 Thread Soumya Koduri



 I have not enabled the FULL_DEBUG but if you need it I can do it.


 Thanks for your perseverance :)

 I have to thank you for the help! :-)
 Cheers,

   Alessandro


 Meghana

 - Original Message -
 From: Alessandro De Salvo alessandro.desa...@roma1.infn.it
 To: Meghana Madhusudhan mmadh...@redhat.com
 Cc: gluster-us...@gluster.org, nfs-ganesha-devel@lists.sourceforge.net, 
 Soumya Koduri skod...@redhat.com
 Sent: Thursday, June 18, 2015 7:24:55 PM
 Subject: Re: [Nfs-ganesha-devel] Problems in 
 /usr/libexec/ganesha/dbus-send.sh and ganesha dbus interface when disabling 
 exports from gluster

 Hi Meghana,

 Il giorno 18/giu/2015, alle ore 07:04, Meghana Madhusudhan 
 mmadh...@redhat.com ha scritto:




 On 06/17/2015 10:57 PM, Alessandro De Salvo wrote:
 Hi,
 when disabling exports from gluster 3.7.1, by using gluster vol set 
 volume ganesha.enable off, I always get the following error:

 Error: Dynamic export addition/deletion failed. Please see log file for 
 details

 This message is produced by the failure of 
 /usr/libexec/ganesha/dbus-send.sh, and in fact if I manually perform the 
 command to remove the share I see:
 you got it wrong. '/usr/libexec/ganesha/dbus-send.sh' is used by
 Gluster-CLI to unexport the volume gluster volume set volname
 ganesha.enable off which rightly deletes the export file too while
 un-exporting the volume.


 # dbus-send --print-reply --system --dest=org.ganesha.nfsd 
 /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport 
 uint16:2
 Error org.freedesktop.DBus.Error.NoReply: Message did not receive a reply 
 (timeout by message bus)

 So, there is a timeout and it fails completely.
 Check if nfs-ganesha is still running. There was a bug in unexporting
 the volume. Its been fixed recently in V2.3-dev, yet to be back-ported
 to V2.2-stable branch.
 https://review.gerrithub.io/#/c/236129/

 Thanks,
 Soumya

 In this case I think there is a bug in /usr/libexec/ganesha/dbus-send.sh, 
 since it blindly deletes the share config if the RemoveExport fails 
 (function check_cmd_status()), but leaves the %include inside ganesha.conf 
 as in the check_cmd_status() there is a runaway condition and the other 
 removal statements are then not executed. I believe the logic should be 
 fixed here, otherwise even a restart of the service will fail due to the 
 bad configuration.

 Yes. I see that the sed -i /$VOL.conf/d $CONF is placed after the 
 check_cmd_status. I shall send a fix upstream in a related bug. But dynamic 
 export removal
 will fail in three cases,
 1. nfs-ganesha is not running.

 no, it was running

 2. The export file that is particular to that volume is somehow deleted 
 before you perform the removal. It does depend on that file to get the 
 export ID.

 I tried to comment out the rm in check_cmd_status to avoid this race 
 condition, but it did not solve the problems.

 3. The bug that Soumya pointed out.

 This might well be the real cause!


 If it is failing consistently, there could be something that you are 
 missing. If you can send the exact sequence of sequence of steps that you 
 have executed,
 I can help you with it.

 Yes, it’s failing consistently, unless as I said I do a DisplayExport before 
 the RemoveExport, in which case it always works.


 Ideally after exporting a particular volume, you'll see an entry in the 
 /etc/ganesha/ganesha.conf file and the export file in 
 /etc/ganesha/exports dir.

 And this works perfectly, I see them correctly.

 If you have this in place and nfs-ganesha running, then dynamic export 
 removal should work just fine.

 But this is not, at least in my case.
 The command I’m using are just the following:

 gluster vol set volume ganesha.enable on
 gluster vol set volume ganesha.enable off

 It normally wait a few seconds between the two commands, to give time to 
 ganesha to actually export the volume.
 The export is always failing as described, unless I add the DisplayExport in 
 dbus-send.sh before RemoveExport.
 Many thanks for the help,

  Alessandro



 Meghana



 What’s more worrying is the problem with the dbus. Issuing a DisplayExport 
 before the RemoveExport apparently fixes the problem, so something like 
 this always works:

 # dbus-send --print-reply --system --dest=org.ganesha.nfsd 
 /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.DisplayExport 
 uint16:2
 # dbus-send --print-reply --system --dest=org.ganesha.nfsd 
 /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport 
 uint16:2

 So, it’s like the DisplayExport is forcing someway a refresh that is 
 needed by the RemoveExport. Any idea why?
 I’m using the latest version of ganesha 2.2.0, i.e. 2.2.0-3.
 Thanks,

Alessandro

 PS: sorry for reporting so many things in a few days :-)



 --



 ___
 Nfs-ganesha-devel mailing list
 Nfs-ganesha-devel@lists.sourceforge.net
 https

Re: [Nfs-ganesha-devel] Problems in /usr/libexec/ganesha/dbus-send.sh and ganesha dbus interface when disabling exports from gluster

2015-06-18 Thread Soumya Koduri


On 06/18/2015 07:39 PM, Malahal Naineni wrote:
 I still have not looked at the log messages, but I see the dbus thread
 waiting for the upcall thread to complete when an export is removed. Is
 there is a time limit on how the upcall thread gets blocked?

A variable called 'destroy_mode' is used to achieve that. This variable 
is set to true during unexport. Upcall thread checks this value in a 
loop before polling for events and exits when set to true.

 In GPFS fsal, we actually send a command to return the upcall thread
 with (THREAD_STOP) and then wait for the upcall thread.

Thanks for the pointer. We plan to optimize our upcall mechanism by 
using callbacks/cond_variables instead of polling in the near future. We 
shall check on reducing the number of threads too dedicated for it.

Thanks,
Soumya

 Regards, Malahal.
 PS: GPFS has only one upcall thread for the entire file system image
 (not for each export).

 Alessandro De Salvo [alessandro.desa...@roma1.infn.it] wrote:
 Hi Malahal,

 Il giorno 17/giu/2015, alle ore 19:51, Malahal Naineni mala...@us.ibm.com 
 ha scritto:

 Alessandro De Salvo [alessandro.desa...@roma1.infn.it] wrote:
 What’s more worrying is the problem with the dbus. Issuing a DisplayExport 
 before the RemoveExport apparently fixes the problem, so something like 
 this always works:

 # dbus-send --print-reply --system --dest=org.ganesha.nfsd 
 /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.DisplayExport 
 uint16:2
 # dbus-send --print-reply --system --dest=org.ganesha.nfsd 
 /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport 
 uint16:2

 So, it’s like the DisplayExport is forcing someway a refresh that is 
 needed by the RemoveExport. Any idea why?
 I’m using the latest version of ganesha 2.2.0, i.e. 2.2.0-3.

 I used the same exact command above (the second one that removes an
 export) after restarting ganesha, and it just worked fine. I use GPFS
 FSAL (neither gluster nor VFS).

 Not sure why you need to use DisplayExport before using RemoveExport.
 Try to trace 'DBUS' component at FULL_DEBUG (or maybe everything at
 FULL_DEBUG) and post the log. The error you reported means we are NOT
 responding to the dbus message which is very odd!

 Indeed! This is why I was worried :-)
 I’m attaching the ganesha.log with NIV_FULL_DEBUG, you can see the restart 
 at the end, due to my attempt of RemoveExport, which always gave me the same 
 error as before.
 Thanks,

  Alessandro





 Reagrds, Malahal.







 --
 ___
 Nfs-ganesha-devel mailing list
 Nfs-ganesha-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Request to merge patches to V2.2-stable branch

2015-06-18 Thread Soumya Koduri


On 06/18/2015 10:33 PM, Malahal Naineni wrote:
 Soumya Koduri [skod...@redhat.com] wrote:

 I thought pthread_exit() always returns a pointer which gets assigned to
 retval of pthread_join(). I assume this is the flow --

 pthread_exit() does take a void *, since it is void *, it is up to
 you whether to really pass some pointer or some casted integer.
 In fact, PTHREAD_CANCELED is actually ((void *) -1)

 pthread_join (thread, void **retval_join)

 pthread_exit (void *retval_exit)

 On exit,
 *retval_join = retval_exit;

 I tried a sample program as per your suggestion.

 int ret1;
 pthread_join (thread, (void **)ret1);

 int ret2 = 1;
 pthread_exit (ret2);

 After pthread_join, ret1 has address of ret2 instead of its value.
 i.e, this led to
 ret1 = ret2;

 Please correct me if I am missing something.

 Had you used pthread_exit((void*)ret2) or better pthread_exit((void*)1),
 ret1 would have the integer 1. This seems to be the standard practice.

 If you really want to pass a pointer to pthread_exit(), make sure that
 the thread that called pthread_join() will be able to access such
 memory. Too much hassle, that is why most people just cast integers if
 they just need integer returns.

 It is up to you but I feel that the code gets unnecessarily complicated
 by returning a real pointer from the upcall thread.


Okay. Thanks for sharing. I shall fix it as part of the improvements 
which we expect to do soon.

Thanks,
Soumya

 Regards, Malahal.


--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Request to merge patches to V2.2-stable branch

2015-06-17 Thread Soumya Koduri
Hi Kaleb/Malahal,

Request you to merge below FSAL_GLUSTER patches into V2.2-stable branch -

366f71c - FSAL_GLUSTER: Fixed an issue with dereferencing a NULL ponter
c4f33d6 - FSAL_GLUSTER : Improvements in acl feature
b1df525 - FSAL_GLUSTER: Stop polling upcall events if not supported

Thanks,
Soumya

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel