Re: [Nfs-ganesha-devel] [NFS-Ganesha-Support] Nfs-ganesha ntirpc crash

2019-10-28 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
I haven't seen anything like this before.  The problem is that the 
resarray in the compound is corrupted somehow (so it's actually a 
Ganesha bug).  Is this reproducible?  Do you have logs?  Do you know the 
workload that triggered this?


Daniel

On 10/22/19 5:57 AM, David C wrote:

Hi All

I've hit a segfault I've not seen before, seems related to ntirpc, 
please see backtrace:


(gdb) bt
#0  xdr_putenum (enumv=address 0x0>, xdrs=0x7fd831761490) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/ntirpc/rpc/xdr.h:584
#1  xdr_enum (xdrs=0x7fd831761490, ep=0x0) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/ntirpc/rpc/xdr_inline.h:405
#2  0x00456749 in xdr_nfs_opnum4 (objp=0x0, xdrs=0x7fd831761490) 
at /usr/src/debug/nfs-ganesha-2.7.3/include/nfsv41.h:8065
#3  xdr_nfs_resop4 (xdrs=0x7fd831761490, objp=0x0) at 
/usr/src/debug/nfs-ganesha-2.7.3/include/nfsv41.h:8433
#4  0x00458afe in xdr_array_encode (cpp=, 
sizep=, xdr_elem=0x456730 , selem=160, 
maxsize=1024, xdrs=0x7fd831761490) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/ntirpc/rpc/xdr_inline.h:851
#5  xdr_array (xdr_elem=0x456730 , selem=160, 
maxsize=1024, sizep=, cpp=, 
xdrs=0x7fd831761490) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/ntirpc/rpc/xdr_inline.h:894
#6  xdr_COMPOUND4res (xdrs=0x7fd831761490, objp=) at 
/usr/src/debug/nfs-ganesha-2.7.3/include/nfsv41.h:8779
#7  0x7fdc0cd0f89b in svc_vc_reply (req=0x7fd831777d30) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_vc.c:887
#8  0x00451337 in nfs_rpc_process_request 
(reqdata=0x7fd831777d30) at 
/usr/src/debug/nfs-ganesha-2.7.3/MainNFSD/nfs_worker_thread.c:1384
#9  0x00450766 in nfs_rpc_decode_request (xprt=0x7fdb1c00a0d0, 
xdrs=0x7fd831f6e190) at 
/usr/src/debug/nfs-ganesha-2.7.3/MainNFSD/nfs_rpc_dispatcher_thread.c:1345
#10 0x7fdc0cd0d07d in svc_rqst_xprt_task (wpe=0x7fdb1c00a2e8) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:769
#11 0x7fdc0cd0d59a in svc_rqst_epoll_events (n_events=out>, sr_rec=0x53136a0) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:941
#12 svc_rqst_epoll_loop (sr_rec=) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:1014
#13 svc_rqst_run_task (wpe=0x53136a0) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:1050
#14 0x7fdc0cd15123 in work_pool_thread (arg=0x7fd86000a960) at 
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/work_pool.c:181
#15 0x7fdc0b2cddd5 in start_thread (arg=0x7fdaf700) at 
pthread_create.c:307
#16 0x7fdc0a444ead in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111


Mem usage on the server was quite high at the time so wonder if that's 
related?


nfs-ganesha-2.7.3-0.1.el7.x86_64
nfs-ganesha-ceph-2.7.3-0.1.el7.x86_64
libcephfs2-14.2.1-0.el7.x86_64
librados2-14.2.1-0.el7.x86_64

Thanks,


___
Support mailing list -- supp...@lists.nfs-ganesha.org
To unsubscribe send an email to support-le...@lists.nfs-ganesha.org





___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] DSESS9002 and DSESS9003 test cases result in NFS4ERR_GRACE

2019-08-12 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Jeff revamped Grace about a year ago as part of his work on clustered 
FSAL_CEPH.  This may have fixed the issue.


Looking through, the only code that doesn't directly check the grace 
state is the NFS4_OP_OPEN code.  If the client sends a CLAIM_NULL but 
has not finished it's reclaim, then we'll return NFS4ERR_GRACE.  My 
quick scan doesn't show us doing anything with that field when Grace 
expires, so it's possible it hangs around...


Daniel

On 8/10/19 12:32 PM, Sriram Patil via Nfs-ganesha-devel wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.


Hi,

Was this fixed in some changes later? I have seen this a couple of times 
in some of our test runs.


I have seen NFS4ERR_GRACE on open even when there are no NFS server 
restarts.


Thanks,
Sriram

*From:* Frank Filz 
*Sent:* Friday, March 16, 2018 5:30 AM
*To:* 'NFS Ganesha Developers' 
*Subject:* [Nfs-ganesha-devel] DSESS9002 and DSESS9003 test cases result 
in NFS4ERR_GRACE


Does anyone know why an open well after Ganesha comes up is returning 
NFS4ERR_GRACE?


I think it has something to do with the cid_reclaim_complete in the NFS 
v4.1 clientid.


Thanks

Frank



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel





___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] [NFS-Ganesha-Support] Error messages

2019-05-17 Thread Daniel Gryniewicz
This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Hi, David.

Answers inline.

On Fri, May 17, 2019 at 11:42 AM David C  wrote:
>
> Hi All
>
> I recently put an nfs-ganesha CEPH_FSAL deployment into production, so far so 
> good but I'm seeing some errors in the logs I didn't see when testing and was 
> hoping someone could shed some light on what they mean. I haven't had any 
> adverse behaviour reported from the clients (apart from a potential issue 
> with slow 'ls' operations which I'm investigating).
>
> Versions:
>
> libcephfs2-13.2.2-0.el7.x86_64
> nfs-ganesha-2.7.1-0.1.el7.x86_64
> nfs-ganesha-ceph-2.7.1-0.1.el7.x86_64
> Ceph cluster is 12.2.10
>
> Log errors:
>
>> "posix2fsal_error :FSAL :INFO :Mapping 11 to ERR_FSAL_DELAY"

This is an INFO, so it's not an error.  You might not want to run a
production deployment at INFO, because it will be a bit chatty.  The
default is to run at EVENT.

>
>
> I'm seeing this one frequently although seems to spam the log with 20 or so 
> occurrences in a second.
>
>> "15/05/2019 18:27:01 : epoch 5cd99ef1 : nfsserver : 
>> ganesha.nfsd-1990[svc_1653] posix2fsal_error :FSAL :INFO :Mapping 5 to 
>> ERR_FSAL_IO, rlim_cur=1048576 rlim_max=1048576
>> 15/05/2019 18:27:01 : epoch 5cd99ef1 : nfsserver : 
>> ganesha.nfsd-1990[svc_1653] nfs4_Errno_verbose :NFS4 :CRIT :Error I/O error 
>> in nfs4_mds_putfh converted to NFS4ERR_IO but was set non-retryable"

The IO error log is there because NFS4ERR_IO is a catchall error,
that's used for lots and lots of situations.  When a client sees
NFS4ERR_IO, it can be really hard to know what caused the error.  This
one has traditionally been a CRIT message.  I think that's high, I
would put it at WARN, but that's the way it is.  Something *did* go
wrong, but not in Ganesha, there was some error in the underlying FS,
and it was returned to the client.

>
>
> I've only seen a few occurrences of this one
>
>> 17/05/2019 15:34:24 : epoch 5cdd9df8 : nfsserver : 
>> ganesha.nfsd-4696[svc_258] xdr_encode_nfs4_princ :ID MAPPER :INFO 
>> :nfs4_gid_to_name failed with code -2.
>> 17/05/2019 15:34:24 : epoch 5cdd9df8 : nfsserver : 
>> ganesha.nfsd-4696[svc_258] xdr_encode_nfs4_princ :ID MAPPER :INFO :Lookup 
>> for 1664 failed, using numeric group

These, again, are INFO, and so don't indicate a true failure anywhere.

>
>
> This one doesn't seem too serious, my guess is there are accounts on the 
> clients with gids/uids that the server can't look up. The server is using 
> SSSD to bind to AD if that helps.

Correct.

Daniel


___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] libntirpc not available anymore?

2019-03-04 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Over the weekend, the ntirpc repo was deleted.  I've restored it, and 
adjusted permissions so it shouldn't happen again.


Daniel

On 3/4/19 4:41 AM, Malahal Naineni wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.


Very strange, you can add my remote 
(https://github.com/malahal/ntirpc.git) and fetch it in src/libntirpc. 
It should have the latest commit needed, so "git submodule update –init 
–recursive” should pass as long as you fetch the remote.


Regards, Malahal.

On Mon, Mar 4, 2019 at 1:07 AM Sriram Patil via Nfs-ganesha-devel 
> wrote:


This list has been deprecated. Please subscribe to the new devel
list at lists.nfs-ganesha.org .

Hi,

__ __

I pulled in the latest ganesha code and tried to pull ntirpc with
“git submodule update –init –recursive”. It is throwing following
error,

__ __

Cloning into 'src/libntirpc'...

Username for 'https://github.com': srirampatil

Password for 'https://srirampa...@github.com': 

remote: Repository not found.

fatal: repository 'https://github.com/nfs-ganesha/ntirpc.git/' not
found

fatal: clone of 'https://github.com/nfs-ganesha/ntirpc.git' into
submodule path 'src/libntirpc' failed

__ __

__ __

The libntirpc repo does not exist anymore in nfs-ganesha account on
github.

__ __

Thanks,

Sriram

___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel





___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] crash in opr_rbtree_insert with nfs-ganesha 2.6.3

2018-10-11 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.

This is not a known case (and, in fact, none of that code has changed in 
about a year).  The issue is that nsm_clnt has been destroyed, but is 
being used.  I don't see how this can happen, since all accesses to 
nsm_clnt are protected by nsm_mutex.


Do you have a reproducer?

Daniel

On 10/10/2018 02:01 AM, Naresh Babu wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.



We are running nfs-ganesha 2.6.3 with custom FSAL and ran into this 
crash. Is this a known issue with libntirpc 1.6.3? Please advise.


(gdb) bt
#0  0x7f56e8004510 in ?? ()
#1  0x7f5769cdb2df in opr_rbtree_insert (head=0x7f56e80016c8, 
node=0x7f56a41c6270) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/libntirpc/src/rbtree.c:271
#2  0x7f5769cd5e9f in clnt_req_setup (cc=0x7f56a41c6240, 
timeout=...) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/libntirpc/src/clnt_generic.c:515
#3  0x00498c04 in nsm_unmonitor (host=0x7f56e8002940) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/Protocols/NLM/nsm.c:219
#4  0x004cdc16 in dec_nsm_client_ref (client=0x7f56e8002940) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/SAL/nlm_owner.c:857
#5  0x004ce574 in free_nlm_client (client=0x7f56e8000e10) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/SAL/nlm_owner.c:1039
#6  0x004ce8ea in dec_nlm_client_ref (client=0x7f56e8000e10) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/SAL/nlm_owner.c:1130
#7  0x004cf0cd in free_nlm_owner (owner=0x7f56e8002a30) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/SAL/nlm_owner.c:1314
#8  0x004aff66 in free_state_owner (owner=0x7f56e8002a30) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/SAL/state_misc.c:818
#9  0x004b04f7 in dec_state_owner_ref (owner=0x7f56e8002a30) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/SAL/state_misc.c:968
#10 0x0049387b in nlm4_Unlock (args=0x7f56a41c5108, 
req=0x7f56a41c4a00, res=0x7f56a41c6070) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/Protocols/NLM/nlm_Unlock.c:127
#11 0x00455e16 in nfs_rpc_process_request 
(reqdata=0x7f56a41c4a00) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/MainNFSD/nfs_worker_thread.c:1329
#12 0x004566e4 in nfs_rpc_valid_NLM (req=0x7f56a41c4a00) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/MainNFSD/nfs_worker_thread.c:1596
#13 0x7f5769cf1329 in svc_vc_decode (req=0x7f56a41c4a00) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/libntirpc/src/svc_vc.c:815
#14 0x00449570 in nfs_rpc_decode_request (xprt=0x7f570c008230, 
xdrs=0x7f56a41c54c0) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1341
#15 0x7f5769cf123a in svc_vc_recv (xprt=0x7f570c008230) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/libntirpc/src/svc_vc.c:788
#16 0x7f5769ceda13 in svc_rqst_xprt_task (wpe=0x7f570c008448) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/libntirpc/src/svc_rqst.c:751
#17 0x7f5769cede8d in svc_rqst_epoll_events (sr_rec=0xeb40c0, 
n_events=2) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/libntirpc/src/svc_rqst.c:923
#18 0x7f5769cee12f in svc_rqst_epoll_loop (sr_rec=0xeb40c0) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/libntirpc/src/svc_rqst.c:996
#19 0x7f5769cee1e2 in svc_rqst_run_task (wpe=0xeb40c0) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/libntirpc/src/svc_rqst.c:1032
#20 0x7f5769cf72f2 in work_pool_thread (arg=0x7f56fc003cd0) at 
/home/vinay/builds/2.6/0916/clfsrepo/external/nfs/src/libntirpc/src/work_pool.c:176
#21 0x7f576a121e25 in start_thread (arg=0x7f569b1f1700) at 
pthread_create.c:308
#22 0x7f57697e334d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:113


(gdb) p head->root
$6 = (struct opr_rbtree_node *) 0x1
(gdb) p node
$7 = (struct opr_rbtree_node *) 0x7f56a41c6270
(gdb) p *node
$8 = {left = 0x6d69546e6f697461, right = 0x30312f3930223d65, parent = 
0x373120383130322f, red = 976827450, gen = 539113778}





___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel





___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Ganesha 2.6.3 Segfault

2018-10-01 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.

I'm not seeing any easy way that cmpf could be corrupted.  The structure 
before it is fairly complex, with it's last element being an integer, so 
it's unlikely that something wrote off the end of that.  That leaves a 
random memory corruption, which is almost impossible to detect.


David, can you rebuild your Ganesha?  If so, can you build with the 
Address Sanitizer on?  To do this, install libasan on your distro, and 
then pass -DSANITIZE_ADDRESS=ON to cmake.  With ASAN enabled, you may 
get a crash at the time of corruption, rather than at some future point.


Daniel

On 10/01/2018 09:20 AM, Malahal Naineni wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.



Looking at the code head->cmpf should be "clnt_req_xid_cmpf" function 
address. Your gdb didn't show that, but I don't know how that could 
happen with the V2.6.3 code though. @Dan, any insights for this issue?


On Mon, Oct 1, 2018 at 2:22 PM David C > wrote:


Hi Malahal

Result of that command:

(gdb) p head->cmpf
$1 = (opr_rbtree_cmpf_t) 0x31fb0b405ba000b7

Thanks,

On Mon, Oct 1, 2018 at 5:55 AM Malahal Naineni mailto:mala...@gmail.com>> wrote:

Looks like the head is messed up. Run these in gdb and let us
know the second commands output. 1. "frame 0"   2.
"p head->cmpf".  I believe, head->cmpf function is NULL or bad
leading to this segfault. I haven't seen this crash before and
never used Ganesha 2.6 version.

Regards, Malahal.

On Mon, Oct 1, 2018 at 1:25 AM David C mailto:dcsysengin...@gmail.com>> wrote:

Hi Malahal

I've set up ABRT so I'm now getting coredumps for the
crashes. I've installed debuginfo package for nfs-ganesha
and libntirpc.

I'd be really grateful if you could give me some guidance on
debugging this.

Some info on the latest crash:

The following was echoed to the kernel log:

traps: ganesha.nfsd[28589] general protection
ip:7fcf2421dded sp:7fcd9d4d03a0 error:0 in
libntirpc.so.1.6.3[7fcf2420d000+3d000]


Last lines of output from # gdb /usr/bin/ganesha.nfsd coredump:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/ganesha.nfsd -L
/var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.c'.
Program terminated with signal 11, Segmentation fault.
#0  0x7fcf2421dded in opr_rbtree_insert
(head=head@entry=0x7fcef800c528,
node=node@entry=0x7fce68004750) at
/usr/src/debug/ntirpc-1.6.3/src/rbtree.c:271
271 switch (head->cmpf(node, parent)) {
Missing separate debuginfos, use: debuginfo-install
bzip2-libs-1.0.6-13.el7.x86_64
dbus-libs-1.10.24-7.el7.x86_64
elfutils-libelf-0.170-4.el7.x86_64
elfutils-libs-0.170-4.el7.x86_64 glibc-2.17-222.el7.x86_64
gssproxy-0.7.0-17.el7.x86_64
keyutils-libs-1.5.8-3.el7.x86_64
krb5-libs-1.15.1-19.el7.x86_64 libattr-2.4.46-13.el7.x86_64
libblkid-2.23.2-52.el7.x86_64 libcap-2.22-9.el7.x86_64
libcom_err-1.42.9-12.el7_5.x86_64
libgcc-4.8.5-28.el7_5.1.x86_64 libgcrypt-1.5.3-14.el7.x86_64
libgpg-error-1.12-3.el7.x86_64
libnfsidmap-0.25-19.el7.x86_64 libselinux-2.5-12.el7.x86_64
libuuid-2.23.2-52.el7.x86_64 lz4-1.7.5-2.el7.x86_64
pcre-8.32-17.el7.x86_64 systemd-libs-219-57.el7.x86_64
xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64

Output from bt:

(gdb) bt
#0  0x7fcf2421dded in opr_rbtree_insert
(head=head@entry=0x7fcef800c528,
node=node@entry=0x7fce68004750) at
/usr/src/debug/ntirpc-1.6.3/src/rbtree.c:271
#1  0x7fcf24218eac in clnt_req_setup
(cc=cc@entry=0x7fce68004720, timeout=...) at
/usr/src/debug/ntirpc-1.6.3/src/clnt_generic.c:515
#2  0x55d62490347f in nsm_unmonitor
(host=host@entry=0x7fce00018ea0) at
/usr/src/debug/nfs-ganesha-2.6.3/src/Protocols/NLM/nsm.c:219
#3  0x55d6249425cf in dec_nsm_client_ref
(client=0x7fce00018ea0) at
/usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:857
#4  0x55d624942f61 in free_nlm_client
(client=0x7fce00017500) at
/usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:1039
#5  0x55d6249431d3 in dec_nlm_client_ref
(client=0x7fce00017500) at

Re: [Nfs-ganesha-devel] 2.6.3 Health status is unhealthy

2018-09-24 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
I think this is due to the low traffic.  What that check says is that we 
got new ops enqueued (1, in this case) but no ops dequeued.  However, 
since there was only 1 op enqueued, I suspect that the issue is that no 
ops came in during the sampling period, except for one right at the end, 
which hasn't been handled yet.


Does this message keep occurring?  Or does it happen only once?

Daniel

On 09/24/2018 12:28 PM, David C wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.



Hi All

CentOS 7.5
nfs-ganesha-vfs-2.6.3-1.el7.x86_64
nfs-ganesha-2.6.3-1.el7.x86_64
libntirpc-1.6.3-1.el7.x86_64

Exporting some directories with VFS FSAL

Nfsv3 only, currently very light traffic (a few clients connecting).

After starting Ganesha the following was logged after about 12 hours:

24/09/2018 12:11:00 : epoch 5ba8165e : fsrv01:
ganesha.nfsd-22835[dbus_heartbeat] nfs_health :DBUS :WARN :Health
status is unhealthy. enq new: 11925, old: 11924; deq new: 11924,
old: 11924


Nfs access still seems fine from the clients.

Could someone point me in the direction of how to diagnose this message 
please?


Thanks,
David




___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel





___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Issue with file locks after upgrade from 2.5.4 to 2.6.2

2018-09-12 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Oh, it looks like rpc.statd is refusing the connection because it 
doesn't think ::1 (the IPv6 localhost address) is localhost.  If so, 
this would be either a bug or a misconfiguration in rpc.statd.


Daniel

On 09/11/2018 11:31 PM, Naresh Babu wrote:
Thanks for the response, Daniel. Ping to ":::10.0.0.7" works fine. 
Do you suspect anything else?


Thanks,
Naresh

On Tue, Sep 11, 2018 at 5:48 AM Daniel Gryniewicz <mailto:d...@redhat.com>> wrote:


My guess is that this is related to IPv6.  IPv6 support in 2.5 was
spotty, but that's been fixed since.  It's clearly using the 4-in-6
address (pretty common on a v6 enabled machine), and I don't believe it
could have used that in 2.5, so that's seems the smoking gun.

Does IPv6 work on your system?  If you ping :::10.0.0.7 on that
box,
does it work?  If the problem is IPv6, you may be able to work
around it
by preferring IPv4 to IPv6.  This is done by adding this line to
/etc/gai.conf:

precedence :::0:0/96  100

Daniel

On 09/11/2018 03:42 AM, Naresh Babu wrote:
 > This list has been deprecated. Please subscribe to the new devel
list at lists.nfs-ganesha.org <http://lists.nfs-ganesha.org>.
 >
 >
 >
 > We have developed a custom FSAL on top of nfs-ganesha 2.5.4
version and
 > lock tests ran fine with that. But, after upgrading nfs-ganesha to
 > 2.6.2, lock tests are failing with the following errors:
 >
 > ganesha.nfsd-1419[svc_70] nsm_monitor :NLM :CRIT :Monitor
 > :::10.0.0.7 SM_MON failed (1)
 >
 > /var/log/messages:
 > Sep 11 07:26:27 mbclvm3 rpc.statd[1439]: SM_MON/SM_UNMON call from
 > non-local host ::1
 > Sep 11 07:26:27 mbclvm3 rpc.statd[1439]: STAT_FAIL to mbclvm3 for
SM_MON
 > of :::10.0.0.7
 >
 > $ rpcinfo -p |grep "status\|lockmgr"
 >      100024    1   udp  46991  status
 >      100024    1   tcp  33715  status
 >      100021    4   udp  45075  nlockmgr
 >      100021    4   tcp  45075  nlockmgr
 >
 >
 >
 >
 > ___
 > Nfs-ganesha-devel mailing list
 > Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
 > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
 >





___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Issue with file locks after upgrade from 2.5.4 to 2.6.2

2018-09-11 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
My guess is that this is related to IPv6.  IPv6 support in 2.5 was 
spotty, but that's been fixed since.  It's clearly using the 4-in-6 
address (pretty common on a v6 enabled machine), and I don't believe it 
could have used that in 2.5, so that's seems the smoking gun.


Does IPv6 work on your system?  If you ping :::10.0.0.7 on that box, 
does it work?  If the problem is IPv6, you may be able to work around it 
by preferring IPv4 to IPv6.  This is done by adding this line to 
/etc/gai.conf:


precedence :::0:0/96  100

Daniel

On 09/11/2018 03:42 AM, Naresh Babu wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.



We have developed a custom FSAL on top of nfs-ganesha 2.5.4 version and 
lock tests ran fine with that. But, after upgrading nfs-ganesha to 
2.6.2, lock tests are failing with the following errors:


ganesha.nfsd-1419[svc_70] nsm_monitor :NLM :CRIT :Monitor 
:::10.0.0.7 SM_MON failed (1)


/var/log/messages:
Sep 11 07:26:27 mbclvm3 rpc.statd[1439]: SM_MON/SM_UNMON call from 
non-local host ::1
Sep 11 07:26:27 mbclvm3 rpc.statd[1439]: STAT_FAIL to mbclvm3 for SM_MON 
of :::10.0.0.7


$ rpcinfo -p |grep "status\|lockmgr"
     100024    1   udp  46991  status
     100024    1   tcp  33715  status
     100021    4   udp  45075  nlockmgr
     100021    4   tcp  45075  nlockmgr




___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel





___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] 2.5 Ganesha with extended API crash

2018-08-20 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Looking at the code, I believe it isn't possible for fh_desc->addr to be 
null.  Here's why:


It comes from mdcache_locate_host(), which gets it from mdc_lookup(). 
This get's it from mdache_copy_fh(), which allocates it with 
gsh_malloc(), which asserts if malloc() fails, so if that field was 
NULL, it would have asserted earlier.  There must be some other cause of 
the crash.


Daniel

On 08/20/2018 10:11 AM, Sagar M D wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.



Hi,

We seen one crash with below stack trace. And core dump was not not 
enabled so i couldn't pin point the issue. But crash may be because of 
below reason and it is my hunch. Is this seen before in any  fsal 
(Ganesha version is 2.5) ?


fsal_status_t 
 create_handle 
(*struct* 
fsal_export 
 
*export_pub 
, 
*struct* gsh_buffdesc 
 
**fh_desc 
*, 
*struct* fsal_obj_handle 
 
**pub_handle 
, 
*struct* attrlist 
 
*attrs_out 
)


*looks like fh->desc->addr is NULL here,* while copying fsal file handle 
it crashed.


16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (1) 
/lib64/libc.so.6(+0x35250) [0x7f8a35890250]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (2) 
/usr/lib64/ganesha/libfsaltdfs.so(copy_ganesha_fh+0) [0x7f8a32d432e4]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (3) 
/usr/lib64/ganesha/libfsaltdfs.so(+0x97f9) 
[0x7f8a32d437f9]   --> create_handle frame.
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (4) 
/usr/bin/ganesha.nfsd(mdcache_locate_host+0x233) [0x543fab]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (5) 
/usr/bin/ganesha.nfsd(mdc_lookup+0x206) [0x54483e]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (6) 
/usr/bin/ganesha.nfsd() [0x53770d]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (7) 
/usr/bin/ganesha.nfsd(fsal_lookupp+0xed) [0x4319bd]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (8) 
/usr/bin/ganesha.nfsd(nfs3_readdirplus+0x5ec) [0x494650]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (9) 
/usr/bin/ganesha.nfsd(nfs_rpc_execute+0x1d53) [0x44c20d]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (10) 
/usr/bin/ganesha.nfsd() [0x44ca17]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (11) 
/usr/bin/ganesha.nfsd() [0x508a7a]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (12) 
/lib64/libpthread.so.0(+0x7dc5) [0x7f8a36290dc5]
16609 4217 08/19 11:34:15 2497585   CvBacktracer() - [bt]: (13) 
/lib64/libc.so.6(clone+0x6d) [0x7f8a3595273d]


Thanks,
Sagar.


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Connection to V4 port gets closed soon after it is established.

2018-08-01 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
So, the issue appears to be that the xprt is created, but no packets are 
ready to be be read on it, so the end of the epoll loop cleans up unused 
xprts.  Since the new xprt has not been used, it's last receive time is 
0, so it is timed out and destroyed.  Something like this should fix it. 
 Can you test?


https://github.com/dang/ntirpc/commit/a78510eccf136ad9072ce0fa2bcc914f975d73d5

On 07/31/2018 07:51 PM, Pradeep wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.



Hi Bill,

I'm seeing a case where the client establishes a connection and sends V4 
NULL request. But before ganesha processes it, the socket is closed by 
TIRPC. Here is the TIRPC logs I captured. The FD 38, xprt: 
0x7fbcd5b71400 is what gets destroyed soon after creation.
Any idea on why? I'm using ganesha version 2.6.5 and TIRPC 1.6.1. I can 
see the same in a tcpdump. This seems to be happening with kerberos 
where rpc.gssd from client talks to ganesha for authentication.


31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :makefd_xprt() 
0x7fbcd5b71400 fd 38 xp_refs 1 af 0 port 4294967295 @ makefd_xprt:347
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :svc_vc_rendezvous() 
0x7fbcd5b71400 fd 38 xp_refs 1 af 10 port 44988 @ svc_vc_rendezvous:469
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :svc_ref_it() 0x7fbce2455800 
fd 15 xp_refs 4 af 0 port 4294967295 @ svc_vc_rendezvous:500
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] nfs_rpc_dispatch_tcp_NFS :DISP :F_DBG :NFS TCP 
request on SVCXPRT 0x7fbcd5b71400 fd 38
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :svc_rqst_evchan_reg:648 locking
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :svc_rqst_hook_events: 
0x7fbcd5b71400 fd 38 xp_refs 1 sr_rec 0x7fbce24eeb10 evchan 1 refcnt 4 
epoll_fd 27 control fd pair (25:26) hook
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :ev_sig: fd 25 sig 0
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :svc_rqst_evchan_reg:671 
unlocking @svc_rqst_evchan_reg:648
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :svc_release_it() 
0x7fbce2455800 fd 15 xp_refs 3 af 0 port 4294967295 @ svc_rqst_xprt_task:754
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_17] rpc :TIRPC :F_DBG :svc_rqst_epoll_event: fd 26 
wakeup (sr_rec 0x7fbce24eeb10)
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :svc_destroy_it() 
0x7fbcd5b71400 fd 38 xp_refs 1 af 10 port 44988 @ svc_rqst_clean_func:781
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_17] rpc :TIRPC :F_DBG :svc_rqst_epoll_event: fd 26 
after consume sig (sr_rec 0x7fbce24eeb10)
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG 
:svc_rqst_xprt_unregister:723 locking
131/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_17] rpc :TIRPC :F_DBG :svc_rqst_epoll_loop: 
epoll_fd 27 before epoll_wait (29000)
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :svc_rqst_unhook_events: 
0x7fbcd5b71400 fd 38 xp_refs 0 sr_rec 0x7fbce24eeb10 evchan 1 refcnt 4 
epoll_fd 27 control fd pair (25:26) unhook
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG 
:svc_rqst_xprt_unregister:733 unlocking @svc_rqst_xprt_unregister:723
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :svc_vc_destroy_it() 
0x7fbcd5b71400 fd 38 xp_refs 0 should actually destroy things @ 
svc_rqst_clean_func:781
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_16] rpc :TIRPC :F_DBG :work_pool_thread() svc_16 
waiting
31/07/2018 12:17:16 : epoch 5b5fa0dc : testvm : 
nfs-ganesha-11564[svc_22] rpc :TIRPC :F_DBG :work_pool_thread() svc_22 
task 0x7fbcd5b71618



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list

Re: [Nfs-ganesha-devel] Umask Syntax

2018-07-20 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
It looks like umask is just an integer.  As such, it should accept 
octal, hex, or decimal input.  It looks like there is residual code 
indicating that this once had special parsing code, but none of that is 
left, and it's just parsed as an integer.


Daniel

On 07/20/2018 09:51 AM, Chris Dos wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
I'm trying to set the Umask config option for VFS.  Man page shows:
umask(mode, range 0 to 0777, default 0)

So I don't know what the mode should be.

My export looks like this:
EXPORT
{
 # Export Id (mandatory, each EXPORT must have a unique Export_Id)
 Export_Id = 10;

 # NFS Protocols
 Protocols = 3,4;
 #Protocols = 3;

 # Transport
 Transports = UDP,TCP;

 # Transfer Size
 MaxRead = 65536;
 MaxWrite = 65536;
 PrefRead = 65536;
 PrefWrite = 65536;

 # Exported path (mandatory)
 Path = /netshares/10gig_storage;

 # Pseudo Path (required for NFS v4)
 Pseudo = /export/10gig_storage;

 # Required for access (default is None)
 # Could use CLIENT blocks instead
 Access_Type = RW;
 Squash = All;
 Anonymous_Uid = 6000;
 Anonymous_Gid = 6000;

 # Exporting FSAL
 FSAL {
 Name = VFS;
 Umask = 0002;
 }

 CLIENT {
 Clients = 172.28.133.0/24;
 }
}


Also, can the init script still be included with the Debian packages so the
Devuan and other sysvinit users can use the packages without having to find
the Debian 8 init script.

Chris

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] File create failing, after mounting from Mac OS

2018-07-13 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
It's trying to set the FATTR4_ARCHIVE attribute, which is a deprecated 
attribute that Ganesha doesn't support.  Since Ganesha doesn't advertise 
support for it, the client should not be setting it.  (Note that the 
Linux kernel NFS server also doesn't support this attribute.)


Daniel

On 07/13/2018 10:19 AM, Suresh kosuru wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.



Hi,

I mounted nfs4 file system on Mac OS. If I try to create a file after 
mounting it is failing with below error in nfs-ganesha logs. Can someone 
please help me debug the issue :


*nfs-ganesha logs:*

13/07/2018 T01:48:31.386718-0700 28560[:::172.30.9.55] [work-4] 3837 
:nfs4_Fattr_Supported_Bitmap :nfs4_Fattr_Supported  ==> 
FATTR4_ARCHIVE supported flag=1 |


13/07/2018 T01:48:31.386722-0700 28560[:::172.30.9.55] [work-4] 3837 
:nfs4_Fattr_Supported_Bitmap :nfs4_Fattr_Supported  ==> 
FATTR4_HIDDEN supported flag=1 |


13/07/2018 T01:48:31.386726-0700 28560[:::172.30.9.55] [work-4] 3837 
:nfs4_Fattr_Supported_Bitmap :nfs4_Fattr_Supported  ==> 
FATTR4_MODE supported flag=1 |


13/07/2018 T01:48:31.386731-0700 28560[:::172.30.9.55] [work-4] 4099 
:*Fattr4_To_FSAL_attr :Attr not supported 14 name=FATTR4_ARCHIVE*


13/07/2018 T01:48:31.386735-0700 28560[:::172.30.9.55] [work-4] 1342 
:nfs4_op_open :*general failure*


13/07/2018 T01:48:31.386739-0700 28560[:::172.30.9.55] [work-4] 1448 
:nfs4_op_open :*failed with status NFS4ERR_ATTRNOTSUPP*


13/07/2018 T01:48:31.386746-0700 28560[:::172.30.9.55] [work-4] 631 
:Copy_nfs4_state_req :OPEN: saving response 0x7fd8900024b0 so_seqid 0 
new seqid 21



*Errors from Mac Terminal:*


      skosurus-MacBook-Pro:tmp skosuru$

      skosurus-MacBook-Pro:tmp skosuru$ mount

      /dev/disk1 on / (hfs, local, journaled)

      devfs on /dev (devfs, local, nobrowse)

      map -hosts on /net (autofs, nosuid, automounted, nobrowse)

      map auto_home on /home (autofs, automounted, nobrowse)

      /dev/disk2s2 on /Volumes/Kyocera OS X 10.5+ Web build 2014.10.20 
(hfs,          local, nodev, nosuid, read-only, noowners, quarantine, 
mounted by skosuru)


      /dev/disk3s1 on /Volumes/Slack.app (hfs, local, nodev, nosuid, 
read-only, noowners, quarantine, mounted by skosuru)


      10.10.108.220:/mapr on /Users/skosuru/Desktop/1234 (nfs, nodev, 
nosuid, mounted by skosuru)


      skosurus-MacBook-Pro:tmp skosuru$ pwd

      /Users/skosuru/Desktop/1234/my.cluster.com/tmp 



*skosurus-MacBook-Pro:tmp skosuru$ touch 123456789*

*     touch: 123456789: Unknown error: 10032*

      skosurus-MacBook-Pro:tmp skosuru$ cd ..

      skosurus-MacBook-Pro:my.cluster.com  
skosuru$ ls -lrt


      total 2

      drwxr-xr-x  2 nobody  nobody  0 Jul 13 12:20 apps

      drwxr-xr-x  3 nobody  nobody  1 Jul 13 12:20 var

      drwxr-xr-x  2 nobody  nobody  0 Jul 13 12:21 user

      drwxr-xr-x  2 nobody  nobody  0 Jul 13 12:21 opt

      drwxrwxrwx  2 nobody  nobody  4 Jul 13 12:37 tmp

      skosurus-MacBook-Pro:my.cluster.com  
skosuru$ cd ..


      skosurus-MacBook-Pro:1234 skosuru$ ls -lrt

      total 1

      drwxrwxrwx  7 nobody  nobody  6 Jul 13 12:21 my.cluster.com 



      skosurus-MacBook-Pro:1234 skosuru$


Thanks & Regards

Suresh Kosuru



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] How to disable MDCACHE FSAL

2018-07-06 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Hi.

The short answer is that you can't.  MDCACHE provides the handle cache, 
which ensures that obj_handle pointers are stable across calls, and 
implements refcounting.  Without it, Ganesha would crash immediately. 
If you really want to, you can implement stable handle caching in your 
FSAL, and then make a local modification to Ganesha to not stack, but 
then you'll have to handle all the refcounting issues yourself.


Daniel

On 07/06/2018 11:35 AM, Tushar Shinde wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Hi,

How can I disable default stacking of mdcache on V2.6-stable branch.

Tushar.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] cb_program or cb_callback_ident always the same

2018-06-15 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
I think you should just use the client_id.  If the client doesn't 
distinguish between them, you can't, and probably shouldn't.


Daniel

On 06/15/2018 09:36 AM, Tuan Viet Nguyen wrote:

Hi Daniel,

Thanks for your prompt reply. I'm trying to implement the lock_op2 fct , 
we have a FS supporting lock owner basing on the host IP & pid of the 
program on the host (and also the file & ranges). That's why while 
converting from the Ganesha lock to FSAL lock, I need to find something 
to simulate the PID. Can you advice?


Thank you

On Fri, Jun 15, 2018 at 3:09 PM, Daniel Gryniewicz <mailto:d...@redhat.com>> wrote:


I don't believe there's any necessity for a client to send different
client_id's for different processes, as long as it can tell locally
which lock is which.  So a server cannot depend on these being
different to do things.

What exactly are you trying to achieve here?  What's the problem
being solved?

Daniel

On 06/15/2018 05:24 AM, Tuan Viet Nguyen wrote:

Hi Daniel,

Thank you for your reply. I've also tried with the client_id but
it also has the same value for 2 different processes. So if the
client_id and the opaque always have the same value (for 2
different processes), how can we distinguish the client?

I've tried with this field

so_owner.so_nfs4_owner.so_clientid

Thank you.
Viet

On Mon, Apr 30, 2018 at 2:38 PM, Daniel Gryniewicz
mailto:d...@redhat.com>
<mailto:d...@redhat.com <mailto:d...@redhat.com>>> wrote:

     This list has been deprecated. Please subscribe to the new
devel
     list at lists.nfs-ganesha.org
<http://lists.nfs-ganesha.org> <http://lists.nfs-ganesha.org>.
     Hi.

     The client program ID in a lock owner is an opaque.  That
is, it's
     not defined in the spec, and the server can't use it for
anything
     other than a byte string.  The concatenation of the
client-ID and
     the opaque part of the lock owner is unique, but the opaque
part of
     the lock owner itself is not.

     That value only has meaning to the client.

     Daniel

     On 04/30/2018 08:18 AM, Tuan Viet Nguyen wrote:

         This list has been deprecated. Please subscribe to the
new devel
         list at lists.nfs-ganesha.org
<http://lists.nfs-ganesha.org> <http://lists.nfs-ganesha.org>.



         Hello,

         While trying to get more information related to the
lock owner,
         I'm trying to get the client program id and realize that it
         always takes the same value (easy to do with a test program
         forking another process, parent lock a file range then
the child
         locks another range). Is it something similar to the client
         process id that is stored in the client record
structure? or any
         other suggestions?

         Thank you



--

         Check out the vibrant tech community on one of the
world's most
         engaging tech sites, Slashdot.org! http://sdm.link/slashdot



         ___
         Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
         <mailto:Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel>

<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel>>




--

     Check out the vibrant tech community on one of the world's most
     engaging tech sites, Slashdot.org! http://sdm.link/slashdot
     ___
     Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
     <mailto:Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
<https://lists.sourceforge.net/lists/listinfo/nfs

Re: [Nfs-ganesha-devel] cb_program or cb_callback_ident always the same

2018-06-15 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
I don't believe there's any necessity for a client to send different 
client_id's for different processes, as long as it can tell locally 
which lock is which.  So a server cannot depend on these being different 
to do things.


What exactly are you trying to achieve here?  What's the problem being 
solved?


Daniel

On 06/15/2018 05:24 AM, Tuan Viet Nguyen wrote:

Hi Daniel,

Thank you for your reply. I've also tried with the client_id but it also 
has the same value for 2 different processes. So if the client_id and 
the opaque always have the same value (for 2 different processes), how 
can we distinguish the client?


I've tried with this field

so_owner.so_nfs4_owner.so_clientid

Thank you.
Viet

On Mon, Apr 30, 2018 at 2:38 PM, Daniel Gryniewicz <mailto:d...@redhat.com>> wrote:


This list has been deprecated. Please subscribe to the new devel
list at lists.nfs-ganesha.org <http://lists.nfs-ganesha.org>.
Hi.

The client program ID in a lock owner is an opaque.  That is, it's
not defined in the spec, and the server can't use it for anything
other than a byte string.  The concatenation of the client-ID and
the opaque part of the lock owner is unique, but the opaque part of
the lock owner itself is not.

That value only has meaning to the client.

Daniel

On 04/30/2018 08:18 AM, Tuan Viet Nguyen wrote:

This list has been deprecated. Please subscribe to the new devel
list at lists.nfs-ganesha.org <http://lists.nfs-ganesha.org>.



Hello,

While trying to get more information related to the lock owner,
I'm trying to get the client program id and realize that it
always takes the same value (easy to do with a test program
forking another process, parent lock a file range then the child
locks another range). Is it something similar to the client
process id that is stored in the client record structure? or any
other suggestions?

Thank you



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel>




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel>





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] ACE permission check

2018-05-29 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
If you take a look at code protected by ENABLE_VFS_DEBUG_ACL, you can 
see a test implementation of ACL enforcement I added to VFS a few years 
ago.  It should be as complete as can be done using mode.  It may have 
bit-rotted a bit by now.


Daniel

On 05/27/2018 07:40 AM, Sagar M D wrote:

Frank/Daniel,

Creds are correct,  ACL implementation in our FSAL is new, basic RWX 
permission is enforced in our FSAL on this path.
Currently ACL's are not enforced in our Filesystem. We are relying on 
ganesha to enforce ACL.
can our FSAL make call to /fsal_test_access instead (wherever we need to 
enforce) ?


/
/P.S:- Our filesystem is accessed either through nfs Ganesha or our own 
nfs server (anyone will be active at given time).

/
/
Thanks,
/
/Sagar.
/


On Fri, May 25, 2018 at 10:14 PM, Frank Filz > wrote:


This list has been deprecated. Please subscribe to the new devel
list at lists.nfs-ganesha.org .
> This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-
> ganesha.org .
> So, the access check is, of course, advisory to the client.  It doesn't 
have to
> make one at all, but can just issue the rename, and expect it to succeed 
or fail
> based on permissions.  I'm not sure why the client does an access and 
then still
> does a rename, but it ultimately doesn't matter, I think.
> 
> We don't do an extra access check in the rename path, because it could race

> with a permissions change anyway.  Instead, we rely on the FSAL's
> rename() call to properly enforce permissions.  This is the way many 
calls work in
> the FSAL API, to avoid those races.
> 
> Does your rename() call not enforce permissions?  Or did it somehow succeed in

> spite of that?  Were the wrong creds passed in?

Yea, while Ganesha does some permission checking itself, it MUST
depend on the FSAL and its underlying filesystem to permission check
any directory operations. This is due to the issues of making a
permission check atomic with the operation.

What kind of ACLs does your FSAL and filesystem implement? Does the
filesystem have mechanisms to enforce the ACL?

Do you set/pass the user credentials for the operations? See what
FSAL_VFS and FSAL_GLUSTER do for examples.

Frank

> > By looking at nfs-Ganesha code, permission check (ACL) happens
> > access_check.c. Our FSAL (not in tree FSAL), storing and serving the
> > ACLs to Ganesha.
> >
> > I see an issue with rename:
> > Even though i set deny ACE for "delete child" on folder1 for user1.
> > user1 is able to rename file belongs to user2.
> >
> > I see below RPC:-
> > ACCESS request folder1
> > ACCESS denied (as expected.) (denied for DELETE_CHILD permission)
> > Rename request Rename succeed
> >
> > I'm not sure why client is sending rename even after receiving  ACCESS
> > Denied.
> >
> > Native nfs denies rename though.
> >
> > Any help is appreciated here.
> >
> > Thanks,
> > Sagar.
> >
> >
> >
> >
> > --
 > >  Check out the vibrant tech community on one of the world's
 > > most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
 > >
 > >
 > >
 > > ___
 > > Nfs-ganesha-devel mailing list
 > > Nfs-ganesha-devel@lists.sourceforge.net

 > > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

 > >
 >
 >
 >

--
 > Check out the vibrant tech community on one of the world's most
engaging tech
 > sites, Slashdot.org! http://sdm.link/slashdot
 > ___
 > Nfs-ganesha-devel mailing list
 > Nfs-ganesha-devel@lists.sourceforge.net

 > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] ACE permission check

2018-05-25 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
So, the access check is, of course, advisory to the client.  It doesn't 
have to make one at all, but can just issue the rename, and expect it to 
succeed or fail based on permissions.  I'm not sure why the client does 
an access and then still does a rename, but it ultimately doesn't 
matter, I think.


We don't do an extra access check in the rename path, because it could 
race with a permissions change anyway.  Instead, we rely on the FSAL's 
rename() call to properly enforce permissions.  This is the way many 
calls work in the FSAL API, to avoid those races.


Does your rename() call not enforce permissions?  Or did it somehow 
succeed in spite of that?  Were the wrong creds passed in?


Daniel

On 05/25/2018 07:36 AM, Sagar M D wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.



Hi,

By looking at nfs-Ganesha code, permission check (ACL) happens 
access_check.c. Our FSAL (not in tree FSAL), storing and serving the 
ACLs to Ganesha.


I see an issue with rename:
Even though i set deny ACE for "delete child" on folder1 for user1. 
user1 is able to rename file belongs to user2.


I see below RPC:-
ACCESS request folder1
ACCESS denied (as expected.) (denied for DELETE_CHILD permission)
Rename request
Rename succeed

I'm not sure why client is sending rename even after receiving  ACCESS 
Denied.


Native nfs denies rename though.

Any help is appreciated here.

Thanks,
Sagar.




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Selinux denials with version 2.6.1-0.1.el7

2018-05-15 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Hi, Muminul

Ganesha needed selinux policy added to allow it access to it's log and 
recovery directories.  This was added in Fedora, but I don't know if it 
was added in Centos, or in what versions.


I suspect you're going to have to add exceptions for Ganesha to access 
/var/log/ganesha and /var/lib/nfs/ganesha (and maybe /etc/ganesha) to 
allow it to run when selinux is enforcing.


Daniel

On 05/11/2018 02:47 PM, Muminul Islam Russell wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Hello All,

I am using nfs-ganesha version 2.6.1-0.1.el7 to mount with VFS and
CEPH. But unable to mount the FS with SELinux enabled. Mount works
fine with SELinux disabled.

I can see hundreds of SELinux denials in the audit log. Below are some
of the lines.

type=AVC msg=audit(1525811393.227:335): avc:  denied  {
dac_read_search } for  pid=2819 comm="master" capability=2
scontext=system_u:system_r:postfix_master_t:s0
tcontext=system_u:system_r:postfix_master_t:s0 tclass=capability
permissive=0

type=AVC msg=audit(1525808830.798:237): avc:  denied  { open } for
pid=3516 comm="ganesha.nfsd" path="/var/log/ganesha/ganesha.log"
dev="dm-0" ino=840795 scontext=system_u:system_r:ganesha_t:s0
tcontext=system_u:object_r:var_log_t:s0 tclass=file permissive=0


Could anyone help me how to resolve this issue.?

Thanks,
Muminul

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] cb_program or cb_callback_ident always the same

2018-04-30 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
Hi.

The client program ID in a lock owner is an opaque.  That is, it's not 
defined in the spec, and the server can't use it for anything other than 
a byte string.  The concatenation of the client-ID and the opaque part 
of the lock owner is unique, but the opaque part of the lock owner 
itself is not.


That value only has meaning to the client.

Daniel

On 04/30/2018 08:18 AM, Tuan Viet Nguyen wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.



Hello,

While trying to get more information related to the lock owner, I'm 
trying to get the client program id and realize that it always takes the 
same value (easy to do with a test program forking another process, 
parent lock a file range then the child locks another range). Is it 
something similar to the client process id that is stored in the client 
record structure? or any other suggestions?


Thank you


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Is the mailing list dead? Or just quiet?

2018-04-20 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
On 04/20/2018 09:40 AM, TomK wrote:

On 4/20/2018 9:27 AM, Frank Filz wrote:
This list has been deprecated. Please subscribe to the new devel list 
at lists.nfs-ganesha.org.

This list has been deprecated. Please subscribe to the new devel list at

lists.nfs-

ganesha.org.
All I've gotten this week has been gerrit notifications.  Is everyone 
just

quiet?  Or

did something happen to the list (or my subscription)?


I think it's just been quiet.

Frank


-- 


Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Not everyone flipped over I'm guessing.

However, I get this message when trying to send:

An error occurred while sending mail. The mail server responded:
Requested action not taken: mailbox unavailable
invalid DNS MX or A/ resource record.
  Please check the message recipient "de...@nfs-ganesha.org" and try again.



My fault, I typo'd the new list.  It's de...@lists.nfs-ganesha.org

Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Is the mailing list dead? Or just quiet?

2018-04-20 Thread Daniel Gryniewicz

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
All I've gotten this week has been gerrit notifications.  Is everyone 
just quiet?  Or did something happen to the list (or my subscription)?


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Ganesh 2.3 : NFSv4 client gets error NFS4ERR_OLD_STATEID

2018-04-09 Thread Daniel Gryniewicz
So, NFS4ERR_OLD_STATEID can only happen in one circumstance: when the 
State presented by the client doesn't match the State that Ganesha 
expects.  In this case, it's the sequence number that's off-by-one. 
This could be the result of a replay, but the code checks for this, and 
the owners must be different in this case.


So, what seams to have happened is that the old state was destroyed, and 
a new owner got a new state, and then the client presented the old state 
again.


I'm not an expert in this code, so maybe I'm missing something?

(Note, the code is the same in 2.7, so no bugs have been fixed in this code)

Daniel

On 04/09/2018 05:09 AM, Sachin Punadikar wrote:

Hi All,
As reported by customer, NFSv4 client, if left open the session 
overnight, gets nothing for command "pwd". Customer is rebooting the 
client to overcome this issue in the next morning.
Is this expected to happen ? If not is it related to any configuration 
setting ? Ganesha bug / Client bug ?


This is related to Ganesha 2.3. The related logs indicates 
error NFS4ERR_OLD_STATEID. Below is the FULL_DEBUG log:

-
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] nfs4_Compound :NFS4 :DEBUG :Request 1: 
opcode 25 is OP_READ
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] nfs4_Compound :NFS4 :M_DBG :NFS4: MID DEBUG: 
Check export perms export = 00f0 req = 0010
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] nfs4_Is_Fh_Invalid :FH :F_DBG :NFS4 Handle 
(53:0x43000800303a00010028000a8026ca59d277870202d7b3f900f6a7110ade460100)
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] nfs4_Is_Fh_Invalid :FH :F_DBG :NFS4 Handle 
0x0 export id 8
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] nfs4_Check_Stateid :STATE :F_DBG :Check READ 
stateid flags ALL_0 ALL_1 CURRENT
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] state_id_value_hash_func :STATE :F_DBG :val = 4
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] state_id_rbt_hash_func :STATE :F_DBG :rbt = 
328053
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] hashtable_getlatch :RW LOCK :F_DBG :Got 
write lock on 0xaf11a8 (&(ht->partitions[index].lock)) at 
/home/ppsbld/ttn423.171127.111356/ttn423.ganesha-rpmdir/BUILD/nfs-ganesha-2.3.2-ibm53-0.1.1-Source/hashtable/hashtable.c:477
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] key_locate :HT CACHE :F_DBG :hash hit index 
4 slot 383
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] compare_state_id :STATE :F_DBG 
:{OTHER=0x010087000500f201 {{CLIENTID Epoch=0x00050087 
Counter=0x0001} StateIdCounter=0x01f2}} vs 
{OTHER=0x010087000500f201 {{CLIENTID Epoch=0x00050087 
Counter=0x0001} StateIdCounter=0x01f2}}
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] display_stateid :RW LOCK :F_DBG :Acquired 
mutex 0x7f33fc002330 (>state_mutex) at 
/home/ppsbld/ttn423.171127.111356/ttn423.ganesha-rpmdir/BUILD/nfs-ganesha-2.3.2-ibm53-0.1.1-Source/SAL/nfs4_state_id.c:190
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] display_stateid :RW LOCK :F_DBG :Released 
mutex 0x7f33fc002330 (>state_mutex) at 
/home/ppsbld/ttn423.171127.111356/ttn423.ganesha-rpmdir/BUILD/nfs-ganesha-2.3.2-ibm53-0.1.1-Source/SAL/nfs4_state_id.c:192
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] hashtable_getlatch :STATE :F_DBG :Get (null) 
returning Value=0x7f33fc002300 {STATE 0x7f33fc002300 
OTHER=0x010087000500f201 {{CLIENTID Epoch=0x00050087 
Counter=0x0001} StateIdCounter=0x01f2} entry=0x7f343c001280 
type=SHARE seqid=2 owner={STATE_OPEN_OWNER_NFSV4 0x7f34200012d0: 
clientid={0x7f3428000cb0 ClientID={Epoch=0x00050087 Counter=0x0001} 
CONFIRMED Client={0x7f3428000bb0 name=(44:Linux NFSv4.0 x.x.x.x/x.x.x.x 
tcp) refcount=1} t_delta=0 reservations=0 refcount=13 cb_prog=1073741824 
r_addr=x.x.x.x.x.x r_netid=tcp} 
owner=(24:0x6f70656e2069643a002e00028f32d4178918) 
confirmed=1 seqid=470159 refcount=37} refccount=1}
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] hashtable_releaselatched :RW LOCK :F_DBG 
:Unlocked 0xaf11a8 (>partitions[latch->index].lock) at 
/home/ppsbld/ttn423.171127.111356/ttn423.ganesha-rpmdir/BUILD/nfs-ganesha-2.3.2-ibm53-0.1.1-Source/hashtable/hashtable.c:543
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] get_state_entry_export_owner_refs :RW LOCK 
:F_DBG :Acquired mutex 0x7f33fc002330 (>state_mutex) at 
/home/ppsbld/ttn423.171127.111356/ttn423.ganesha-rpmdir/BUILD/nfs-ganesha-2.3.2-ibm53-0.1.1-Source/SAL/nfs4_state.c:538
2018-03-13 14:37:28 : epoch 00050087 : nfs02 : 
ganesha.nfsd-10335[work-77] get_state_entry_export_owner_refs 

Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2018-04-04 Thread Daniel Gryniewicz
Okay, thanks.  That confirms to me that we need to do something else. 
I'll start to look into this ASAP.


Daniel

On 04/04/2018 12:37 PM, Pradeep wrote:

Hi Daniel,

I tried increasing lanes to 1023. The usage looks better, but still over 
the limit:


$2 = {entries_hiwat = 10, entries_used = 299838, chunks_hiwat = 
10, chunks_used = 1235, fds_system_imposed = 1048576,
   fds_hard_limit = 1038090, fds_hiwat = 943718, fds_lowat = 524288, 
futility = 0, per_lane_work = 50, biggest_window = 419430,

   prev_fd_count = 39434, prev_time = 1522775283, caching_fds = true}

I'm trying to simulate build workload by running SpecFS SWBUILD 
workload. This is with Ganesha 2.7 and FSAL_VFS. The server has 
4CPU/12GB Memory.


For build 8 (40 processes), the latency increased from 5ms (with 17 
lanes) to 22 ms (with 1023 lanes) and the test failed to achieve 
required IOPs.


Thanks,
Pradeep

On Tue, Apr 3, 2018 at 7:58 AM, Pradeep <pradeeptho...@gmail.com 
<mailto:pradeeptho...@gmail.com>> wrote:


Hi Daniel,

Sure I will try that.

One thing I tried is to not allocate new entries and return
NFS4ERR_DELAY in the hope that the increased refcnt at LRU is
temporary. This worked for some time; but then I hit a case where I
see all the entries at the LRU of L1 has a refcnt of 2 and the
subsequent entries have a refcnt of 1. All L2's were empty. I realized
that whenever a new entry is created, the refcnt is 2 and it is put at
the LRU. Also promotions from L2 moves them to LRU of L1. So it is
likely that many threads may end up finding no entries at LRU and end
allocating new entries.

Then I tried another experiment: Invoke lru_wake_thread() when the
number of entries is greater than entries_hiwat; but still allocate a
new entry for the current thread. This worked. I had to make a change
in lru_run() to allow demotion in case of 'entries > entries_hiwat' in
addition to max FD check. The side effect would be that it will close
FDs and demote to L2. Almost all of these FDs are opened in the
context of setattr/getattr; so attributes are already in cache and FDs
are probably useless until the cache expires.  I think your idea of
moving further down the lane may be a better approach.

I will try your suggestion next. With 1023 lanes, it is unlikely that
all lanes will have an active entry.

Thanks,
Pradeep

On 4/3/18, Daniel Gryniewicz <d...@redhat.com
<mailto:d...@redhat.com>> wrote:
 > So, the way this is supposed to work is that getting a ref when
the ref
 > is 1 is always an LRU_REQ_INITIAL ref, so that moves it to the
MRU.  At
 > that point, further refs don't move it around in the queue, just
 > increment the refcount.  This should be the case, because
 > mdcache_new_entry() and mdcache_find_keyed() both get an INITIAL ref,
 > and all other refs require you to already have a pointer to the entry
 > (and therefore a ref).
 >
 > Can you try something, since you have a reproducer?  It seems
that, with
 > 1.7 million files, 17 lanes may be a bit low.  Can you try with
 > something ridiculously large, like 1023, and see if that makes a
 > difference?
 >
 > I suspect we'll have to add logic to move further down the lanes if
 > futility hits.
 >
 > Daniel
 >
 > On 04/02/2018 12:30 PM, Pradeep wrote:
 >> We discussed this a while ago. I'm running into this again with
2.6.0.
 >> Here is a snapshot of the lru_state (I set the max entries to 10):
 >>
 >> {entries_hiwat = 20, entries_used = 1772870, chunks_hiwat =
10,
 >> chunks_used = 16371, lru_reap_l1 = 8116842,
 >>    lru_reap_l2 = 1637334, lru_reap_failed = 1637334,
attr_from_cache =
 >> 31917512, attr_from_cache_for_client = 5975849,
 >>    fds_system_imposed = 1048576, fds_hard_limit = 1038090,
fds_hiwat =
 >> 943718, fds_lowat = 524288, futility = 0, per_lane_work = 50,
 >>    biggest_window = 419430, prev_fd_count = 0, prev_time =
1522647830,
 >> caching_fds = true}
 >>
 >> As you can see it has grown well beyond the limlt set (1.7
million vs
 >> 200K max size). lru_reap_failed indicates number of times the reap
 >> failed from L1 and L2.
 >> I'm wondering what can cause the reap to fail once it reaches a
steady
 >> state. It appears to me that the entry at LRU (head of the queue) is
 >> actually being used (refcnt > 1) and there are entries in the
queue with
 >> refcnt == 1. But those are not being looked at. My understanding
is that
 >> if an entry is accessed, it must move to MRU (tail of the
queue). Any
 >> idea why the entry at LR

Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2018-04-03 Thread Daniel Gryniewicz
So, the way this is supposed to work is that getting a ref when the ref 
is 1 is always an LRU_REQ_INITIAL ref, so that moves it to the MRU.  At 
that point, further refs don't move it around in the queue, just 
increment the refcount.  This should be the case, because 
mdcache_new_entry() and mdcache_find_keyed() both get an INITIAL ref, 
and all other refs require you to already have a pointer to the entry 
(and therefore a ref).


Can you try something, since you have a reproducer?  It seems that, with 
1.7 million files, 17 lanes may be a bit low.  Can you try with 
something ridiculously large, like 1023, and see if that makes a difference?


I suspect we'll have to add logic to move further down the lanes if 
futility hits.


Daniel

On 04/02/2018 12:30 PM, Pradeep wrote:
We discussed this a while ago. I'm running into this again with 2.6.0. 
Here is a snapshot of the lru_state (I set the max entries to 10):


{entries_hiwat = 20, entries_used = 1772870, chunks_hiwat = 10, 
chunks_used = 16371, lru_reap_l1 = 8116842,
   lru_reap_l2 = 1637334, lru_reap_failed = 1637334, attr_from_cache = 
31917512, attr_from_cache_for_client = 5975849,
   fds_system_imposed = 1048576, fds_hard_limit = 1038090, fds_hiwat = 
943718, fds_lowat = 524288, futility = 0, per_lane_work = 50,
   biggest_window = 419430, prev_fd_count = 0, prev_time = 1522647830, 
caching_fds = true}


As you can see it has grown well beyond the limlt set (1.7 million vs 
200K max size). lru_reap_failed indicates number of times the reap 
failed from L1 and L2.
I'm wondering what can cause the reap to fail once it reaches a steady 
state. It appears to me that the entry at LRU (head of the queue) is 
actually being used (refcnt > 1) and there are entries in the queue with 
refcnt == 1. But those are not being looked at. My understanding is that 
if an entry is accessed, it must move to MRU (tail of the queue). Any 
idea why the entry at LRU can have a refcnt > 1?


This can happen if the refcnt is incremented without QLOCK and if 
lru_reap_impl() is called at the same time from another thread, it will 
skip the first entry and return NULL. This was done 
in _mdcache_lru_ref() which could cause the refcnt on the head of the 
queue to be incremented while some other thread looks at it holding a 
QLOCK. I tried moving the increment/dequeue in _mdcache_lru_ref() inside 
QLOCK; but that did not help.


Also if "get_ref()" is called for the entry at the LRU for some reason, 
it will just increment refcnt and return. I think the assumption is that 
by the time "get_ref() is called, the entry is supposed to be out of LRU.



Thanks,
Pradeep



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFSv4 recovery directory path

2018-04-03 Thread Daniel Gryniewicz
If you can handle PID and Cache being moved, you can override 
SYSSTATEDIR on the cmake line.  Other than that, not currently.


There's no good reason why it's not configurable, just that no one has 
needed it before.


Daniel

On 04/02/2018 05:33 AM, Sriram Patil wrote:

Hi,

Currently the NFSv4 recovery directory root path is as good as hard 
coded because it depends on the CMAKE_INSTALL_PREFIX. It is can be found 
in config.h after running cmake.  I tried specifying a 
–DNFS_V4_RECOV_ROOT= when compiling, but it does not work because 
config.h overwrites the variable. Writing a new recovery backend seems 
like too much of a hassle for changing just the path.


Is there any way (may be a conf param I am missing) to change the 
recovery root directory?


Thanks,

Sriram



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-23 Thread Daniel Gryniewicz

On 03/23/2018 09:58 AM, William Allen Simpson wrote:

On 3/23/18 7:59 AM, Daniel Gryniewicz wrote:

Thanks, Tomk.  PR is here: https://review.gerrithub.io/404945



Actually, it seems fairly elegant.

ntirpc and rdma also have the USE_ and _USE_ convention.  Both
require libraries, and would benefit from defaults with
enforcement checking for the cmake parameter line.

How hard would it be to convert?  Or would you prefer waiting
until these FSALs are pulled, and then try next week?


NTIRPC is a special case, since it's sometimes a submodule and sometimes 
a library from the system.  For this case, we have USE_SYSTEM_NTIRPC, 
which will always fail if the package isn't installed.


I skipped RDMA this time, since it actually has 2 triggers: USE_NFS_RDMA 
and USE_9P_RDMA.  I have a list of cleanup/fixes to work on after this 
bit drops, and RDMA is on that list to look at.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-23 Thread Daniel Gryniewicz

Thanks, Tomk.  PR is here: https://review.gerrithub.io/404945

Daniel

On 03/22/2018 05:39 PM, TomK wrote:

On 3/22/2018 12:50 PM, Daniel Gryniewicz wrote:
A side note, happy to test for you guy's once you have this done.

No, we can't.  I'm working on a set of macros that make this work (as 
far as I can tell) and aren't *too* ugly.  Hopefully, this will work out.


Daniel

On 03/22/2018 12:13 PM, Malahal Naineni wrote:
That could be a reason why I thought we need two symbols for a 
feature. For example, USE_GPFS_FSAL could be used at cmake command 
line and GPFS_FSAL could be used in the option().


Can't we use this option() inside conditionals?

regards, malahal.

On Thu, Mar 22, 2018 at 8:43 PM, Daniel Gryniewicz <d...@redhat.com 
<mailto:d...@redhat.com>> wrote:


    I don't think this works because of option().  This defines the
    value to it's default (OFF if no default is given), so the value is
    always defined.  We can skip using option, but this will break
    anyone using any tools to automate their cmake.

    What we need is for option() to distinguish between using the
    default value and having it configured on.

    I can play with this a bit and see if I can get something to work,
    but it will be ugly, since cmake doesn't natively support this.

    Daniel

    On 03/22/2018 10:50 AM, Malahal Naineni wrote:

    Here is what I wanted. Let us say there is a compilation feature
    called USE_FSAL_GPFS. I want these possibilities:

    1. If I enable this feature at the "cmake command line", enable
    this. If it can't be enabled due to missing packages, then
    please fail cmake!
    2. If I disable this feature at the "cmake command line", please
    disable it. This is easy.
    3. If I neither enable nor disable at the cmake command line,
    then it can be auto-enabled if sufficient packages are 
installed.


    I am not sure if the following works for what I am thinking of
    (I added braces for clarity):

    if (DEFINED USE_FSAL_GPFS) {
            if (USE_FSAL_GPFS) {
                  case A: admin wants it. Check headers and libs
    (or packages). If it can't be enabled, fail.
           else () {
                  case B:  admin doesn't want it
           }
    else () {# not defined by the admin
           case C: We want to enable this feature if required
    packages are installed.
           case D:  We don't care, just disable
    }

    I don't know if DEFINED keyword works the way I want it though.
    Note that case A is the only one that fails here.

    Regards, Malahal.







    On Thu, Mar 22, 2018 at 5:33 PM, Daniel Gryniewicz
    <d...@redhat.com <mailto:d...@redhat.com>
    <mailto:d...@redhat.com <mailto:d...@redhat.com>>> wrote:

     So, there is an option STRICT_PACKAGE that is supposed to
    enable
     this. It's not fully utilized, but it's mostly there.

     The problem is that we can't tell whether the default is
    being used
     (lots of options are on by default but disable themselves
    if the
     packages aren't installed) or if the user explicitly turned
    them on.
     CMake doesn't seem to give us that information, that I've
    found.     So, instead, we have STRICT_PACKAGE, and you'll have
    to explicitly
     turn off everything that's on by default but that you don't
    want.

     If you know of a better way of doing this, then I'm happy
    to listen
     and help implement it.

     Daniel

     On 03/22/2018 12:28 AM, Malahal Naineni wrote:

         If I specify an option on the cmake command line, I
    would like
         it to be honoured, if not, simply fail.
    Today,  cmake only gives
         a warning if it can't meet my option's requirements.
    Can some
         cmake guru fix this first?

             On Tue, Mar 20, 2018 at 8:38 PM, Daniel Gryniewicz
         <d...@redhat.com <mailto:d...@redhat.com>
    <mailto:d...@redhat.com <mailto:d...@redhat.com>>
         <mailto:d...@redhat.com <mailto:d...@redhat.com>
    <mailto:d...@redhat.com <mailto:d...@redhat.com>>>> wrote:

              It's probably a good idea to add the build options
    to --version
              output, or something.  That way we can ask for it
    in these
         types of
              situations.  I've added a card to the wishlist for
    this.

              Daniel

              On Tue, Mar 20, 2018 at 9:39 AM, TomK
    <tomk...@mdevsys.com <mailto:tomk...@mdevsys.com>
         <mailto:tomk..

Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-22 Thread Daniel Gryniewicz
No, we can't.  I'm working on a set of macros that make this work (as 
far as I can tell) and aren't *too* ugly.  Hopefully, this will work out.


Daniel

On 03/22/2018 12:13 PM, Malahal Naineni wrote:
That could be a reason why I thought we need two symbols for a feature. 
For example, USE_GPFS_FSAL could be used at cmake command line and 
GPFS_FSAL could be used in the option().


Can't we use this option() inside conditionals?

regards, malahal.

On Thu, Mar 22, 2018 at 8:43 PM, Daniel Gryniewicz <d...@redhat.com 
<mailto:d...@redhat.com>> wrote:


I don't think this works because of option().  This defines the
value to it's default (OFF if no default is given), so the value is
always defined.  We can skip using option, but this will break
anyone using any tools to automate their cmake.

What we need is for option() to distinguish between using the
default value and having it configured on.

I can play with this a bit and see if I can get something to work,
but it will be ugly, since cmake doesn't natively support this.

Daniel

On 03/22/2018 10:50 AM, Malahal Naineni wrote:

Here is what I wanted. Let us say there is a compilation feature
called USE_FSAL_GPFS. I want these possibilities:

1. If I enable this feature at the "cmake command line", enable
this. If it can't be enabled due to missing packages, then
please fail cmake!
2. If I disable this feature at the "cmake command line", please
disable it. This is easy.
3. If I neither enable nor disable at the cmake command line,
then it can be auto-enabled if sufficient packages are installed.

I am not sure if the following works for what I am thinking of
(I added braces for clarity):

if (DEFINED USE_FSAL_GPFS) {
            if (USE_FSAL_GPFS) {
                  case A: admin wants it. Check headers and libs
(or packages). If it can't be enabled, fail.
           else () {
                  case B:  admin doesn't want it
           }
else () {# not defined by the admin
           case C: We want to enable this feature if required
packages are installed.
           case D:  We don't care, just disable
}

I don't know if DEFINED keyword works the way I want it though.
Note that case A is the only one that fails here.

Regards, Malahal.







On Thu, Mar 22, 2018 at 5:33 PM, Daniel Gryniewicz
<d...@redhat.com <mailto:d...@redhat.com>
<mailto:d...@redhat.com <mailto:d...@redhat.com>>> wrote:

     So, there is an option STRICT_PACKAGE that is supposed to
enable
     this. It's not fully utilized, but it's mostly there.

     The problem is that we can't tell whether the default is
being used
     (lots of options are on by default but disable themselves
if the
     packages aren't installed) or if the user explicitly turned
them on.
     CMake doesn't seem to give us that information, that I've
found.     So, instead, we have STRICT_PACKAGE, and you'll have
to explicitly
     turn off everything that's on by default but that you don't
want.

     If you know of a better way of doing this, then I'm happy
to listen
     and help implement it.

     Daniel

     On 03/22/2018 12:28 AM, Malahal Naineni wrote:

         If I specify an option on the cmake command line, I
would like
         it to be honoured, if not, simply fail.
Today,  cmake only gives
         a warning if it can't meet my option's requirements.
Can some
         cmake guru fix this first?

         On Tue, Mar 20, 2018 at 8:38 PM, Daniel Gryniewicz
         <d...@redhat.com <mailto:d...@redhat.com>
<mailto:d...@redhat.com <mailto:d...@redhat.com>>
         <mailto:d...@redhat.com <mailto:d...@redhat.com>
<mailto:d...@redhat.com <mailto:d...@redhat.com>>>> wrote:

              It's probably a good idea to add the build options
to --version
              output, or something.  That way we can ask for it
in these
         types of
              situations.  I've added a card to the wishlist for
this.

              Daniel

              On Tue, Mar 20, 2018 at 9:39 AM, TomK
<tomk...@mdevsys.com <mailto:tomk...@mdevsys.com>
         <mailto:tomk...@mdevsys.com <mailto:tomk...@mdevsys.com>>
              <mailto:tomk...@mdevsys.com
<mailto:tomk...@mdevsys.com> <mailto:tomk...@mdevsys.com
<mailto:tomk...@mdevsys.com>>>>

Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-22 Thread Daniel Gryniewicz
I don't think this works because of option().  This defines the value to 
it's default (OFF if no default is given), so the value is always 
defined.  We can skip using option, but this will break anyone using any 
tools to automate their cmake.


What we need is for option() to distinguish between using the default 
value and having it configured on.


I can play with this a bit and see if I can get something to work, but 
it will be ugly, since cmake doesn't natively support this.


Daniel

On 03/22/2018 10:50 AM, Malahal Naineni wrote:
Here is what I wanted. Let us say there is a compilation feature called 
USE_FSAL_GPFS. I want these possibilities:


1. If I enable this feature at the "cmake command line", enable this. If 
it can't be enabled due to missing packages, then please fail cmake!
2. If I disable this feature at the "cmake command line", please disable 
it. This is easy.
3. If I neither enable nor disable at the cmake command line, then it 
can be auto-enabled if sufficient packages are installed.


I am not sure if the following works for what I am thinking of (I added 
braces for clarity):


if (DEFINED USE_FSAL_GPFS) {
           if (USE_FSAL_GPFS) {
                 case A: admin wants it. Check headers and libs (or 
packages). If it can't be enabled, fail.

          else () {
                 case B:  admin doesn't want it
          }
else () {# not defined by the admin
          case C: We want to enable this feature if required packages 
are installed.

          case D:  We don't care, just disable
}

I don't know if DEFINED keyword works the way I want it though. Note 
that case A is the only one that fails here.


Regards, Malahal.







On Thu, Mar 22, 2018 at 5:33 PM, Daniel Gryniewicz <d...@redhat.com 
<mailto:d...@redhat.com>> wrote:


So, there is an option STRICT_PACKAGE that is supposed to enable
this. It's not fully utilized, but it's mostly there.

The problem is that we can't tell whether the default is being used
(lots of options are on by default but disable themselves if the
packages aren't installed) or if the user explicitly turned them on.
CMake doesn't seem to give us that information, that I've found. 
So, instead, we have STRICT_PACKAGE, and you'll have to explicitly

turn off everything that's on by default but that you don't want.

If you know of a better way of doing this, then I'm happy to listen
and help implement it.

Daniel

On 03/22/2018 12:28 AM, Malahal Naineni wrote:

If I specify an option on the cmake command line, I would like
it to be honoured, if not, simply fail. Today,  cmake only gives
a warning if it can't meet my option's requirements. Can some
cmake guru fix this first?

On Tue, Mar 20, 2018 at 8:38 PM, Daniel Gryniewicz
<d...@redhat.com <mailto:d...@redhat.com>
<mailto:d...@redhat.com <mailto:d...@redhat.com>>> wrote:

     It's probably a good idea to add the build options to --version
     output, or something.  That way we can ask for it in these
types of
     situations.  I've added a card to the wishlist for this.

     Daniel

     On Tue, Mar 20, 2018 at 9:39 AM, TomK <tomk...@mdevsys.com
<mailto:tomk...@mdevsys.com>
     <mailto:tomk...@mdevsys.com <mailto:tomk...@mdevsys.com>>>
wrote:
      > On 3/19/2018 9:54 AM, Frank Filz wrote:
      >>>
      >>> Solved.
      >>>
      >>> Here's the solution in case it can help someone else.
      >>>
      >>> To get a certain feature in NFS Ganesha, I had to
compile the V2.6
      >>> release from source.  When configuring to compile, idmapd
     support got
      >>> disabled since packages were missing:
      >>>
      >>> libnfsidmap-devel-0.25-17.el7.x86_64
      >>>
      >>> Installed the above package and recompiled with nfsidmap
     support enabled
      >>> and this issue went away.  Users now show up properly
off the
     NFS mount
      >>> on clients.
      >>
      >>
      >> Oh, well that was a simple fix :-)
      >>
      >> I wonder if we could make changes in our cmake files to
make it
     easier to
      >> see when stuff got left out due to missing packages?
I've been
     caught out
      >> myself.
      >>
      >> Frank
      >>
      > Yep, sure was an easy fix.
 

Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-22 Thread Daniel Gryniewicz
So, there is an option STRICT_PACKAGE that is supposed to enable this. 
It's not fully utilized, but it's mostly there.


The problem is that we can't tell whether the default is being used 
(lots of options are on by default but disable themselves if the 
packages aren't installed) or if the user explicitly turned them on. 
CMake doesn't seem to give us that information, that I've found.  So, 
instead, we have STRICT_PACKAGE, and you'll have to explicitly turn off 
everything that's on by default but that you don't want.


If you know of a better way of doing this, then I'm happy to listen and 
help implement it.


Daniel

On 03/22/2018 12:28 AM, Malahal Naineni wrote:
If I specify an option on the cmake command line, I would like it to be 
honoured, if not, simply fail. Today,  cmake only gives a warning if it 
can't meet my option's requirements. Can some cmake guru fix this first?


On Tue, Mar 20, 2018 at 8:38 PM, Daniel Gryniewicz <d...@redhat.com 
<mailto:d...@redhat.com>> wrote:


It's probably a good idea to add the build options to --version
output, or something.  That way we can ask for it in these types of
situations.  I've added a card to the wishlist for this.

Daniel

On Tue, Mar 20, 2018 at 9:39 AM, TomK <tomk...@mdevsys.com
<mailto:tomk...@mdevsys.com>> wrote:
 > On 3/19/2018 9:54 AM, Frank Filz wrote:
 >>>
 >>> Solved.
 >>>
 >>> Here's the solution in case it can help someone else.
 >>>
 >>> To get a certain feature in NFS Ganesha, I had to compile the V2.6
 >>> release from source.  When configuring to compile, idmapd
support got
 >>> disabled since packages were missing:
 >>>
 >>> libnfsidmap-devel-0.25-17.el7.x86_64
 >>>
 >>> Installed the above package and recompiled with nfsidmap
support enabled
 >>> and this issue went away.  Users now show up properly off the
NFS mount
 >>> on clients.
 >>
 >>
 >> Oh, well that was a simple fix :-)
 >>
 >> I wonder if we could make changes in our cmake files to make it
easier to
 >> see when stuff got left out due to missing packages? I've been
caught out
 >> myself.
 >>
 >> Frank
 >>
 > Yep, sure was an easy fix.
 >
 > Wouldn't mind seeing that.  Maybe even a way to find out what
options went
 > into compiling packages for each distro.
 >
 >
 > --
 > Cheers,
 > Tom K.
 >

-
 >
 > Living on earth is expensive, but it includes a free trip around
the sun.
 >





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-20 Thread Daniel Gryniewicz
It's probably a good idea to add the build options to --version
output, or something.  That way we can ask for it in these types of
situations.  I've added a card to the wishlist for this.

Daniel

On Tue, Mar 20, 2018 at 9:39 AM, TomK  wrote:
> On 3/19/2018 9:54 AM, Frank Filz wrote:
>>>
>>> Solved.
>>>
>>> Here's the solution in case it can help someone else.
>>>
>>> To get a certain feature in NFS Ganesha, I had to compile the V2.6
>>> release from source.  When configuring to compile, idmapd support got
>>> disabled since packages were missing:
>>>
>>> libnfsidmap-devel-0.25-17.el7.x86_64
>>>
>>> Installed the above package and recompiled with nfsidmap support enabled
>>> and this issue went away.  Users now show up properly off the NFS mount
>>> on clients.
>>
>>
>> Oh, well that was a simple fix :-)
>>
>> I wonder if we could make changes in our cmake files to make it easier to
>> see when stuff got left out due to missing packages? I've been caught out
>> myself.
>>
>> Frank
>>
> Yep, sure was an easy fix.
>
> Wouldn't mind seeing that.  Maybe even a way to find out what options went
> into compiling packages for each distro.
>
>
> --
> Cheers,
> Tom K.
> -
>
> Living on earth is expensive, but it includes a free trip around the sun.
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Release V2.6.1

2018-03-20 Thread Daniel Gryniewicz
I'd like to announce the release of the latest stable version of Ganesha: 2.6.1

This is a bugfix release, and includes the following commits:

e2664ee61 (HEAD -> V2.6-stable, tag: V2.6.1) V2.6.1
cf0b642d0 Pullup to ntirpc 1.6.2
1a6daaf34 PROXY: add sample config file
9f4ef2c96 MDCACHE - Initialize dirent structs in entry early
e8aae805e systemd: use sighup, not dbus to reload config
739ebc85d RADOS_KV: do copy in rados_kv_get before releasing read op
fb5f859ba build: compile conf_lex.c with _GNU_SOURCE to get caddr_t definition
88803a26d specfile: fix libcephfs-devel and librgw-devel BuildRequires
1f3f98bb1 doc: fix typo in ganesha-config manpage

Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup NTIRPC through #124

2018-03-20 Thread Daniel Gryniewicz
Yeah, except for drastic changes, we cannot take much away from
variations in Jenkins runs, I think.  Even if we know what hardware
it's running on, we can't know how loaded it is.

Daniel

On Tue, Mar 20, 2018 at 4:59 AM, Niels de Vos  wrote:
> On Tue, Mar 20, 2018 at 08:59:21AM +0100, Dominique Martinet wrote:
>> Hi,
>>
>> Girjesh Rajoria wrote on Mon, Mar 19, 2018 at 10:47:09PM +0530:
>> >> + tail -21 ../dbenchTestLog.txt
>> >>
>> >>  OperationCountAvgLatMaxLat
>> >>  --
>> >>  Deltree102 9.79927.590
>> >>  Flush   284316 1.637   203.259
>> >>  Close  2979801 0.007 0.330
>> >>  LockX13208 0.007 0.079
>> >>  Mkdir   51 0.011 0.059
>> >>  Rename  171774 0.073 0.463
>> >>  ReadX  6358865 0.01038.319
>> >>  WriteX 2022375 0.04840.888
>> >>  Unlink  819204 0.09038.363
>> >>  UnlockX  13208 0.006 0.063
>> >>  FIND_FIRST 1421549 0.04438.320
>> >>  SET_FILE_INFORMATION330438 0.024 0.310
>> >>  QUERY_FILE_INFORMATION  644319 0.004 0.242
>> >>  QUERY_PATH_INFORMATION 3676827 0.01540.851
>> >>  QUERY_FS_INFORMATION674193 0.01037.783
>> >>  NTCreateX  4056560 0.049   122.097
>> >>
>> >>
>> >> Where are the iozone results from ../ioZoneLog.txt?
>> >
>> > iozone suite doesn't give outputs result as dbench. So iozone test checks
>> > for successful completion and print message of success in the log. In cases
>> > where the test fails, it'll print error due to which test failed from
>> > ../ioZoneLog.txt.
>>
>> I think it's great to have this kind of dbench stats, and would be
>> awesome if we can have some raw figures from iozone as well (I think it
>> can output the results in csv format at least?)
>>
>>
>> jenkins can also take performance metrics from jobs and we could have
>> graphs of the performance over time if it keeps these metrics a bit
>> longer than the actual jobs (for example with the performance plugin[1],
>> but there might be other ways)
>>
>> On an individual basis as the tests are on VMs with various loads the
>> results will probably flicker a bit, but on a whole we should be able to
>> identify what week(s) introduced slowdowns/speedups after the fact quite
>> nicely if we can achieve that! :)
>
> Note that the tests in the CentOS CI run on different physical hosts.
> When a test is started, one or more machines are requested, and the
> scheduler (called Duffy) just returns a random system. This means that
> the performance results might differ quite a bit between runs, even for
> the same change-set.
>
> See https://wiki.centos.org/QaWiki/PubHardware for details about the
> hardware.
>
> So except for the performance results, it may be useful to gather some
> details about the hardware that was used.
>
> Niels
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-15 Thread Daniel Gryniewicz
100k is a much more accurate measurement.  I haven't gotten any
crashes since the fixes from yesterday, but I can keep trying.


On Thu, Mar 15, 2018 at 12:10 PM, William Allen Simpson
<william.allen.simp...@gmail.com> wrote:
> On 3/15/18 10:23 AM, Daniel Gryniewicz wrote:
>>
>> Can you try again with a larger count, like 100k?  500 is still quite
>> small for a loop benchmark like this.
>>
> In the code, I commented that 500 is minimal.  I've done a pile of
> 100, 200, 300, and they perform roughly the same as 500.
>
> rpcping tcp localhost count=100 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 46812.8194, total 46812.8194
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 41285.4267, total 41285.4267
>
> 100k is a lot less (when it works).
>
> tests/rpcping tcp localhost -c 10
> rpcping tcp localhost count=10 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 15901.7190, total 15901.7190
> tests/rpcping tcp localhost -c 10
> rpcping tcp localhost count=10 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 15894.9971, total 15894.9971
>
> tests/rpcping tcp localhost -c 10 -t 2
> double free or corruption (out)
> Aborted (core dumped)
>
> tests/rpcping tcp localhost -c 10 -t 2
> double free or corruption (out)
> corrupted double-linked list (not small)
> Aborted (core dumped)
>
> Looks like we have a nice dump test case! ;)

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-15 Thread Daniel Gryniewicz
Can you try again with a larger count, like 100k?  500 is still quite
small for a loop benchmark like this.

Daniel

On Thu, Mar 15, 2018 at 9:02 AM, William Allen Simpson
 wrote:
> On 3/14/18 3:33 AM, William Allen Simpson wrote:
>>
>> rpcping tcp localhost threads=1 count=500 (port=2049 program=13
>> version=3 procedure=0): mean 51285.7754, total 51285.7754
>
>
> DanG pushed the latest code onto ntirpc this morning, and I'll submit a
> pullup for Ganesha later today.
>
> I've changed the calculations to be in the final loop, holding onto
> the hope that the original design of averaging each threat result
> might introduce quantization errors.  But it didn't significantly
> change the results.
>
> I've improved the pretty print a bit, now including the worker pool.
> The default 5 worker threads are each handling the incoming replies
> concurrently, so they hopefully keep working without a thread switch.
>
> Another thing I've noted is that the best result is almost always the
> first result after an idle period.  That's opposite of my expectations.
>
> Could it be that the default Ganesha worker pool size of 200 (default)
> or 500 (configured) is much too large, thread scheduler thrashing?
>
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 50989.4139, total 50989.4139
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 32562.0173, total 32562.0173
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 34479.7577, total 34479.7577
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 34070.8189, total 34070.8189
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 33861.2689, total 33861.2689
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 35843.8433, total 35843.8433
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 35367.2721, total 35367.2721
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 31642.2972, total 31642.2972
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 34738.4166, total 34738.4166
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 33211.7319, total 33211.7319
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 35000.5520, total 35000.5520
> rpcping tcp localhost count=500 threads=1 workers=5 (port=2049
> program=13 version=3 procedure=0): mean 36557.6578, total 36557.6578

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread Daniel Gryniewicz

rpcping was not thread safe.  I have fixes for it incoming.

Daniel

On 03/13/2018 12:13 PM, William Allen Simpson wrote:

On 3/13/18 2:38 AM, William Allen Simpson wrote:

In my measurements, using the new CLNT_CALL_BACK(), the client thread
starts sending a stream of pings.  In every case, it peaks at a
relatively stable rate.


DanG suggested that timing was dominated by the system time calls.

The previous numbers were switched to a finer grained timer than
the original code.  JeffL says that clock_gettime() should have had
negligible overhead.

But just to make sure, I've eliminated the per thread timers and
substituted one before and one after.  Unlike previously, this
will include the overhead of setting up the client, in addition to
completing all the callback returns.

Same result.  More calls ::= slower times.

rpcping tcp localhost threads=1 count=1000 (port=2049 program=13 
version=3 procedure=0): average 36012.0254, total 36012.0254
rpcping tcp localhost threads=1 count=1500 (port=2049 program=13 
version=3 procedure=0): average 33720.9125, total 33720.9125
rpcping tcp localhost threads=1 count=2000 (port=2049 program=13 
version=3 procedure=0): average 25604.7542, total 25604.7542
rpcping tcp localhost threads=1 count=3000 (port=2049 program=13 
version=3 procedure=0): average 21170.0836, total 21170.0836
rpcping tcp localhost threads=1 count=5000 (port=2049 program=13 
version=3 procedure=0): average 18163.2451, total 18163.2451


Including the 3-way handshake time for setting up the clients does affect
the overall throughput numbers.

rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 
version=3 procedure=0): average 10379.3976, total 20758.7951
rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 
version=3 procedure=0): average 10746.9395, total 21493.8790


rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 
version=3 procedure=0): average 5473.3780, total 16420.1339
rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 
version=3 procedure=0): average 5886.5549, total 17659.6646


rpcping tcp localhost threads=5 count=1500 (port=2049 program=13 
version=3 procedure=0): average 3396.9438, total 16984.7190
rpcping tcp localhost threads=5 count=1500 (port=2049 program=13 
version=3 procedure=0): average 3455.3026, total 17276.5131



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Better late than never - US Daylight Savings Time has started and that means weekly conference call is an hour earlier

2018-03-13 Thread Daniel Gryniewicz

An hour later...

Daniel

On 03/13/2018 10:02 AM, Frank Filz wrote:



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfsv3 client writing file gets Invalid argument on glusterfs with quota on

2018-03-08 Thread Daniel Gryniewicz

On 03/07/2018 10:21 PM, Kinglong Mee wrote:

On 2018/3/7 21:10, Daniel Gryniewicz wrote:

On 03/06/2018 10:10 PM, Kinglong Mee wrote:

On 2018/3/7 10:59, Kinglong Mee wrote:

When using nfsv3 on glusterfs-3.13.1-1.el7.x86_64 and 
nfs-ganesha-2.6.0-0.2rc3.el7.centos.x86_64,
I gets strange "Invalid argument" when writing file.

1. With quota disabled;
nfs client mount nfs-ganesha share, and do 'll' in the testing directory.

2. Enable quota;
# getfattr -d -m . -e hex /root/rpmbuild/gvtest/nfs-ganesha/testfile92
getfattr: Removing leading '/' from absolute path names
# file: root/rpmbuild/gvtest/nfs-ganesha/testfile92
trusted.gfid=0xe2edaac0eca8420ebbbcba7e56bbd240
trusted.gfid2path.b3250af8fa558e66=0x3966313434352d653530332d343831352d396635312d3236633565366332633137642f7465737466696c653932
trusted.glusterfs.quota.9f1445ff-e503-4815-9f51-26c5e6c2c17d.contri.3=0x0201

Notice: testfile92 without trusted.pgfid xattr.


The trusted.pgfid will be created by the next name lookup; nameless lookup 
don't create it.


3. restart glusterfs volume by "gluster volume stop/start gvtest"


Restarting glusterfsd here cleanup all inode cache from memory;
after starting, inode of testfile92's parent is NULL.


4. echo somedata > testfile92


Because, nfs-ganesha and nfs client has cache for testfile92,
before write fops, no name lookup happens that trusted.pgfid is not created for 
testfile92.

Quota_writev call quota_build_ancestry building the ancestry in 
quota_check_limit,
but testfile92 doesn't contain trusted.pgfid, so that write fops failed with 
Invalid argument.

I have no idea of fixing this problem, any comments are welcome.



I think, ideally, Gluster would send an invalidate upcall under the 
circumstances, causing Ganesha do drop it's cached entry.


It doesn't work.
I try to restarting nfs-ganesha, and echo data to testfile92, the problem also 
exists.

After nfs-ganesha restart,
1. A GETATTR is send from nfs-client for the testfile92, ganesha translates it 
to nameless lookup.
2. A ACCESS gets attributes from nfs-ganesha's cache (cached by #1).
3. A SETATTR sets the testfile92's size to 0, ganesha translates it to setattr 
fop.
4. A WRITE also get Invalid argument error.

If ganesha drops its cache, nfs client may write file by filehandle;
ganesha lookup it by nameless lookup from glusterfs,
so that, trusted.pgfid isn't created too.

I think, a name lookup is needed for testfile92 after quota enable.

thanks,
Kinglong Mee



If a name lookup is needed, then gluster cannot provide NFS semantics in 
these circumstances.  NFS *requires* that access by handle continue to 
work once the client has a handle.  In fact, there's no way to get a 
name for this, since the name doesn't exist anywhere in the handle (by 
design, since a name is an attribute of a dirent, not of a handle/inode).


So, this likely is a bug in Gluster, and needs to be fixed there.  Would 
it be possible to enable quota globally at the start as a workaround?


Daniel



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfsv3 client writing file gets Invalid argument on glusterfs with quota on

2018-03-07 Thread Daniel Gryniewicz

On 03/06/2018 10:10 PM, Kinglong Mee wrote:

On 2018/3/7 10:59, Kinglong Mee wrote:

When using nfsv3 on glusterfs-3.13.1-1.el7.x86_64 and 
nfs-ganesha-2.6.0-0.2rc3.el7.centos.x86_64,
I gets strange "Invalid argument" when writing file.

1. With quota disabled;
nfs client mount nfs-ganesha share, and do 'll' in the testing directory.

2. Enable quota;
# getfattr -d -m . -e hex /root/rpmbuild/gvtest/nfs-ganesha/testfile92
getfattr: Removing leading '/' from absolute path names
# file: root/rpmbuild/gvtest/nfs-ganesha/testfile92
trusted.gfid=0xe2edaac0eca8420ebbbcba7e56bbd240
trusted.gfid2path.b3250af8fa558e66=0x3966313434352d653530332d343831352d396635312d3236633565366332633137642f7465737466696c653932
trusted.glusterfs.quota.9f1445ff-e503-4815-9f51-26c5e6c2c17d.contri.3=0x0201

Notice: testfile92 without trusted.pgfid xattr.


The trusted.pgfid will be created by the next name lookup; nameless lookup 
don't create it.


3. restart glusterfs volume by "gluster volume stop/start gvtest"


Restarting glusterfsd here cleanup all inode cache from memory;
after starting, inode of testfile92's parent is NULL.


4. echo somedata > testfile92


Because, nfs-ganesha and nfs client has cache for testfile92,
before write fops, no name lookup happens that trusted.pgfid is not created for 
testfile92.

Quota_writev call quota_build_ancestry building the ancestry in 
quota_check_limit,
but testfile92 doesn't contain trusted.pgfid, so that write fops failed with 
Invalid argument.

I have no idea of fixing this problem, any comments are welcome.



I think, ideally, Gluster would send an invalidate upcall under the 
circumstances, causing Ganesha do drop it's cached entry.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Multiprotocol support in ganesha

2018-03-06 Thread Daniel Gryniewicz
So, in fact, both ganesha and samba can have the same file open at the 
same time (just as any 2 processes can).  This will, of course, cause 
issues if both are modifying the same sections of the file.  This is why 
file locking was invented.  NFSv3 (via NLM) and NFSv4 (built in) have 
locking modes that are compatible with SMB locking, so as long as the 
clients use those, it "should" work fine.  Of course, there's going to 
be issues, since this isn't tested well (or maybe at all).  I have some 
idea that the Gluster team at one point tested this; they support 
exporting Gluster from both ganesha and samba, although I don't believe 
they support exporting the same FS at the same time.


So the short answer is that it theoretically can work, and it actually 
may work, but it likely won't, without some work.


Daniel

On 03/06/2018 01:01 PM, Pradeep wrote:

Hi Daniel,

What I meant is a use case where some one needs to access the same 
export through NFS protocol using Ganesha server and SMB protocol using 
Samba server. Both Samba and Ganesha are running on the same server. 
Obviously, file can't be open by both ganesha and samba; so we need to 
close the open FDs (if those are for caching). Linux provides oplock 
(fcntl() with F_SETLEASE) for processes to get notification on other 
processes trying to open and this can be used to synchronize with Samba.
Samba seems to support this already: 
https://github.com/samba-team/samba/blob/master/source3/smbd/oplock_linux.c


Thanks,

On Tue, Mar 6, 2018 at 9:29 AM, Daniel Gryniewicz <d...@redhat.com 
<mailto:d...@redhat.com>> wrote:


Ganesha has multi-protocol (NFS3, NFS4, and 9P).  There are no plans
to add CIFS, since that is an insanely complicated protocol, and has
a userspace daemon implementation already (in the form of Samba).  I
personally wouldn't reject such support if it was offered, but as
far as I know, no one is even thinking about working on it.

Daniel


On 03/06/2018 12:20 PM, Pradeep wrote:

Hello,

Is there plans to implement multiprotocol (NFS and CIFS
accessing same export/share) in ganesha? I believe current FD
cache will need changes to support that.

Thanks,
Pradeep



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel>




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel>





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Multiprotocol support in ganesha

2018-03-06 Thread Daniel Gryniewicz
Ganesha has multi-protocol (NFS3, NFS4, and 9P).  There are no plans to 
add CIFS, since that is an insanely complicated protocol, and has a 
userspace daemon implementation already (in the form of Samba).  I 
personally wouldn't reject such support if it was offered, but as far as 
I know, no one is even thinking about working on it.


Daniel

On 03/06/2018 12:20 PM, Pradeep wrote:

Hello,

Is there plans to implement multiprotocol (NFS and CIFS accessing same 
export/share) in ganesha? I believe current FD cache will need changes 
to support that.


Thanks,
Pradeep


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-06 Thread Daniel Gryniewicz
Based on the error messages, you client is not sending t...@nix.my.dom 
but is sending t...@my.dom@localdomain.  Something is mis-configured on 
the client.  Have you tried having identical (including case) 
idmapd.conf files on both the client and server?


Idmap configuration has historically be very picky and hard to set up, 
and I'm far from an expert on it.


Daniel

On 03/06/2018 08:24 AM, TomK wrote:

Hey Guy's,

Getting below message which in turn fails to list proper UID / GID on 
NFSv4 mounts from within an unprivileged account. All files show up with 
owner and group as nobody / nobody when viewed from the client.


Wondering if anyone saw this and what the solution could be here?

If not the right list, let me know please.

[root@client01 etc]# cat /etc/idmapd.conf|grep -v "#"| sed -e "/^$/d"
[General]
Verbosity = 7
Domain = nix.my.dom
[Mapping]
[Translation]
[Static]
[UMICH_SCHEMA]
LDAP_server = ldap-server.local.domain.edu
LDAP_base = dc=local,dc=domain,dc=edu
[root@client01 etc]#

Mount looks like this:

nfs-c01.nix.my.dom:/n/my.dom on /n/my.dom type nfs4 
(rw,relatime,vers=4.0,rsize=8192,wsize=8192,namlen=255,hard,proto=tcp,port=0,timeo=10,retrans=2,sec=sys,clientaddr=192.168.0.236,local_lock=none,addr=192.168.0.80) 



/var/log/messages

Mar  6 00:17:27 client01 nfsidmap[14396]: key: 0x3f2c257b type: uid 
value: t...@my.dom@localdomain timeout 600
Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: calling 
nsswitch->name_to_uid
Mar  6 00:17:27 client01 nfsidmap[14396]: nss_getpwnam: name 
't...@my.dom@localdomain' domain 'nix.my.dom': resulting localname '(null)'
Mar  6 00:17:27 client01 nfsidmap[14396]: nss_getpwnam: name 
't...@my.dom@localdomain' does not map into domain 'nix.my.dom'
Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: 
nsswitch->name_to_uid returned -22
Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: final return 
value is -22
Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: calling 
nsswitch->name_to_uid
Mar  6 00:17:27 client01 nfsidmap[14396]: nss_getpwnam: name 
'nob...@nix.my.dom' domain 'nix.my.dom': resulting localname 'nobody'
Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: 
nsswitch->name_to_uid returned 0
Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: final return 
value is 0
Mar  6 00:17:27 client01 nfsidmap[14398]: key: 0x324b0048 type: gid 
value: t...@my.dom@localdomain timeout 600
Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: calling 
nsswitch->name_to_gid
Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: 
nsswitch->name_to_gid returned -22
Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: final return 
value is -22
Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: calling 
nsswitch->name_to_gid
Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: 
nsswitch->name_to_gid returned 0
Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: final return 
value is 0

Mar  6 00:17:31 client01 systemd-logind: Removed session 23.




Result of:

systemctl restart rpcidmapd

/var/log/messages
---
Mar  5 23:46:12 client01 systemd: Stopping Automounts filesystems on 
demand...

Mar  5 23:46:13 client01 systemd: Stopped Automounts filesystems on demand.
Mar  5 23:48:51 client01 systemd: Stopping NFSv4 ID-name mapping service...
Mar  5 23:48:51 client01 systemd: Starting Preprocess NFS configuration...
Mar  5 23:48:51 client01 systemd: Started Preprocess NFS configuration.
Mar  5 23:48:51 client01 systemd: Starting NFSv4 ID-name mapping service...
Mar  5 23:48:51 client01 rpc.idmapd[14117]: libnfsidmap: using domain: 
nix.my.dom
Mar  5 23:48:51 client01 rpc.idmapd[14117]: libnfsidmap: Realms list: 
'NIX.MY.DOM'
Mar  5 23:48:51 client01 rpc.idmapd: rpc.idmapd: libnfsidmap: using 
domain: nix.my.dom
Mar  5 23:48:51 client01 rpc.idmapd: rpc.idmapd: libnfsidmap: Realms 
list: 'NIX.MY.DOM'
Mar  5 23:48:51 client01 rpc.idmapd: rpc.idmapd: libnfsidmap: loaded 
plugin /lib64/libnfsidmap/nsswitch.so for method nsswitch
Mar  5 23:48:51 client01 rpc.idmapd[14117]: libnfsidmap: loaded plugin 
/lib64/libnfsidmap/nsswitch.so for method nsswitch

Mar  5 23:48:51 client01 rpc.idmapd[14118]: Expiration time is 600 seconds.
Mar  5 23:48:51 client01 systemd: Started NFSv4 ID-name mapping service.
Mar  5 23:48:51 client01 rpc.idmapd[14118]: Opened 
/proc/net/rpc/nfs4.nametoid/channel
Mar  5 23:48:51 client01 rpc.idmapd[14118]: Opened 
/proc/net/rpc/nfs4.idtoname/channel





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Issue with large directories containing 100, 000 files

2018-03-06 Thread Daniel Gryniewicz

Hi.

What version of Ganesha is this?

Daniel

On 03/05/2018 10:35 PM, Varghese Devassy via Nfs-ganesha-devel wrote:

Hello,

I am testing our own version of FSAL and I am observing an issue with 
directories containing 100,000 files. When I do ls on the directory, it 
only lists about 21 files. This issue is repeatable in two directories 
containing the same number of files. But, no issue on on another 
directory with 10,000 files. There are no errors in ganesha.log and 
ganesha is running at the default log level (NIV_INFO).


Any help in this matter is greatly appreciated.

Thank you




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-26 Thread Daniel Gryniewicz
We check this because consistent, readable style is important for 
maintainability.  It's being checked now because commit hooks only 
operate on changed code.


Daniel

On 02/16/2018 01:01 PM, William Allen Simpson wrote:

As I'm trying to update nfs41.h, I've run into the problem
that the commit check is complaining that the pointer '*' on
parameters is sometimes " * v" and others " *v" -- usually
the same function definition.

Presumably the generator made these.  They are cosmetic.

Why oh why are we checking this now, after all these years?

Do I need to make a pass fixing all these header files
before doing real coding?

Or can we turn off this silly cosmetic check?

-- 


Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-15 Thread Daniel Gryniewicz
How many clients are you using?  Each client op can only (currently) be 
handled in a single thread, and client's won't send more ops until the 
current one is ack'd, so Ganesha can basically only parallelize on a 
per-client basis at the moment.


I'm sure there are locking issues; so far we've mostly worked on 
correctness rather than performance.  2.6 has changed the threading 
model a fair amount, and 2.7 will have more improvements, but it's a 
slow process.


Daniel

On 02/13/2018 06:38 PM, Deepak Jagtap wrote:

Thanks Daniel!

Yeah user-kernel context switching is definitely adding up latency, but 
I wonder ifrpc or some locking overhead is also in the picture.


With 70% read 30% random workload nfs ganesha CPU usage was close to 
170% while remaining 2 cores were pretty much unused (~18K IOPS, 
latency ~8ms)


With 100% read 30% random nfs ganesha CPU usage ~250% ( ~50K IOPS, 
latency ~2ms).



-Deepak


*From:* Daniel Gryniewicz <d...@redhat.com>
*Sent:* Tuesday, February 13, 2018 6:15:47 AM
*To:* nfs-ganesha-devel@lists.sourceforge.net
*Subject:* Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance
Also keep in mind that FSAL VFS can never, by it's very nature, beat
knfsd, since it has to do everything knfsd does, but also has userspace
<-> kernespace transitions.  Ganesha's strength is exporting
userspace-based cluster filesystems.

That said, we're always working to make Ganesha faster, and I'm sure
there's gains to be made, even in these circumstances.

Daniel

On 02/12/2018 07:01 PM, Deepak Jagtap wrote:

Hey Guys,


I ran few performance tests to compare nfs gansha and nfs kernel server 
and noticed significant difference.



Please find my test result:


SSD formated with EXT3 exported using nfs ganesha  : ~18K IOPS    Avg 
latency: ~8ms       Throughput: ~60MBPS


same directory exported using nfs kernel server:             ~75K IOPS  
    Avg latency: ~0.8ms Throughput: ~300MBPS



nfs kernel and nfs ganesha both of them are configured with 128 
worker threads. nfs ganesha is configured with VFS FSAL.



Am I missing something major in nfs ganesha config or this is expected 
behavior.


Appreciate any inputs as how the performance can be improved for nfs 
ganesha.




Please find following ganesha config file that I am using:


NFS_Core_Param
{
          Nb_Worker = 128 ;
}

EXPORT
{
      # Export Id (mandatory, each EXPORT must have a unique Export_Id)
     Export_Id = 77;
     # Exported path (mandatory)
     Path = /host/test;
     Protocols = 3;
     # Pseudo Path (required for NFS v4)
     Pseudo = /host/test;
     # Required for access (default is None)
     # Could use CLIENT blocks instead
     Access_Type = RW;
     # Exporting FSAL
     FSAL {
          Name = VFS;
     }
     CLIENT
     {
          Clients = *;
          Squash = None;
          Access_Type = RW;
     }
}



Thanks & Regards,

Deepak



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-13 Thread Daniel Gryniewicz
Also keep in mind that FSAL VFS can never, by it's very nature, beat 
knfsd, since it has to do everything knfsd does, but also has userspace 
<-> kernespace transitions.  Ganesha's strength is exporting 
userspace-based cluster filesystems.


That said, we're always working to make Ganesha faster, and I'm sure 
there's gains to be made, even in these circumstances.


Daniel

On 02/12/2018 07:01 PM, Deepak Jagtap wrote:

Hey Guys,


I ran few performance tests to compare nfs gansha and nfs kernel server 
and noticed significant difference.



Please find my test result:


SSD formated with EXT3 exported using nfs ganesha  : ~18K IOPS    Avg 
latency: ~8ms       Throughput: ~60MBPS


same directory exported using nfs kernel server:             ~75K IOPS  
   Avg latency: ~0.8ms Throughput: ~300MBPS



nfs kernel and nfs ganesha both of them are configured with 128 
worker threads. nfs ganesha is configured with VFS FSAL.



Am I missing something major in nfs ganesha config or this is expected 
behavior.


Appreciate any inputs as how the performance can be improved for nfs 
ganesha.




Please find following ganesha config file that I am using:


NFS_Core_Param
{
         Nb_Worker = 128 ;
}

EXPORT
{
     # Export Id (mandatory, each EXPORT must have a unique Export_Id)
    Export_Id = 77;
    # Exported path (mandatory)
    Path = /host/test;
    Protocols = 3;
    # Pseudo Path (required for NFS v4)
    Pseudo = /host/test;
    # Required for access (default is None)
    # Could use CLIENT blocks instead
    Access_Type = RW;
    # Exporting FSAL
    FSAL {
         Name = VFS;
    }
    CLIENT
    {
         Clients = *;
         Squash = None;
         Access_Type = RW;
    }
}



Thanks & Regards,

Deepak



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] How do I compile nfs-ganesha-utils RPM

2018-02-12 Thread Daniel Gryniewicz
The Ceph project also maintains RPMs compatible with RHEL, if you'd 
prefer a Ceph-flavored version (also works without Ceph, of course, just 
like the Gluster versions).


http://download.ceph.com/nfs-ganesha/rpm-V2.5-stable/luminous/x86_64/

Daniel

On 02/09/2018 02:53 PM, Daniel Gryniewicz wrote:

If you didn't pass "-DUSE_SYSTEM_NTIRPC" then libntirpc is in your
nfs-ganesha rpm.  If you did, then rpmbuild would have complained that
no libntirpc rpm was available.

The way to build a ntirpc RPM is to check out ntirpc, and run cmake.
That will generate a libntripc.spec file.  You can then run "make
dist" to get a tarball, and run rpmbuild on the spec file.  Sorry it's
not as easy an ganesha, most people don't build separate ntirpc rpms,
and distros have thier own systems.

Daniel

On Fri, Feb 9, 2018 at 2:26 PM, You Me <yourindian...@gmail.com> wrote:

Thank you. That worked.

How do I build libntirpc RPM?

On Fri, Feb 9, 2018 at 10:42 AM, Daniel Gryniewicz <d...@redhat.com> wrote:

pass -DUSE_ADMIN_TOOLS=YES to your cmake command.  I think you can
also pass --with utils to rpmbuild.

Daniel

On Fri, Feb 9, 2018 at 10:28 AM, You Me <yourindian...@gmail.com> wrote:

I compiled ganesha 2.5.1 from sources. 'make rpm' gave me all RPMs
except
ganesha-utils.
How do I build that one?

Thank you



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel






--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] How do I compile nfs-ganesha-utils RPM

2018-02-09 Thread Daniel Gryniewicz
If you didn't pass "-DUSE_SYSTEM_NTIRPC" then libntirpc is in your
nfs-ganesha rpm.  If you did, then rpmbuild would have complained that
no libntirpc rpm was available.

The way to build a ntirpc RPM is to check out ntirpc, and run cmake.
That will generate a libntripc.spec file.  You can then run "make
dist" to get a tarball, and run rpmbuild on the spec file.  Sorry it's
not as easy an ganesha, most people don't build separate ntirpc rpms,
and distros have thier own systems.

Daniel

On Fri, Feb 9, 2018 at 2:26 PM, You Me <yourindian...@gmail.com> wrote:
> Thank you. That worked.
>
> How do I build libntirpc RPM?
>
> On Fri, Feb 9, 2018 at 10:42 AM, Daniel Gryniewicz <d...@redhat.com> wrote:
>>
>> pass -DUSE_ADMIN_TOOLS=YES to your cmake command.  I think you can
>> also pass --with utils to rpmbuild.
>>
>> Daniel
>>
>> On Fri, Feb 9, 2018 at 10:28 AM, You Me <yourindian...@gmail.com> wrote:
>> > I compiled ganesha 2.5.1 from sources. 'make rpm' gave me all RPMs
>> > except
>> > ganesha-utils.
>> > How do I build that one?
>> >
>> > Thank you
>> >
>> >
>> >
>> > --
>> > Check out the vibrant tech community on one of the world's most
>> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> > ___
>> > Nfs-ganesha-devel mailing list
>> > Nfs-ganesha-devel@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>> >
>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Using root privileges when using kerberos exports with Ganesha.

2018-02-09 Thread Daniel Gryniewicz
I haven't done it, but I think it works if you have idmapping set up
correctly.  That is, if the idmapper domain is correct, then the
client will send "root@DOMAIN" and the idmapd on the ganesha server
will convert that to UID 0.

Daniel

On Thu, Feb 8, 2018 at 6:45 PM, Pradeep  wrote:
> Hello,
>
> It looks like Ganesha converts certain principals to UID/GID 0
> (idmapper.c:principal2uid()). I noticed that when a client uses kerberos
> with AD, the default principal is @. So when NFS operations
> are tried with root on client, it sends the principal in @
> format which will not be mapped to UID/GID 0 on Ganesha side.
>
> Have anyone successfully used privileged access to NFS exports with
> Kerberos/AD with Ganesha server? If yes, could share how you were able to
> achieve that?
>
> Thanks,
> Pradeep
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] owner and group issue.

2018-02-08 Thread Daniel Gryniewicz

This is not a known issue, as far as I'm aware.

I'm assuming you never changed 4.txt at all, it just fixed itself on 
it's own?  This means, to me, that the correct value is being set on the 
file, but an incorrect one is returned to the user.  That's a getattr() 
issue, and it resolves itself because after 60 seconds, the attribute 
cache expires, and so the next getattr refreshes the cache, fixing the 
issue.


Is your FSAL correctly returning the attributes in attrs_out in it's 
open2() call?


Daniel

On 02/08/2018 05:15 AM, Sagar M D wrote:

Hi,

We are using nfs-ganesha 2.5 for our fsal, I see sometimes created file 
have -2 has owner and group even though root is created the file and 
no_root_squash is enabled.


I see this behavior for only for short duration that to right after 
ganesha restarts.


During file creation, our fsal gets correct owner and group. But ls -ltr 
is showing -2. If i try after few mins all works fine.


[root@BDC testPerm]# touch 4.txt
[root@BDC testPerm]# ls -ltr
total 0
-rw-r--r--. 1 root   root   0 Feb  7 20:10 1.txt
-rw-r--r--. 1 root   root   0 Feb  7 20:17 2.txt
-rw-r--r--. 1 root   root   0 Feb  7 20:17 3.txt
-rw-r--r--. 1 4294967294 4294967294 0 Feb  7 20:21 4.txt


ls tried after some time:-
[root@BDC testPerm]# touch 5.txt
[root@BDC testPerm]# ls -ltr
total 0
-rw-r--r--. 1 root root 0 Feb  7 20:10 1.txt
-rw-r--r--. 1 root root 0 Feb  7 20:17 2.txt
-rw-r--r--. 1 root root 0 Feb  7 20:17 3.txt
-rw-r--r--. 1 root root 0 Feb  7 20:21 4.txt
-rw-r--r--. 1 root root 0 Feb  7 20:33 5.txt

is there any known issues ? Are we missing anything from our side?

Thanks,
Sagar.



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] WIP example API for async/vector FSAL ops

2018-02-07 Thread Daniel Gryniewicz

On 02/07/2018 09:05 AM, William Allen Simpson wrote:

On 2/6/18 10:40 AM, Daniel Gryniewicz wrote:

On 02/06/2018 10:26 AM, William Allen Simpson wrote:

On 2/6/18 8:25 AM, Daniel Gryniewicz wrote:

Hi, all.

I've worked up a sample API for async/vector for FSAL ops.  The 
example op is read(), and I've "implemented" it for all FSALs, so 
that I can verify that it does, in fact, work for some definition of 
work. 


I'm a bit surprised it works, as the alloca needs the sizeof struct 
PLUS the

sizeof iov * iov_count.  Right now it's writing off the end.


I believe the empty array construct includes a size of 1.  If I'm 
wrong, then it's an easy fix (and this code will go away anyway, and 
never be committed).



No, it's zero.  Yes, an easy fix.


Already fixed.  I did my research after sending the last mail.



I'm assuming this code will be committed sometime in the near future.


I wasn't planning on committing this as is, but rather waiting until it 
was more complete.






"asynchronous: has an 'h' in it.

"it's" means "it is".  Most places should be "its".

To be async, need to move the status return into the arg struct, and 
pass

NULL for the caller's parameter at the top level.


Return is it's own argument to the callback.



I'd prefer to have the new struct contain all the common arguments.

Every level needs to be able to set the status, so putting the result in
the struct makes the code cleaner than copying stuff in every wrapper.



Why not move the other arguments into the struct?
   * bypass
   * state
   * offset


Because those are pure in arguments, and were unchanged, so minimal 
code changes.  The iov was put into the arg to avoid multiple mallocs, 
and I put iov_count with iov.  The rest are out arguments.



Obviously, all [in] arguments can be in the struct.  Set and forget once
at the top

Even [out] pointer arguments can be in the struct.

Removing long parameter lists makes the code cleaner (and faster).


It all can, but I prefer simple in parameters to be passed as 
parameters.  I don't like functions that take only a single giant 
struct, especially with optional entries.  I was not planning on putting 
anything in it that was not needed in the callback (and the contents of 
this struct are still evolving as I write code).




And some of this will need to be done to remove op_ctx dependencies.


This won't convince me.  I'm a fan of op_ctx.





Also it will be the same for write, so we can just name it
struct fsal_cb_arg -- and the function typedef fsal_cb to match.


It may be.  I didn't look at write, this is a proof-of-concept for 
read, and not in any way intended to be final.



Yeah, as we talked earlier, I was looking at the bigger picture.  This is a
nice clean proof-of-concept.  I like it.  Now I'm talking details.




Why not get rid of fsal_read2(), and call the function directly in
the 3 places it's used?


I'm considering it.  That was a good point to break this particular 
proof-of-concept, but much must change for async to be plumbed all the 
way back to the top of the protocols.



Yes, again I'm forward looking.  If we do it now, then we don't have to
undo anything later.  Makes the patches easier to understand.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] WIP example API for async/vector FSAL ops

2018-02-06 Thread Daniel Gryniewicz

On 02/06/2018 10:26 AM, William Allen Simpson wrote:

On 2/6/18 8:25 AM, Daniel Gryniewicz wrote:

Hi, all.

I've worked up a sample API for async/vector for FSAL ops.  The 
example op is read(), and I've "implemented" it for all FSALs, so that 
I can verify that it does, in fact, work for some definition of work. 


I'm a bit surprised it works, as the alloca needs the sizeof struct PLUS 
the

sizeof iov * iov_count.  Right now it's writing off the end.


I believe the empty array construct includes a size of 1.  If I'm wrong, 
then it's an easy fix (and this code will go away anyway, and never be 
committed).




"asynchronous: has an 'h' in it.

"it's" means "it is".  Most places should be "its".

To be async, need to move the status return into the arg struct, and pass
NULL for the caller's parameter at the top level.


Return is it's own argument to the callback.


Why not move the other arguments into the struct?
   * bypass
   * state
   * offset


Because those are pure in arguments, and were unchanged, so minimal code 
changes.  The iov was put into the arg to avoid multiple mallocs, and I 
put iov_count with iov.  The rest are out arguments.




Also it will be the same for write, so we can just name it
struct fsal_cb_arg -- and the function typedef fsal_cb to match.


It may be.  I didn't look at write, this is a proof-of-concept for read, 
and not in any way intended to be final.




Why not get rid of fsal_read2(), and call the function directly in
the 3 places it's used?


I'm considering it.  That was a good point to break this particular 
proof-of-concept, but much must change for async to be plumbed all the 
way back to the top of the protocols.




Anyway, a good effort.  I see how you've wrapped for stacking.  Thanks!



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] reg. unbounded readdir cache

2018-02-06 Thread Daniel Gryniewicz

It's the dirent cache.  There's a bounded version in 2.5 and later.

Daniel

On 02/06/2018 07:18 AM, Suresh kosuru wrote:

Hi,

I am working on NFS Ganesha v2.3. From what I have heard, readdir cache 
is unbounded in Ganesha. Can you please elaborate on which cache is 
unbounded. Is it the inode cache that doesn't get cleaned up or the 
dirent cache (implemented as avl).


Thanks in advance.

Suresh.


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] WIP example API for async/vector FSAL ops

2018-02-06 Thread Daniel Gryniewicz

Hi, all.

I've worked up a sample API for async/vector for FSAL ops.  The example 
op is read(), and I've "implemented" it for all FSALs, so that I can 
verify that it does, in fact, work for some definition of work.  The 
commit is the top of this branch:


https://github.com/dang/nfs-ganesha/tree/async

Several notes on this:

This is an API example only. There is no actual async (and the stub in 
fsal_read2() depends on there being no async currently), and the vector 
size must currently be 1, until all FSALs are capable of handling larger 
vectors.  I've added vector processing to all FSALs that I could find 
implementations for (currently MEM, VFS, and GLUSTER), and I've added 
loops to all FSALs that I could figure out how that should work.  This 
leaves GPFS and Proxy (which have complicated read implementations) only 
handling vector sizes of 1.


My primary goal here is to start discussion around what the API should 
look like, since this is a big change.  In parallel with this 
discussion, I'll be working on pushing the async up the stack to the 
point where we can actually handle async ops, and Bill will be working 
on vectorizing up the stack as well.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Crash in graceful shutdown.

2018-02-01 Thread Daniel Gryniewicz
I do this all the time (that is, I send SIGTERM after every workload I 
run) in order to catch these issues.  Is there some specific workload 
that triggers this for you?


Daniel

On 02/01/2018 01:16 PM, Pradeep wrote:
Running some NFS workload and sending SIGTERM to ganesha (sudo killall 
-TERM ganesha.nfsd) will reproduce it.


But you might hit a double-free problem before that - here is a patch 
that fixes it.


https://review.gerrithub.io/#/c/398092/

Feel free to rework the patch above.

On Thu, Feb 1, 2018 at 6:59 AM, Daniel Gryniewicz <d...@redhat.com 
<mailto:d...@redhat.com>> wrote:


I've been actually deliberately leaving that crash in.  It indicates
a refcount leak (and is usually the only indicator of a refcount
leak, other than slowly growing memory over a long time).

Can you get me a reproducer for this?  If so, I can track down the leak.

Daniel


On 02/01/2018 09:51 AM, Pradeep wrote:

Hello,

In graceful shutdown of ganesha (do_shutdown()), the way object
handles are released is first by calling unexport() and then
destroy_fsals(). One issue I'm seeing is unexport in MDCACHE
will not
release objects if refcnt is non-zero (which can happen if files are
open). When it comes to destroy_fsals() -> shutdown_handles() ->
mdcache_hdl_release() -> ...-> mdcache_lru_clean(), we don't have
op_ctx. So it will crashes in mdcache_lru_clean().

A simple fix would be to create op_ctx if it is NULL in
mdcache_hdl_release(). But I'm wondering if unexport is supposed to
free all handles in MDCACHE?

This is with 2.6-rc2 in case you want to look at code.

Thanks,
Pradeep


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel>




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel>





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Crash in graceful shutdown.

2018-02-01 Thread Daniel Gryniewicz
I've been actually deliberately leaving that crash in.  It indicates a 
refcount leak (and is usually the only indicator of a refcount leak, 
other than slowly growing memory over a long time).


Can you get me a reproducer for this?  If so, I can track down the leak.

Daniel

On 02/01/2018 09:51 AM, Pradeep wrote:

Hello,

In graceful shutdown of ganesha (do_shutdown()), the way object
handles are released is first by calling unexport() and then
destroy_fsals(). One issue I'm seeing is unexport in MDCACHE will not
release objects if refcnt is non-zero (which can happen if files are
open). When it comes to destroy_fsals() -> shutdown_handles() ->
mdcache_hdl_release() -> ...-> mdcache_lru_clean(), we don't have
op_ctx. So it will crashes in mdcache_lru_clean().

A simple fix would be to create op_ctx if it is NULL in
mdcache_hdl_release(). But I'm wondering if unexport is supposed to
free all handles in MDCACHE?

This is with 2.6-rc2 in case you want to look at code.

Thanks,
Pradeep

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Correct initialization sequence

2018-01-31 Thread Daniel Gryniewicz

On 01/31/2018 10:27 AM, William Allen Simpson wrote:

On 1/31/18 8:44 AM, Daniel Gryniewicz wrote:

Agreed.

Daniel

On 01/30/2018 11:46 PM, Malahal Naineni wrote:
Looking at the code, dupreq2_pkginit() only depends on Ganesha config 
processing to initialize few things, so it should be OK to call 
anytime after Ganesha config processing.


Regards, Malahal.

On Wed, Jan 31, 2018 at 8:00 AM, Pradeep <pradeeptho...@gmail.com 
<mailto:pradeeptho...@gmail.com>> wrote:


    Hi Bill,

    Is it ok to move dupreq2_pkginit() before nfs_Init_svc() so that we
    won't hit the crash below?


It seems OK to me.  The previous culprit was delegation callbacks
happened before nfs_Init_svc().  Anything that does output (or expects
input) has to be after initializing ntirpc svc.

DanG, could you add this move to your pullup?  That might trigger
another test, too.


Should probably be a different PR, since it's unrelated to ntirpc.  If 
Pradeep doesn't want to submit it, I can.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Features board list

2018-01-31 Thread Daniel Gryniewicz

On 01/31/2018 10:21 AM, William Allen Simpson wrote:

On 1/30/18 12:03 PM, Supriti Singh wrote:

Hello all,

As discussed in community call, I am sharing the feature board list: 
https://github.com/nfs-ganesha/nfs-ganesha/projects for nfs-ganesha 
2.6 and 2.7. The aim is to use these boards to track the planned 
features for major release. The hope is that it will help community 
members to follow up on feature development, and do more focused 
testing. In the project board, ideally everyone who has write access 
to nfs-ganesha github should be able to modify the board.



It says "Click on the '+' sign to add new cards." I don't see how to
add new cards under the '+' pull down on the upper right.  And don't
have any other plus signs.



Hmm... I have a big + sign at the top of each card, that adds new 
entries to the card.  I'm guessing it's a github permissions thing.  I 
don't pretend to understand github permissions, however.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Correct initialization sequence

2018-01-31 Thread Daniel Gryniewicz

Agreed.

Daniel

On 01/30/2018 11:46 PM, Malahal Naineni wrote:
Looking at the code, dupreq2_pkginit() only depends on Ganesha config 
processing to initialize few things, so it should be OK to call anytime 
after Ganesha config processing.


Regards, Malahal.

On Wed, Jan 31, 2018 at 8:00 AM, Pradeep > wrote:


Hi Bill,

Is it ok to move dupreq2_pkginit() before nfs_Init_svc() so that we
won't hit the crash below?

#0  0x7fb54dd7923b in raise () from /lib64/libpthread.so.0
#1  0x00442ebd in crash_handler (signo=11,
info=0x7fb546efc430, ctx=0x7fb546efc300) at
/usr/src/debug/nfs-ganesha-2.6-rc2/MainNFSD/nfs_init.c:263
#2  
#3  0x004de670 in nfs_dupreq_get_drc (req=0x7fb546422800) at
/usr/src/debug/nfs-ganesha-2.6-rc2/RPCAL/nfs_dupreq.c:579
#4  0x004e00bf in nfs_dupreq_start (reqnfs=0x7fb546422800,
req=0x7fb546422800) at
/usr/src/debug/nfs-ganesha-2.6-rc2/RPCAL/nfs_dupreq.c:1011
#5  0x00457825 in nfs_rpc_process_request
(reqdata=0x7fb546422800) at
/usr/src/debug/nfs-ganesha-2.6-rc2/MainNFSD/nfs_worker_thread.c:852
#6  0x004599a7 in nfs_rpc_valid_NFS (req=0x7fb546422800) at
/usr/src/debug/nfs-ganesha-2.6-rc2/MainNFSD/nfs_worker_thread.c:1555

(gdb) print drc_st
$1 = (struct drc_st *) 0x0
(gdb) print nfs_init.init_complete
$2 = false

On Tue, Jan 30, 2018 at 1:39 PM, Matt Benjamin > wrote:

reordering, I hope

Matt

On Tue, Jan 30, 2018 at 1:40 PM, Pradeep
> wrote:
 > Hello,
 >
 > It is possible to receive requests anytime after
nfs_Init_svc() is
 > completed. We initialize several things in nfs_Init() after
this. This could
 > lead to processing of incoming requests racing with the rest of
 > initialization (ex: dupreq2_pkginit()). Is it possible to
re-order
 > nfs_Init_svc() so that rest of ganesha is ready to process
requests as soon
 > as we start listing on the NFS port? Another way is to return
NFS4ERR_DELAY
 > until 'nfs_init.init_complete' is true. Any thoughts?
 >
 >
 > Thanks,
 > Pradeep
 >
 >

--
 > Check out the vibrant tech community on one of the world's most
 > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
 > ___
 > Nfs-ganesha-devel mailing list
 > Nfs-ganesha-devel@lists.sourceforge.net

 >
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

 >



--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage


tel. 734-821-5101 
fax. 734-769-8938 
cel. 734-216-5309 




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup ntirpc 1.6

2018-01-30 Thread Daniel Gryniewicz

On 01/30/2018 09:11 AM, William Allen Simpson wrote:

On 1/29/18 2:27 PM, Daniel Gryniewicz wrote:

On 01/29/2018 02:09 PM, William Allen Simpson wrote:

On 1/29/18 1:13 PM, GerritHub wrote:

Daniel Gryniewicz has uploaded this change for *review*.

View Change <https://review.gerrithub.io/397004>

Pullup ntirpc 1.6

(svc_vc) rearm after EAGAIN and EWOULDBLOCK

(Note, previous pullup was erroneously from 1.7)


All my weekend patches need to be backported to the 1.6 branch.  There
are string errors and clnt_control errors fixed.





I'm not sure I agree.  clnt_control() isn't called with unknown 
values, so a default return of false isn't important; it's never 
called with CLSET_XID, so that case isn't important.  And RDMA doesn't 
work, even with these fixes correct?  I can't be convinced otherwise, 
but it seemed the only important fix for 1.6 was the EAGAIN one.


Daniel


The error string commas are very important, as V2.6 does a lot more
error reporting now.

I think that clnt_control is important, especially given the bad error
returns and that this might be downstream for years.  OTOH, it would also
apply to V2.5, V2.4, et alia going back years, and nobody has cared.

I agree we can hold off on RDMA for now (until next week).

Sorry you cannot be convinced otherwise.


(Sorry, typo, I meant to say I *can* be convinced otherwise.)

I agree about the comma patch, and have backported that.  I'm not sure 
how I missed it the first time around.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-29 Thread Daniel Gryniewicz

It's already in ntirpc, and we'll submit a pullup for this week.

Daniel

On 01/29/2018 12:53 PM, Pradeep wrote:

Hi Bill,

Are you planning to pull this into the next ganesha RC?

Thanks,
Pradeep

On Sun, Jan 28, 2018 at 7:13 AM, William Allen Simpson 
> wrote:


On 1/27/18 4:07 PM, Pradeep wrote:

​Here is what I see in the log (the '2' is what I added to
figure out which recv failed):
nfs-ganesha-199008[svc_948] rpc :TIRPC :WARN :svc_vc_recv:
0x7f91c0861400 fd 21 recv errno 11 (try again) 2 176​

The fix looks good. Thanks Bill.

Thanks for the excellent report.  I wish everybody did such well
researched reports!

Yeah, the 2 isn't really needed, because I used "svc_vc_wait" and
"svc_vc_recv" (__func__) to differentiate the 2 messages.

This is really puzzling, since it should never happen.  It's the
recv() with NO WAIT.  And we are level-triggered, so we shouldn't be
in this code without an event.

If it needed more data, it should be WOULD BLOCK, but it's giving
EAGAIN.  No idea what that means here.

Hope it's not happening often




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-26 Thread Daniel Gryniewicz
I don't think you re-arm a FD in epoll.  You arm it once, and it fires 
until you disarm it, as far as I know.  You just call epoll_wait() to 
get new events.


The thread model is a bit odd;  When the epoll fires, all the events are 
found, and a thread is submitted for each one except one.  That one is 
handled in the local thread (since it's expected that most epoll 
triggers will have one event on them, thus using the current hot 
thread).  In addition, a new thread is submitted to go back and wait for 
events, so there's no delay handling new events.  So EAGAIN is handled 
by just indicating this thread is done, and returning it to the thread 
pool.  When the socket is ready again, it will trigger a new event on 
the thread waiting on the epoll.


Bill, please correct me if I'm wrong.

Daniel

On 01/25/2018 09:13 PM, Matt Benjamin wrote:

Hmm.  We used to handle that ;)

Matt

On Thu, Jan 25, 2018 at 9:11 PM, Pradeep  wrote:

If recv() returns EAGAIN, then svc_vc_recv() returns without rearming the
epoll_fd. How does it get back to svc_vc_recv() again?

On Wed, Jan 24, 2018 at 9:26 PM, Pradeep  wrote:


Hello,

I seem to be hitting a corner case where ganesha (2.6-rc2) does not
respond to a RENEW request from 4.0 client. Enabled the debug logs and
noticed that NFS layer has not seen the RENEW request (I can see it in
tcpdump).

I collected netstat output periodically and found that there is a time
window of ~60 sec where the receive buffer size remains the same. This means
the RPC layer somehow missed a 'recv' call. Now if I enable debug on TIRPC,
I can't reproduce the issue. Any pointers to potential races where I could
enable selective prints would be helpful.

svc_rqst_epoll_event() resets SVC_XPRT_FLAG_ADDED. Is it possible for
another thread to svc_rqst_rearm_events()? In that case if
svc_rqst_epoll_event() could reset the flag set by svc_rqst_rearm_events and
complete the current receive before the other thread could call epoll_ctl(),
right?

Thanks,
Pradeep




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel








--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] ganesha crash when stopping with gluster handles exist

2018-01-24 Thread Daniel Gryniewicz

How about this as a band-aid, since non-chunked readdir is going away:

https://review.gerrithub.io/396194

Daniel

On 01/24/2018 05:29 AM, Kinglong Mee wrote:

With the latest codes (nfs-ganesha-2.6.0-0.2rc3.el7.centos.x86_64),
set ganesha.conf includes,

CACHEINODE {
 Dir_Max = 50;
 Dir_Chunk = 0;
}

that forces mdcache_readdir enter mdcache_dirent_populate, but mdc_add_cache
return ERR_FSAL_OVERFLOW when a directory contains larger than 50 files
(I tests with 100 files).

After return ERR_FSAL_OVERFLOW, a handle is left in gluster fsal,
but no entry added to mdcache.

Right now, restarting nfs-ganesha will crash as,

#0  0x7f75bb9ea6e7 in __inode_unref () from /lib64/libglusterfs.so.0
#1  0x7f75bb9eaf41 in inode_unref () from /lib64/libglusterfs.so.0
#2  0x7f75bbccdad6 in glfs_h_close () from /lib64/libgfapi.so.0
#3  0x7f75bc0eb93f in handle_release ()
from /usr/lib64/ganesha/libfsalgluster.so
#4  0x7f75c12c6faf in destroy_fsals ()
#5  0x7f75c12e158f in admin_thread ()
#6  0x7f75bf849dc5 in start_thread () from /lib64/libpthread.so.0
#7  0x7f75bef1b73d in clone () from /lib64/libc.so.6

For gluster, those sources (eg, handles, inode) belong to a glfs,
the glfs is freed at remove_all_exports() before destory_fsal(),
so that, glfs_h_close() use after freed memory.

I have no idea how to fix the problem,
1. Adds the gluster handle (meets the ERR_FSAL_OVERFLOW) to mdcache entry ?
2. Binds gluster handles to a glfs export, and release before freeing glfs?
3. Only move the shutdown_handles() before remove_all_exports()?

valgrind shows many messages as,
==13796== Invalid read of size 8
==13796==at 0xA5E39C2: ??? (in /usr/lib64/libglusterfs.so.0.0.1)
==13796==by 0xA5E5D58: inode_forget (in /usr/lib64/libglusterfs.so.0.0.1)
==13796==by 0xA3A0ACD: glfs_h_close (in /usr/lib64/libgfapi.so.0.0.0)
==13796==by 0x9F6B93E: ??? (in /usr/lib64/ganesha/libfsalgluster.so.4.2.0)
==13796==by 0x147FAE: destroy_fsals (in /usr/bin/ganesha.nfsd)
==13796==by 0x16258E: admin_thread (in /usr/bin/ganesha.nfsd)
==13796==by 0x6441DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==13796==by 0x6DAA73C: clone (in /usr/lib64/libc-2.17.so)
==13796==  Address 0x1d3b6d58 is 104 bytes inside a block of size 256 free'd
==13796==at 0x4C28CDD: free (vg_replace_malloc.c:530)
==13796==by 0xA5FD9BB: free_obj_list (in /usr/lib64/libglusterfs.so.0.0.1)
==13796==by 0xA5FDDA7: mem_pools_fini (in /usr/lib64/libglusterfs.so.0.0.1)
==13796==by 0xA38A2D6: glfs_fini (in /usr/lib64/libgfapi.so.0.0.0)
==13796==by 0x9F6687A: glusterfs_free_fs (in
/usr/lib64/ganesha/libfsalgluster.so.4.2.0)
==13796==by 0x9F66C19: ??? (in /usr/lib64/ganesha/libfsalgluster.so.4.2.0)
==13796==by 0x223E1E: ??? (in /usr/bin/ganesha.nfsd)
==13796==by 0x200FFA: free_export_resources (in /usr/bin/ganesha.nfsd)
==13796==by 0x212288: free_export (in /usr/bin/ganesha.nfsd)
==13796==by 0x215329: remove_all_exports (in /usr/bin/ganesha.nfsd)
==13796==by 0x1624B8: admin_thread (in /usr/bin/ganesha.nfsd)
==13796==by 0x6441DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==13796==  Block was alloc'd at
==13796==at 0x4C27BE3: malloc (vg_replace_malloc.c:299)
==13796==by 0xA5FE3C4: mem_get (in /usr/lib64/libglusterfs.so.0.0.1)
==13796==by 0xA5FE4D2: mem_get0 (in /usr/lib64/libglusterfs.so.0.0.1)
==13796==by 0xA5E3BD3: ??? (in /usr/lib64/libglusterfs.so.0.0.1)
==13796==by 0xA5E510A: inode_new (in /usr/lib64/libglusterfs.so.0.0.1)
==13796==by 0x178B5779: ???
==13796==by 0x178EA503: ???
==13796==by 0x178C3E46: ???
==13796==by 0xA8B8E7F: rpc_clnt_handle_reply (in
/usr/lib64/libgfrpc.so.0.0.1)
==13796==by 0xA8B9202: rpc_clnt_notify (in /usr/lib64/libgfrpc.so.0.0.1)
==13796==by 0xA8B4F82: rpc_transport_notify (in
/usr/lib64/libgfrpc.so.0.0.1)
==13796==by 0x1741D595: ??? (in
/usr/lib64/glusterfs/3.13.1/rpc-transport/socket.so)

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread Daniel Gryniewicz

Hi, Pradeep.

Can you try with this patch on ntirpc?

https://github.com/nfs-ganesha/ntirpc/pull/105

Thanks,
Daniel

On 01/22/2018 08:08 PM, Pradeep wrote:

Hello,

I'm running into a crash in libntirpc with rc2:

#2  
#3  0x7f9004de31f4 in clnt_ncreate_timed (hostname=0x57592e 
"localhost", prog=100024, vers=1,
     netclass=0x57592a "tcp", tp=0x0) at 
/usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/src/clnt_generic.c:197
#4  0x0049a21c in clnt_ncreate (hostname=0x57592e "localhost", 
prog=100024, vers=1,
     nettype=0x57592a "tcp") at 
/usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/ntirpc/rpc/clnt.h:395
#5  0x0049a4d2 in nsm_connect () at 
/usr/src/debug/nfs-ganesha-2.6-rc2/Protocols/NLM/nsm.c:58
#6  0x0049c10d in nsm_unmonitor_all () at 
/usr/src/debug/nfs-ganesha-2.6-rc2/Protocols/NLM/nsm.c:267
#7  0x004449d4 in nfs_start (p_start_info=0x7c8b28 
)

     at /usr/src/debug/nfs-ganesha-2.6-rc2/MainNFSD/nfs_init.c:963
#8  0x0041cd2e in main (argc=10, argv=0x7fff68b294d8)
     at /usr/src/debug/nfs-ganesha-2.6-rc2/MainNFSD/nfs_main.c:499
(gdb) f 3
#3  0x7f9004de31f4 in clnt_ncreate_timed (hostname=0x57592e 
"localhost", prog=100024, vers=1,
     netclass=0x57592a "tcp", tp=0x0) at 
/usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/src/clnt_generic.c:197

197                     if (CLNT_SUCCESS(clnt))
(gdb) print clnt
$1 = (CLIENT *) 0x0

Looked at dev.22 and we were handling this error case correctly there.


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Segfault in dupreq cache.

2018-01-15 Thread Daniel Gryniewicz

Hi, Pradeep.

There's been 6 sets of code changes to dupreq since dev-22, including 
refcount bugfixes.  Can you try with 2.6-rc2?


Daniel

On 01/13/2018 08:50 PM, Pradeep wrote:

Hello,

I'm seeing a segfault in nfs_dupreq_finish() with 2.6.dev.22. This is
when using NFSv4.0 clients. The dupreq_entry seems to be malformed
(drc is still NULL)

#0  0x7f761902023b in raise () from /lib64/libpthread.so.0
#1  0x00442a67 in crash_handler (signo=11,
info=0x7f75ed745730, ctx=0x7f75ed745600)
 at /usr/src/debug/nfs-ganesha-2.6-dev.22/MainNFSD/nfs_init.c:263
#2  
#3  0x7f761901abd0 in pthread_mutex_lock () from /lib64/libpthread.so.0
#4  0x004e01d6 in nfs_dupreq_finish (req=0x7f75e1cab800,
res_nfs=0x7f75db135380)
 at /usr/src/debug/nfs-ganesha-2.6-dev.22/RPCAL/nfs_dupreq.c:1174
#5  0x00459064 in nfs_rpc_process_request (reqdata=0x7f75e1cab800)
 at /usr/src/debug/nfs-ganesha-2.6-dev.22/MainNFSD/nfs_worker_thread.c:1416
#6  0x00459493 in nfs_rpc_valid_NFS (req=0x7f75e1cab800)

(gdb) f 4
#4  0x004e01d6 in nfs_dupreq_finish (req=0x7f75e1cab800,
res_nfs=0x7f75db135380)
 at /usr/src/debug/nfs-ganesha-2.6-dev.22/RPCAL/nfs_dupreq.c:1174
1174PTHREAD_MUTEX_lock(>mtx);
(gdb) p *dv
$1 = {rbt_k = {left = 0x0, right = 0x0, parent = 0x0, red = 2, gen =
103478}, fifo_q = {tqe_next = 0x7f7601109a80,
 tqe_prev = 0x7f75e49a62e0}, mtx = {__data = {__lock = 0, __count =
0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0,
   __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size =
'\000' , __align = 0}, hin = {drc = 0x0,
 addr = {ss_family = 0, __ss_padding = '\000' ,
__ss_align = 0}, tcp = {rq_xid = 0, checksum = 0},
 rq_prog = 0, rq_vers = 0, rq_proc = 0}, hk = 0, state =
DUPREQ_COMPLETE, refcnt = 2, res = 0x7f75db135380,
   timestamp = 1515724505}

(gdb) p *req
$2 = {rq_xprt = 0x7f75df908400, rq_clntname = 0x0, rq_svcname = 0x0,
rq_xdrs = 0x7f75c8047400, rq_u1 = 0x7f75db135380,
   rq_u2 = 0x7f75db135380, rq_cksum = 8804038937737838967, rq_auth =
0x7f7618e07d70 , rq_ap1 = 0x0, rq_ap2 = 0x0,
   rq_msg = {rm_xid = 2828691980, rm_direction = REPLY, ru = {RM_cmb =
{cb_rpcvers = 2}, RM_rmb = {rp_stat = MSG_ACCEPTED, ru = {
   RP_ar = {ar_stat = SUCCESS, ru = {AR_versions = {low =
4549240, high = 0}, AR_results = {
 proc = 0x456a78 , where =
0x7f75db135380}}, ar_verf = {oa_flavor = 0, oa_length = 0,
   oa_body = '\000' }}, RP_dr = {rj_stat
= RPC_MISMATCH, ru = {RJ_versions = {low = 0,
 high = 4549240}, RJ_why = AUTH_OK}, rm_xdr = {proc
= 0x4569bf , where = 0x7f75e1cabf08},

Not sure how most of the fields in dupreq_entry can be zeros when it
reaches nfs_dupreq_finish() - at least rq_xid, rq_prog etc., should
have been filled.

If this is a known issue, please let me know.

Thanks,

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Possibility of converting FSAL_VFS and FSAL_XFS to use stackable FSALS

2018-01-12 Thread Daniel Gryniewicz

On 01/12/2018 10:08 AM, Frank Filz wrote:
  

So, the linux/freebsd split was not something I was considering for

stacking.  I

agree, that doesn't match well.  But the VFS subfsal API looks like it can

just

go away and be a stack instead.  And I certainly would not like to see it
extended.


Yea, I think the subfsal could go away.

The VFS/XFS split is more akin to the linux/freebsd split and probably
should stay.

As to Lustre, I'm pretty sure the stackable FSAL will work fine. A stacked
FSAL does have the pointer to the subfsal object, and knows which FSAL is
the subfsal, so in this case where FSAL_LUSTRE MUST stack on top of FSAL_VFS
(in fact FSAL_LUSTRE create export could do the stacking of FSAL_VFS
underneath), there would be no issue of FSAL_LUSTRE looking at the subfsal
obj_handle, or into the state_t and looking at FSAL_VFS structures.


Good point, I'd assumed LUSTER would hard-code it's stacking state, and 
not use configuration for it, but it's important to actually state it. 
In fact, there's no reason for a user to know there's a VFS in the stack 
at all.




As long as we properly document what we are doing, and make sure the
stacking only works as expected, I don't see anything wrong. It only has to
be a layering violation if we decide it should be... And if doing it
carefully as I've advocated keeps it safe, then it isn't a layering
violation...



Agreed.

Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFS-Ganesha and compatible ceph version

2018-01-12 Thread Daniel Gryniewicz
All work necessary to work in containerized ceph/ganesha will be 
backported to 2.6, as it's ready.  This means that some version 2.6.x 
(likely low, maybe 2 or 3) will be fully ready for containerized 
deployment.  This also means that 2.7.0 will work out-of-the-box, since 
the work will be done during the 2.7-dev cycle.


Daniel

On 01/12/2018 08:37 AM, Supriti Singh wrote:

Hi Daniel,

Thanks for reply. Does it mean that 2.7 will already be ready to support 
containerized nfs-ganesha?
Are there more details on what exact features need to be implemented in 
nfs-ganesha and cephfs to make OpenShift/Kubernerties work. I am aware 
of blog post by Jeff: 
https://jtlayton.wordpress.com/2017/11/07/active-active-nfs-over-cephfs/.


It would be really helpful if we can get more details.

Thanks,
Supriti


--
Supriti Singh
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284 (AG Nürnberg)


Daniel Gryniewicz <d...@redhat.com> 01/09/18 3:45 PM >>>

Yes, 2.7 will be compatible with mimic, and likely with luminous as
well, since features for Ganesha tend to be backported in Ceph.

The only major work I'm aware of coming for Ceph/Ganesha is HA on
OpenShift/Kuberneties for FSAL_CEPH, and full in-place read/write for
FSAL_RGW. I'm sure lots of little things will be added, as usual.

Daniel

On Tue, Jan 9, 2018 at 4:52 AM, Supriti Singh <supriti.si...@suse.com> 
wrote:

Hello,

As v2.6-rc1 was recently tagged, I assume the work will start very soon for
2.7. If yes, then is 2.7 targeted to be compatible with ceph mimic? Also,
what are the main features list aimed for 2.7?

Thanks,
Supriti

--
Supriti Singh
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284 (AG Nürnberg)


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel






--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Possibility of converting FSAL_VFS and FSAL_XFS to use stackable FSALS

2018-01-12 Thread Daniel Gryniewicz
So, the linux/freebsd split was not something I was considering for 
stacking.  I agree, that doesn't match well.  But the VFS subfsal API 
looks like it can just go away and be a stack instead.  And I certainly 
would not like to see it extended.


Daniel

On 01/09/2018 06:19 PM, Frank Filz wrote:

I was looking into using stackable FSALs instead of the current mechanism
for creating FSAL_VFS (Linux), FSAL_VFS (freebsd), and FSAL_XFS. The
following functions are implemented differently for FSAL_VFS and FSAL_XFS

int vfs_fd_to_handle(int fd, struct fsal_filesystem *fs,
  vfs_file_handle_t *fh);

This is only used by lookup_path, ultimately the best thing to do would
probably be to duplicate the lookup_path code in both under stacked FSALs.

int vfs_name_to_handle(int atfd,
struct fsal_filesystem *fs,
const char *name,
vfs_file_handle_t *fh);

This could be mapped to the FSAL lookup method.

int vfs_open_by_handle(struct vfs_filesystem *fs,
vfs_file_handle_t *fh, int openflags,
fsal_errors_t *fsal_error);

This could be mapped to the FSAL open2 method

int vfs_encode_dummy_handle(vfs_file_handle_t *fh,
 struct fsal_filesystem *fs);

This is used to fabricate an object handle when we cross a filesystem mount
point into a filesystem not handled by the FSAL. It would somehow have to be
mapped to lookup, but that overloads the use of lookup for
vfs_name_to_handle.

bool vfs_is_dummy_handle(vfs_file_handle_t *fh);

This one is really internal to the guts of FSAL_VFS/XFS and has no analog in
the FSAL API.

bool vfs_valid_handle(struct gsh_buffdesc *desc);

This is called by vfs_check_handle  and could maybe be mapped to the FSAL
wire_to_host method.

int vfs_readlink(struct vfs_fsal_obj_handle *myself,
  fsal_errors_t *fsal_error);

This could be mapped to FSAL readlink method.

int vfs_extract_fsid(vfs_file_handle_t *fh,
  enum fsid_type *fsid_type,
  struct fsal_fsid__ *fsid);

This is also called by vfs_check_handle and maybe more of that function
could be moved into the FSAL wire_to_host method.

int vfs_get_root_handle(struct vfs_filesystem *vfs_fs,
 struct vfs_fsal_export *exp);

int vfs_re_index(struct vfs_filesystem *vfs_fs,
  struct vfs_fsal_export *exp);

These two are used in the process of claiming filesystems during
create_export, so perhaps could be mapped to FSAL create_export method.

All in all, these functions are not a great fit for the FSAL API so I'm not
sure it would be a good solution. Forcing some of the functions into FSAL
methods would require some code duplication that loses some of the advantage
of the mechanism. There would also be a question of how things like the file
descriptors and fsal_filesystems are shared between the main FSAL_ VFS and
the underlying stacked FSAL.

Frank





---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFS-Ganesha and compatible ceph version

2018-01-09 Thread Daniel Gryniewicz
Yes, 2.7 will be compatible with mimic, and likely with luminous as
well, since features for Ganesha tend to be backported in Ceph.

The only major work I'm aware of coming for Ceph/Ganesha is HA on
OpenShift/Kuberneties for FSAL_CEPH, and full in-place read/write for
FSAL_RGW.  I'm sure lots of little things will be added, as usual.

Daniel

On Tue, Jan 9, 2018 at 4:52 AM, Supriti Singh  wrote:
> Hello,
>
> As v2.6-rc1 was recently tagged, I assume the work will start very soon for
> 2.7. If yes, then is 2.7 targeted to be compatible with ceph mimic? Also,
> what are the main features list aimed for 2.7?
>
> Thanks,
> Supriti
>
> --
> Supriti Singh
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] [FSAL_VFS] Probable memory leak when deallocating handles

2018-01-09 Thread Daniel Gryniewicz
I would actually like to see the sub-FSAL removed, and xfs converted
to a stacked FSAL at some point, now that patfs is dead.  But it's not
high on my list (especially with testing xfs...)

Daniel

On Mon, Jan 8, 2018 at 3:15 PM, Frank Filz  wrote:
>> The problem is that the sub-FSAL interface was never intended to be a
>> general interface.  It was added specifically to allow certain features to be
>> added to panfs, so it only has the entrypoints needed for that work.
>> Realistically, if I were implementing it now, I'd use a stacked FSAL, but at 
>> the
>> time that wasn't possible.
>>
>> If you're working on a FSAL that's a slight variation on VFS, you might 
>> consider
>> a stacked FSAL instead.
>
> Yea, I think I'd like to see future work here use the stacked FSAL mechanism. 
> I'm curious how FSAL_ZFS would look as a stacked FSAL (it never even made it 
> to sub-FSAL status). On the other hand, that maybe produces more code 
> duplication than the current implementation of FSAL_ZFS...
>
> What might be good to do is actually lay out where a new variant of FSAL_VFS 
> has to be different and determine the best way to implement that with the 
> smallest amount of code duplication.
>
> Frank
>
>> On 01/08/2018 11:55 AM, sriram patil wrote:
>> > Hmm we can use the mdcache. But wanted to store some extra
>> information
>> > from sub fsal. I guess I can do it by allocating more memory in
>> > vfs_sub_alloc_handle with a single malloc call.
>> >
>> > Thanks,
>> > Sriram
>> >
>> > On 08-Jan-2018 7:43 PM, "Frank Filz" > > > wrote:
>> >
>> > Why did you want to have an additional cache? FSAL_MDCACHE already
>> > provides a cache of essentially every handle (though FSAL_MEM and
>> > FSAL_PSEUDO maintain some additional caching because those object
>> > handles cannot be evicted (we should actually beef up mdcache to
>> > genuinely prevent those handles from being evicted so they don’t
>> > have to do additional caching work…).
>> >
>> > __ __
>> >
>> > Frank
>> >
>> > __ __
>> >
>> > *From:*sriram patil [mailto:spsrirampa...@gmail.com
>> > ]
>> > *Sent:* Sunday, January 7, 2018 11:00 PM
>> > *To:* nfs-ganesha-devel@lists.sourceforge.net
>> > 
>> > *Cc:* kcha...@vmware.com ;
>> > sakt...@vmware.com 
>> > *Subject:* Re: [Nfs-ganesha-devel] [FSAL_VFS] Probable memory leak
>> > when deallocating handles
>> >
>> > __ __
>> >
>> > Hi,
>> >
>> > __ __
>> >
>> > Sorry for the confusion here. The free should work fine because it
>> > is contagious memory allocated in single malloc/calloc call.
>> >
>> > The problem I wanted to address is, there is no corresponding
>> > vfs_sub_free_handle for vfs_sub_alloc_handle. I wanted to maintain a
>> > cache for every handle, once we release the handle the cache entries
>> > should also be evicted. Because there is no support for
>> > vfs_sub_free_handle, the sub fsal does not know when is the handle
>> > released.
>> >
>> > __ __
>> >
>> > Also, it is easier if we have a void pointer in vfs_fsal_obj_handle
>> > to keep sub fsal specific data. This way there is no need of having
>> > a separate cache in sub fsal.
>> >
>> > __ __
>> >
>> > Thanks,
>> >
>> > Sriram
>> >
>> > __ __
>> >
>> > On Mon, Jan 8, 2018 at 11:58 AM, sriram patil
>> > >
>> > wrote:
>> >
>> > Hi,
>> >
>> > __ __
>> >
>> > I was going through the vfs_fsal_obj_handle workflow. As part of
>> > the function alloc_handle we allocate the handle with the help
>> > of vfs_sub_alloc_handle.
>> >
>> > __ __
>> >
>> > vfs_sub_alloc_handle allocates vfs_fsal_obj_handle and
>> > vfs_file_handle_t back to back. And when releasing the handle
>> > (obj_ops->release), it calls free on the vfs_fsal_obj_handle.
>> > So, when is vfs_file_handle_t freed? Am I missing something
>> > here?
>> >
>> > __ __
>> >
>> > Thanks,
>> >
>> > Sriram
>> >
>> > __ __
>> >
>> >
>> >
>> > 
>> > 
>> > Avast logo 
>> >
>> > This email has been checked for viruses by Avast antivirus software.
>> > www.avast.com 
>> >
>> >
>> > <#m_-432962764810679348_DAB4FAD8-2DD7-40BB-A1B8-
>> 4E2AA1F9FDF2>
>> >
>> >
>> >
>> > --
>> >  Check out the vibrant tech community on one of the world's
>> > most engaging tech sites, Slashdot.org! 

Re: [Nfs-ganesha-devel] [FSAL_VFS] Probable memory leak when deallocating handles

2018-01-08 Thread Daniel Gryniewicz
The problem is that the sub-FSAL interface was never intended to be a 
general interface.  It was added specifically to allow certain features 
to be added to panfs, so it only has the entrypoints needed for that 
work.  Realistically, if I were implementing it now, I'd use a stacked 
FSAL, but at the time that wasn't possible.


If you're working on a FSAL that's a slight variation on VFS, you might 
consider a stacked FSAL instead.


Daniel

On 01/08/2018 11:55 AM, sriram patil wrote:
Hmm we can use the mdcache. But wanted to store some extra information 
from sub fsal. I guess I can do it by allocating more memory in 
vfs_sub_alloc_handle with a single malloc call.


Thanks,
Sriram

On 08-Jan-2018 7:43 PM, "Frank Filz" > wrote:


Why did you want to have an additional cache? FSAL_MDCACHE already
provides a cache of essentially every handle (though FSAL_MEM and
FSAL_PSEUDO maintain some additional caching because those object
handles cannot be evicted (we should actually beef up mdcache to
genuinely prevent those handles from being evicted so they don’t
have to do additional caching work…).

__ __

Frank

__ __

*From:*sriram patil [mailto:spsrirampa...@gmail.com
]
*Sent:* Sunday, January 7, 2018 11:00 PM
*To:* nfs-ganesha-devel@lists.sourceforge.net

*Cc:* kcha...@vmware.com ;
sakt...@vmware.com 
*Subject:* Re: [Nfs-ganesha-devel] [FSAL_VFS] Probable memory leak
when deallocating handles

__ __

Hi,

__ __

Sorry for the confusion here. The free should work fine because it
is contagious memory allocated in single malloc/calloc call.

The problem I wanted to address is, there is no corresponding
vfs_sub_free_handle for vfs_sub_alloc_handle. I wanted to maintain a
cache for every handle, once we release the handle the cache entries
should also be evicted. Because there is no support for
vfs_sub_free_handle, the sub fsal does not know when is the handle
released.

__ __

Also, it is easier if we have a void pointer in vfs_fsal_obj_handle
to keep sub fsal specific data. This way there is no need of having
a separate cache in sub fsal.

__ __

Thanks,

Sriram

__ __

On Mon, Jan 8, 2018 at 11:58 AM, sriram patil
> wrote:

Hi,

__ __

I was going through the vfs_fsal_obj_handle workflow. As part of
the function alloc_handle we allocate the handle with the help
of vfs_sub_alloc_handle.

__ __

vfs_sub_alloc_handle allocates vfs_fsal_obj_handle and
vfs_file_handle_t back to back. And when releasing the handle
(obj_ops->release), it calls free on the vfs_fsal_obj_handle.
So, when is vfs_file_handle_t freed? Am I missing something
here?

__ __

Thanks,

Sriram

__ __




Avast logo   

This email has been checked for viruses by Avast antivirus software.
www.avast.com 


<#m_-432962764810679348_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Implement a FSAL for S3-compatible storage

2017-12-15 Thread Daniel Gryniewicz

On 12/15/2017 12:48 PM, Frank Filz wrote:

From: Aurelien RAINONE [mailto:aurelien.rain...@gmail.com]
Sent: Friday, December 15, 2017 9:30 AM
To: Nfs-ganesha-devel@lists.sourceforge.net
Subject: [Nfs-ganesha-devel] Implement a FSAL for S3-compatible storage

Hi,

I'm new to nfs-ganesha and on this mailing-list. I'm currently reading a
maximum of documentation and code in order to understand where to start
the implementation of a new FSAL.

The main objectives of my development are:
  - connect one fsal export to exactly one S3 Bucket. Requests to the S3 host
will be performed with libs3 (github.com/bji/libs3).
  - make proper use of cache in order to not overload the network connection
to the S3 host.
  - no need for full ACL support, maybe also no ACL at all, at least at the
beggining.

My new FSAL would be based on v2.5.4 stable branch.

Here are my surely very basic questions:

1) I read that FSAL API has undergone major refactorings in the last
months/years. Is the 2.5.4 branch recent enough for my needs?


Should be reasonable, though you might want to develop against next. I strongly 
encourage development in tree.


2) I intent to start developing the new FSAL over FSAL_RGW? Is my choice
correct ?


I'm not quite sure what you mean by that.


FSAL_RGW may or may not be a good place to start.  It provides a 
filesystem-like library API to Ganesha that translates internally to the 
objects view provided by RGW.  It doesn't actually use any S3 at all, it 
speaks RADOS directly to the Ceph cluster.


That said, there may not be any better FSAL to base on, since all FSALs 
in the tree currently provide a filesystem API, except for PROXY, which 
translates NFS directly to NFS.


In our work on FSAL_RGW, we've come to the conclusion that converting 
between NFS and S3 is very hard, so expect to hit many cases where it 
either doesn't work, or doesn't work well, for the first several 
iterations at least.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] dev.20 segfault on shutdown

2017-12-13 Thread Daniel Gryniewicz
This is why I've never bothered to fix the crashes when
desrtoy_fsals() finds things; it's an indication to me that something
is leaking refs.

On Tue, Dec 12, 2017 at 7:55 PM, Frank Filz  wrote:
>> I was testing code I'd written over the weekend, but it segfaulted on
>> shutdown after running pynfs (pynfs itself was successful.)  No problems
>> simply starting and pkilling without doing any work.
>>
>> Gradually backed things out, until I'm at the 1a75e52 V2.6-dev.20, but
> still
>> seeing the problem on shutdown.  Ran it twice to be sure.  Took quite a
> bit of
>> time to run pynfs over and over.
>
> Ok, so I've fixed the crash, but looking at some debug, the reason we are
> getting to where it could crash is that we are leaking export references.
> I'm doing some code examination and finding export and obj_handle reference
> leaks... So far they are all in NFS v4.
>
> I hope to post some patches early tomorrow.
>
> It would really help if things that expected everything to cleanup actually
> checked if everything was cleaned up...
>
> destroy_fsals should never find any exports to call shutdown_export on.
>
> Frank
>
>> Error: couldn't complete write to the log file
>> /home/bill/rdma/install/var/log/ganesha.log status=9 (Bad file descriptor)
>> message was:
>> 11/12/2017 19:13:01 : epoch 5a2f193a : simpson91 : ganesha.nfsd-
>> 13288[Admin] rpc :TIRPC :DEBUG :svc_destroy_it() 0x6198bb80 fd 19
>> xp_refs 1 af 0 port 4294967295 @ svc_xprt_shutdown:364
>>
>> Thread 271 "ganesha.nfsd" received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 0x7fff68053700 (LWP 31096)]
>> 0x7fffef8ca739 in release (exp_hdl=0x6130cec0)
>>  at /home/bill/rdma/nfs-ganesha/src/FSAL/FSAL_VFS/export.c:79
>> 79LogDebug(COMPONENT_FSAL, "Releasing VFS export for
>> %s",
>> (gdb) bt
>> #0  0x7fffef8ca739 in release (exp_hdl=0x6130cec0)
>>  at /home/bill/rdma/nfs-ganesha/src/FSAL/FSAL_VFS/export.c:79
>> #1  0x0044799d in shutdown_export (export=0x6130cec0)
>>  at /home/bill/rdma/nfs-ganesha/src/FSAL/fsal_destroyer.c:152
>> #2  0x00447d66 in destroy_fsals ()
>>  at /home/bill/rdma/nfs-ganesha/src/FSAL/fsal_destroyer.c:194
>> #3  0x0047d9c3 in do_shutdown ()
>>  at /home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_admin_thread.c:511
>> #4  0x0047de09 in admin_thread (UnusedArg=0x0)
>>  at /home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_admin_thread.c:531
>> #5  0x760b373a in start_thread (arg=0x7fff68053700)
>>  at pthread_create.c:333
>> #6  0x7598ae7f in clone ()
>>  at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>> (gdb) quit
>> A debugging session is active.
>>
>>   Inferior 1 [process 30823] will be killed.
>>
>> Quit anyway? (y or n) y
>> [root@simpson91 install]#
>>
>>
>> Thread 270 "ganesha.nfsd" received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 0x7fff68087700 (LWP 6650)]
>> 0x7fffef8ca739 in release (exp_hdl=0x6130cec0)
>>  at /home/bill/rdma/nfs-ganesha/src/FSAL/FSAL_VFS/export.c:79
>> 79LogDebug(COMPONENT_FSAL, "Releasing VFS export for
>> %s",
>> (gdb) bt
>> #0  0x7fffef8ca739 in release (exp_hdl=0x6130cec0)
>>  at /home/bill/rdma/nfs-ganesha/src/FSAL/FSAL_VFS/export.c:79
>> #1  0x0044799d in shutdown_export (export=0x6130cec0)
>>  at /home/bill/rdma/nfs-ganesha/src/FSAL/fsal_destroyer.c:152
>> #2  0x00447d66 in destroy_fsals ()
>>  at /home/bill/rdma/nfs-ganesha/src/FSAL/fsal_destroyer.c:194
>> #3  0x0047d9c3 in do_shutdown ()
>>  at /home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_admin_thread.c:511
>> #4  0x0047de09 in admin_thread (UnusedArg=0x0)
>>  at /home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_admin_thread.c:531
>> #5  0x760b373a in start_thread (arg=0x7fff68087700)
>>  at pthread_create.c:333
>> #6  0x75989e7f in clone ()
>>  at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>> (gdb) quit
>> A debugging session is active.
>>
>>   Inferior 1 [process 6378] will be killed.
>>
>> Quit anyway? (y or n) y
>> [root@simpson91 install]#
>>
>>
>>
>>
> 
> --
>> Check out the vibrant tech community on one of the world's most engaging
>> tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing 

Re: [Nfs-ganesha-devel] test over FSAl_NULL

2017-12-12 Thread Daniel Gryniewicz

Okay, with this fix, stacking NULL works for me:

 https://review.gerrithub.io/391463

Daniel

On 12/12/2017 11:52 AM, Daniel Gryniewicz wrote:
Okay, I'm able to reproduce.  I'm looking at this, but the problem is 
that the export being set before mdcache is called is NULL's export, not 
MDCACHE's export, so the double un-stack causes VFS to see a NULL 
export.  Somewhere, the top of the export stack is being lost.


Daniel

On 12/08/2017 10:59 AM, Daniel Gryniewicz wrote:
I run NULL semi-regularly.  The last time I ran it was a couple of 
months ago, so something may have crept in.  I'll try again.


That said, the code in that callpath looks correct.

Daniel

On 12/08/2017 05:46 AM, LUCAS Patrice wrote:

Hi,


Does anyone recently test the FSAL_NULL stackable FSAL ?


Before using it as example of coding a stackable FSAL, I simply tried 
to use FSAL_NULL over FSAL_VFS and I got the following segmentation 
fault when running cthon04 basic test7 ('link and rename').


Best regards,

Patrice


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x72daf700 (LWP 22397)]
0x0041c23e in posix2fsal_attributes (buffstat=0x72dad590, 
fsalattr=0x72dad780) at /opt/nfs-ganesha/src/FSAL/fsal_convert.c:432
432 fsalattr->supported = 
op_ctx->fsal_export->exp_ops.fs_supported_attrs(
Missing separate debuginfos, use: debuginfo-install 
glibc-2.17-157.el7_3.5.x86_64 gssproxy-0.4.1-13.el7.x86_64 
keyutils-libs-1.5.8-3.el7.x86_64 kr
b5-libs-1.14.1-27.ocean1.el7.centos.x86_64 
libcom_err-1.42.13.wc6-8.ocean1.el7.centos.x86_64 
libselinux-2.5-6.el7.x86_64 pcre-8.32-15.el7_2.1.x86_

64
(gdb) where
#0  0x0041c23e in posix2fsal_attributes 
(buffstat=0x72dad590, fsalattr=0x72dad780)

 at /opt/nfs-ganesha/src/FSAL/fsal_convert.c:432
#1  0x0041c21c in posix2fsal_attributes_all 
(buffstat=0x72dad590, fsalattr=0x72dad780)

 at /opt/nfs-ganesha/src/FSAL/fsal_convert.c:422
#2  0x73f155dc in fetch_attrs (myself=0x7fffd801baf0, 
my_fd=35, attrs=0x72dad780) at 
/opt/nfs-ganesha/src/FSAL/FSAL_VFS/file.c:325
#3  0x73f1927a in vfs_getattr2 (obj_hdl=0x7fffd801baf0, 
attrs=0x72dad780) at /opt/nfs-ganesha/src/FSAL/FSAL_VFS/file.c:1595
#4  0x741255d3 in getattrs (obj_hdl=0x7fffd800fdd0, 
attrib_get=0x72dad780)

 at /opt/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_NULL/handle.c:503
#5  0x00531459 in mdcache_refresh_attrs 
(entry=0x7fffd80175e0, need_acl=false, invalidate=false)
 at 
/opt/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1048 

#6  0x0052d4a1 in mdcache_refresh_attrs_no_invalidate 
(entry=0x7fffd80175e0)
 at 
/opt/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h:445
#7  0x005310be in mdcache_rename (obj_hdl=0x7fffe4037888, 
olddir_hdl=0x7fffd8017618, old_name=0x7fffd8002b80 "file.0",

 newdir_hdl=0x7fffd8017618, new_name=0x7fffd800f730 "newfile.0")
 at 
/opt/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:991 

#8  0x00431b26 in fsal_rename (dir_src=0x7fffd8017618, 
oldname=0x7fffd8002b80 "file.0", dir_dest=0x7fffd8017618,
 newname=0x7fffd800f730 "newfile.0") at 
/opt/nfs-ganesha/src/FSAL/fsal_helper.c:1412
#9  0x00475947 in nfs4_op_rename (op=0x7fffd80153f0, 
data=0x72dadae0, resp=0x7fffd8018220)

 at /opt/nfs-ganesha/src/Protocols/NFS/nfs4_op_rename.c:122
#10 0x00459b84 in nfs4_Compound (arg=0x7fffd800c538, 
req=0x7fffd800be30, res=0x7fffd800a950)

 at /opt/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:752
#11 0x0044ab75 in nfs_rpc_process_request 
(reqdata=0x7fffd800be30) at 
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1338
#12 0x0044b77a in nfs_rpc_valid_NFS (req=0x7fffd800be30) at 
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1736
#13 0x76c28546 in svc_vc_decode (req=0x7fffd800be30) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:812
#14 0x0044fb64 in nfs_rpc_decode_request 
(xprt=0x7fffe4000bc0, xdrs=0x7fffd8017be0)

 at /opt/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1625
#15 0x76c28458 in svc_vc_recv (xprt=0x7fffe4000bc0) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:785
#16 0x76c24bce in svc_rqst_xprt_task (wpe=0x7fffe4000dd8) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:753
#17 0x76c25048 in svc_rqst_epoll_events (sr_rec=0x7ef210, 
n_events=1) at /opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:925
#18 0x76c252ea in svc_rqst_epoll_loop (sr_rec=0x7ef210) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:998
#19 0x76c2539d in svc_rqst_run_task (wpe=0x7ef210) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1034
#20 0x76c2e9f1 in work_pool_thread (arg=0x7fffe8c0) at 
/opt/nfs-ganesha/src/libntirpc/src/work_pool.c:176

#21 0x77058dc5 in start_thread () from /lib64/libpthread.so.0
#22 0x7f

Re: [Nfs-ganesha-devel] test over FSAl_NULL

2017-12-12 Thread Daniel Gryniewicz
Okay, I'm able to reproduce.  I'm looking at this, but the problem is 
that the export being set before mdcache is called is NULL's export, not 
MDCACHE's export, so the double un-stack causes VFS to see a NULL 
export.  Somewhere, the top of the export stack is being lost.


Daniel

On 12/08/2017 10:59 AM, Daniel Gryniewicz wrote:
I run NULL semi-regularly.  The last time I ran it was a couple of 
months ago, so something may have crept in.  I'll try again.


That said, the code in that callpath looks correct.

Daniel

On 12/08/2017 05:46 AM, LUCAS Patrice wrote:

Hi,


Does anyone recently test the FSAL_NULL stackable FSAL ?


Before using it as example of coding a stackable FSAL, I simply tried 
to use FSAL_NULL over FSAL_VFS and I got the following segmentation 
fault when running cthon04 basic test7 ('link and rename').


Best regards,

Patrice


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x72daf700 (LWP 22397)]
0x0041c23e in posix2fsal_attributes (buffstat=0x72dad590, 
fsalattr=0x72dad780) at /opt/nfs-ganesha/src/FSAL/fsal_convert.c:432
432 fsalattr->supported = 
op_ctx->fsal_export->exp_ops.fs_supported_attrs(
Missing separate debuginfos, use: debuginfo-install 
glibc-2.17-157.el7_3.5.x86_64 gssproxy-0.4.1-13.el7.x86_64 
keyutils-libs-1.5.8-3.el7.x86_64 kr
b5-libs-1.14.1-27.ocean1.el7.centos.x86_64 
libcom_err-1.42.13.wc6-8.ocean1.el7.centos.x86_64 
libselinux-2.5-6.el7.x86_64 pcre-8.32-15.el7_2.1.x86_

64
(gdb) where
#0  0x0041c23e in posix2fsal_attributes 
(buffstat=0x72dad590, fsalattr=0x72dad780)

 at /opt/nfs-ganesha/src/FSAL/fsal_convert.c:432
#1  0x0041c21c in posix2fsal_attributes_all 
(buffstat=0x72dad590, fsalattr=0x72dad780)

 at /opt/nfs-ganesha/src/FSAL/fsal_convert.c:422
#2  0x73f155dc in fetch_attrs (myself=0x7fffd801baf0, 
my_fd=35, attrs=0x72dad780) at 
/opt/nfs-ganesha/src/FSAL/FSAL_VFS/file.c:325
#3  0x73f1927a in vfs_getattr2 (obj_hdl=0x7fffd801baf0, 
attrs=0x72dad780) at /opt/nfs-ganesha/src/FSAL/FSAL_VFS/file.c:1595
#4  0x741255d3 in getattrs (obj_hdl=0x7fffd800fdd0, 
attrib_get=0x72dad780)

 at /opt/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_NULL/handle.c:503
#5  0x00531459 in mdcache_refresh_attrs (entry=0x7fffd80175e0, 
need_acl=false, invalidate=false)
 at 
/opt/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1048 

#6  0x0052d4a1 in mdcache_refresh_attrs_no_invalidate 
(entry=0x7fffd80175e0)
 at 
/opt/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h:445
#7  0x005310be in mdcache_rename (obj_hdl=0x7fffe4037888, 
olddir_hdl=0x7fffd8017618, old_name=0x7fffd8002b80 "file.0",

 newdir_hdl=0x7fffd8017618, new_name=0x7fffd800f730 "newfile.0")
 at 
/opt/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:991 

#8  0x00431b26 in fsal_rename (dir_src=0x7fffd8017618, 
oldname=0x7fffd8002b80 "file.0", dir_dest=0x7fffd8017618,
 newname=0x7fffd800f730 "newfile.0") at 
/opt/nfs-ganesha/src/FSAL/fsal_helper.c:1412
#9  0x00475947 in nfs4_op_rename (op=0x7fffd80153f0, 
data=0x72dadae0, resp=0x7fffd8018220)

 at /opt/nfs-ganesha/src/Protocols/NFS/nfs4_op_rename.c:122
#10 0x00459b84 in nfs4_Compound (arg=0x7fffd800c538, 
req=0x7fffd800be30, res=0x7fffd800a950)

 at /opt/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:752
#11 0x0044ab75 in nfs_rpc_process_request 
(reqdata=0x7fffd800be30) at 
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1338
#12 0x0044b77a in nfs_rpc_valid_NFS (req=0x7fffd800be30) at 
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1736
#13 0x76c28546 in svc_vc_decode (req=0x7fffd800be30) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:812
#14 0x0044fb64 in nfs_rpc_decode_request (xprt=0x7fffe4000bc0, 
xdrs=0x7fffd8017be0)

 at /opt/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1625
#15 0x76c28458 in svc_vc_recv (xprt=0x7fffe4000bc0) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:785
#16 0x76c24bce in svc_rqst_xprt_task (wpe=0x7fffe4000dd8) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:753
#17 0x76c25048 in svc_rqst_epoll_events (sr_rec=0x7ef210, 
n_events=1) at /opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:925
#18 0x76c252ea in svc_rqst_epoll_loop (sr_rec=0x7ef210) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:998
#19 0x76c2539d in svc_rqst_run_task (wpe=0x7ef210) at 
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1034
#20 0x76c2e9f1 in work_pool_thread (arg=0x7fffe8c0) at 
/opt/nfs-ganesha/src/libntirpc/src/work_pool.c:176

#21 0x77058dc5 in start_thread () from /lib64/libpthread.so.0
#22 0x7671a76d in clone () from /lib64/libc.so.6
(gdb)

-- 


Check out the 

Re: [Nfs-ganesha-devel] Stacked FSALs and fsal_export parameters and op_ctx

2017-12-08 Thread Daniel Gryniewicz

fsal_export should probably be set anywhere gsh_export is set.

Daniel

On 12/07/2017 07:54 PM, Frank Filz wrote:

Stacked FSALs often depend on op_ctx->fsal_export being set.

We also have lots of FSAL methods that take the fsal_export as a parameter.

I wonder if we would be better off removing the fsal_export parameters in
almost all cases, and instead expecting op_ctx->fsal_export to be set?

One danger is that if an FSAL method is called for a different export than
op_ctx->fsal_export, the subcall macros will end up changing
op_ctx->fsal_export and break the caller... It would be better that the
caller assure that op_ctx is properly set up (and save the current
op_ctx->fsal_export if necessary).

In any case, we probably need to audit anywhere op_ctx->fsal_export is not
set. I see that RQUOTA does not set it...

One advantage of making sure op_ctx->fsal_export is always set in the upper
layers is that it should ALSO set op_ctx->ctx_export.

We should also check for any place where op_ctx is NULL. We have
subcall_shutdown_raw that assumes op_ctx might not be set, and the only
place it is used is in calling sub_export release, with the result that that
was not actually working right in the DBUS unexport case where there wasn't
a proper op_ctx (though I wonder if that's also why shutdown reports an
extra ref to FSAL_PSEUDO, is there some point in shutdown where we don't
have an op_ctx?).

Frank


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Stacked FSALs and fsal_export parameters and op_ctx

2017-12-08 Thread Daniel Gryniewicz
I'm on the opposite end.  I'm not in favor of passing op_ctx needlessly 
to every function in Ganesha.


Any useful threading system must have some form of TLS.

Daniel

On 12/08/2017 10:13 AM, Matt Benjamin wrote:

I'd like to see this use of TLS as a "hidden parameter" replaced
regardless.  It has been a source of bugs, and locks us into a
pthreads execution model I think needlessly.

Matt

On Fri, Dec 8, 2017 at 10:07 AM, Frank Filz  wrote:

On 12/7/17 7:54 PM, Frank Filz wrote:

Stacked FSALs often depend on op_ctx->fsal_export being set.

We also have lots of FSAL methods that take the fsal_export as a

parameter.



The latter sounds better.

Now that we know every single thread local storage access involves a hidden
lock/unlock sequence in glibc "magically" invoked by the linker, it would be
better to remove as many TLS references as possible!

After all, too many lock/unlock are a real performance issue.

Perhaps we should pass op_ctx as the parameter instead.


I thought the lock was only to create the TLS variable, and not on every 
reference.

Frank


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel







--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Proposal: new scripts repo

2017-11-27 Thread Daniel Gryniewicz

On 11/22/2017 08:04 AM, LUCAS Patrice wrote:

On 11/21/17 15:20, Daniel Gryniewicz wrote:

Hi, All.

I would like to propose a new repo on our github for helpful scripts. 
I'd like to have testing/performance scripts there, as a start, but 
maybe other useful things can go there too in the future.


The idea is that this would be a place to share things like setups for 
running dbench/iozone/etc., so that when issues arise, we can point to 
the script that triggers the issue.


My proposal is that the scripts would have a standard header at the 
top, that describes what needs to be installed, how many machines are 
necessary, and how they are connected on networks, and so on, so that 
someone could pick up a script and, with minimal interaction with 
anyone else, be able to run it.


What do people think about this?  Should we discuss it on the next call?

Daniel

-- 


Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel



Hi Daniel,

We already have a ganesah github repo dedicated to Continuous 
Integration Tests : https://github.com/nfs-ganesha/ci-tests . Why not 
adding your scripts to this repo instead of creating a new one ?


Best regards,




I thought of that, but that is specifically scripts run by CI, and this 
is intended to be scripts run by hand. I don't have a strong opinion, 
but I feel that ci-tests should be focused on CI.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Proposal: new scripts repo

2017-11-21 Thread Daniel Gryniewicz

Hi, All.

I would like to propose a new repo on our github for helpful scripts. 
I'd like to have testing/performance scripts there, as a start, but 
maybe other useful things can go there too in the future.


The idea is that this would be a place to share things like setups for 
running dbench/iozone/etc., so that when issues arise, we can point to 
the script that triggers the issue.


My proposal is that the scripts would have a standard header at the top, 
that describes what needs to be installed, how many machines are 
necessary, and how they are connected on networks, and so on, so that 
someone could pick up a script and, with minimal interaction with anyone 
else, be able to run it.


What do people think about this?  Should we discuss it on the next call?

Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Exporting several cephfs filesystems

2017-11-20 Thread Daniel Gryniewicz

Hi, Alessandro

Currently, there's no way to address more than one ceph cluster in a 
single Ganesha.  The ceph configuration is in the global CEPH block, 
which is a singleton, shared between all cephfs exports.


Typically, the way you would do this would be to run a ganesha server 
per ceph cluster.  This won't allow, however, putting all the exports 
into a single pseudofs.  This would be possible, by using redirects on 
one server, but very complicated.


I'm not sure about the status of libcephfs allowing connections to 
multiple servers.  It may be possible, but there may be changes needed 
to libcephfs to allow this in Ganesha.  Once libcephfs allows it, it 
shouldn't be too difficult to add config to the CEPH FSAL block within 
an export.


Daniel

On 11/20/2017 12:38 PM, Alessandro De Salvo wrote:

Hi,

I'm trying to export several cephfs filesystems, hosted in different 
clusters, using ganesha v2.5.3.


I saw from the code there is an option 'ceph_conf' that could be used to 
specify different ceph configs, but it does not really appear to be an 
option understood in the EXPORT block, probably it's only a general 
config. Is there any way to export several cephfs filesystems, hosted in 
different clusters, exposing them in appropriate pseudo paths?


Thanks,


 Alessandro


-- 


Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Subfsal export

2017-11-10 Thread Daniel Gryniewicz
You can do this with a stackable FSAL, since it's on top, not 
underneath, so it can override anything you want.  You can even just put 
the symlink part in your FSAL, and defer everything else so VFS, if you 
want.


Daniel

On 11/08/2017 03:04 PM, Frank Filz wrote:
Yea, in the case of wanting to allow symlink exports, you would need to 
replace the method. I’d be interested in why you want to allow them. 
That is something that actually could be handled by a config or compile 
option.


Frank

*From:*sriram patil [mailto:spsrirampa...@gmail.com]
*Sent:* Wednesday, November 8, 2017 10:20 AM
*To:* Frank Filz <ffilz...@mindspring.com>
*Cc:* d...@redhat.com; nfs-ganesha-devel@lists.sourceforge.net
*Subject:* Re: [Nfs-ganesha-devel] Subfsal export

Hi,

In case of stackable FSALs, what if I want the underlying FSAL behave 
differently. For example, VFS does not allow symlink exports, 
lookup_path fails if we try to export symlink pointing to a directory. 
There can be such cases where I want to defer (very small changes) from 
the underlying "sub fsal". For bypassing some part of the underlying 
fsal function, is rewriting the function the only way?


Thanks,

Sriram

On Wed, Nov 8, 2017 at 11:22 PM, sriram patil <spsrirampa...@gmail.com 
<mailto:spsrirampa...@gmail.com>> wrote:


Hmm stackable FSAL sounds like a good option. Let me investigate on
that. I was mainly going with sub fsal because of xfs.

I agree with both of you about pushing FSAL changes to upstream. Let
me see if I can make that happen.

Thanks,

Sriram

On 08-Nov-2017 11:03 PM, "Frank Filz" <ffilz...@mindspring.com
<mailto:ffilz...@mindspring.com>> wrote:

 > You might want to consider a stackable FSAL instead. 
FSAL_NULL is a good

 > example to start from.  It's much more flexible than
sub_fsals for VFS.
If I
 > was implementing PanFS now, I'd use a stackable FSAL instead.

Yea, this is a good point. Probably even the XFS/VFS split
should be done
with stackable FSAL.

 > Ganesha is LGPL specifically to allow non-FOSS FSALs. 
However, if the

FSAL
 > itself does not require proprietary code, there are many
advantages to
open
 > sourcing it and getting it upstream.  Not the least is that
changes to
APIs and
 > the build system will be fixed by the community, so there
will be fewer
cases
 > of sudden breakage for you to fix.

On top of direct API impacts, it also helps to understand how
other FSALs
are using Ganesha so that not only do we not break the API, but
don't break
other assumptions that were made. It also is a better platform for
requesting FSAL specific API accommodation (for a recent
example, see
compute_readdir_cookie and whence_is_name support that was added for
FSAL_RGW, though added in a generic way so other FSALs might use
it).

Frank

 > Daniel
 >
 > On 11/08/2017 10:51 AM, sriram patil wrote:
 > > Yes, I am making a new sub fsal. May not push it to
upstream because
 > > it will not be useful without the whole framework/product.
As part of
 > > that, I wanted to allocate an export object which has some
extra
 > > fields, other than the ones in vfs_fsal_export.
 > >
 > > Also, I hope creating a sub fsal and not making the
implementation
 > > opensource does not violate any license terms.
 > >
 > > Thanks,
 > > Sriram
 > >
 > > On 08-Nov-2017 8:19 PM, "Daniel Gryniewicz"
<d...@redhat.com <mailto:d...@redhat.com>
 > > <mailto:d...@redhat.com <mailto:d...@redhat.com>>> wrote:
 > >
 > > On 11/08/2017 02:41 AM, sriram patil wrote:
 > >
 > > Hi,
 > >
 > > In the subfsal framework, I see that subfsals can
have their own
 > > fsal_obj_handles by implementing
vfs_sub_alloc_handle and then
 > > use subfsal specific variables using container_of.
 > >
 > > It does not provide same functionality for
fsal_export however.
 > > There is no vfs_sub_alloc_export. 
vfs_create_export just calls

 > > gsh_calloc to allocate vfs_fsal_export, giving no
flexibility.
 > >
 > > PANFS has its own struct panfs_fsal_export but it
is not
 > > allocated anywhere. It still uses conta

Re: [Nfs-ganesha-devel] crash in jemalloc leading to a deadlock.

2017-11-08 Thread Daniel Gryniewicz
Allocating in a backtrace seems like a very bad idea.  If there's ever a 
crash during an allocation, it is guaranteed to deadlock.


Daniel

On 11/08/2017 01:43 PM, Pradeep wrote:

I'm using Ganesha 2.6 dev.12 with jemalloc-3.6.0 and hitting a case
where jemalloc seem to be holding a lock and crashing. In Ganesha's
gsh_backtrace(), we try to allocate memory and that hangs (ended up in
deadlock). Have you seen this before? Perhaps it is a good idea not to
allocate memory in backtrace path?


#0 0x7f49b51ff1bd in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x7f49b51fad02 in _L_lock_791 () from /lib64/libpthread.so.0
#2 0x7f49b51fac08 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x7f49b65d12dc in arena_bin_malloc_hard () from /lib64/libjemalloc.so.1
#4 0x7f49b65d1516 in je_arena_tcache_fill_small () from
/lib64/libjemalloc.so.1
#5 0x7f49b65ea6ff in je_tcache_alloc_small_hard () from
/lib64/libjemalloc.so.1
#6 0x7f49b65ca14f in malloc () from /lib64/libjemalloc.so.1
#7 0x7f49b6c5a785 in _dl_scope_free () from /lib64/ld-linux-x86-64.so.2
#8 0x7f49b6c55841 in _dl_map_object_deps () from /lib64/ld-linux-x86-64.so.2
#9 0x7f49b6c5ba4b in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#10 0x7f49b6c57364 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#11 0x7f49b6c5b35b in _dl_open () from /lib64/ld-linux-x86-64.so.2
#12 0x7f49b48f5ff2 in do_dlopen () from /lib64/libc.so.6
#13 0x7f49b6c57364 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#14 0x7f49b48f60b2 in __libc_dlopen_mode () from /lib64/libc.so.6
#15 0x7f49b48cf595 in init () from /lib64/libc.so.6
#16 0x7f49b51fdbb0 in pthread_once () from /lib64/libpthread.so.0
#17 0x7f49b48cf6ac in backtrace () from /lib64/libc.so.6
#18 0x0045193d in gsh_backtrace () at
/usr/src/debug/nfs-ganesha-2.6-dev.12/MainNFSD/nfs_init.c:228
#19 0x004519fe in crash_handler (signo=11,
info=0x7f49b155db70, ctx=0x7f49b155da40) at
/usr/src/debug/nfs-ganesha-2.6-dev.12/MainNFSD/nfs_init.c:244
#20 
#21 0x7f49b65d0c61 in arena_purge () from /lib64/libjemalloc.so.1
#22 0x7f49b65d218d in je_arena_dalloc_large () from /lib64/libjemalloc.so.1

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Subfsal export

2017-11-08 Thread Daniel Gryniewicz

On 11/08/2017 02:41 AM, sriram patil wrote:

Hi,

In the subfsal framework, I see that subfsals can have their own 
fsal_obj_handles by implementing vfs_sub_alloc_handle and then use 
subfsal specific variables using container_of.


It does not provide same functionality for fsal_export however. There is 
no vfs_sub_alloc_export.  vfs_create_export just calls gsh_calloc to 
allocate vfs_fsal_export, giving no flexibility.


PANFS has its own struct panfs_fsal_export but it is not allocated 
anywhere. It still uses container_of on vfs_fsal_export. This looks like 
a memory corruption. The last commit in PANFS however is about a year 
back, not sure if it is actively developed.


PANFS is unused and unmaintained.  We keep it building, but that's it.

Considering the above scenario, it makes sense to have 
vfs_sub_alloc_export to allow allocating the wrapper export object. Any 
thoughts?


This seems like it was a bug in the original implementation for PanFS. 
It probably should be fixed, but the VFS sub_fsal doesn't need it, so it 
works for now.


Are you making a new sub_fsal?

Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Backport list for 2.5.4

2017-11-02 Thread Daniel Gryniewicz
They're in use in downstream for Ceph, and have been tested by QA, so 
they should be safe.  If we decide we don't want them in 2.5, then 
downstream RHCS will just need to carry them as patches.


Daniel

On 11/02/2017 03:56 AM, Malahal Naineni wrote:
Dan, I remember that we waited for the recovery code (aka IP failover 
code) reorganization patches to go into V2.6 alone. Do they now have 
enough runtime to get merged into V2.5 stable branch?


Regards, Malahal.

On Tue, Oct 31, 2017 at 11:37 PM, Daniel Gryniewicz <d...@redhat.com 
<mailto:d...@redhat.com>> wrote:


Here's the set of commits that downstream Ceph needs.  Gluster can
also use the non-Ceph related ones.

Note, these are oldest first, not newest first.

Daniel


commit b862fe360b2a0f1b1d9d5d6a8b91f1550b66b269
Author: Gui Hecheng <guihech...@cmss.chinamobile.com
<mailto:guihech...@cmss.chinamobile.com>>
AuthorDate: Thu Mar 30 10:44:25 2017 +0800
Commit: Frank S. Filz <ffilz...@mindspring.com
<mailto:ffilz...@mindspring.com>>
CommitDate: Fri Aug 11 14:31:22 2017 -0700

 SAL: extract fs logic from nfs4_recovery

 This is a prepare patch for modulized recovery backends.
 - define recovery apis: struct nfs_recovery_backend
 - define hooks for recovery_fs module

 Change-Id: I45523ef9a0e6f9a801fc733b095ba2965dd8751b
 Signed-off-by: Gui Hecheng <guihech...@cmss.chinamobile.com
<mailto:guihech...@cmss.chinamobile.com>>
commit cb787a1cf4a4df4da672c6b00cb0724db5d99e4d
Author: Gui Hecheng <guihech...@cmss.chinamobile.com
<mailto:guihech...@cmss.chinamobile.com>>
AuthorDate: Thu Mar 30 10:50:18 2017 +0800
Commit: Frank S. Filz <ffilz...@mindspring.com
<mailto:ffilz...@mindspring.com>>
CommitDate: Fri Aug 11 14:31:23 2017 -0700

 SAL: introduce new recovery backend based on rados kv store

 Use rados OMAP API to implement a kv store for client tracking data

 Change-Id: I1aec1e110a2fba87ae39a1439818a363b6cfc822
 Signed-off-by: Gui Hecheng <guihech...@cmss.chinamobile.com
<mailto:guihech...@cmss.chinamobile.com>>
commit fbc905015d01a7f2548b81d84f35b76524543f13
Author: Gui Hecheng <guihech...@cmss.chinamobile.com
<mailto:guihech...@cmss.chinamobile.com>>
AuthorDate: Wed May 3 09:58:34 2017 +0800
Commit: Frank S. Filz <ffilz...@mindspring.com
<mailto:ffilz...@mindspring.com>>
CommitDate: Fri Aug 11 14:31:23 2017 -0700

 cmake: make modulized recovery backends compile as modules

 - add USE_RADOS_RECOV option for new rados kv backend
 - keep original fs backend as default

 Change-Id: I26c2c4f9a433e6cd70f113fa05194d6817b9377a
 Signed-off-by: Gui Hecheng <guihech...@cmss.chinamobile.com
<mailto:guihech...@cmss.chinamobile.com>>
commit eb4eea1343251f17fe39de48426bc4363eaef957
Author: Gui Hecheng <guihech...@cmss.chinamobile.com
<mailto:guihech...@cmss.chinamobile.com>>
AuthorDate: Thu May 4 22:43:17 2017 +0800
Commit: Frank S. Filz <ffilz...@mindspring.com
<mailto:ffilz...@mindspring.com>>
CommitDate: Fri Aug 11 14:31:23 2017 -0700

 config: add new config options for rados_kv recovery backend

 - new config block: RADOS_KV
 - new option: ceph_conf, userid, pool

 Change-Id: Id44afa70e8b5adb2cb2b9d48a807b0046f604f30
 Signed-off-by: Gui Hecheng <guihech...@cmss.chinamobile.com
<mailto:guihech...@cmss.chinamobile.com>>
commit f7a09d87851f64a68c2438fdc09372703bcbebec
Author: Matt Benjamin <mbenja...@redhat.com
<mailto:mbenja...@redhat.com>>
AuthorDate: Thu Jul 20 15:21:00 2017 -0400
Commit: Frank S. Filz <ffilz...@mindspring.com
<mailto:ffilz...@mindspring.com>>
CommitDate: Thu Aug 17 14:46:29 2017 -0700

 config: add config_url and RADOS url provider

 Provides a mechanism to to load nfs-ganesha config sections (e.g.,
 export blocks) from a generic URL.  Includes a URL provider
 which maps URLs to Ceph RADOS objects.

 Change-Id: I9067eaef2b38a78e9f1a877dfb9eb3c176239e71
 Signed-off-by: Matt Benjamin <mbenja...@redhat.com
<mailto:mbenja...@redhat.com>>
commit b6ce63479c965c12d2d3417abd1dd082cf0967b8
Author: Matt Benjamin <mbenja...@redhat.com
<mailto:mbenja...@redhat.com>>
AuthorDate: Fri Sep 22 14:21:46 2017 -0400
Commit: Frank S. Filz <ffilz...@mindspring.com
<mailto:ffilz...@mindspring.com>>
CommitDate: Fri Sep 22 14:06:12 2017 -0700

 rpm spec: add RADOS_URLS

 Change-Id: I60ebd4cb5bc3b3184704b8951a5392ed91846cdd
 Signed-off-by: Matt Benjamin <mbenja...@redhat.com
 

Re: [Nfs-ganesha-devel] CI failures

2017-11-02 Thread Daniel Gryniewicz

On 11/02/2017 11:46 AM, Frank Filz wrote:

Ok, so this patch: https://review.gerrithub.io/#/c/385433/ has a real
failure visible, however, it clearly has nothing to do with the patch at
hand.

How do we want to handle that for merge? The patch clearly is ready for
merge, but with a -1 Verify, if we're going to make this verification stuff
meaningful, we can't proceed.

Frank



It's a use-after-free on the state lock.  As such, it may be caused by 
the previous commit in the sequence:


https://review.gerrithub.io/#/c/385104/

Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Backport list for 2.5.4

2017-10-31 Thread Daniel Gryniewicz
nection parameters in the RADOS_URLS config section
from taking effect.

Change-Id: Ied11473806a1b951c05f55b164936f0a1484cd0e
Signed-off-by: Matt Benjamin <mbenja...@redhat.com>
commit 824cea60caab1c37908d5f358da7570de76b43a9
Author: Matt Benjamin <mbenja...@redhat.com>
AuthorDate: Mon Oct 23 16:27:20 2017 -0400
Commit: Frank S. Filz <ffilz...@mindspring.com>
CommitDate: Fri Oct 27 13:27:07 2017 -0700

RGW: don't deref NULL at LogFullDebug after add_detached_dirent()

The fault happens only at LogFullDebug, as stated.

Change-Id: I64f8688114025237c6d2985ad359e148a0a55737
Signed-off-by: Matt Benjamin <mbenja...@redhat.com>
commit 2ebc97065a5474dfb689b2c7053e955a5e5a3595
Author: Matt Benjamin <mbenja...@redhat.com>
AuthorDate: Tue Oct 24 16:08:40 2017 -0400
Commit: Frank S. Filz <ffilz...@mindspring.com>
CommitDate: Fri Oct 27 13:27:44 2017 -0700

RGW: REALLY early init support

Provide a general mechanism to access already-identified
config blocks for parsing by later blocks--and use the
mechanism to permit the RADOS config url provider to parse
config variables form its own RADOS_URLS block.

Change-Id: I1e81eae784de09abb3d1fbf715bb05eac6e8333e
Signed-off-by: Matt Benjamin <mbenja...@redhat.com>
commit c4a05dedc13de3bae73b17b040be147ca520
Author: Daniel Gryniewicz <d...@redhat.com>
AuthorDate: Tue Oct 24 13:07:40 2017 -0400
Commit: Daniel Gryniewicz <d...@redhat.com>
CommitDate: Tue Oct 24 13:07:40 2017 -0400

Plumb through Bind_Addr so that it works.

During the conversion to work with IPv6, the code to implement 
Bind_Addr

was lost.  Implement this code, so that an address can be given.  The
following formats are accepted:

Bind_Addr = aaa.bbb.ccc.ddd;  # Standard dotted quad IPv4 address
Bind_Addr = a:b:c:d:e:f:g:h;  # Standard IPv6 address (may have ::)
Bind_Addr = :::aaa.bbb.ccc.ddd; # IPv4 address mapped in IPv6
Bind_Addr = 0.0.0.0; # Listen on all addresses (IPv4 and IPv6)
Bind_Addr = ::; # Listen on all IPv6 addresses (no IPv4)

Note that none of these are quoted in any way.  Case of hex does not
matter.

Change-Id: Ie960e8325bdabb3c307dff03694f3f1dc92f4596
Signed-off-by: Daniel Gryniewicz <d...@redhat.com>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Ganesha 2.3 and 2.5 - crash in free_nfs_request

2017-10-31 Thread Daniel Gryniewicz
This sounds like a use-after-free.  That memory is either poisoned by 
the allocator, or re-allocated and used by something that memset() it.


You can try running under valgrind, if it's fast enough, or you can try 
backporting the ASAN additions from these commits:


1b48c0237c10c48d33840ab278d5cf0c2a8a8e4a
9c69aeb4c0f313650897d78590be4c07b41a8e40

Daniel

On 10/31/2017 10:45 AM, Sachin Punadikar wrote:

William,
You are right, gsh_calloc is getting invoked (even for 2.3 code).
Interestingly for the core we got in testing, has almost all the fields 
filled with 0xFF. So wondering is it something to do with underneath 
glibc or RHEL in general.

Here is the gdb o/p indicating the same.

(gdb) p reqdata->r_u.req.svc
$7 = {rq_prog = 4294967295, rq_vers = 4294967295, rq_proc = 4294967295, 
rq_cred = {oa_flavor = -1,
 oa_base = 0x bounds>, oa_length = 4294967295},

   rq_clntcred = 0x7f183c0a83e0, rq_xprt = 0x7f1932423830,
   rq_clntname = 0x bounds>,
   rq_svcname = 0x bounds>, rq_msg = 0x7f183c0a8020, rq_context = 0x0,
   rq_u1 = 0x, rq_u2 = 0x, rq_cksum = 
18446744073709551615, rq_xid = 4294967295, rq_verf = {
 oa_flavor = -1, oa_base = 0x 0x out of bounds>, oa_length = 4294967295},
   rq_auth = 0x, rq_ap1 = 0x, rq_ap2 = 
0x, rq_raddr = {ss_family = 65535,
 __ss_align = 18446744073709551615, __ss_padding = '\377' 112 times>}, rq_daddr = {ss_family = 65535,
 __ss_align = 18446744073709551615, __ss_padding = '\377' 112 times>}, rq_raddr_len = 0, rq_daddr_len = 0}

(gdb) p reqdata->r_u.req
$8 = {xprt = 0x7f1932423830, svc = {rq_prog = 4294967295, rq_vers = 
4294967295, rq_proc = 4294967295, rq_cred = {
   oa_flavor = -1, oa_base = 0x 0x out of bounds>, oa_length = 4294967295},

 rq_clntcred = 0x7f183c0a83e0, rq_xprt = 0x7f1932423830,
 rq_clntname = 0x bounds>,
 rq_svcname = 0x bounds>, rq_msg = 0x7f183c0a8020, rq_context = 0x0,
 rq_u1 = 0x, rq_u2 = 0x, rq_cksum = 
18446744073709551615, rq_xid = 4294967295,
 rq_verf = {oa_flavor = -1, oa_base = 0x 0x out of bounds>,
   oa_length = 4294967295}, rq_auth = 0x, rq_ap1 = 
0x, rq_ap2 = 0x,
 rq_raddr = {ss_family = 65535, __ss_align = 18446744073709551615, 
__ss_padding = '\377' },
 rq_daddr = {ss_family = 65535, __ss_align = 18446744073709551615, 
__ss_padding = '\377' },
 rq_raddr_len = 0, rq_daddr_len = 0}, lookahead = {flags = 
4294967295, read = 65535, write = 65535}, arg_nfs = {

 arg_getattr3 = {object = {data = {data_len = 4294967295,
   data_val = 0x of bounds>}}}, arg_setattr3 = {object = {data = {
   data_len = 4294967295, data_val = 0x 0x out of bounds>}},
   new_attributes = {mode = {set_it = -1, set_mode3_u = {mode = 
4294967295}}, uid = {set_it = -1, set_uid3_u = {
 uid = 4294967295}}, gid = {set_it = -1, set_gid3_u = {gid = 
4294967295}}, size = {set_it = -1, set_size3_u = {

 size = 18446744073709551615}}, atime = {
   set_it = (SET_TO_SERVER_TIME | SET_TO_CLIENT_TIME | unknown: 
4294967292), set_atime_u = {atime = {

   tv_sec = 4294967295, tv_nsec = 4294967295}}}, mtime = {
   set_it = (SET_TO_SERVER_TIME | SET_TO_CLIENT_TIME | unknown: 
4294967292), set_mtime_u = {mtime = {
   tv_sec = 4294967295, tv_nsec = 4294967295, guard = 
{check = -1, sattrguard3_u = {obj_ctime = {
 tv_sec = 4294967295, tv_nsec = 4294967295, arg_lookup3 
= {what = {dir = {data = {data_len = 4294967295,
 data_val = 0x out of bounds>}},
 name = 0x bounds>}}, arg_access3 = {object = {data = {
   data_len = 4294967295, data_val = 0x 0x out of bounds>}},
   access = 4294967295}, arg_readlink3 = {symlink = {data = 
{data_len = 4294967295,
   data_val = 0x of bounds>}}}, arg_read3 = {file = {data = {
   data_len = 4294967295, data_val = 0x 0x out of bounds>}},
   offset = 18446744073709551615, count = 4294967295}, arg_write3 = 
{file = {data = {data_len = 4294967295,
   data_val = 0x of bounds>}}, offset = 18446744073709551615,
   count = 4294967295, stable = (DATA_SYNC | FILE_SYNC | unknown: 
4294967292), data = {data_len = 4294967295,
 data_val = 0x of bounds>}}, arg_create3 = {where = {dir = {data = {
 data_len = 4294967295, data_val = 0x 
}},
 name = 0x bounds>}, how = {
 mode = (GUARDED | EXCLUSIVE | unknown: 4294967292), 
createhow3_u = {obj_attributes = {mode = 

Re: [Nfs-ganesha-devel] ABBA deadlock in 2.5 (likely in 2.6 as well)

2017-10-23 Thread Daniel Gryniewicz

Maybe something like this:

https://paste.fedoraproject.org/paste/CptGkmoRutBKYjno5FiSjg/

Daniel

On 10/23/2017 10:13 AM, Malahal Naineni wrote:

Let us say we have X/Y path in the file system. We have attr_lock and
content_lock on each object. The locks we are interested here are
attr_lock on X (hear after referred to as AX) and content_lock on X
(hear after CX).  Similarly we have AY and CY for object named Y.

1. Thread 50 (lookup called for X ) takes AX and waits for CX (attr_lock
followed by content_lock is the expected order)

2. Thread 251 (readdirplus on X) takes CX, AY and then waits for CY for
processing object Y

3. Thread 132 (readdirplus on Y) takes CY, and then waits for AX (this
is due to lookup of parent)

Classic philosopher's problem: 1 waits for 2, 2 waits for 3 and then 3
waits for 1. The lock ordering for attr_lock and content_lock for an
object is attr_lock followed by content_lock. We can assume that parent
locks should be acquired before the child locks, but DOTDOT appears as a
child in readdirplus/readdir. If we can handle parent differently, we
might be OK. Any help would be appreciated.

Regards, Malahal.

(gdb) thread 50
[Switching to thread 50 (Thread 0x3fff6cffe850 (LWP 37851))]
#0  0x3fff8a50089c in .__pthread_rwlock_wrlock () from 
/lib64/libpthread.so.0

(gdb) bt
#0  0x3fff8a50089c in .__pthread_rwlock_wrlock () from 
/lib64/libpthread.so.0
#1  0x101adaa4 in mdcache_refresh_attrs (entry=0x3ffccc0541f0, 
need_acl=false, invalidate=true)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1187
#2  0x101ae1b0 in mdcache_getattrs (obj_hdl=0x3ffccc054228, 
attrs_out=0x3fff6cffcfb8)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1228
#3  0x100b9fdc in nfs_SetPostOpAttr (obj=0x3ffccc054228, 
Fattr=0x3ffe64050dc8, attrs=0x0)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/Protocols/NFS/nfs_proto_tools.c:91
#4  0x100c6ba8 in nfs3_lookup (arg=0x3ffa9ff04780, 
req=0x3ffa9ff03f78, res=0x3ffe64050d50)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/Protocols/NFS/nfs3_lookup.c:131

#5  0x10065220 in nfs_rpc_execute (reqdata=0x3ffa9ff03f50)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1290

#6  0x10065c9c in worker_run (ctx=0x10013c1d3f0)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1562

#7  0x101670f4 in fridgethr_start_routine (arg=0x10013c1d3f0)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/support/fridgethr.c:550

#8  0x3fff8a4fc2bc in .start_thread () from /lib64/libpthread.so.0
#9  0x3fff8a31b304 in .__clone () from /lib64/libc.so.6
(gdb) frame 1
#1  0x101adaa4 in mdcache_refresh_attrs (entry=0x3ffccc0541f0, 
need_acl=false, invalidate=true)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1187

1187PTHREAD_RWLOCK_wrlock(>content_lock);
(gdb) p entry
$1 = (mdcache_entry_t *) 0x3ffccc0541f0
(gdb) p entry->content_lock
$2 = {__data = {__lock = 0, __nr_readers = 1, __readers_wakeup = 2408, 
__writer_wakeup = 4494, __nr_readers_queued = 0,
 __nr_writers_queued = 6, __writer = 0, __shared = 0, __pad1 = 0, 
__pad2 = 0, __flags = 0},
   __size = 
"\000\000\000\000\000\000\000\001\000\000\th\000\000\021\216\000\000\000\000\000\000\000\006", 
'\000' , __align = 1}

(gdb) thread 251
[Switching to thread 251 (Thread 0x3fff2e7fe850 (LWP 37976))]
#0  0x3fff8a50089c in .__pthread_rwlock_wrlock () from 
/lib64/libpthread.so.0

(gdb) bt
#0  0x3fff8a50089c in .__pthread_rwlock_wrlock () from 
/lib64/libpthread.so.0
#1  0x101adaa4 in mdcache_refresh_attrs (entry=0x3ffc30041bd0, 
need_acl=false, invalidate=true)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1187
#2  0x101ae1b0 in mdcache_getattrs (obj_hdl=0x3ffc30041c08, 
attrs_out=0x3fff2e7fcb58)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1228
#3  0x101c0f50 in mdcache_readdir_chunked 
(directory=0x3ffccc0541f0, whence=1298220731,
 dir_state=0x3fff2e7fcee8, cb=@0x1024b040: 0x1003f524 
, attrmask=122830, eod_met=0x3fff2e7fd01c)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:3047
#4  0x101ab998 in mdcache_readdir (dir_hdl=0x3ffccc054228, 
whence=0x3fff2e7fcfa0, dir_state=0x3fff2e7fcee8,
 cb=@0x1024b040: 0x1003f524 , attrmask=122830, 
eod_met=0x3fff2e7fd01c)
 at 
/usr/src/debug/nfs-ganesha-2.5.3-ibm008.20M6-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:637
#5  0x10040090 in fsal_readdir 

Re: [Nfs-ganesha-devel] Crash in mdcache_alloc_handle() during unexport

2017-10-05 Thread Daniel Gryniewicz

This looks like a legit issue.  I'm working on a fix for it.

Daniel

On 10/05/2017 01:07 PM, Pradeep wrote:

Hello,

This issue is with 2.6-dev.11 (don't think it is specific to this version).

It appears that there is a race between mdcache_unexport() and 
mdcache_alloc_handle(). If an request comes after the MDC_UNEXPORT is 
set, mdcache tries to free the entry by calling these:


   /* Map the export before we put this entry into the LRU, but 
after it's
  * well enough set up to be able to be unrefed by unexport 
should there

  * be a race.
  */
 status = mdc_check_mapping(result);

 if (unlikely(FSAL_IS_ERROR(status))) {
 /* The current export is in process to be unexported, don't
  * create new mdcache entries.
  */
 LogDebug(COMPONENT_CACHE_INODE,
  "Trying to allocate a new entry %p for export 
id %"
  PRIi16" that is in the process of being 
unexported",

  result, op_ctx->ctx_export->export_id);
 mdcache_put(result);
 mdcache_kill_entry(result);
 return NULL;
 }

At this point, the entry is neither in any LRU queue nor in any 
partition (AVL tree).
So _mdcache_kill_entry() will call mdcache_lru_cleanup_push() which will 
try to dequeue:


 if (!(lru->qid == LRU_ENTRY_CLEANUP)) {
 struct lru_q *q;

 /* out with the old queue */
 q = lru_queue_of(entry);  <--- NULL since we haven't 
inserted it.

 LRU_DQ_SAFE(lru, q); <-- crash here.


I think if we call mdcache_lru_unref() instead of mdcache_kill_entry(), 
it will correctly free the entry.


If the idea of calling mdcache_kill_entry() is to insert into the 
cleanup queue, then adding a check before LRU_DQ_SAFE() in 
mdcache_lru_cleanup_push() should fix it too.


Thanks,
Pradeep



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Multi instance Ganesha with DBus

2017-10-03 Thread Daniel Gryniewicz
So, from what I've seen, it's possible to run a dbus inside a container. 
 You need to start it from your entrypoint script.  I've done this in 
the past, but not for several years.


That said, this may be a legit reason to add multiple bus keys.  I can 
see wanting to control them from outside, but I can also see wanting to 
control them from inside, so we should probably support both modes.


Daniel

On 10/03/2017 01:27 PM, sriram patil wrote:
I am running ganesha inside docker containers. I have tried to run dbus 
inside docker, but that fails because system_bus_socket does not exist. 
This is the actual problem.


To solve this, I am using the host dbus server from containers by 
mounting the host system_bus_socket inside the containers. So, all 
ganesha servers will register with the host dbus. This is why I need 
different bus names for every ganesha server running inside containers.


Thanks,
Sriram

On 03-Oct-2017 5:35 PM, "Daniel Gryniewicz" <d...@redhat.com 
<mailto:d...@redhat.com>> wrote:


On 10/03/2017 02:31 AM, sriram patil wrote:

Hi,

AFAIK we can run only single instance of nfs-ganesha on a given
machine which supports dbus signals. Running with different
ports, nfs
ganesha service comes up, but the dbus signals work only for the
first
(primary) instance. We cannot interact with the ganesha instances
(other than primary) through dbus. This is a big deal, because
dynamic
exports, runtime grace periods, stats, etc are not available on
"secondary" ganesha instances.

I wanted to know if this is an issue at all? And is there anyone
working on this already?

Of the top of my head, I am thinking of handling this by adding an
identifier configuration which can be appended to the bus name. For
example, org.ganesha.nfsd.id1, org.ganesha.nfsd.id2, etc. Any
suggestions on this approach? Can we handle it in a better way?


Just out of curiosity, what's the use-case for multiple Ganesha's on
the same machine, as opposed to just putting all the exports on the
same Ganesha?

Daniel


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
<https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel>




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Pull up NTIRPC #80 & #81

2017-10-03 Thread Daniel Gryniewicz

On 10/03/2017 12:47 PM, William Allen Simpson wrote:

https://review.gerrithub.io/#/c/380970/

Begging for a mid-week dev release.

These patches attempt to fix a crash found early in Bake-a-thon.  DanG
and I couldn't reproduce, so we need this to determine whether it has
fixed the QE crash -- so they can move onward with more testing.

I'd hoped that #80 would have gone in on Friday/Saturday, but DanG was
flying on Friday and Matt didn't have time to review.

#81 was found during my weekend code review of related paths.  Solves
possible lock conflict during shutdown.



I don't think this is blocking QE, as they haven't hit it again either, 
so it can probably just go in in the normal merge this week.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] [Nfs-ganesha-support] FSAL proxy error

2017-10-02 Thread Daniel Gryniewicz

On 09/29/2017 03:24 AM, Vincent Bosquier - UCit wrote:

Hi,

NFS-Ganesha version is 2.5-rc8.



Note, you should at least update to 2.5.2, as it has many many bug fixes 
over -rc8.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Proposal to manage global file descriptors

2017-09-22 Thread Daniel Gryniewicz

On 09/21/2017 07:45 PM, Frank Filz wrote:

Philippe discovered that recent Ganesha will no longer allow compiling the
linux kernel due to dangling open file descriptors.

I'm not sure if there is any true leak, the simple test of echo foo >
/mnt/foo does show a remaining open fd for /mnt/foo, however that is the
global fd opened in the course of doing a getattrs on FSAL_ VFS.

We have been talking about how the current management of open file
descriptors doesn't really work, so I have a couple proposals:

1. We really should have a limit on the number of states we allow. Now that
NLM locks and shares also have a state_t, it would be simple to have a count
of how many are in use, and return a resource error if an operation requires
creating a new one past the limit. This can be a hard limit with no grace,
if the limit is hit, then alloc_state fails.


This I agree with.



2. Management of the global fd is more complex, so here goes:

Part of the proposal is a way for the FSAL to indicate that an FSAL call
used the global fd in a way that consumes some kind of resource the FSAL
would like managed.

FSAL_PROXY should never indicate that (anonymous I/O should be done using a
special stateid, and a simple file create should result in the open stateid
immediately being closed, if that's not the case, then it's easy enough to
indicate use of a limited resource.

FSAL_VFS would indicate use of the resource any time it utilizes the global
fd. If it uses a temp fd that is closed after performing the operation, it
would not indicate use of the limited resource.

FSAL_GPFS, FSAL_GLUSTER, and FSAL_CEPH should all be similar to FSAL_VFS.

FSAL_RGW only has a global fd, and I don't quite understand how it is
managed.


If only PROXY doesn't set this, then maybe it's added complexity we 
don't need.  Just assume it's set.



The main part of the proposal is to actually create a new LRU queue for
objects that are using the limited resource.

If we are at the hard limit on the limited resource and an entry that is not
already in the LRU uses the resource, then we would reap an existing entry
and call fsal_close on it to release the resource. If an entry was not
available to be reaped, we would temporarily exceed the limit just like we
do with mdcache entries.

If an FSAL call resulted in use of the resource and the entry was already in
the resource LRU, then it would be bumped to MRU of L1.

The LRU run thread for the resource would demote objects from LRU L1 to MRU
of L2, and call fsal_close and remove objects from LRU of L2. I think it
should work to close any files that have not been used in the amount of
time, really using the L1 and L2 to give a shorter life to objects for which
the resource is used once and then not used again, whereas a file that is
accessed multiple times would have more resistance to being closed. I think
the exact mechanics here may need some tuning, but that's the general idea.

The idea here is to be constantly closing files that have not been accessed
recently, and also to better manage a count of the files for which we are
actually using the resources, and not keep a file open just because for some
reason we do lots of lookups or stats of it (we might have to open it for
getattrs, but then we might serve a bunch of cached attrs, which doesn't go
to disk, might as well close the fd).


This sounds almost exactly like the existing LRU thread, except that it 
ignores refcount.  If you remove global FD from the obj_handle, then the 
LRU as it currently exists becomes unnecessary for MDCACHE entries, as 
they only need a simple, single-level LRU based only on initial 
refcounts.  The current, multi-level LRU only exists to close the global 
FD when transitioning LRU levels.


So, what it sounds like to me is that you're splitting the LRU for 
entries from the LRU for global FDs.  Is this correct?  If so, I think 
this complicates the two sets of LRU transitions, but probably not 
insurmountably so.



I also propose making the limit for the resource configurable independent of
the ulimit for file descriptors, though if an FSAL is loaded that actually
uses file descriptors for open files should check that the ulimit is big
enough, it should also include the limit on state_t also. Of course it will
be impossible to account for file descriptors used for sockets, log files,
config files, or random libraries that like to open files...


Hmmm... I don't think we can do any kind of checking, if we're not going 
to use ulimit by default, since it depends on which FSALs are in use at 
any given time.  I say we either default the limits to ulimit, or just 
ignore ulimit entirely and log an appropriate error when EMFILE is returned.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel 

Re: [Nfs-ganesha-devel] Ganesha-Proxy: missing documentation of the used ports and how to change the configuration

2017-09-21 Thread Daniel Gryniewicz
The ports should be the same, I think, just outgoing instead of
incoming.  So, nfsv3 will need access to sunrpc and whatever ephemeral
port it negotiates, and nfsv4 will need 2049. (note, this is the nfs
version of the backing server, not that ganesha is providing to
clients.)

Does your container block outgoing connections? I don't think
containers do that by default.

Daniel

On Thu, Sep 21, 2017 at 6:40 AM, Stephan Walter
 wrote:
> Hi,
>
>
>
> I tried in the last days to put a nfs-ganesha  proxy server into a docker
> container. I was able to configure and deploy a ganesha container that
> export a specific directory and mount it successfully on other VMs.
>
>
>
> But, when I tried to do the same with a ganesha PROXY, the log report over
> and over again:
>
>
>
> 21/09/2017 08:19:31 : epoch 59c37613 : dd85a933ae92 :
> ganesha.nfsd-28[0x7f98c2d9e6f0] pxy_setclientid :FSAL :EVENT :Negotiating a
> new ClientId with the remote server
>
>
>
> Since I have no problems, when I run exactly the same configuration outside
> of a container, I would assume, that there is a problem with the container
> ports.
>
>
>
> I looked around to find, what ports ganesha use, and how I can select them.
> Nevertheless, I didn’t fount anything useful.
>
>
>
> So it would be great if somebody could point me to the right documentation.
>
>
>
> Best regards,
>
>
>
> Stephan
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Continuing CI pain

2017-09-14 Thread Daniel Gryniewicz
As far as I know, this doesn't happen on Fedora, so it hasn't been
reported anywhere.

On Thu, Sep 14, 2017 at 6:58 AM, Niels de Vos  wrote:
> On Wed, Sep 13, 2017 at 10:39:52AM +0200, Niels de Vos wrote:
>> On Tue, Sep 12, 2017 at 06:41:49PM -0400, William Allen Simpson wrote:
>> > On 9/12/17 6:06 PM, Frank Filz wrote:
>> > > So this failure:
>> > >
>> > > https://ci.centos.org//job/nfs_ganesha_cthon04/1436/console
>> > >
>> > > Is an example of where we need some improvement. I looked at the top and
>> > > scrolled down to the end. I have no idea why it failed. This is a case of
>> > > too much information without a concise error report.
>> > >
>> > Installed:
>> >   libntirpc.x86_64 0:1.6.0-dev.7.el7.centos
>> >   nfs-ganesha.x86_64 0:2.6-dev.7.el7.centos
>> >   nfs-ganesha-gluster.x86_64 0:2.6-dev.7.el7.centos
>> >
>> > Complete!
>> > + systemctl start nfs-ganesha
>> > Job for nfs-ganesha.service failed because the control process exited with
>> > error code. See "systemctl status nfs-ganesha.service" and "journalctl -xe"
>> > for details.
>> > Build step 'Execute shell' marked build as failure
>> > Finished: FAILURE
>> >
>> > ===
>> >
>> > Why not print "systemctl status nfs-ganesha.service" and "journalctl -xe"?
>> >
>> > Originally I assumed that it was some obscure problem with my code, but
>> > then I looked around, and it seems to be all the submissions for
>> > nfs_ganesha_cthon04 at the moment
>>
>> The additional information will now be logged as well. This change in
>> the centos-ci branch does it:
>>
>>   https://github.com/nfs-ganesha/ci-tests/pull/14/files
>>
>> A (manually started) test run logs the errors more clearly:
>>
>>   https://ci.centos.org/job/nfs_ganesha_cthon04/1439/console
>>
>>
>> Sep 13 09:33:41 n9.pufty.ci.centos.org bash[20711]: 13/09/2017 09:33:41 : 
>> epoch 59b8ed65 : n9.pufty.ci.centos.org : ganesha.nfsd-20711[main] 
>> create_log_facility :LOG :CRIT :Cannot create new log file 
>> (/var/log/ganesha/ganesha.log), because: Permission denied
>> Sep 13 09:33:41 n9.pufty.ci.centos.org bash[20711]: 13/09/2017 09:33:41 : 
>> epoch 59b8ed65 : n9.pufty.ci.centos.org : ganesha.nfsd-20711[main] 
>> init_logging :LOG :FATAL :Create error (Permission denied) for FILE 
>> (/var/log/ganesha/ganesha.log) logging!
>> Sep 13 09:33:41 n9.pufty.ci.centos.org systemd[1]: nfs-ganesha.service: 
>> control process exited, code=exited status=2
>> Sep 13 09:33:41 n9.pufty.ci.centos.org systemd[1]: Failed to start 
>> NFS-Ganesha file server.
>> Sep 13 09:33:41 n9.pufty.ci.centos.org systemd[1]: Unit nfs-ganesha.service 
>> entered failed state.
>> Sep 13 09:33:41 n9.pufty.ci.centos.org systemd[1]: nfs-ganesha.service 
>> failed.
>>
>>
>> Why creating the logfile fail is not clear to me. Maybe something in the
>> packaging was changed and the /var/log/ganesha/ directory is not
>> writable for the ganesha.nfsd process anymore? Have changes for running
>> as non-root been merged, maybe?
>
> This seems to be a problem with the CentOS rebuild of RHEL-7.4. The
> CentOS CI gets the new packages before the release is made available for
> all users. I have run a test with SELinux in Permissive mode, and this
> passed just fine.
>
> https://ci.centos.org/job/nfs_ganesha_cthon04/1445/consoleFull
>
> As a temporary (hopefully!) solution, doing a 'setenforce 0' in the
> preparation script should help here:
>   https://github.com/nfs-ganesha/ci-tests/pull/15
>
> I would like to know if this problem has been reported against Fedora or
> RHEL already. Once the bug is fixed in selinux-policy for RHEL, the
> CentOS package will get an update soon after, and we can run our tests
> with SELinux in Enforcing mode again.
>
> Thanks,
> Niels
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Any plans to support callback API model for FSALs?

2017-09-14 Thread Daniel Gryniewicz
I don't have anything written down yet.  I'll post to the list as soon as I do.

Daniel

On Wed, Sep 13, 2017 at 10:11 PM, Kinglong Mee <kinglong...@gmail.com> wrote:
> Hello Daniel,
>
> I am interested in the ASYNC API for FSALs.
> Could you show me some information (plan, document, discussion) about it?
> If there is a draft of source about it, I am happy to help with the testing.
>
> thanks,
> Kinglong Mee
>
> On Thu, May 11, 2017 at 10:40 PM, Daniel Gryniewicz <d...@redhat.com> wrote:
>> Basically, yes.  The plan is to implement an ASYNC API for FSAls, which
>> would allow this.  It's high on my list for 2.6, so it should be done
>> fairly soon.
>>
>> Daniel
>>
>> On 05/11/2017 10:27 AM, Satish Chandra Kilaru wrote:
>>> Suppose an FSAL archives old files. When there is read request for those
>>> files, it can take long time to restore those files. Instead of blocking
>>> the worker thread for that long, it is good to allow FSAL to callback
>>> when data is available.
>>>
>>> Any plans to support that?
>>> --
>>> Please Donate to www.wikipedia.org <http://www.wikipedia.org>
>>>
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>
>>>
>>>
>>> ___
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] yesterdays conf.-call

2017-09-13 Thread Daniel Gryniewicz

On 09/13/2017 09:41 AM, William Allen Simpson wrote:

On 9/13/17 9:02 AM, Daniel Gryniewicz wrote:
True.  I'd forgotten I had to raise FD limits in my environment (to 
99) to allow valgrind to pass. 


Probably excessive as there are only 65K ports in both TCP and UDP.



True, but valgrind uses a ton of FDs for it's own use, so I just maxed 
it out.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] yesterdays conf.-call

2017-09-13 Thread Daniel Gryniewicz
True.  I'd forgotten I had to raise FD limits in my environment (to 
99) to allow valgrind to pass.  The other thing I do is set 
--max-stackframe=3280592 on the valgrind command line, but with those 
two changes, pyNFS "all" passes for me under valgrind.


Daniel

On 09/13/2017 08:53 AM, Swen Schillig wrote:

In yesterdays conf.-call we spoke about pynfs-errors while running
valgrind.

I spent some time today to find out the reason and I'm afraid that was
a user error (so me).

A good few pynfs-tests go to numeric limits for a standard process
and if ganesha is executed under valgrind those limits are hit.
In my case it was the soft-limit for number of open files.
Once increased, pynfs succeeds.
At least the tests based on file creations.

There are other tests which might fail if executed as part of
the "all" test-suite but do succeed if executed in a smaller set.
E.g. MKLINK RDDR1 RDDR2 RDDR3 RDDR4 RDDR8 RDDR11 RDDR12 RENEW3 RLOWN1
RD10 RD11

Anyhow, just FYI.

Cheers Swen


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Intermittent test failures - manual tests and continuous integration

2017-09-08 Thread Daniel Gryniewicz



It would really help if we could have someone with better time zone
overlap with me who could manage the CI stuff, but that may not be

realistic.

We can sign up anyone in the NFS-Ganesha community to do this. It takes a
little time to get familiar with the scripts and tools that are used, but

once

that settled it is relatively straight forward.

Volunteers?


I hope we can have more folks on this. I've tried to understand CI stuff and
just get lost...



I'll join.  I have Centos logins already, although that may not extend 
to CI stuff.


Daniel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


  1   2   3   >