On Mon, Oct 24, 2016 at 12:39 PM, Xavier Hernandez <xhernan...@datalab.es> wrote:
> Hi Soumya, > > > On 21/10/16 16:15, Soumya Koduri wrote: > >> >> >> On 10/21/2016 06:35 PM, Soumya Koduri wrote: >> >>> Hi Xavi, >>> >>> On 10/21/2016 12:57 PM, Xavier Hernandez wrote: >>> >>>> Looking at the code, I think that the added fd_unref() should only be >>>> called if the fop preparation fails. Otherwise the callback already >>>> unreferences the fd. >>>> >>>> Code flow: >>>> >>>> * glfs_fsync_async_common() takes an fd ref and calls STACK_WIND passing >>>> that fd. >>>> * Just after that a ref is released. >>>> * When glfs_io_async_cbk() is called another ref is released. >>>> >>>> Note that if fop preparation fails, a single fd_unref() is called, but >>>> on success two fd_unref() are called. >>>> >>> >>> Sorry for the inconvenience caused. I think its patch#15573 hasn't >>> caused the problem but has highlighted another ref leak in the code. >>> >>> From the code I see that glfs_io_async_cbk() does fd_unref (glfd->fd) >>> but not the fd passed in STACK_WIND_COOKIE() of the fop. >>> >>> If I take any fop, for eg., >>> glfs_fsync_common() { >>> >>> fd = glfs_resolve_fd (glfd->fs, subvol, glfd); >>> >>> >>> } >>> >>> Here in glfs_resolve_fd () >>> >>> fd_t * >>> __glfs_resolve_fd (struct glfs *fs, xlator_t *subvol, struct glfs_fd >>> *glfd) >>> { >>> fd_t *fd = NULL; >>> >>> if (glfd->fd->inode->table->xl == subvol) >>> return fd_ref (glfd->fd); >>> >>> Here we can see that we are taking extra ref additional to the >>>>>>>>>>> >>>>>>>>>> ref already taken for glfd->fd. That means the caller of this >>> function >>> needs to fd_unref(fd) irrespective of subsequent fd_unref (glfd->fd). >>> >>> fd = __glfs_migrate_fd (fs, subvol, glfd); >>> if (!fd) >>> return NULL; >>> >>> >>> if (subvol == fs->active_subvol) { >>> fd_unref (glfd->fd); >>> glfd->fd = fd_ref (fd); >>> } >>> >>> I think the issue is here during graph_switch(). You have >>>>>>>>>>> >>>>>>>>>> mentioned as well that the crash happens post graph_switch. Maybe >>> here >>> we are missing an extra ref to be taken for fd additional to glfd->fd. I >>> need to look through __glfs_migrate_fd() to confirm that. But these are >>> my initial thoughts. >>> >> >> Looking into this, I think we should fix glfs_io_async_cbk() not to >> fd_unref(glfd->fd). glfd->fd should be active though out the lifetime of >> glfd (i.e, until it is closed). Thoughts? >> > > I don't know gfapi internals in deep, but at first sight I think this > would be the right think to do. Assuming that glfd will keep a reference to > the fd until it's destroyed, and that a glfd reference is taken during the > lifetime of each request that needs it, the fd_unref() in > glfd_io_async_cbk() seems unnecessary. I think it was there just to release > the fd acquired if glfs_resolve_fd(), but it's better to place it where > it's now. > > Another question is if we really need to take an additional reference in > glfs_resolve_fd() ? > This answers the first question too, we don't need the additional ref in glfs_resolve_fd() now that we have ref accounting in glfd. This confusion is because earlier there was no ref accounting for glfd and only refs were on fd_t. No, we don't need this additional reference. > > Can an fd returned by this function live more time than the associated > glfd in some circumstances ? > > Also could you please check if it is the second/subsequent fsync_async() >> call which results in crash? >> > > I'll try to test it as soon as possible, but this is on a server that we > need to put in production very soon and we have decided to go with fuse for > now. We'll have a lot of work to do this week. Once I have some free time > I'll build a test environment to check it, probably next week. > > I have not been able to test this out completely. Theoretically, I don't see any possibility where fd can outlive a glfd that is pointing to it. I have sent a patch[1] that a. fixes the crash b. handles the unref in failure cases c. but still has the duplicate refs with some explanation in commit message. d. adds a simple test for the same. I request that we take this patch in and make a release soon as it is affecting many community users and do the cleanup in another patch. [1] http://review.gluster.org/#/c/15768/ Thanks, Raghavendra Talur Xavi > > > >> Thanks, >> Soumya >> >> >>> Please let me know your comments. >>> >>> Thanks, >>> Soumya >>> >>> >>> >>>> Xavi >>>> >>>> On 21/10/16 09:03, Xavier Hernandez wrote: >>>> >>>>> Hi, >>>>> >>>>> I've just tried Gluster 3.8.5 with Proxmox using gfapi and I >>>>> consistently see a crash each time an attempt to connect to the volume >>>>> is made. >>>>> >>>>> The backtrace of the crash shows this: >>>>> >>>>> #0 pthread_spin_lock () at >>>>> ../nptl/sysdeps/x86_64/pthread_spin_lock.S:24 >>>>> #1 0x00007fe5345776a5 in fd_unref (fd=0x7fe523f7205c) at fd.c:553 >>>>> #2 0x00007fe53482ba18 in glfs_io_async_cbk (op_ret=<optimized out>, >>>>> op_errno=0, frame=<optimized out>, cookie=0x7fe526c67040, >>>>> iovec=iovec@entry=0x0, count=count@entry=0) >>>>> at glfs-fops.c:839 >>>>> #3 0x00007fe53482beed in glfs_fsync_async_cbk (frame=<optimized out>, >>>>> cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>, >>>>> op_errno=<optimized out>, >>>>> prebuf=<optimized out>, postbuf=0x7fe5217fe890, xdata=0x0) at >>>>> glfs-fops.c:1382 >>>>> #4 0x00007fe520be2eb7 in ?? () from >>>>> /usr/lib/x86_64-linux-gnu/glusterfs/3.8.5/xlator/debug/io-stats.so >>>>> #5 0x00007fe5345d118a in default_fsync_cbk (frame=0x7fe52ceef3ac, >>>>> cookie=0x560ef95398e8, this=0x8, op_ret=0, op_errno=0, prebuf=0x1, >>>>> postbuf=0x7fe5217fe890, xdata=0x0) at defaults.c:1508 >>>>> #6 0x00007fe5345d118a in default_fsync_cbk (frame=0x7fe52ceef204, >>>>> cookie=0x560ef95398e8, this=0x8, op_ret=0, op_errno=0, prebuf=0x1, >>>>> postbuf=0x7fe5217fe890, xdata=0x0) at defaults.c:1508 >>>>> #7 0x00007fe525f78219 in dht_fsync_cbk (frame=0x7fe52ceef2d8, >>>>> cookie=0x560ef95398e8, this=0x0, op_ret=0, op_errno=0, >>>>> prebuf=0x7fe5217fe820, postbuf=0x7fe5217fe890, xdata=0x0) >>>>> at dht-inode-read.c:873 >>>>> #8 0x00007fe5261bbc7f in client3_3_fsync_cbk (req=0x7fe525f78030 >>>>> <dht_fsync_cbk>, iov=0x7fe526c61040, count=8, >>>>> myframe=0x7fe52ceef130) at >>>>> client-rpc-fops.c:975 >>>>> #9 0x00007fe5343201f0 in rpc_clnt_handle_reply (clnt=0x18, >>>>> clnt@entry=0x7fe526fafac0, pollin=0x7fe526c3a1c0) at rpc-clnt.c:791 >>>>> #10 0x00007fe53432056c in rpc_clnt_notify (trans=<optimized out>, >>>>> mydata=0x7fe526fafaf0, event=<optimized out>, data=0x7fe526c3a1c0) at >>>>> rpc-clnt.c:962 >>>>> #11 0x00007fe53431c8a3 in rpc_transport_notify (this=<optimized out>, >>>>> event=<optimized out>, data=<optimized out>) at rpc-transport.c:541 >>>>> #12 0x00007fe5283e8d96 in socket_event_poll_in (this=0x7fe526c69440) at >>>>> socket.c:2267 >>>>> #13 0x00007fe5283eaf37 in socket_event_handler (fd=<optimized out>, >>>>> idx=5, data=0x7fe526c69440, poll_in=1, poll_out=0, poll_err=0) at >>>>> socket.c:2397 >>>>> #14 0x00007fe5345ab3f6 in event_dispatch_epoll_handler >>>>> (event=0x7fe5217fecc0, event_pool=0x7fe526ca2040) at event-epoll.c:571 >>>>> #15 event_dispatch_epoll_worker (data=0x7fe527c0f0c0) at >>>>> event-epoll.c:674 >>>>> #16 0x00007fe5324140a4 in start_thread (arg=0x7fe5217ff700) at >>>>> pthread_create.c:309 >>>>> #17 0x00007fe53214962d in clone () at >>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 >>>>> >>>>> The fd being unreferenced contains this: >>>>> >>>>> (gdb) print *fd >>>>> $6 = { >>>>> pid = 97649, >>>>> flags = 2, >>>>> refcount = 0, >>>>> inode_list = { >>>>> next = 0x7fe523f7206c, >>>>> prev = 0x7fe523f7206c >>>>> }, >>>>> inode = 0x0, >>>>> lock = { >>>>> spinlock = 1, >>>>> mutex = { >>>>> __data = { >>>>> __lock = 1, >>>>> __count = 0, >>>>> __owner = 0, >>>>> __nusers = 0, >>>>> __kind = 0, >>>>> __spins = 0, >>>>> __elision = 0, >>>>> __list = { >>>>> __prev = 0x0, >>>>> __next = 0x0 >>>>> } >>>>> }, >>>>> __size = "\001", '\000' <repeats 38 times>, >>>>> __align = 1 >>>>> } >>>>> }, >>>>> _ctx = 0x7fe52ec31c40, >>>>> xl_count = 11, >>>>> lk_ctx = 0x7fe526c126a0, >>>>> anonymous = _gf_false >>>>> } >>>>> >>>>> fd->inode is NULL, explaining the cause of the crash. We also see that >>>>> fd->refcount is already 0. So I'm wondering if this couldn't be an >>>>> extra >>>>> fd_unref() introduced by that patch. >>>>> >>>>> The crash seems to happen immediately after a graph switch. >>>>> >>>>> Xavi >>>>> _______________________________________________ >>>>> Gluster-devel mailing list >>>>> Gluster-devel@gluster.org >>>>> http://www.gluster.org/mailman/listinfo/gluster-devel >>>>> >>>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel@gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-devel >>> >> _______________________________________________ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel >
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel