Hi Ben, Thanks for the quick reply. I will file an issue.
However, a followup question: So, to counter such issues, can I use uv_ref() and uv_unref() for the "uv_fs_req" object and then make sure that it is unref()'ed when NFS is hung (as soon as I trigger my "timeout" callback) ? Do you think that helps ? Or, do you think the unref() would not help because the actual I/O thread is already running ? Thanks and Best Regards, Ramesh On Thursday, May 8, 2014 4:00:37 PM UTC-7, Ben Noordhuis wrote: > > On Fri, May 9, 2014 at 12:46 AM, Ramesh Rayaprolu > <[email protected] <javascript:>> wrote: > > Hi, > > > > I am using libuv for async file I/O operations. These file operations > are > > mostly on an NFS mount. > > > > So, if the NFS goes down while reading a file, it would hang, (meaning, > I > > dont get the callback until the NFS is back running). > > > > (Also, it is the I/O that gets hung, so it looks like libuv thinks that > the > > I/O is in progress, and uv_cancel would not work on these requests). > > > > In this scenario, I have implemented a timeout for 30sec, and if this is > > triggered, I just cleanup my objects (and my application runs forever > doing > > other stuff). > > > > But It looks like I need to keep the related "uv_fs_req" objects, > because > > when the NFS comes back, libuv will try to access these objects to make > the > > callback. > > > > I wish to know if there is any "cleaner" way of removing the pending > > callback when NFS goes down ? > > > > Thanks and Best Regards, > > Ramesh > > The answer to your question is "not really." > > File I/O is done inside a thread pool. There is no magic: > uv_fs_read() is just a read(2) system call that is executed in a > separate userspace thread. > > When that NFS mount goes away, the kernel's counterpart to the > userspace thread hangs until it comes back or a system-specific > timeout expires. > > Depending on the kernel and the NFS implementation, it may be possible > to unwedge the thread by sending it a signal. There are two gotchas, > however: > > 1. Currently, system calls are retried when they fail with EINTR, the > "interrupted by a signal" error code. Libuv would have to learn to > distinguish between normal EINTR errors and ones that we instigated. > > 2. It's possible for the kernel thread to become non-interruptible, in > which case nothing can unwedge it, not even a SIGKILL signal. That's > rare on modern-day Linux, however, and non-existent (AFAIK) when you > use the default NFS implementation. > > I would suggest filing an issue. > -- You received this message because you are subscribed to the Google Groups "libuv" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/libuv. For more options, visit https://groups.google.com/d/optout.
