Hi Ben,

Thanks for the quick reply. I will file an issue.

However, a followup question:

So, to counter such issues, can I use uv_ref() and uv_unref() for the 
"uv_fs_req" object and then make sure that it is unref()'ed when NFS is 
hung (as soon as I trigger my "timeout" callback) ?

Do you think that helps ? Or, do you think the unref() would not help 
because the actual I/O thread is already running ? 

Thanks and Best Regards,
Ramesh



On Thursday, May 8, 2014 4:00:37 PM UTC-7, Ben Noordhuis wrote:
>
> On Fri, May 9, 2014 at 12:46 AM, Ramesh Rayaprolu 
> <[email protected] <javascript:>> wrote: 
> > Hi, 
> > 
> > I am using libuv for async file I/O operations. These file operations 
> are 
> > mostly on an NFS mount. 
> > 
> > So, if the NFS goes down while reading a file, it would hang, (meaning, 
> I 
> > dont get the callback until the NFS is back running). 
> > 
> > (Also, it is the I/O that gets hung, so it looks like libuv thinks that 
> the 
> > I/O is in progress, and uv_cancel would not work on these requests). 
> > 
> > In this scenario, I have implemented a timeout for 30sec, and if this is 
> > triggered, I just cleanup my objects (and my application runs forever 
> doing 
> > other stuff). 
> > 
> > But It looks like I need to keep the related "uv_fs_req" objects, 
> because 
> > when the NFS comes back, libuv will try to access these objects to make 
> the 
> > callback. 
> > 
> > I wish to know if there is any "cleaner" way of removing the pending 
> > callback when NFS goes down ? 
> > 
> > Thanks and Best Regards, 
> > Ramesh 
>
> The answer to your question is "not really." 
>
> File I/O is done inside a thread pool.  There is no magic: 
> uv_fs_read() is just a read(2) system call that is executed in a 
> separate userspace thread. 
>
> When that NFS mount goes away, the kernel's counterpart to the 
> userspace thread hangs until it comes back or a system-specific 
> timeout expires. 
>
> Depending on the kernel and the NFS implementation, it may be possible 
> to unwedge the thread by sending it a signal.  There are two gotchas, 
> however: 
>
> 1.  Currently, system calls are retried when they fail with EINTR, the 
> "interrupted by a signal" error code.  Libuv would have to learn to 
> distinguish between normal EINTR errors and ones that we instigated. 
>
> 2. It's possible for the kernel thread to become non-interruptible, in 
> which case nothing can unwedge it, not even a SIGKILL signal.  That's 
> rare on modern-day Linux, however, and non-existent (AFAIK) when you 
> use the default NFS implementation. 
>
> I would suggest filing an issue. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"libuv" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/libuv.
For more options, visit https://groups.google.com/d/optout.

Reply via email to