> You could make a custom Dir wrapper that always caches in RAM, but > that sounds a bit terrifying :)
This was exactly what I implemented:) A commit-thread runs periodically every 30 seconds, while RAM-Monitor thread runs every 5 seconds to commit data in-case sizeInBytes>=70%-of-maxCachedBytes. This is quite dangerous as you have said, especially when sync() can take an arbitrary amount of time > Alternatively, maybe on an HDFS error you could block that one thread > while you retry for some amount of time, until the write/read > succeeds? (Like an NFS hard mount). Well, after your idea I started digging HDFS for this problem. I believe HDFS handles this internally without a snitch, as per this link. https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-3/data-flow I believe in the case of a node failure while writing, even an IOException is also not thrown to the client and all of it is handled internally. I think I can rest-easy on this. May be will write a test-case to verify this behavior. Sorry for the trouble. Should have done some digging before-hand. -- Ravi On Wed, Dec 18, 2013 at 11:55 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Wed, Dec 18, 2013 at 3:15 AM, Ravikumar Govindarajan > <ravikumar.govindara...@gmail.com> wrote: > > Thanks Mike for a great explanation on Flush IOException > > You're welcome! > > > I was thinking on the perspective of a HDFSDirectory. In addition to the > > all causes of IOException during flush you have listed, a HDFSDirectory > > also has to deal with network issues, which is not lucene's problem at > all. > > > > But I would ideally like to handle momentary network blips, as these are > > fully recoverable errors. > > > > > > Will NRTCachingDirectory help in case of HDFSDirectory? If all goes > well, I > > should always flush to RAM and sync to HDFS happens only during commits. > In > > such cases, I can have a retry logic inside sync() method for handling > > momentary IOExceptions > > I'm not sure it helps, because on merge, if the expected size of the > merge segment is large enough, NRTCachingDir won't cache those files: > it just delegates directly to the wrapped directory. > > Likewise, if too much RAM is already in use, flushing a new segment > would go straight to the wrapped directory. > > You could make a custom Dir wrapper that always caches in RAM, but > that sounds a bit terrifying :) > > Alternatively, maybe on an HDFS error you could block that one thread > while you retry for some amount of time, until the write/read > succeeds? (Like an NFS hard mount). > > Mike McCandless > > http://blog.mikemccandless.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >