Re: IndexWriter flush/commit exception

Ravikumar Govindarajan Wed, 18 Dec 2013 20:35:52 -0800

> You could make a custom Dir wrapper that always caches in RAM, but
> that sounds a bit terrifying :)


This was exactly what I implemented:) A commit-thread runs periodically
every 30 seconds, while RAM-Monitor thread runs every 5 seconds to commit
data in-case sizeInBytes>=70%-of-maxCachedBytes. This is quite dangerous as
you have said, especially when sync() can take an arbitrary amount of time

> Alternatively, maybe on an HDFS error you could block that one thread
> while you retry for some amount of time, until the write/read
> succeeds?  (Like an NFS hard mount).

Well, after your idea I started digging HDFS for this problem. I believe
HDFS handles this internally without a snitch, as per this link.
https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-3/data-flow

I believe in the case of a node failure while writing, even an IOException
is also not thrown to the client and all of it is handled internally. I
think I can rest-easy on this.
May be will write a test-case to verify this behavior.

Sorry for the trouble. Should have done some digging before-hand.

--
Ravi

On Wed, Dec 18, 2013 at 11:55 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> On Wed, Dec 18, 2013 at 3:15 AM, Ravikumar Govindarajan
> <ravikumar.govindara...@gmail.com> wrote:
> > Thanks Mike for a great explanation on Flush IOException
>
> You're welcome!
>
> > I was thinking on the perspective of a HDFSDirectory. In addition to the
> > all causes of IOException during flush you have listed, a HDFSDirectory
> > also has to deal with network issues, which is not lucene's problem at
> all.
> >
> > But I would ideally like to handle momentary network blips, as these are
> > fully recoverable errors.
> >
> >
> > Will NRTCachingDirectory help in case of HDFSDirectory? If all goes
> well, I
> > should always flush to RAM and sync to HDFS happens only during commits.
> In
> > such cases, I can have a retry logic inside sync() method for handling
> > momentary IOExceptions
>
> I'm not sure it helps, because on merge, if the expected size of the
> merge segment is large enough, NRTCachingDir won't cache those files:
> it just delegates directly to the wrapped directory.
>
> Likewise, if too much RAM is already in use, flushing a new segment
> would go straight to the wrapped directory.
>
> You could make a custom Dir wrapper that always caches in RAM, but
> that sounds a bit terrifying :)
>
> Alternatively, maybe on an HDFS error you could block that one thread
> while you retry for some amount of time, until the write/read
> succeeds?  (Like an NFS hard mount).
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Re: IndexWriter flush/commit exception

Reply via email to