Re: IndexWriter flush/commit exception

Michael McCandless Thu, 19 Dec 2013 04:20:15 -0800

On Wed, Dec 18, 2013 at 11:34 PM, Ravikumar Govindarajan
<ravikumar.govindara...@gmail.com> wrote:
>> You could make a custom Dir wrapper that always caches in RAM, but
>> that sounds a bit terrifying :)
>
> This was exactly what I implemented:)


I see :)

> A commit-thread runs periodically
> every 30 seconds, while RAM-Monitor thread runs every 5 seconds to commit
> data in-case sizeInBytes>=70%-of-maxCachedBytes. This is quite dangerous as
> you have said, especially when sync() can take an arbitrary amount of time

Well, Lucene is able to produce bytes at a high rate (if it can read
them at a high rate), during merging, so if you're not careful you can
use too much RAM.

Be sure to stall the byte-producing-threads when that happens, until
HDFS catches up.

>> Alternatively, maybe on an HDFS error you could block that one thread
>> while you retry for some amount of time, until the write/read
>> succeeds?  (Like an NFS hard mount).
>
> Well, after your idea I started digging HDFS for this problem. I believe
> HDFS handles this internally without a snitch, as per this link.
> https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-3/data-flow
>
> I believe in the case of a node failure while writing, even an IOException
> is also not thrown to the client and all of it is handled internally. I
> think I can rest-easy on this.
> May be will write a test-case to verify this behavior.

Oh that's good.

> Sorry for the trouble. Should have done some digging before-hand.

No problem!

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: IndexWriter flush/commit exception

Reply via email to