We have also made a patch for having a high-water-mark level (15% of excess block-cache capacity) after which cache-writes are stopped.
Once capacity is reclaimed via clean-up thread, we resume adding to cache On Mon, Jul 18, 2016 at 1:58 PM, Ravikumar Govindarajan < [email protected]> wrote: > We had an issue with block-cache growing beyond configured size & reducing > very rarely. Describing the sequence of events > > 1. Shard receives incoming mutations, adds it to Index & triggers > background merge. > 2. Merge produces new-set of files. We have write-thru cache enabled & > adds new files to block-cache.. > 3. Shard goes silent & doesn't receive any mutation for many minutes > all together > 4. Since we perform commit only upon receiving mutations, the > older-files are not evicted from block-cache.. > 5. Problem is exacerbated with KeepNLastCommit policy, where even > after commit, unused files are not evicted from block-cache.. > > > We are planning to patch up SharedMergeScheduler by refreshing IndexReader > when a merge completes & then delete merged files from block-cache. This > way, I believe block-cache can be reigned in whenever it exceeds capacity, > irrespective of Commit-Policy used > > Do let know if this is fine... > > On Thu, Jun 16, 2016 at 4:33 PM, Ravikumar Govindarajan < > [email protected]> wrote: > >> I didn't fully understand the underlying Lucene reader, writer, >>> open, close semantics >> >> >> I too don't know the correct behavior. Lucene code is incredibly hairy to >> follow... :) >> >> Have pinged lucene mailing list. Hope someone replies... >> >> On Tue, Jun 7, 2016 at 4:46 PM, Aaron McCurry <[email protected]> wrote: >> >>> On Wed, Jun 1, 2016 at 7:34 AM, Ravikumar Govindarajan < >>> [email protected]> wrote: >>> >>> > Just one more observation here... >>> > >>> > Even if readerPooling is set to true, lucene has 2 readers (One for >>> search >>> > & one updates/deletes) >>> > >>> > But the reader for updates/deletes is not opened/closed for every >>> commit >>> > call which is the default behavior as of today. It is opened only once >>> > (During first update/delete call) >>> > >>> >>> I will take a closer look at the code for this one. Likely when I wrote >>> this code I didn't fully understand the underlying Lucene reader, writer, >>> open, close semantics. Thank you for pointing this out! >>> >>> Aaron >>> >>> >>> > >>> > On Wed, Jun 1, 2016 at 3:10 PM, Ravikumar Govindarajan < >>> > [email protected]> wrote: >>> > >>> > > In newer versions of the code there are multiple streams involved. >>> One >>> > for >>> > >> each open file handle plus if a sequential read is detected a new >>> stream >>> > >> is >>> > >> created for the instance for better performance >>> > > >>> > > >>> > > Great. We just patched up our Blur version with this code. >>> > > >>> > > While I was digging at the reader-closed issue, was quite surprised >>> to >>> > > observe the following behavior >>> > > >>> > > - Issue a commit >>> > > - Lucene opens a new reader via IndexWriter. (Doesn't re-use our >>> > > already opened DirectoryReader) >>> > > - Processes all updates/deletes/merges >>> > > - Closes the new reader >>> > > - Complete commit >>> > > >>> > > For a big index & lots of commits, opening a new-reader for every >>> commit >>> > > is prohibitively expensive. >>> > > >>> > > >>> > > Here is the JIRA for it... >>> > > https://issues.apache.org/jira/browse/LUCENE-2297 >>> > > >>> > > All we need to do is just set "readerPooling=true" in >>> IndexWriterConfig >>> > > class >>> > > >>> > > Please do explore this option when you find time. >>> > > >>> > > -- >>> > > Ravi >>> > > >>> > > >>> > > >>> > > On Tue, May 24, 2016 at 7:48 PM, Aaron McCurry <[email protected]> >>> > wrote: >>> > > >>> > >> On Tue, May 24, 2016 at 6:06 AM, Ravikumar Govindarajan < >>> > >> [email protected]> wrote: >>> > >> >>> > >> > We have solved it temporarily by using a KeepLastTwoCommits del >>> > policy. >>> > >> We >>> > >> > don't get these exceptions now!!! >>> > >> > >>> > >> >>> > >> Great! >>> > >> >>> > >> >>> > >> > >>> > >> > Btw, I see that pread calls in FSDataInputStream.java are >>> > synchronized. >>> > >> Is >>> > >> > it possible that merge DFS read calls could potentially block >>> search >>> > DFS >>> > >> > read calls? >>> > >> > >>> > >> >>> > >> Yes. >>> > >> >>> > >> >>> > >> > >>> > >> > Would it be a good idea to have 2 DFSInputStreams for every file, >>> one >>> > >> for >>> > >> > merge & another for search? >>> > >> > >>> > >> >>> > >> In newer versions of the code there are multiple streams involved. >>> One >>> > >> for >>> > >> each open file handle plus if a sequential read is detected a new >>> stream >>> > >> is >>> > >> created for the instance for better performance. Checkout the >>> > >> HdfsDirectory class. >>> > >> >>> > >> Aaron >>> > >> >>> > >> >>> > >> > >>> > >> > On Tue, May 10, 2016 at 7:43 PM, Ravikumar Govindarajan < >>> > >> > [email protected]> wrote: >>> > >> > >>> > >> > > Sorry, I mis-understood the code. >>> > >> > > I see that it has 2 locks IndexRefreshWriteLock & >>> > >> IndexRefreshReadLock. >>> > >> > > They look to be separate >>> > >> > > >>> > >> > > On Tue, May 10, 2016 at 7:16 PM, Ravikumar Govindarajan < >>> > >> > > [email protected]> wrote: >>> > >> > > >>> > >> > >> Thanks a lot Aaron. >>> > >> > >> >>> > >> > >> I guess we took a commit of 0.2.2 that doesn't have the >>> > >> > >> IndexRefreshWriteLock (IRWL). It looks like it co-ordinates >>> between >>> > >> > >> searches & incoming mutation commits. If so, then it will >>> likely >>> > >> solve >>> > >> > the >>> > >> > >> first issue for us (AlreadyClosedException) >>> > >> > >> >>> > >> > >> >>> > >> > >> Can you recollect if that was the reason IRWL was introduced? >>> > >> > >> >>> > >> > >> On Tue, May 10, 2016 at 6:40 PM, Aaron McCurry < >>> [email protected] >>> > > >>> > >> > >> wrote: >>> > >> > >> >>> > >> > >>> On Tue, May 10, 2016 at 2:30 AM, Ravikumar Govindarajan < >>> > >> > >>> [email protected]> wrote: >>> > >> > >>> >>> > >> > >>> > Actually there are 2 issues... >>> > >> > >>> > >>> > >> > >>> > 1. IndexReaderClosedException >>> > >> > >>> > 2. HDFS Stream Closed >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> Likely when the index is closed it closes the underlying >>> > >> indexinputs as >>> > >> > >>> well causing the HDFS Stream closed exception. >>> > >> > >>> >>> > >> > >>> >>> > >> > >>> > >>> > >> > >>> > Merge completion results in File Deletion & ultimately HDFS >>> > Stream >>> > >> > >>> Closed >>> > >> > >>> > during Search.... >>> > >> > >>> > >>> > >> > >>> > I use IndexFileDeleter with >>> KeepOnlyLastCommitDeletionPolicy. >>> > This >>> > >> > >>> blindly >>> > >> > >>> > deletes the file, without bothering to cross-check >>> > >> > >>> IndexReader.RefCount > >>> > >> > >>> > 0. >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> Hmm. You can see here: >>> > >> > >>> >>> > >> > >>> >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> https://github.com/apache/incubator-blur/blob/release-0.2.2-incubating/blur-core/src/main/java/org/apache/blur/manager/writer/BlurIndexSimpleWriter.java#L303 >>> > >> > >>> >>> > >> > >>> That once the new index is available it is swapped into the >>> index >>> > >> ref >>> > >> > >>> object and the old one is sent to the index closer. Once the >>> ref >>> > to >>> > >> > the >>> > >> > >>> index are low enough it closes the index. Or at least it >>> should. >>> > >> > >>> >>> > >> > >>> I will continue looking into the problem but I don't have a >>> > solution >>> > >> > for >>> > >> > >>> you yet. >>> > >> > >>> >>> > >> > >>> Aaron >>> > >> > >>> >>> > >> > >>> >>> > >> > >>> >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> > *Exception(message:Unknown error during rewrite, >>> > >> > >>> > stackTraceStr:java.io.IOException: Stream closed* >>> > >> > >>> > at >>> > >> > >>> >>> > >> >>> org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1385) >>> > >> > >>> > at >>> > >> > >>> org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1374) >>> > >> > >>> > at >>> > >> > >>> >>> > >> >>> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.store.hdfs.HdfsIndexInput.readInternal(HdfsIndexInput.java:62) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:167) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:122) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.store.hdfs.MmapCacheIndexInput.readAndcache(MmapCacheIndexInput.java:24) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.store.blockcache_v2.CacheIndexInput.fillNormally(CacheIndexInput.java:354) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.store.blockcache_v2.CacheIndexInput.fill(CacheIndexInput.java:379) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.store.blockcache_v2.CacheIndexInput.tryToFill(CacheIndexInput.java:297) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.store.blockcache_v2.CacheIndexInput.readByte(CacheIndexInput.java:151) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.lucene.warmup.TraceableIndexInput.readByte(TraceableIndexInput.java:62) >>> > >> > >>> > at >>> > org.apache.lucene.store.DataInput.readVInt(DataInput.java:108) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock(BlockTreeTermsReader.java:2366) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekCeil(BlockTreeTermsReader.java:1949) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.index.ExitableReader$ExitableTermsEnum.seekCeil(ExitableReader.java:250) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:225) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:78) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220) >>> > >> > >>> > at >>> > >> > >>> >>> > >> > >>> > >>> org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288) >>> > >> > >>> > at >>> > >> > >>> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412) >>> > >> > >>> > at >>> > >> > >>> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412) >>> > >> > >>> > at >>> > >> > >>> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412) >>> > >> > >>> > at >>> > >> > >>> > >>> > >> > >>> > On Mon, May 9, 2016 at 4:42 PM, Ravikumar Govindarajan < >>> > >> > >>> > [email protected]> wrote: >>> > >> > >>> > >>> > >> > >>> > > One extra info we gleaned from the logs... >>> > >> > >>> > > >>> > >> > >>> > > 1. Merge Starts & is about to complete >>> > >> > >>> > > 2. Searcher is opened >>> > >> > >>> > > 3. Merge Completes >>> > >> > >>> > > 4. Ref-count drops to 0 in IndexReader >>> > >> > >>> > > 5. IndexReader closed while Searcher is still open >>> > >> > >>> > > >>> > >> > >>> > > This seems to be the main pattern for causing the >>> Exception >>> > >> > >>> > > >>> > >> > >>> > > -- >>> > >> > >>> > > Ravi >>> > >> > >>> > > >>> > >> > >>> > > On Mon, May 9, 2016 at 3:08 PM, Ravikumar Govindarajan < >>> > >> > >>> > > [email protected]> wrote: >>> > >> > >>> > > >>> > >> > >>> > >> Thanks Aaron... >>> > >> > >>> > >> >>> > >> > >>> > >> Just a quick question. Lucene itself has ref-counting to >>> > close >>> > >> > it's >>> > >> > >>> > >> readers no? Or Blur has it's own logic to handle it? >>> > >> > >>> > >> >>> > >> > >>> > >> -- >>> > >> > >>> > >> Ravi >>> > >> > >>> > >> >>> > >> > >>> > >> On Fri, May 6, 2016 at 7:56 PM, Aaron McCurry < >>> > >> [email protected] >>> > >> > > >>> > >> > >>> > wrote: >>> > >> > >>> > >> >>> > >> > >>> > >>> Likely yes. If have a few minutes this weekend I can >>> look >>> > >> > through >>> > >> > >>> that >>> > >> > >>> > >>> version and see if I can point you in the right >>> direction. >>> > >> > >>> > >>> >>> > >> > >>> > >>> On Fri, May 6, 2016 at 8:46 AM, Ravikumar Govindarajan < >>> > >> > >>> > >>> [email protected]> wrote: >>> > >> > >>> > >>> >>> > >> > >>> > >>> > Sometimes during an ongoing search we receive an >>> > >> > >>> > >>> > IndexReaderClosedException... >>> > >> > >>> > >>> > >>> > >> > >>> > >>> > We are on an older version of Blur (0.2.2). Has this >>> been >>> > >> fixed >>> > >> > >>> in >>> > >> > >>> > >>> newer >>> > >> > >>> > >>> > versions or we have been using it wrongly? >>> > >> > >>> > >>> > >>> > >> > >>> > >>> > >>> > >> *stackTraceStr:org.apache.lucene.store.AlreadyClosedException: >>> > >> > >>> this >>> > >> > >>> > >>> > IndexReader cannot be used anymore as one of its child >>> > >> readers >>> > >> > >>> was >>> > >> > >>> > >>> closed* >>> > >> > >>> > >>> > at >>> > >> > >>> > >>> > >> org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:257) >>> > >> > >>> > >>> > at >>> > >> > >>> > >>> > >>> > >> > >>> > >>> > >>> > >> > >>> > >>> >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.index.FilterAtomicReader.fields(FilterAtomicReader.java:380) >>> > >> > >>> > >>> > at >>> > >> > >>> > >>> > >>> > >> > >>> > >>> > >>> > >> > >>> > >>> >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.blur.index.ExitableReader$ExitableFilterAtomicReader.fields(ExitableReader.java:81) >>> > >> > >>> > >>> > at >>> > >> > >>> > >>> > >>> > >> > >>> > >>> > >>> > >> > >>> > >>> >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:52) >>> > >> > >>> > >>> > at >>> > >> > >>> > >>> > >>> > >> > >>> > >>> > >>> > >> > >>> > >>> >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95) >>> > >> > >>> > >>> > at >>> > >> > >>> > >>> > >>> > >> > >>> > >>> > >>> > >> > >>> > >>> >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >> >>> > >>> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220) >>> > >> > >>> > >>> > at >>> > >> > >>> > >>> >>> > >> > >>> > >>> > >> > >>> >>> > >> > >>> > >>> org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288) >>> > >> > >>> > >>> > at >>> > >> > >>> > >>> > >> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412) >>> > >> > >>> > >>> > >>> > >> > >>> > >>> >>> > >> > >>> > >> >>> > >> > >>> > >> >>> > >> > >>> > > >>> > >> > >>> > >>> > >> > >>> >>> > >> > >> >>> > >> > >> >>> > >> > > >>> > >> > >>> > >> >>> > > >>> > > >>> > >>> >> >> >
