Howard van Rooijen updated LUCENENET-600:
    Summary: Creating an IndexWriter with a RAMDirectory causes two exceptions 
to be thrown  (was: Creating an IndexWriter with a RamDirectory causes two 
exceptions to be thrown)

> Creating an IndexWriter with a RAMDirectory causes two exceptions to be thrown
> ------------------------------------------------------------------------------
>                 Key: LUCENENET-600
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-600
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Howard van Rooijen
>            Priority: Major
> I have a document scoring algorithm built on top of Lucene. I've just 
> upgraded it to the 4.8.0-beta00005 packages (great job by the way).
> We essentially create an in memory index for a single document in order to do 
> some parsing / processing / scoring / classification.
> I noticed while running our test suite that the CPU was spiking and also 
> noticed that a large number of first chance exceptions were being generated 
> by these two lines of code:
> {{var directory = new RAMDirectory();}}
> {{var indexWriter = new IndexWriter(directory, new 
> IndexWriterConfig(LuceneVersion.LUCENE_48, new 
> ScorableDocumentAnalyzer(LuceneVersion.LUCENE_48)));}}
> The first exception is:
> {{'System.IO.FileNotFoundException' in Lucene.Net.dll ("segments.gen"). }}
> The second exception is:
> {{'Lucene.Net.Index.IndexNotFoundException' in Lucene.Net.dll ("no segments* 
> file found in RAMDirectory@21af1a5 
> lockFactory=Lucene.Net.Store.SingleInstanceLockFactory:}}
> Based on reading / research, I believer this is because the RAMDirectory is 
> initialised to be null, and when the IndexWriter is created it tries to query 
> the RAMDirectory and FileNotFoundException is thrown.
> Is it possible to either initialized as empty rather than null - i.e. reading 
> the directory would not throw an exception - this might involve trying to add 
> an "segments.gen" entry and a matching "segments_n" segmentinfo entry, 
> alternatively is it possible not to throw an exception in this use case? 
> Or do you have a suggestion for how it would be possible to manually 
> initialise the RAMDirectory before passing it to the IndexWriter?
> Because these two lines are being called per request - we're seeing 2 
> exceptions per request - this seems like an expensive way of initialising an 
> IndexWriter. We've already had to replace QueryParser with SimpleQueryParser 
> because QueryParser was throwing 50+ exception internally when being 
> instantiated.
> If anyone can point me in the right direction, I'd be more than happy to try 
> and create a fix / PR. But I'm wondering as RAMDirectory is often used for 
> unit testing scenarios - does anyone have any deep knowledge about why this 
> current behaviour is the default behaviour? 
> Many Thanks,
> Howard

This message was sent by Atlassian JIRA

Reply via email to