Presence of uncommitted changes

2014-01-16 Thread Mindaugas Žakšauskas
Hi, I was wondering what would be the best approach to deal with the situation when some documents are deleted and it is unclear on whether deletions have resulted any pending commits. In a single thread scenario this seems to be as simple as 1 indexWriter.deleteDocuments(query); // same for

Re: Presence of uncommitted changes

2014-01-16 Thread Michael McCandless
On Thu, Jan 16, 2014 at 6:30 AM, Mindaugas Žakšauskas min...@gmail.com wrote: Hi, I was wondering what would be the best approach to deal with the situation when some documents are deleted and it is unclear on whether deletions have resulted any pending commits. In a single thread scenario

Sample Data to Test Lucene

2014-01-16 Thread Deniz Atak
Hi, we are new to Lucene. We would like to use Lucene for our archive project. In this project we have to get some images of documents, get text out of them via OCR and index them using Lucene. In order to see if Lucene is suitable for our project we need to test Lucene with sample data. But we

RE: Sample Data to Test Lucene

2014-01-16 Thread Allison, Timothy B.
To confirm, Lucene does not perform OCR. (If you are looking for open source java ocr packages, you might take a look here for some ideas: https://issues.apache.org/jira/i#browse/TIKA-93). Are you trying to find a corpus of noisy OCR'd text to use as input into Lucene? If so, this looks

ArrayIndexOutOfBoundsException calling FacteFields.addFields()

2014-01-16 Thread Matthew D. Petersen
I’m having an issue with an index when adding category paths to a document. They seem to be added without issue for a long period of time, then for some unknown reason the addition fails with an ArrayIndexOutOfBounds exception. Subsequent attempts to add category paths fail with the same

Issue with FacetFields.addFields() throwing ArrayIndexOutOfBoundsException

2014-01-16 Thread Matthew Petersen
I’m having an issue with an index when adding category paths to a document. They seem to be added without issue for a long period of time, then for some unknown reason the addition fails with an ArrayIndexOutOfBounds exception. Subsequent attempts to add category paths fail with the same

FieldType.tokenized not the same after query

2014-01-16 Thread Phil Herold
The last line in the test program below fails. I'm trying to store a keyword, not tokenized, and get the same type of field back after query. But it doesn't work, it comes back as tokenized. Is this a known problem, or am I missing something? Thanks. -- Phil import