Re: Index size for Same DataSet.

2014-03-25 Thread Erick Erickson
You're probably fine. Part of indexing is merging segments, and when segments are merged the data from deleted (or updated) documents is reclaimed. Any slight variance in the commit algorithm will potentially reclaim more or less space. What happens if you optimize (forceMerge) as a final step?

RE: Index size for Same DataSet.

2014-03-25 Thread Uwe Schindler
Hi, The reason for this is multithreaded merging. While indexing, Lucene merges segments in a separate threads. As this runs multithreaded, there is no strict order of things. Depending on how fast the disk is or what other processes are running in parallel, the merging may proceed fast or

Re: Index size for Same DataSet.

2014-03-25 Thread Jose Carlos Canova
Hi, Thanks a lot for the clarifying. Will do that (force merge) at end, just to check if all things at my side (:-)) are doing right. att. On Tue, Mar 25, 2014 at 5:41 AM, Uwe Schindler u...@thetaphi.de wrote: Hi, The reason for this is multithreaded merging. While indexing, Lucene

Lucene Wildcard for zero or one character

2014-03-25 Thread Sven Teichmann
Hello, does Lucene provide a zero or one character wildcard (like ? in Perl RegEx)? Example of what I mean: house% finds house and houses As far as I know in Lucene the ? wildcard is for exactly one character, but I need a zero or one character wildcard. Best regards, -- Sven Teichmann

RE: Lucene Wildcard for zero or one character

2014-03-25 Thread Uwe Schindler
The default WildcardQuery only supports: '*' (star) is the wildcard in WildcardQuery for zero or more chars. '?' is exactly one char Zero or exatly one char can only be done with a RegexpQuery: https://lucene.apache.org/core/4_7_0/core/org/apache/lucene/search/RegexpQuery.html Here is the

Re: Lucene Wildcard for zero or one character

2014-03-25 Thread Jack Krupansky
/houses?/ -- Jack Krupansky -Original Message- From: Uwe Schindler Sent: Tuesday, March 25, 2014 11:34 AM To: java-user@lucene.apache.org Subject: RE: Lucene Wildcard for zero or one character The default WildcardQuery only supports: '*' (star) is the wildcard in WildcardQuery for

A question concerning a NullPointerException in QueryParser's jj_add_error_token method

2014-03-25 Thread Turri, Albert (ELS-NYC)
Hi, I would like seek assistance regarding the following issue I'm encountering. I'm running Tomcat and have deployed Jena (2.6.2) and Lucene Core (2.9.0), whereby Jena invokes the Lucene classes. Typically everything is fine, but I have recently encountered a NullPointerException,

RE: A question concerning a NullPointerException in QueryParser's jj_add_error_token method

2014-03-25 Thread Turri, Albert (ELS-NYC)
The following is the related stack trace, whereby jj_expentry (or oldentry?) is null. --- if (oldentry.length == jj_expentry.length) { --- SEVERE: Exception from