Re: Exceptions during batch indexing

2014-11-10 Thread Peter Keegan
Yeah, I realized this after getting no responses and sent it to solr-user - thanks. On Sat, Nov 8, 2014 at 11:45 PM, Jack Krupansky j...@basetechnology.com wrote: Oops... you sent this to the wrong list - this is the Lucene user list, send it to the Solr user list. -- Jack Krupansky

How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Martin O'Shea
I realise that 3.0.2 is an old version of Lucene but if I have Java code as follows: int nGramLength = 3; SetString stopWords = new SetString(); stopwords.add(the); stopwords.add(and); ... SnowballAnalyzer snowballAnalyzer = new SnowballAnalyzer(Version.LUCENE_30, English, stopWords);

RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Uwe Schindler
Hi, In general, you cannot change Analyzers, they are examples and can be seen as best practise. If you want to modify them, write your own Analyzer subclass which uses the wanted Tokenizers and TokenFilters as you like. You can for example clone the source code of the original and remove

RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Martin O'Shea
Uwe Thanks for the reply. Given that SnowBallAnalyzer is made up of a series of filters, I was thinking about something like this where I 'pipe' output from one filter to the next: standardTokenizer =new StandardTokenizer (...); standardFilter = new StandardFilter(standardTokenizer,...);

RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Uwe Schindler
Hi, Uwe Thanks for the reply. Given that SnowBallAnalyzer is made up of a series of filters, I was thinking about something like this where I 'pipe' output from one filter to the next: standardTokenizer =new StandardTokenizer (...); standardFilter = new

Re: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Ahmet Arslan
Hi, Regarding Uwe's warning, NOTE: SnowballFilter expects lowercased text. [1] [1] https://lucene.apache.org/core/4_3_0/analyzers-common/org/apache/lucene/analysis/snowball/SnowballFilter.html On Monday, November 10, 2014 4:43 PM, Uwe Schindler u...@thetaphi.de wrote: Hi, Uwe Thanks

RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Martin O'Shea
Thanks Uwe. -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: 10 Nov 2014 14 43 To: java-user@lucene.apache.org Subject: RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2 Hi, Uwe Thanks for the reply. Given that SnowBallAnalyzer is

Index keeps growing, then shrinks on restart

2014-11-10 Thread Rob Nikander
Hi, I have an index that's about 700 MB, and it grows over days to until it causes problems with disk size, at about 5GB. If the JVM process ends, the index shrinks back to about 700MB, I'm calling IndexWriter.commit() all the time. What else do you call to get it to compact it's use of space?

How to improve the performance in Lucene when query is long?

2014-11-10 Thread Harry Yu
Hi everyone, I have been using Lucene to build a POI searching geocoding system. After test, I found that when query is long(above 10 terms). And the speed of searching is too slow near to 1s. I think the bottleneck is that I used OR to generate my BooleanQuery. It would get plenty of