Re: Recreating index lucene without stopping client applications

2018-07-18 Thread Michael McCandless
If you use IndexWriter.deleteAll, and not any of the other delete by Query, Term methods, it should be quite efficient to delete, as IndexWriter just drops all segments. That API is also transactional, so you could call IW.deleteAll, proceed to reindex all your documents, and if somehow that crash

Lucene Speed

2018-07-18 Thread Ehson Umrani
Hello, My name is Ehson Umrani and I am currently running some experiments using Lucene. FOr the expiraments I am running I need Lucene to run as fast as possible. Do you have any suggestions on how to achieve speeds listed on the nightly benchmark page. I am also using 1kb Wikipedia files and

Re: Lucene Speed

2018-07-18 Thread Adrien Grand
Have you already checked https://wiki.apache.org/lucene-java/ImproveIndexingSpeed? Often when running such benchmarks, the bottleneck is not indexing but opening or parsing input files, so you should review that part as well. Le mer. 18 juil. 2018 à 16:12, Ehson Umrani a écrit : > Hello, > > My

Re: Lucene Speed

2018-07-18 Thread Michael McCandless
Hi Ehson, Have you looked at the luceneutil source code that runs the benchmarks? https://github.com/mikemccand/luceneutil The sources are not super clean, but that's what's running the nightly benchmarks, starting from src/main/perf/Indexer.java. Mike McCandless http://blog.mikemccandless.com

Lucene Performance Tuning

2018-07-18 Thread Hicks, Matt
I am seeing serious performance differences with three slightly varied queries: https://gist.github.com/darkfrog26/de19959db854aaf30957d64d1730d07f Can anyone explain why this might be happening and any tips to optimize it? Most queries are lightning fast, but ones like "Smith Mark D" are taking

Lucene BooleanQuery with some TermQuery's having BooleanClause.Occur set MUST for all

2018-07-18 Thread baris . kazar
Hi,- i have an indexed field having "$word1 word2" and i want to find the docs having these two words first in my first query. i have another indexed field but i am not searching on that second field for this first query which is BooleanQuery with two TermQuery's having BooleanClause.Occu

Re: Not getting desired result through TermQuery

2018-07-18 Thread Michael Sokolov
It's impossible to tell for sure from the info you provided -- attachments are not included in messages on this mailing list - but my guess is that when you use the QueryParser api you are getting a query that has the benefit of text processing using an Analyzer (lower-casing and other text transfo

Re: Not getting desired result through TermQuery

2018-07-18 Thread baris . kazar
My problem seems similar to this one. i make sure index has all lower cased and TermQuery search term also gets all lower cased. i tokenize the search string since index uses standardtokenizer and standardfilter and lowecasefilter and asciifoldingfilter. My index uses standardtokenizer and

does $ mean something in Lucene index and MultiFieldQueryParser

2018-07-18 Thread baris . kazar
It seems in my query string i cant see $ when print it out from MultiFieldQueryParser but the search string has $ in it and it finds hits. On the other hand, Termquery based BooleanQuery keeps $ and no hits. i use $ for starts with effect. Best regards ---