Re: Lucene Optimization

2016-07-29 Thread Parit Bansal
On 07/13/2016 12:43 AM, Siraj Haider wrote: We currently use Lucene 2.9 and to keep the indexes running faster we optimize the indexes during night. In our application the volume of new documents coming in is very high so most of our indexes have to merge segments during the day too, when the

Re: get enumeration of all terms starting at a given term after lucene 4

2016-07-29 Thread Parit Bansal
On 07/29/2016 08:27 AM, Mukul Ranjan wrote: lucene version from lucene 3.6 to lucene 5.5.2. After 3.6, >indexReader terms api is removed which used to give list of terms. >I have used below code to get the termEnum, but it has no option to >pass the value of the field which is used to get the

Re: get enumeration of all terms starting at a given term after lucene 4

2016-07-29 Thread Parit Bansal
, PrefixTermsEnum is removed in lucene 5.1 so we can not use this now. Thanks, Mukul -Original Message- From: Parit Bansal [mailto:Parit.Bansal@sib.swiss] Sent: Friday, July 29, 2016 3:20 PM To: java-user@lucene.apache.org Subject: Re: get enumeration of all terms starting at a given term after

Re: Filter strategy in Lucene 6.0

2016-08-03 Thread Parit Bansal
Hi, Could you point to some resource where I can read about two-phase iterators in slightly more depth? There are still confusions for me as to how exactly it works. - Best Parit On 08/02/2016 07:07 PM, Andres de la Peña wrote: Thanks Adrien, this is very helpful. I have just read your

Re: Help regarding BM25Similarity

2018-01-05 Thread Parit Bansal
Hi Robert, passing b = 0 will influence the similarity across all the fields (no?) . I wanted it to be field specific. I think Uwe's suggestion of not indexing norms for specific fields should work better. Thankx again. - Best Parit Bansal On 01/04/2018 08:34 PM, Robert Muir wrote: You

Re: Help regarding BM25Similarity

2018-01-05 Thread Parit Bansal
Hi Robert, passing b = 0 will influence the similarity across all the fields (no?) . I wanted it to be field specific. I think Uwe's suggestion of not indexing norms for specific fields should work better. - Best Parit Bansal On 01/04/2018 08:34 PM, Robert Muir wrote: You don't need to do

Re: Help regarding BM25Similarity

2018-01-05 Thread Parit Bansal
Hi Uwe, You are right. Thankx! :) - Best Parit Bansal On 01/04/2018 05:02 PM, Uwe Schindler wrote: How about just indexing the field without norms? Uwe Am January 4, 2018 3:58:27 PM UTC schrieb Parit Bansal <Parit.Bansal@sib.swiss>: Hi, I am trying to tweak BM25Similarity for my us

Re: Lucene with Database

2018-01-05 Thread Parit Bansal
lot more). This approach is helpful in our use case as we have write-once index/database. Hope this helps. - Best Parit Bansal On 12/28/2017 06:35 AM, Kumar, Santosh wrote: Hi Trejkaz, Evert, Riccardo, Thank you for your inputs. We have an application which we plan to migrate to Cloudfoundry a

Re: Help regarding BM25Similarity

2018-01-05 Thread Parit Bansal
Thankx Adrien. I'll try this approach too. - Best Parit Bansal On 01/05/2018 10:43 AM, Adrien Grand wrote: You can use PerFieldSimilarityWrapper to have different BM25 settings per field. Le ven. 5 janv. 2018 à 10:37, Parit Bansal <Parit.Bansal@sib.swiss> a écrit : Hi Robert, passing

Re: High CPU usage observed while searching with lucene 6.2.1

2018-01-05 Thread Parit Bansal
Hi jay, I have used 6.2.1 previously and I didn't see any specific high CPU usage. Would be good if you could debug your indexing process via visualvm or similar tool to pinpoint where lucene is spending most of the time. Hope this helps. - Best Parit Bansal On 01/04/2018 12:25 PM

Help regarding BM25Similarity

2018-01-04 Thread Parit Bansal
in BM25Similarity. In ClassicSimilarity, same was possible by overriding the lengthNorm method. Is there a way around in BM25Similarity? Is there a possibility to change it to non-final methods in new releases? - Best Parit Bansal

Dubious tokenizing with WordDelimiterGraphFilter

2018-01-22 Thread Parit Bansal
token i.e. cg7582pa be 0 instead of 1 ? 3. Why is the last token i.e pa given a position of 2 and not 1 ? Looking forward for your suggestions. - Best Parit Bansal

WordDelimiterIterator word splitting usecase

2017-12-22 Thread Parit Bansal
ssibility of a patch/refactoring to fix isBreak() to use some new configuration flags? - Best Parit Bansal (Developer www.uniprot.org) - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional command