date:20090909

Re: get all tokens from index

2009-09-09 Thread AHMET ARSLAN

> hello all, is there any way to get all > tokens from my index ? please anyone > suggest me The code below prints all terms of a field. String path = "E:\\ThesaurusSolrHome\\data\\index"; String field = "contents"; IndexReader indexReader = IndexReader.open(path);

Re: Filtering question/advice

2009-09-09 Thread Amin Mohammed-Coleman

Hi Thanks for your reponse. Here is the following testcase: public class UnderwriterReferenceTest { private Directory directory; private Analyzer analyzer; private IndexSearcher indexSearcher; private IndexWriter indexWriter; private Document layerDocumentA; @Before

Re: get all tokens from index

2009-09-09 Thread m.harig

Thanks Ahmet , i found the solution. thanks a lot Ahmet Arslan wrote: > > >> hello all, is there any way to get all >> tokens from my index ? please anyone >> suggest me > > The code below prints all terms of a field. > >String path = "E:\\ThesaurusSolrHome\\data\\index"; >St

RE: New "Stream closed" exception with Java 6

2009-09-09 Thread Chris Bamford

Thanks for your input Mark and Chris. I will take all into account Chris - Original Message - From: Mark Miller Sent: Tue, 8/9/2009 8:06pm To: java-user@lucene.apache.org Subject: Re: New "Stream closed" exception with Java 6 Chris Hostetter wrote: > : I'm coming to the same conclusio

Re: Lucene 2.9 RC2 now available for testing

2009-09-09 Thread Peter Keegan

I've been testing 2.9 RC2 lately and comparing query performance to 2.3.2. I'm seeing a huge increase in throughput (2x-10x) on an index that was built with 2.3.2. The queries have a lot of BoostingTermQuerys and boolean clauses containing a custom scorer. Using JProfiler, I observe that the improv

Re: Lucene 2.9 RC2 now available for testing

2009-09-09 Thread Yonik Seeley

On Wed, Sep 9, 2009 at 8:57 AM, Peter Keegan wrote: > Using JProfiler, I observe that the improvement > is due to a huge reduction in the number of calls to TermDocs.next and > TermDocs.skipTo (about 65% fewer calls). Indexes are searched per-segment now (i.e. MultiTermDocs isn't normally used). O

Re: Lucene 2.9 RC2 now available for testing

2009-09-09 Thread Yonik Seeley

On Wed, Sep 9, 2009 at 9:17 AM, Yonik Seeley wrote: > On Wed, Sep 9, 2009 at 8:57 AM, Peter Keegan wrote: >> Using JProfiler, I observe that the improvement >> is due to a huge reduction in the number of calls to TermDocs.next and >> TermDocs.skipTo (about 65% fewer calls). > > Indexes are searched

Re: Newbie: Luke and fields

2009-09-09 Thread Erick Erickson

It's all in the analyzers. Depending upon which analyzer you use manythings happen to the input stream. Casing is one example, but that's just the simplest. Which is why it's so important to use the same analyzer when indexing and querying unless you have a *very* good reason not to. I'd really ad

Re: Lucene 2.9 RC2 now available for testing

2009-09-09 Thread Peter Keegan

IndexSearcher.search is calling my custom scorer's 'next' and 'doc' methods 64% fewer times. I see no 'advance' method in any of the hot spots'. I am getting the same number of hits from the custom scorer. Has the BooleanScorer2 logic changed? Peter On Wed, Sep 9, 2009 at 9:17 AM, Yonik Seeley <

Re: Lucene 2.9 RC2 now available for testing

2009-09-09 Thread Mark Miller

How about the new score inorder/out of order stuff? It was an option before, but I think now it uses whats best by default? And pairs with the collector? I didn't follow any of that closely though. - Mark Peter Keegan wrote: > IndexSearcher.search is calling my custom scorer's 'next' and 'doc' me

Re: Lucene 2.9 RC2 now available for testing

2009-09-09 Thread Michael McCandless

Right, BooleanQuery will now try to use BooleanScorer (does "out of order" collection, which does not use skipTo/advance at all, I think) when possible, instead of BooleanScorer2. This only applies for boolean queries that have only SHOULD clauses, and up to 32 MUST_NOT clauses (if there's even 1

Re: Lucene 2.9 RC2 now available for testing

2009-09-09 Thread Yonik Seeley

On Wed, Sep 9, 2009 at 9:40 AM, Peter Keegan wrote: > IndexSearcher.search is calling my custom scorer's 'next' and 'doc' methods > 64% fewer times. I see no 'advance' method in any of the hot spots'. I am > getting the same number of hits from the custom scorer. > Has the BooleanScorer2 logic chan

Re: Lucene 2.9 RC2 now available for testing

2009-09-09 Thread Peter Keegan

>Is it possible that skipTo is very costly with your custom scorer? It's no more expensive than 'next'. The scorer's 'skipTo' and 'next' methods call termdocs.skipTo or termdocs.next to get the next 'candidate' doc. This just checks a BitVector to find the next non-deleted doc. But the scorer mus

Re: Lucene 2.9 RC2 now available for testing

2009-09-09 Thread Peter Keegan

> http://svn.apache.org/viewvc?view=rev&revision=630698 This may be it. The scorer is sparse and usually in a conjuction with a dense scorer. Does the index format matter? I haven't yet built it with 2.9. Peter On Wed, Sep 9, 2009 at 10:17 AM, Yonik Seeley wrote: > On Wed, Sep 9, 2009 at 9:40 AM

How to calculate the DGaps value in *.del file?

2009-09-09 Thread 関磊

Hello, I want to know how to calculate the DGaps value in *.del file? For example, if there are 8000 bits and only bits 10,12,32 are set, DGaps would be used: (VInt) 1 , (byte) 20 , (VInt) 3 , (Byte) 1 I do not understand why the DGraps is 1 and 3. Please tell m

Lucene 2.9 RC3 now available for testing

2009-09-09 Thread Mark Miller

-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello Lucene users, On behalf of the Lucene dev community (a growing community far larger than just the committers) I would like to announce the third release candidate for Lucene 2.9. Please download and check it out – take it for a spin and kick th

support for PayloadTermQuery in MoreLikeThis

2009-09-09 Thread Bill Au

Has anyone done anything regarding the support of PayloadTermQuery in MoreLikeThis? I took a quick look at the code and it seems to be simply a matter of swapping TermQuery with PayloadTermQuery. I guess a generic solution would be to add a enable method to enable PayloadTermQuery, keeping TermQu

IndexReader.isCurrent for cached indexes

2009-09-09 Thread Nick Bailey

Looking for some help figuring out a problem with the IndexReader.isCurrent() method and cached indexes. We have a number of lucene indexes that we attempt to keep in memory after an initial query is performed. In order to prevent the indexes from becoming stale, we check for changes about e

NumberFormatException when creating field cache

2009-09-09 Thread Antony Bowesman

I'm using Lucene 2.3.2 and have a date field used for sorting, which is MMDDHHMM. I get an exception when the FieldCache is being generated as follows: java.lang.NumberFormatException: For input string: "190400-412317" java.lang.NumberFormatException.forInputString(NumberFormatException.jav

Re: NumberFormatException when creating field cache

2009-09-09 Thread Mark Miller

Antony Bowesman wrote: > I'm using Lucene 2.3.2 and have a date field used for sorting, which > is MMDDHHMM. I get an exception when the FieldCache is being > generated as follows: > > java.lang.NumberFormatException: For input string: "190400-412317" > java.lang.NumberFormatException.forInput

Re: How to avoid huge index files

2009-09-09 Thread Dvora

Hello again, Can someone please comment on that, whether what I'm looking is possible or not? Dvora wrote: > > Hello, > > I'm using Lucene2.4. I'm developing a web application that using Lucene > (via compass) to do the searches. > I'm intending to deploy the application in Google App Engine

Re: get all tokens from index

Re: Filtering question/advice

Re: get all tokens from index

RE: New "Stream closed" exception with Java 6

Re: Lucene 2.9 RC2 now available for testing

Re: Lucene 2.9 RC2 now available for testing

Re: Lucene 2.9 RC2 now available for testing

Re: Newbie: Luke and fields

Re: Lucene 2.9 RC2 now available for testing

Re: Lucene 2.9 RC2 now available for testing

Re: Lucene 2.9 RC2 now available for testing

Re: Lucene 2.9 RC2 now available for testing

Re: Lucene 2.9 RC2 now available for testing

Re: Lucene 2.9 RC2 now available for testing

How to calculate the DGaps value in *.del file?

Lucene 2.9 RC3 now available for testing

support for PayloadTermQuery in MoreLikeThis

IndexReader.isCurrent for cached indexes

NumberFormatException when creating field cache

Re: NumberFormatException when creating field cache

Re: How to avoid huge index files

21 matches

Site Navigation

Mail list logo

Footer information