Re: Norm Value of not existing Field

2009-12-04 Thread Benjamin Heilbrunn
Erick, I'm not sure if I understand you right. What do you mean by spinning through all the terms on a field. It would be an option to load all unique terms of a field by using TermEnum. Than use TermDocs to get the docs to those terms. The rest of docs doesn't contain a term and so you know,

How to include some more fields to be indexed in the file document class?

2009-12-04 Thread DHIVYA M
Hi all,   Am using lucene 2.3.2. I would like to include some more fields of the to be indexed other than the available one.   In the FileDocument class of the demo version of lucene 2.3.2 there are only three fields added to the documents to be indexed.   Ex: doc.add(new Field(path..    

Re: How to include some more fields to be indexed in the file document class?

2009-12-04 Thread Anshum
Hi Dhivya, So are you using the same demo code for your app? Incase you are you have to modify that code and continue. All said and done, you'd have to add fields in your java file and recompile(in case you are already using some code for that purpose). In case you would be starting to write an

updating index

2009-12-04 Thread m.harig
hello all how do i update my existing index to avoid my duplicates , this is how am doing my indexing doc.add(new Field(id,+i,Field.Store.YES,Field.Index.NOT_ANALYZED)); doc.add(new Field(title, indexForm.getTitle(), Field.Store.YES,

Re: updating index

2009-12-04 Thread Ian Lea
writer.updateDocument(new Term(id, +i), doc); Read the javadocs! Haven't we been here before? -- Ian. On Fri, Dec 4, 2009 at 10:30 AM, m.harig m.ha...@gmail.com wrote: hello all        how do i update my existing index to avoid my duplicates , this is how am doing my indexing  

Re: IndexDivisor

2009-12-04 Thread Michael McCandless
I'm confused -- what are these attachments? Output from a memory profiler? Can you post the app you created? Mike On Fri, Dec 4, 2009 at 12:24 AM, Ganesh emailg...@yahoo.co.in wrote: Thanks mike.. Please find the attached file. I ran the testing for 1,100,1000,1 divisor value.  There

Re: Norm Value of not existing Field

2009-12-04 Thread Erick Erickson
The word Filter as part of a class is overloaded in Lucene G See: http://lucene.apache.org/java/2_9_1/api/all/index.html The above filter is just a DocIdSet, one bit per document. So in your example, you're only talking 12M or so, even if you create one filter for every field and keep it

Re: How to do relevancy ranking in lucene

2009-12-04 Thread Erick Erickson
Hmmm, I don't know the underlying scoring code well enough to answer off the top of my head. But if you have the source code, I'd examine the junit tests (the class names should give you a strong hint) and start from there. Best Erick On Fri, Dec 4, 2009 at 12:15 AM, DHIVYA M

Re: IndexDivisor

2009-12-04 Thread Ganesh
I didn't run with profiler. I created a test app and run that.. I am opening multiple database. IndexReader opened with IndexDivisor: 100 //Open the reader with the divisor value TermCount: 7046764 //Available unique terms in the db Warmup done:

searchWithFilter bug?

2009-12-04 Thread Peter Keegan
I'm having a problem with 'searchWithFilter' on Lucene 2.9.1. The Filter wraps a simple BitSet. When doing a 'MatchAllDocs' query with this filter, I get only a subset of the expected results, even accounting for deletes. The index has 10 segments. In IndexSearcher-searchWithFilter, it looks like

Re: searchWithFilter bug?

2009-12-04 Thread Michael McCandless
That doesn't sound good. Though, in searchWithFilter, we seem to ask for the Query's scorer, and the Filter's docIdSetIterator, using the same reader (which may be toplevel, for the legacy case, or per-segment, for the normal case). So I'm not [yet] seeing where the issue is... Can you boil it

Re: searchWithFilter bug?

2009-12-04 Thread Peter Keegan
I think the Filter's docIdSetIterator is using the top level reader for each segment, because the cardinality of the DocIdSet from which it's created is the same for all readers (and what I expect to see at the top level. Peter On Fri, Dec 4, 2009 at 10:38 AM, Michael McCandless

Re: searchWithFilter bug?

2009-12-04 Thread Simon Willnauer
Peter, which filter do you use, do you respect the IndexReaders maxDoc() and the docBase? simon On Fri, Dec 4, 2009 at 4:47 PM, Peter Keegan peterlkee...@gmail.com wrote: I think the Filter's docIdSetIterator is using the top level reader for each segment, because the cardinality of the

Re: searchWithFilter bug?

2009-12-04 Thread Peter Keegan
The filter is just a java.util.BitSet. I use the top level reader to create the filter, and call IndexSearcher.search (Query, Filter, HitCollector). So, there is no 'docBase' at this level of the api. Peter On Fri, Dec 4, 2009 at 11:01 AM, Simon Willnauer simon.willna...@googlemail.com wrote:

Re: searchWithFilter bug?

2009-12-04 Thread Simon Willnauer
-- Forwarded message -- From: Simon Willnauer simon.willna...@googlemail.com Date: Fri, Dec 4, 2009 at 6:53 PM Subject: Re: searchWithFilter bug? To: Peter Keegan peterlkee...@gmail.com Peter, since search is per segment you need to use the segment reader passed in during search

Re: searchWithFilter bug?

2009-12-04 Thread Michael McCandless
On Fri, Dec 4, 2009 at 12:53 PM, Simon Willnauer simon.willna...@googlemail.com wrote: @Mike: maybe we should add a testcase / method in TestFilteredSearch that searches on more than one segment. I agree, we should -- wanna cough up a patch? Mike

Re: searchWithFilter bug?

2009-12-04 Thread Simon Willnauer
On Fri, Dec 4, 2009 at 7:09 PM, Michael McCandless luc...@mikemccandless.com wrote: On Fri, Dec 4, 2009 at 12:53 PM, Simon Willnauer simon.willna...@googlemail.com wrote: @Mike: maybe we should add a testcase / method in TestFilteredSearch that searches on more than one segment. Working on

Re: How to include some more fields to be indexed in the file document class?

2009-12-04 Thread DHIVYA M
Thanx for the suggestion sir. But i wrote the Document of FileDocument class here in my indexing program so that it vl look this method rather than refering the one from the jar file.   Updating the jar by creating a class again seemed to be time consuming for me so did this way.   Thanks,