Re: Ask about method QueryParser.parser

2005-11-09 Thread Karl Øie
Sounds very strange, have you debugged the input string s? Where does it come from? Karl On 9. nov. 2005, at 05.00, Hai Do Thanh wrote: Dear all, I really appreciate your work on Lucene. It is apparently a helpful API for my project on indexed Document searching. On the whole, It works

Re: RangeQuery over many indexed documents seems to be buggy

2005-11-09 Thread Erik Hatcher
On 9 Nov 2005, at 08:43, Joachim Rösener wrote: sex:female AND birthday:[19800101 TO 19810101] This gives the following results: 1980-1981: found 424 women. 1981-1982: found 329 women. 1982-1983: found 237 women. 1983-1984: found 232 women. 1984-1985: found 175 women. To proof if it works, a

Re: RangeQuery over many indexed documents seems to be buggy

2005-11-09 Thread Joachim Rösener
Am Mittwoch, den 09.11.2005, 08:53 -0500 schrieb Erik Hatcher: On 9 Nov 2005, at 08:43, Joachim Rösener wrote: [...] Can you explain, maybe fix this? ah, the lure of young women ;) What else?! :-) Is it perhaps you're getting an exception and eating it somewhere along the way? How

Re: RangeQuery over many indexed documents seems to be buggy

2005-11-09 Thread Yonik Seeley
The limited number of terms in a range query should hopefully be addressed before Lucene 1.9 comes out. I'd give you a reference to the bug, but JIRA seems like it's currently down. search for ConstantScoreRangeQuery if interested. -Yonik Now hiring -- http://forms.cnet.com/slink?231706

going from Document - IndexReader's docid

2005-11-09 Thread tlittell
If I have a Document object (doc), and I also have an IndexReader open, how can I find out IndexReader's docid corresponding to (doc)? IndexReader has a map from docid - Document, but I don't see the reverse. thanks in advance, Todd

Re: going from Document - IndexReader's docid

2005-11-09 Thread Yonik Seeley
There really isn't a generic way... you have to search for the document. If you have a unique id field in your document, you can find the document id quickly via IndexReader.termDocs(term) -Yonik Now hiring -- http://forms.cnet.com/slink?231706 On 11/9/05, [EMAIL PROTECTED] [EMAIL PROTECTED]

A lot of short documents, optimal query?

2005-11-09 Thread eks dev
Hi all, Can somebody please suggest a way/ways on how to optimize execution times this query below (or to use some of Trunk BooleanScorers)... Probably I do not see obvious. Use Case: Here I have names of people with query expansion for individual tokens (not using Fuzzy Query) that should be

Re: going from Document - IndexReader's docid

2005-11-09 Thread Erik Hatcher
The question is, how did you get that Document? If you got it from Hits, you can get the document id from Hits.id(hit_num). Erik On 9 Nov 2005, at 11:13, Yonik Seeley wrote: There really isn't a generic way... you have to search for the document. If you have a unique id field in

Re: A lot of short documents, optimal query?

2005-11-09 Thread Chris Hostetter
: ( : +( : (+raimonds +marschan) : (+raimonds +marschol) : (+raimonds +marschel) : (+raimonds +marschalfr) : (+raimonds +marschalek) : (+raimonds +marscha) : ... : ) : +(ZIPS:22* ZIPS:21* ZIPS:20* ZIPS:23* ZIPS:245* : ZIPS:246* ZIPS:247* ZIPS:240* ZIPS:241* ZIPS:242* : ZIPS:243* ZIPS:254*

Re: going from Document - IndexReader's docid

2005-11-09 Thread tlittell
Ahh, thank you very much. That's exactly what I needed, I just didn't see that in the API. cheers, Todd The question is, how did you get that Document? If you got it from Hits, you can get the document id from Hits.id(hit_num). Erik On 9 Nov 2005, at 11:13, Yonik Seeley wrote:

Encountered EOF using queryparser in XSP

2005-11-09 Thread Tricia Williams
Hi All, I'm using an html form to send a query to an xsp which uses lucene to search and then returns the results as xml. Perhaps some one has experienced the problem that I'm currently experiencing. When the query is parsed org.apache.lucene.queryParser.ParseException is thrown stating that

efficiently finding all terms used on a particular field within Documents matching a query

2005-11-09 Thread Matt Magoffin
I've used the example posted at http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-a801793d7479264e29157d92440199b35266dc18 to find all terms used in a complete index, but was wondering if there is an efficient way to find all terms used within only a set of Documents matching a query? For

Memory Usage

2005-11-09 Thread Daniel Noll
Hi. What is the expected memory usage of Lucene these days? I dug up an old email [1] from 2001 which gave the following summary of memory usage: An IndexReader requires: one byte per field per document in index (norms) one open file per file in index 1/128 of the Terms in the index a

Search Help

2005-11-09 Thread Daniel . Clark
Is there a way to limit the number of hits I want returned? Sometimes I just want one document. ~ Daniel Clark, Senior Consultant Sybase Federal Professional Services 6550 Rock Spring Drive, Suite 800 Bethesda, MD 20817 Office - (301) 896-1103 Office Fax

Re: Memory Usage

2005-11-09 Thread Marvin Humphrey
On Nov 9, 2005, at 4:48 PM, Daniel Noll wrote: My question is: is this 1/128 figure set in stone, or can it be changed without major consequences? You want indexInterval. Here's an excerpt from the docs in TermInfosWriter. // TODO: the default values for these two parameters //

Sorting: string vs int

2005-11-09 Thread Monsur Hossain
Hi all. I have a question about sorting. Lucene in Action says: For numeric types, each field being sorted for each document in the index requires that four bytes be cached. For String types, each unique term is also cached for each document. I want to make sure I'm understanding this

Re: Search Help

2005-11-09 Thread Erik Hatcher
On 9 Nov 2005, at 19:54, [EMAIL PROTECTED] wrote: Is there a way to limit the number of hits I want returned? Sometimes I just want one document. Is there an issue with just accessing hits.doc(0) in this case? Erik

Re: Sorting: string vs int

2005-11-09 Thread Yonik Seeley
The FieldCache (which is used for sorting), uses arrays of size maxDoc() to cache field values. String sorting will involve caching a String[] (or StringIndex) and int sorting will involve caching an int[]. Unique string values are shared in the array, but the String values plus the String[]

Re: Ask about method QueryParser.parser

2005-11-09 Thread Hai Do Thanh
Thanks for your reply :) I have already debugged the input string s. As I said before, s is a string which is sent by client through the doPost() method of servlet At first, I thought that the analyzer is the cause of the problem and that it lowercase all leters. However, then, I have also

Re: efficiently finding all terms used on a particular field within Documents matching a query

2005-11-09 Thread Chris Hostetter
: For example I would like to find the set of terms used within a particular : date range, where all Documents have a date field on them. What I've done : to date is simply perform a query to find all Documents that match the : date range query, then iterate over each one and construct a Set of