Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Paul Elschot
On Friday 12 November 2004 07:57, Sanyi wrote: That's the point: there is no query optimizer in Lucene. Sorry, I'm not very much into Lucene's internal Classes, I'm just telling your the viewpoint of a user. You know my users aren't technicians, so answers like yours won't make them happy.

Re: Phrase search for more than 4 words throws exception in QueryParser

2004-11-12 Thread Sanyi
It works for me too on linux. Thanks for the test! --- Morus Walter [EMAIL PROTECTED] wrote: Sanyi writes: How to perform phrase searches for more than four words? This works well with 1.4.2: aa bb cc dd I pass the query as a command line parameter on XP: \aa bb cc dd\

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Sanyi
It is normally possible to reduce the numbers of such complaints a lot by imposing a minimum prefix length I've alread limited it to a minimum of 5 characters (abcde*). I can still easily find (for the first try) situations where it starts to search for minutes. While another 5 char. partial

RE: HTMLParser.getReader returning null

2004-11-12 Thread Vanlerberghe, Luc
If you use the Field.Text(String name, Reader value) version of the Field.Text constructor, the field is tokenized and indexed but *not* stored. This means you will be able to search and find that document, but to know the original contents you will have to store a copy of it elsewhere. The

Faster highlighting with TermPositionVectors PROBLEM

2004-11-12 Thread Miro Max
Hi all, i'm trying to implement this Method and i need some help. i've downloaded the new lucene-cvs version, compiled it with ant and i get the following error message: The method getOffsets(int) is undefined for the type Object. Can anyone tell me why? thx miro

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Giulio Cesare Solaroli
Hi all, I am cross-posting my reply also to developer list because I think some of my arguments belong there. I was thinking about extending somehow the PhraseQuery analyzer in order to better handle wild character expansion. Sanyi idea to optimize the expansion of the terms to include just the

Re: getting error message

2004-11-12 Thread Erik Hatcher
On Nov 11, 2004, at 4:34 PM, Hetan Shah wrote: Does anyone know what does the following error message mean? Yeah, it means an object you're trying to access is null. :)) You'll have to look at the JSP and see what object that is to troubleshoot. Erik TIA. -H root cause

Re: HTMLParser.getReader returning null

2004-11-12 Thread Luke Shannon
Hi; I am using the HTMLParser that comes with the latest version of Lucene (in the demo). Here is the import line: import org.apache.lucene.demo.html.HTMLParser; If you have lucene-demos-1.4-final.jar in your class path the system will find the Parser Class. I am happy with the results. Let

Re: HTMLParser.getReader returning null

2004-11-12 Thread Luke Shannon
Ah. That would explain it. Thank you Luc. - Original Message - From: Vanlerberghe, Luc [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Friday, November 12, 2004 5:41 AM Subject: RE: HTMLParser.getReader returning null If you use the Field.Text(String name, Reader value)

Re: Lucene : avoiding locking

2004-11-12 Thread Luke Francl
Luke, I also integrated Lucene into a content management application with incremental updates and ran into the same problem you did. You need to make sure only one process (which means, no multiple copies of the application writing to the index simultaneously) or thread ever writes to the index.

Re: Lucene : avoiding locking

2004-11-12 Thread Luke Shannon
Hi Luke; Currently I am experimenting with checking if the index is lock using IndexReader.locked before creating a writer. If this turns out to be the case I was thinking of just unlocking the file. Do you think this is a good strategy? Thanks, Luke - Original Message - From: Luke

Re: Lucene : avoiding locking

2004-11-12 Thread Otis Gospodnetic
Hello, --- Luke Shannon [EMAIL PROTECTED] wrote: Currently I am experimenting with checking if the index is lock using IndexReader.locked before creating a writer. If this turns out to be the case I was thinking of just unlocking the file. Do you think this is a good strategy? Only if

Re: Lucene : avoiding locking

2004-11-12 Thread Luke Francl
On Fri, 2004-11-12 at 09:51, Luke Shannon wrote: Hi Luke; Currently I am experimenting with checking if the index is lock using IndexReader.locked before creating a writer. If this turns out to be the case I was thinking of just unlocking the file. Do you think this is a good strategy?

Re: Lucene : avoiding locking

2004-11-12 Thread Otis Gospodnetic
I am curious, though, how many people on this list are using Lucene in the incremental update case. Most examples I've seen all assume batch indexing. I do both on for Simpy (simpy.com). To ensure no duplicates, I try to delete (by some unique ID) before I add a new Document. Otis On Thu,

Number of documents to be optimized

2004-11-12 Thread Ravi
How do I know the number of documents to be optimized (If I have one large index, number of documents that are in other segments) at any time? Thanks in advance, Ravi. - To unsubscribe, e-mail: [EMAIL PROTECTED] For

QueryParser: [stopword] AND something throws Exception

2004-11-12 Thread Peter Pimley
[this is using lucene-1.4-final] Hello. I have just encountered a way to get the QueryParser to throw an ArrayIndexOutOfBoundsException. It can be recreated with the demo org.apache.lucene.demo.SearchFiles program. The way to trigger it is to parse a query of the form: a AND b ...where 'a'

RE: QueryParser: [stopword] AND something throws Exception

2004-11-12 Thread Will Allen
Holy cow! This does happen! -Original Message- From: Peter Pimley [mailto:[EMAIL PROTECTED] Sent: Friday, November 12, 2004 11:52 AM To: Lucene Users List Subject: QueryParser: [stopword] AND something throws Exception [this is using lucene-1.4-final] Hello. I have just encountered

Re: QueryParser: [stopword] AND something throws Exception

2004-11-12 Thread Justin Swanhart
Try using 1.4.2. The change file says that ArrayIndexOutOfBoundsExceptions have been fixed in the queryparser. On Fri, 12 Nov 2004 12:04:31 -0500, Will Allen [EMAIL PROTECTED] wrote: Holy cow! This does happen! -Original Message- From: Peter Pimley [mailto:[EMAIL PROTECTED]

Re: QueryParser: [stopword] AND something throws Exception

2004-11-12 Thread Daniel Naber
On Friday 12 November 2004 17:52, Peter Pimley wrote: [this is using lucene-1.4-final] Please try 1.4.2. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands,

Re: Lucene : avoiding locking

2004-11-12 Thread Luke Shannon
Hi All; I think I have resovled my locking issues, at least in my development environment (QA is next). I did the following: 1. Synchronized all the methods in my class (not sure if this was really necessary). 2. When ever I created a writer or use the reader to delete I checked if the index is

Re: QueryParser: [stopword] AND something throws Exception

2004-11-12 Thread Peter Pimley
Thanks for pointing that out, and sorry for reporting a duplicate bug. I went here: http://jakarta.apache.org/site/binindex.cgi and the lucene link about halfway down the page links to 1.4-final. I didn't find my way to the page that announces 1.4.2. I'll install 1.4.2 on Monday morning,

lucene Scorers

2004-11-12 Thread Ken McCracken
Hi, I am looking at the Similarity class overview, and wondering if I can replace the SUM operator with a MAX operator, or any other operator (across the terms in a query). For example, if I search for car OR automobile, a BooleanScorer is used to add the values from each subexpression together.

Re: lucene Scorers

2004-11-12 Thread Paul Elschot
On Friday 12 November 2004 20:48, Ken McCracken wrote: Hi, I am looking at the Similarity class overview, and wondering if I can replace the SUM operator with a MAX operator, or any other operator (across the terms in a query). For example, if I search for car OR automobile, a

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Luke Francl
On Thu, 2004-11-11 at 14:48, Daniel Naber wrote: On Thursday 11 November 2004 20:57, Sanyi wrote: What I'm saying is that there is no reason for the optimizer to expand wild* to more than 1024 variations That's the point: there is no query optimizer in Lucene. Would it be possible to

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Daniel Naber
On Friday 12 November 2004 21:28, Luke Francl wrote: That's the point: there is no query optimizer in Lucene. Would it be possible to write one? I would be very interested in this feature. There are two different issues: first, reorder the query so that those terms with less matches appear

RE: lucene Scorers

2004-11-12 Thread Chuck Williams
I had a similar need and wrote MaxDisjunctionQuery and MaxDisjunctionScorer. Unfortunately these are not available as a patch but I've included the original message below that has the code (modulo line breaks added by simple text email format). This code is functional -- I use it in my app. It

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Luke Francl
On Fri, 2004-11-12 at 14:52, Daniel Naber wrote: There are two different issues: first, reorder the query so that those terms with less matches appear first, because as soon as the first term with 0 matches occurs, search stops. There will probably be a non-so-difficult implementation for

Index File

2004-11-12 Thread Luke Shannon
Hi; Is there someway to determine if specific contents are in the index folder other than running a query against it? I see that my document is being indexed. But when I run a query against the index I get no results returned. The weird thing is if I restart TomCat and run the search again

Re: Index File

2004-11-12 Thread Otis Gospodnetic
If you add a Document to the index after you've opened an IndexSearcher/Reader, your IndexSearcher/Reader will not see it. You have to open a new IS/R to see the newly added Documents. This is often covered on this list... I must have added this to Lucene FAQ at jGuru, too. Otis --- Luke

RE: Index File

2004-11-12 Thread Richard Greenane
You might wat to look at LUKE @ http://www.getopt.org/luke/ A great tool for checking the index to make sure that everything is there Regards Richard -Original Message- From: Luke Shannon [mailto:[EMAIL PROTECTED] Sent: 12 November 2004 23:54 To: Lucene Users List Subject: Index File