Query validation in web app

2004-03-05 Thread Kelvin Tan
Lucene reacts pretty badly to non-wellformed queries, not throwing a checked/unchecked Exception but throwing an Error. The error message is also unintelligible to a user (non-developer). How are people checking/validating queries from a web-app? I have some checked-in code in sandbox that does

Re: Query: A ? B

2004-03-05 Thread Erik Hatcher
Actually a slop of 1 does guarantee order... it is either an exact match or 1 term off. It takes a slop of 2 or greater for reverse order matches. But it is not exactly 1 term off, which is what Jochen wants. *shrug* Erik On Mar 4, 2004, at 6:22 PM, Otis Gospodnetic wrote: Ah, sorry, I

Re: Query validation in web app

2004-03-05 Thread Erik Hatcher
Kelvin, In what scenarios does QueryParser fail without throwing a ParseException? I think we should fix those cases to ensure a ParseException is thrown. Erik On Mar 5, 2004, at 3:21 AM, Kelvin Tan wrote: Lucene reacts pretty badly to non-wellformed queries, not throwing a

Re: Query validation in web app

2004-03-05 Thread Kelvin Tan
On Fri, 5 Mar 2004 04:18:29 -0500, Erik Hatcher said: Kelvin, In what scenarios does QueryParser fail without throwing a ParseException? I think we should fix those cases to ensure a ParseException is thrown. Erik Sorry, my bad. Was it ever throwing Errors? Probably not, but somehow I

Re: Query validation in web app

2004-03-05 Thread Erik Hatcher
There was one condition we tightened up recently where a ParseException was not being thrown, but now it is. This was when there were too many boolean queries, and this exception is now converted to a ParseException. Erik On Mar 5, 2004, at 4:46 AM, Kelvin Tan wrote: On Fri, 5 Mar 2004

Storing numbers

2004-03-05 Thread lucene
Hi! I want to store numbers (id) in my index: long id = 1069421083284; doc.add(Field.UnStored(in, String.valueOf(id))); But searching for id:1069421083284 doesn't return any hits. Well, did I misunderstand something? UnStored is the number is stored but not index

Re: Storing numbers

2004-03-05 Thread Morus Walter
[EMAIL PROTECTED] writes: Hi! I want to store numbers (id) in my index: long id = 1069421083284; doc.add(Field.UnStored(in, String.valueOf(id))); But searching for id:1069421083284 doesn't return any hits. If your field is named 'in' you shouldn't search in 'id'.

Re: Did you mean for multiple terms

2004-03-05 Thread lucene
On Thursday 04 March 2004 17:55, [EMAIL PROTECTED] wrote: Consider the query +michael +jackson not to return any hits because there's no michael in index, but there is jackson (e.g. janet...). Is there any reasonable approach how to determine whether one or multiple terms of a query - and

Re: Storing numbers

2004-03-05 Thread lucene
On Friday 05 March 2004 12:27, Morus Walter wrote: doc.add(Field.UnStored(in, String.valueOf(id))); But searching for id:1069421083284 doesn't return any hits. If your field is named 'in' you shouldn't search in 'id'. Right? Well, indexing and analyzing are different things.

using lucene to search in a 1 huge file. (aka grep -n)

2004-03-05 Thread prasen
Hi guys, I am relatively new to Lucene. Can lucene be used to speed-up search for a string in one huge file( ~ TerraBytes ) based on its libe numbers. Something like grep -n pattern filename where the indexing will be done only on one file and based on either line-numbers/blocks. prasen

Re: using lucene to search in a 1 huge file. (aka grep -n)

2004-03-05 Thread Otis Gospodnetic
In order for this to make sense, you would have to split your huge file into either lines or blocks, whichever you want to be your indexing and search/hit unit, and convert those to Lucene Documents, which you would then index. Searching would then return the line/block where matches are found.

Re: Storing numbers

2004-03-05 Thread Erik Hatcher
Terms in Lucene are text. If you want to deal with number ranges, you need to pad them. 0001 for example. Be sure all numbers have the same width and zero padded. Lucene use lexicographical ordering, so you must be sure things collate in this way. Erik On Mar 5, 2004, at 11:46

Re: Storing numbers

2004-03-05 Thread Stephane James Vaucher
Weird idea, how about transforming your long into a Date and using a DateFilter to use a ranged query? sv On Fri, 5 Mar 2004, Erik Hatcher wrote: Terms in Lucene are text. If you want to deal with number ranges, you need to pad them. 0001 for example. Be sure all numbers have the

Re: Storing numbers

2004-03-05 Thread lucene
On Friday 05 March 2004 18:01, Erik Hatcher wrote: 0001 for example. Be sure all numbers have the same width and zero padded. And what about a range like 100 TO 1000? - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Re: Storing numbers

2004-03-05 Thread Stephane James Vaucher
On Fri, 5 Mar 2004 [EMAIL PROTECTED] wrote: On Friday 05 March 2004 18:01, Erik Hatcher wrote: 0001 for example. Be sure all numbers have the same width and zero padded. And what about a range like 100 TO 1000? You mean 0100 To 1000 or 100 to 0001000 ;) sv

Re: Query validation in web app

2004-03-05 Thread Dror Matalon
On Fri, Mar 05, 2004 at 04:21:07PM +0800, Kelvin Tan wrote: Lucene reacts pretty badly to non-wellformed queries, not throwing a checked/unchecked Exception but throwing an Error. The error message is also unintelligible to a user (non-developer). How are people checking/validating queries

Re: Query validation in web app

2004-03-05 Thread Otis Gospodnetic
Funny - Kelvin Tan is the author of that code :) Otis --- Dror Matalon [EMAIL PROTECTED] wrote: On Fri, Mar 05, 2004 at 04:21:07PM +0800, Kelvin Tan wrote: Lucene reacts pretty badly to non-wellformed queries, not throwing a checked/unchecked Exception but throwing an Error. The error

Re: Storing numbers

2004-03-05 Thread Erik Hatcher
Another quite cool option is to subclass QueryParser, and override getRangeQuery. Do the padding there. This will allow users to type in normal looking numbers, and the padding happens automatically. You'll need to be sure that numbers padded during indexing matches what getRangeQuery does

Re: Query validation in web app

2004-03-05 Thread Dror Matalon
I was responding to How are people checking/validating queries from a web-app? So should I be embarrassed or should Kelvin be flattered :-)? On Fri, Mar 05, 2004 at 12:12:35PM -0800, Otis Gospodnetic wrote: Funny - Kelvin Tan is the author of that code :) Otis --- Dror Matalon [EMAIL

Re: Storing numbers

2004-03-05 Thread Erik Hatcher
On Mar 5, 2004, at 4:16 PM, Erik Hatcher wrote: Another quite cool option is to subclass QueryParser, and override getRangeQuery. Do the padding there. This will allow users to type in normal looking numbers, and the padding happens automatically. You'll need to be sure that numbers padded

Re: using lucene to search in a 1 huge file. (aka grep -n)

2004-03-05 Thread prasen
Any tutorial/samples on how to use indices, and use them in your search ? thanks-n-appreciate a lot, prasen Otis Gospodnetic wrote: In order for this to make sense, you would have to split your huge file into either lines or blocks, whichever you want to be your indexing and search/hit unit,

Re: Query validation in web app

2004-03-05 Thread Kelvin Tan
Neither! :-) I was just wondering if there were better ways to do it, that's all. I'm a regex newbie and I found it rather difficult to validate the entire Lucene query syntax (including escaping!) using regex. Anyway, I'm writing unit tests for the query validator right now courtesy of jsunit...