RE: Question about RangeQuery and strings...

2002-06-11 Thread Otis Gospodnetic

James,

I haven't used RangeQueries, but what you describe does sound confusing
to me.  I'll enter it as a bug, just so this information doesn't get
lost, because I am not certain that this is really a bug, even though
it sounds like one to me.

Thanks,
Otis


--- James Ricci [EMAIL PROTECTED] wrote:
 I'm replying to my own message because I think I now understand the
 problem,
 and part of it is, in my opinion, a bad implementation of
 RangedQuery.
 
 When you create a ranged query and omit the lower term, my
 expectation would
 be that I would find everything less than the upper term. Now if I
 pass
 false for the inclusive term, then I would expect that I would find
 all
 terms less than the upper term excluding the upper term itself.
 
 What is happening in the case of lower_term=null, upper_term=x,
 inclusive=false is that empty strings are being excluded because
 inclusive
 is set false, and the implementation of RangedQuery creates a default
 lower
 term of Term(fieldName, ). Since it's not inclusive, it excludes
 . This
 isn't what I intended, and I don't think it's what most people would
 imagine
 RangedQuery would do in the case I've mentioned.
 
 I equate lower=null, upper=x, inclusive=false to Field  x.
 lower=null,
 upper=x, inclusive=true would be Field = x. In both cases, the only
 difference should be whether or not Field = x is true for the query.
 
 I'm still quite new to Lucene, so maybe I'm wrong about all this
 because I
 just don't understand it well enough. If so, could someone tell me
 where
 I've gone astray?
 
 Thanks much,
 
 James
 
 PS: The rest of the problems I had below I was able to fix by
 changing how
 the fields were tokenized and indexed.
 
   -Original Message-
  From:   James Ricci  
  Sent:   Thursday, June 06, 2002 11:16 AM
  To: '[EMAIL PROTECTED]'
  Subject:Question about RangeQuery and strings...
  
  Hi all,
  
  I've been having some problems using RangeQuery. I have a simple
 Query
  which is essentially document.field  AB. Field values are:
  
   // Empty string
  A SPACE
  A123456
  ABC
  
  Now I expected to find the first three of the four values (and I do
 with
  another commercial search engine product I've worked with). With
 Lucene I
  get nothing. Part of the problem I think is that there are some
 issues
  with case here. Changing my query to document.field  ab returns:
  
  A123456
  
  Now I would have expected A SPACE to get returned, and I was
 really
  surprised that  wasn't returned. I'm guessing that  wasn't
 returned
  because no term in the field passed the query criteria, and empty
 string
  is not considered a term.
  
  How should I go about getting what I expect? What is going on here?
  
  Thanks much,
  
  James
  
  
  
  
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Question about RangeQuery and strings...

2002-06-06 Thread James Ricci

I'm replying to my own message because I think I now understand the problem,
and part of it is, in my opinion, a bad implementation of RangedQuery.

When you create a ranged query and omit the lower term, my expectation would
be that I would find everything less than the upper term. Now if I pass
false for the inclusive term, then I would expect that I would find all
terms less than the upper term excluding the upper term itself.

What is happening in the case of lower_term=null, upper_term=x,
inclusive=false is that empty strings are being excluded because inclusive
is set false, and the implementation of RangedQuery creates a default lower
term of Term(fieldName, ). Since it's not inclusive, it excludes . This
isn't what I intended, and I don't think it's what most people would imagine
RangedQuery would do in the case I've mentioned.

I equate lower=null, upper=x, inclusive=false to Field  x. lower=null,
upper=x, inclusive=true would be Field = x. In both cases, the only
difference should be whether or not Field = x is true for the query.

I'm still quite new to Lucene, so maybe I'm wrong about all this because I
just don't understand it well enough. If so, could someone tell me where
I've gone astray?

Thanks much,

James

PS: The rest of the problems I had below I was able to fix by changing how
the fields were tokenized and indexed.

  -Original Message-
 From: James Ricci  
 Sent: Thursday, June 06, 2002 11:16 AM
 To:   '[EMAIL PROTECTED]'
 Subject:  Question about RangeQuery and strings...
 
 Hi all,
 
 I've been having some problems using RangeQuery. I have a simple Query
 which is essentially document.field  AB. Field values are:
 
  // Empty string
 A SPACE
 A123456
 ABC
 
 Now I expected to find the first three of the four values (and I do with
 another commercial search engine product I've worked with). With Lucene I
 get nothing. Part of the problem I think is that there are some issues
 with case here. Changing my query to document.field  ab returns:
 
 A123456
 
 Now I would have expected A SPACE to get returned, and I was really
 surprised that  wasn't returned. I'm guessing that  wasn't returned
 because no term in the field passed the query criteria, and empty string
 is not considered a term.
 
 How should I go about getting what I expect? What is going on here?
 
 Thanks much,
 
 James
 
 
 
 

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]