Re: Sorting on a long string

2004-09-30 Thread Erik Hatcher
On Sep 28, 2004, at 9:46 PM, Daly, Pete wrote:
I am new to lucene, and trying to perform a sorted query on a list of
people's names.  Lucene seem unable to properly sort on the name field 
of my
indexed documents.  If I sort by the other (shorter) fields, it seems 
to
work fine.  The name sort seems to be close, almost like the last few
iterations through the sort loop are not being done.
How are you indexing the name field?  (code please :)
  The records are
obviously not in the normally random order, but not fully sorted 
either.
Normally random order?!  The natural (not using a Sort) order is by 
score (also called relevance).  Nothing random about it at all.  In 
fact, this ordering is very special!  See the Javadocs on the 
Similarity class for details of the formula.

Are their known limitations in the sorting functionality that I am 
running
into?  I can provide more details if needed.
No limitations that I know if.  Some bugs have been fixed, so be sure 
you're using Lucene 1.4.1, and not just 1.4, but please report back 
with more details if this issue still occurs in 1.4.1.

Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Sorting on a long string

2004-09-30 Thread Daly, Pete
 How are you indexing the name field?  (code please :)

doc.add(Field.Text(name, name));

Based on Aviron's suggestion, if I index as a Keyword, the sort works fine.


 Normally random order?!  The natural (not using a Sort) order is by 
 score (also called relevance).  Nothing random about it at all.  In 
 fact, this ordering is very special!  See the Javadocs on the 
 Similarity class for details of the formula.

Score makes sense then.  I didn't think about sort by score since I was
specifying a specific sort to use instead.

 No limitations that I know if.  Some bugs have been fixed, so be sure 
 you're using Lucene 1.4.1, and not just 1.4, but please report back 
 with more details if this issue still occurs in 1.4.1.

Looks like switching to keyword is the solution.  Can someone describe what
abilities I would be losing by using Keyword instead of Text.  I am indexing
people's name, which are searched on my partial name quite a bit.  Can a
Keyword consisting of more than one word be searched on just as well as a
Text field, or do I need to index both ways in order to keep functionality
with sorting ability?

Thanks all for your help,

-Pete


RE: Sorting on a long string

2004-09-29 Thread Aviran
Currently Lucene can only sort on a Keyword field properly.
I guess your field is tokenized, which in this case the sort does not work
properly.

A patch has been suggested to fix this problem ( but has not been applied
yet )

http://issues.apache.org/bugzilla/show_bug.cgi?id=30382

Aviran

-Original Message-
From: Daly, Pete [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, September 28, 2004 15:46 PM
To: Lucene Users List
Subject: Sorting on a long string


I am new to lucene, and trying to perform a sorted query on a list of
people's names.  Lucene seem unable to properly sort on the name field of my
indexed documents.  If I sort by the other (shorter) fields, it seems to
work fine.  The name sort seems to be close, almost like the last few
iterations through the sort loop are not being done.  The records are
obviously not in the normally random order, but not fully sorted either.  I
have tried different ways of sorting, including a SortField array/object
with the field cast as a string.

The index I am sorting has about 1.2 million documents.

Are their known limitations in the sorting functionality that I am running
into?  I can provide more details if needed.

Thanks for any help,

-Pete



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Sorting on a long string

2004-09-28 Thread Daly, Pete
I am new to lucene, and trying to perform a sorted query on a list of
people's names.  Lucene seem unable to properly sort on the name field of my
indexed documents.  If I sort by the other (shorter) fields, it seems to
work fine.  The name sort seems to be close, almost like the last few
iterations through the sort loop are not being done.  The records are
obviously not in the normally random order, but not fully sorted either.  I
have tried different ways of sorting, including a SortField array/object
with the field cast as a string.

The index I am sorting has about 1.2 million documents.

Are their known limitations in the sorting functionality that I am running
into?  I can provide more details if needed.

Thanks for any help,

-Pete