Re: sorted search

2005-02-24 Thread Daniel Naber
On Thursday 24 February 2005 19:01, Yura Smolsky wrote:

       sort.setSort( new SortField[] { new SortField (modified,
 SortField.STRING, true) } );

You should store the date as a number, e.g. days since 1970 (or weeks if 
that is precise enough) and then tell the sort that it's an integer. 
DateField always stores the date in milliseconds which leads to a large 
number of terms, it also turns the date into a string, both makes searching 
and especially sorting slower.

Regards
 Daniel

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: sorted search

2005-02-24 Thread Erik Hatcher
Sorting by String uses up lots more RAM than a numeric sort.  If you 
use a numeric (yet lexicographically orderable) date format (e.g. 
MMDD) you'll see better performance most likely.

Erik
On Feb 24, 2005, at 1:01 PM, Yura Smolsky wrote:
Hello, lucene-user.
I have index with many documents, more than 40 Mil.
Each document has DateField (It is time stamp of document)
I need the most recent results only. I use single instance of 
IndexSearcher.
When I perform sorted search on this index:
  Sort sort = new Sort();
  sort.setSort( new SortField[] { new SortField (modified, 
SortField.STRING, true) } );
  Hits hits =
searcher.search(QueryParser.parse(good, content,
  StandardAnalyzer()), sort);

then search speed is not good.
Today I have tried search without sort by modified, but with sort by
Relevance. Speed was much better!
I think that Sort by DateField is very slow. Maybe I do something
wrong about this kind of sorted search? Can you give me advices about
this?
Thanks.
Yura Smolsky.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re[2]: sorted search

2005-02-24 Thread Yura Smolsky
Hello, Erik.

if i need to store hour and minute then I need to place date into
following integer format:
MMDDHHII
?
Will it be faster than current solution?
And will I have ability to do Ranged queries (from Date A to Date B)?

EH Sorting by String uses up lots more RAM than a numeric sort.  If you
EH use a numeric (yet lexicographically orderable) date format (e.g. 
EH MMDD) you'll see better performance most likely.

EH Erik


EH On Feb 24, 2005, at 1:01 PM, Yura Smolsky wrote:

 Hello, lucene-user.

 I have index with many documents, more than 40 Mil.
 Each document has DateField (It is time stamp of document)

 I need the most recent results only. I use single instance of 
 IndexSearcher.
 When I perform sorted search on this index:
   Sort sort = new Sort();
   sort.setSort( new SortField[] { new SortField (modified, 
 SortField.STRING, true) } );
   Hits hits =
 searcher.search(QueryParser.parse(good, content,
   StandardAnalyzer()), sort);

 then search speed is not good.

 Today I have tried search without sort by modified, but with sort by
 Relevance. Speed was much better!

 I think that Sort by DateField is very slow. Maybe I do something
 wrong about this kind of sorted search? Can you give me advices about
 this?

 Thanks.

 Yura Smolsky.



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]


EH -
EH To unsubscribe, e-mail: [EMAIL PROTECTED]
EH For additional commands, e-mail:
EH [EMAIL PROTECTED]





Yura Smolsky.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re[2]: sorted search

2005-02-24 Thread Yura Smolsky
Hello, Erik.

about memory usage...
DateField takes string of 9 bytes in memory ('000ic64p7')
How much memory will be taken by this string?

How much memory will be taken by integer?

EH Sorting by String uses up lots more RAM than a numeric sort.  If you
EH use a numeric (yet lexicographically orderable) date format (e.g. 
EH MMDD) you'll see better performance most likely.

EH Erik


EH On Feb 24, 2005, at 1:01 PM, Yura Smolsky wrote:

 Hello, lucene-user.

 I have index with many documents, more than 40 Mil.
 Each document has DateField (It is time stamp of document)

 I need the most recent results only. I use single instance of 
 IndexSearcher.
 When I perform sorted search on this index:
   Sort sort = new Sort();
   sort.setSort( new SortField[] { new SortField (modified, 
 SortField.STRING, true) } );
   Hits hits =
 searcher.search(QueryParser.parse(good, content,
   StandardAnalyzer()), sort);

 then search speed is not good.

 Today I have tried search without sort by modified, but with sort by
 Relevance. Speed was much better!

 I think that Sort by DateField is very slow. Maybe I do something
 wrong about this kind of sorted search? Can you give me advices about
 this?

 Thanks.

 Yura Smolsky.



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]


EH -
EH To unsubscribe, e-mail: [EMAIL PROTECTED]
EH For additional commands, e-mail:
EH [EMAIL PROTECTED]





Yura Smolsky.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]