RE: Out of memory on Solr sorting

Fuad Efendi Tue, 05 Aug 2008 12:08:05 -0700

Best choice for sorting field:
    <!-- This is an example of using the KeywordTokenizer along
         With various TokenFilterFactories to produce a sortable field
         that does not include some properties of the source text
      -->

<fieldType name="alphaOnlySort" class="solr.TextField"sortMissingLast="true" omitNorms="true">


- case-insentitive etc...

I might be partially wrong about SOLR LRU Cache but it is used somehowin your specific case... 'filterCache' is probably used for'tokenized' sorting: it stores (token, DocList)...



Fuad Efendi
==============
http://www.tokenizer.org


Quoting Fuad Efendi <[EMAIL PROTECTED]>:

My understanding of Lucene Sorting is that it will sort by 'tokens' and
not by 'full fields'... so that for sorting you need 'full-string'
(non-tokenized) field, and to search you need another one tokenized.

For instance, use 'string' for sorting, and 'text_ws' for search; and
use 'copyField'... (some memory for copyField)

Sorting using tokenized field: 100,000 documents, each 'Book Title'
consists of 10 tokens in average, ... - total 1,000,000 (probably
unique) tokens in a hashtable; with nontokenized field - 100,000
entries, and Lucene internal FieldCache is used instead of SOLR LRU.


Also, with tokenized fields 'sorting' is not natural (alphabetical order)...


Fuad Efendi
==============
http://www.linkedin.com/in/liferay

Quoting sundar shankar <[EMAIL PROTECTED]>:

The field is of type "text_ws". Is this not recomended. Should Iuse text instead?
If increasing LRU cache helps you: - you are probably using'tokenized' field for sorting (could you confirm please?)......you should use 'non-tokenized single-valued non-boolean' forbetter performance of sorting...

RE: Out of memory on Solr sorting

Reply via email to