[jira] Updated: (LUCENE-1278) Add optional storing of document numbers in term dictionary

Jason Rutherglen (JIRA) Mon, 05 May 2008 10:32:27 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jason Rutherglen updated LUCENE-1278:
-------------------------------------

    Attachment: TestTermEnumDocs.java

Was going to write a Lucene test case but need an example and svn is down.

The example test is extremely poor because the term and field saturation is 
nil.  Normal documents will have far more terms and the file cache will not 
have cached as much of the term docs as it will be larger.  However it does 
illustrate the speed up.  Please suggest other tests.

Laptop Windows XP SP2 Java6 core2duo, about the same on 3 separate runs:
3360 millis termenum loaddocs
25641 millis termdocs
7.6 times speedup

There have been previous discussions regarding the speed issue.  
http://www.gossamer-threads.com/lists/lucene/java-dev/53786
The conclusion was to use payloads which do not speed up stringindex or range 
queries.  


> Add optional storing of document numbers in term dictionary
> -----------------------------------------------------------
>
>                 Key: LUCENE-1278
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1278
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.3.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: lucene.1278.5.4.2008.patch, 
> lucene.1278.5.5.2008.2.patch, lucene.1278.5.5.2008.patch, 
> TestTermEnumDocs.java
>
>
> Add optional storing of document numbers in term dictionary.  String index 
> field cache and range filter creation will be faster.  
> Example read code:
> {noformat}
> TermEnum termEnum = indexReader.terms(TermEnum.LOAD_DOCS);
> do {
>   Term term = termEnum.term();
>   if (term == null || term.field() != field) break;
>   int[] docs = termEnum.docs();
> } while (termEnum.next());
> {noformat}
> Example write code:
> {noformat}
> Document document = new Document();
> document.add(new Field("tag", "dog", Field.Store.YES, 
> Field.Index.UN_TOKENIZED, Field.Term.STORE_DOCS));
> indexWriter.addDocument(document);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1278) Add optional storing of document numbers in term dictionary

Reply via email to