[ https://issues.apache.org/jira/browse/LUCENE-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596762#action_12596762 ]
Paul Elschot commented on LUCENE-1278: -------------------------------------- Some comments on the 5.7.2008 patch: The test with 7.6 times speedup for very few docs per term makes me wonder why this never showed up as a performance problem before. It certainly shows an advantage of flexible indexing for the case in which the within document term frequencies are not needed (for example primary/foreign keys, which normally end up in a keyword field.) In the patch, DocIdSetIterator is used in the org.apache.lucene.index package, so it would be a good idea to move it from o.a.l.search to o.a.l.index or to o.a.l.util to avoid a circular dependency involving the index and search packages. As DocIdSetIterator is not yet released, this move should be no problem. The DocIdSetReader class in the patch has so much code in common with SortedVIntList that it might be better to merge the two into a single one, and try and refactor common code into new methods there. That would also be an easy way to get rid of the unsupported skipTo() operation. > Add optional storing of document numbers in term dictionary > ----------------------------------------------------------- > > Key: LUCENE-1278 > URL: https://issues.apache.org/jira/browse/LUCENE-1278 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Affects Versions: 2.3.1 > Reporter: Jason Rutherglen > Priority: Minor > Attachments: lucene.1278.5.4.2008.patch, > lucene.1278.5.5.2008.2.patch, lucene.1278.5.5.2008.patch, > lucene.1278.5.7.2008.patch, lucene.1278.5.7.2008.test.patch, > TestTermEnumDocs.java > > > Add optional storing of document numbers in term dictionary. String index > field cache and range filter creation will be faster. > Example read code: > {noformat} > TermEnum termEnum = indexReader.terms(TermEnum.LOAD_DOCS); > do { > Term term = termEnum.term(); > if (term == null || term.field() != field) break; > int[] docs = termEnum.docs(); > } while (termEnum.next()); > {noformat} > Example write code: > {noformat} > Document document = new Document(); > document.add(new Field("tag", "dog", Field.Store.YES, > Field.Index.UN_TOKENIZED, Field.Term.STORE_DOCS)); > indexWriter.addDocument(document); > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]