[
https://issues.apache.org/jira/browse/LUCENE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir resolved LUCENE-2492.
---------------------------------
Resolution: Duplicate
This is implemented over on LUCENE-4498 as an optimization, storing the docid
directly instead of filepointer+docid when docfreq=1, and using the
already-stored totalTermFreq as the freq for that singleton document, since its
redundant. Any positions/payloads or anything else still go to their usual
place, it just saves the wasted seek, the useless file pointer, and optimizes
the primary key case.
As a default it doesn't have the potential traps of Pulsing (see the issue). If
someone wants more flexibility (e.g. they want to store positions and such in
the term dictionary), they can still use Pulsing.
> Make PulsingCodec (wrapping StandardCodec) the default codec
> ------------------------------------------------------------
>
> Key: LUCENE-2492
> URL: https://issues.apache.org/jira/browse/LUCENE-2492
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/index
> Affects Versions: 4.0-ALPHA
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 4.1
>
>
> PulsingCodec can provides good gains, by inlining the postings into the terms
> dict for rare terms. This is especially helpful for primary key like fields,
> since every term is rare and batch lookups are common (see
> http://chbits.blogspot.com/2010/06/lucenes-pulsingcodec-on-primary-key.html
> for a simple perf test), but it should also be a gain for ordinary fields,
> thanks to Zipf's law.
> I think we should make it the default....
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]