[ https://issues.apache.org/jira/browse/LUCENE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Toke Eskildsen updated LUCENE-1990: ----------------------------------- Attachment: LUCENE-1990-te20100210.patch Changing the code to use bitsPerValue instead of maxValue for constructors and persistent format took a bit longer than anticipated. To get things flowing, I've attached the code as it is now. I've moved the classes to o.a.l.util.packed and performed some clenup too. It still needs aligned32 and aligned64 implementations and more cleanup, which I'll work on for the next hour today and hopefully some hours tomorrow. One current use-case for mutable packed ints would be for StringOrdValComparator (using an auto-grow wrapper), although the gain might be small as the overhead of the Strings is so large. I understand the problem of making all packed ints mutable, but a compromise might be to have a Mutable interface and a new factory-method that returns the same implementations as Mutable instead of Reader? That way it is possible to use the implementations for things such as sorting instead of having to re-implement them. I've left the interface for Reader clean as you suggested, but kept the implementations of set in the classes for now, as the code has already been made. > Add unsigned packed int impls in oal.util > ----------------------------------------- > > Key: LUCENE-1990 > URL: https://issues.apache.org/jira/browse/LUCENE-1990 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael McCandless > Priority: Minor > Attachments: LUCENE-1990-te20100122.patch, > LUCENE-1990-te20100210.patch, LUCENE-1990.patch, > LUCENE-1990_PerformanceMeasurements20100104.zip > > > There are various places in Lucene that could take advantage of an > efficient packed unsigned int/long impl. EG the terms dict index in > the standard codec in LUCENE-1458 could subsantially reduce it's RAM > usage. FieldCache.StringIndex could as well. And I think "load into > RAM" codecs like the one in TestExternalCodecs could use this too. > I'm picturing something very basic like: > {code} > interface PackedUnsignedLongs { > long get(long index); > void set(long index, long value); > } > {code} > Plus maybe an iterator for getting and maybe also for setting. If it > helps, most of the usages of this inside Lucene will be "write once" > so eg the set could make that an assumption/requirement. > And a factory somewhere: > {code} > PackedUnsignedLongs create(int count, long maxValue); > {code} > I think we should simply autogen the code (we can start from the > autogen code in LUCENE-1410), or, if there is an good existing impl > that has a compatible license that'd be great. > I don't have time near-term to do this... so if anyone has the itch, > please jump! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org