[ https://issues.apache.org/jira/browse/LUCENE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555016#comment-13555016 ]
Michael McCandless commented on LUCENE-4620: -------------------------------------------- +1 It's much faster than I had tested before (maybe because of the DV cutover!?): {noformat} Task QPS base StdDev QPS comp StdDev Pct diff PKLookup 181.98 (1.2%) 182.20 (1.3%) 0.1% ( -2% - 2%) LowTerm 77.95 (2.0%) 83.59 (2.8%) 7.2% ( 2% - 12%) MedTerm 26.60 (3.3%) 31.46 (1.4%) 18.3% ( 13% - 23%) HighTerm 15.83 (3.9%) 19.35 (1.3%) 22.2% ( 16% - 28%) {noformat} > Explore IntEncoder/Decoder bulk API > ----------------------------------- > > Key: LUCENE-4620 > URL: https://issues.apache.org/jira/browse/LUCENE-4620 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Reporter: Shai Erera > Assignee: Shai Erera > Fix For: 4.1, 5.0 > > Attachments: LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch, > LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch > > > Today, IntEncoder/Decoder offer a streaming API, where you can encode(int) > and decode(int). Originally, we believed that this layer can be useful for > other scenarios, but in practice it's used only for writing/reading the > category ordinals from payload/DV. > Therefore, Mike and I would like to explore a bulk API, something like > encode(IntsRef, BytesRef) and decode(BytesRef, IntsRef). Perhaps the Encoder > can still be streaming (as we don't know in advance how many ints will be > written), dunno. Will figure this out as we go. > One thing to check is whether the bulk API can work w/ e.g. facet > associations, which can write arbitrary byte[], and so may decoding to an > IntsRef won't make sense. This too we'll figure out as we go. I don't rule > out that associations will use a different bulk API. > At the end of the day, the requirement is for someone to be able to configure > how ordinals are written (i.e. different encoding schemes: VInt, PackedInts > etc.) and later read, with as little overhead as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org