[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927578#comment-16927578 ] ASF subversion and git services commented on LUCENE-7521: - Commit c514b29b24138405bac8bd30ca33aa24980a998d in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c514b29 ] LUCENE-7521: Simplify PackedInts. > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923212#comment-16923212 ] Adrien Grand commented on LUCENE-7521: -- I'd like to move forward on this. Hopefully since we last discussed this cleanup, more users took the time to move from FieldCache to doc values, which has been our recommendation for a very long time now. I will only push this change to master. > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691118#comment-15691118 ] Toke Eskildsen commented on LUCENE-7521: Just to be clear, I was only talking about the bit structures optimal-packed vs. word-aligned. Whether or not the performance/complexity trade-off is good enough for keeping the Direct*- and Packed*ThreeBlocks-implementations (which are both optimal-packed and word-aligned by nature) is harder to judge from the old performance tests: They differ a lot more across CPU architectures. > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690275#comment-15690275 ] Yonik Seeley commented on LUCENE-7521: -- The rational for this patch looked to be "most specializations we have are barely used for performance-sensitive operations". I would definitely categorize FieldCache usage in Solr as performance-sensitive, so I thought maybe you were only talking about usage in Lucene (since the FieldCache was recently moved from Solr to Lucene). Is this the case? In the interests of being data-driven, it seemed like the approach should be to either: 1) benchmark solr's fieldcache usage w/o Direct* and Packed*ThreeBlocks impls and see what the impact is 2) make a copy of PackedInts in Solr (and use it in the FieldCache) so Lucene can immediately simplify PackedInts w/o regard for FieldCache (#1 could still be done at a later time) > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690132#comment-15690132 ] Adrien Grand commented on LUCENE-7521: -- Thanks Toke for pointing to these benchmarks. I agree the gain is not worth the complexity. > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15689447#comment-15689447 ] Toke Eskildsen commented on LUCENE-7521: I was involved in the original PackedInts implementation, where I did quite a bit of performance testing of the two different approaches: Optimal memory packing (Packed64) and word-aligned packing (Packed64SingleBlock). They were named different back then, but the principles and the performance-relevant code parts were about the same. The JIRA is LUCENE-1990. The conclusion then was that aligned won in a few cases but added quite a lot of complexity, so it was scrapped. Two years later the aligned version was re-introduced in LUCENE-4062. Again there were some performance testing. Performance characteristics differed depending on CPU structure and in-memory array size (cache utilization really). Overall it seemed that aligned packing was faster, but not by much on the i7 (desktop & Xeon). One important observation from the JIRA is that only the BPVs (Bits Per Value) 3, 5, 6, 7, 9, 10, 12 and 21 that differ in representation (and get/set algorithm) between packed and aligned. There's some poor graphs from an old comparison of those values on http://ekot.dk/misc/packedints/padding.html where contiguous=packed and padding=aligned. This was for a small (10M values, AFAIR) set. Note how the performance difference between the implementation varies a lot, depending on CPU type. Long story longer, I still favour having only 1 underlying format ("optimal" packed): Too little gain in too few cases for a high code complexity cost with aligned. On a related node, a high-quality micro-benchmark for structures like these would be great. > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608318#comment-15608318 ] Yonik Seeley commented on LUCENE-7521: -- I assume these specializations were added because they made a difference (and hence still do make a difference in Solr when using the FieldCache). I'd certainly want to do a bunch of performance testing to try and quantify worst-case slowdowns before making any decision to ditch them or not. Seems easier just to keep them (in lucene or solr). > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606537#comment-15606537 ] Adrien Grand commented on LUCENE-7521: -- Option 2 sounds easier so I can look into it if you think it is necessary. > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605644#comment-15605644 ] Yonik Seeley commented on LUCENE-7521: -- OK, so to avoid slowdowns (or more memory usage) in the FieldCache, some options include: - Make PackedInts extensible and move the unwanted-by-lucene implementations to Solr - Since PackedInts is so tied to the FieldCache, simply copy the whole "packed" package to Solr, like was done with the "uninverted" package > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605482#comment-15605482 ] Adrien Grand commented on LUCENE-7521: -- Yes they are. > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605358#comment-15605358 ] Yonik Seeley commented on LUCENE-7521: -- Are any of these formats used by FieldCacheImpl (that was moved to Solr?) It's hard to tell at first blush, I may have to resort to prints... > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7521) Simplify PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605337#comment-15605337 ] Michael McCandless commented on LUCENE-7521: +1! Look at all that removed code :) > Simplify PackedInts > --- > > Key: LUCENE-7521 > URL: https://issues.apache.org/jira/browse/LUCENE-7521 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-7521.patch > > > We have a lot of specialization in PackedInts about how to keep packed arrays > of longs in memory. However, most use-cases have slowly moved to DirectWriter > and DirectMonotonicWriter and most specializations we have are barely used > for performance-sensitive operations, so I'd like to clean this up a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org