[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2019-09-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927578#comment-16927578
 ] 

ASF subversion and git services commented on LUCENE-7521:
-

Commit c514b29b24138405bac8bd30ca33aa24980a998d in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c514b29 ]

LUCENE-7521: Simplify PackedInts.


> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2019-09-05 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923212#comment-16923212
 ] 

Adrien Grand commented on LUCENE-7521:
--

I'd like to move forward on this. Hopefully since we last discussed this 
cleanup, more users took the time to move from FieldCache to doc values, which 
has been our recommendation for a very long time now. I will only push this 
change to master.

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-11-23 Thread Toke Eskildsen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691118#comment-15691118
 ] 

Toke Eskildsen commented on LUCENE-7521:


Just to be clear, I was only talking about the bit structures optimal-packed 
vs. word-aligned. Whether or not the performance/complexity trade-off is good 
enough for keeping the Direct*- and Packed*ThreeBlocks-implementations (which 
are both optimal-packed and word-aligned by nature) is harder to judge from the 
old performance tests: They differ a lot more across CPU architectures.

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-11-23 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690275#comment-15690275
 ] 

Yonik Seeley commented on LUCENE-7521:
--

The rational for this patch looked to be "most specializations we have are 
barely used for performance-sensitive operations".
I would definitely categorize FieldCache usage in Solr as 
performance-sensitive, so I thought maybe you were only talking about usage in 
Lucene (since the FieldCache was recently moved from Solr to Lucene).  Is this 
the case?

In the interests of being data-driven, it seemed like the approach should be to 
either:
1) benchmark solr's fieldcache usage w/o Direct* and Packed*ThreeBlocks impls 
and see what the impact is
2) make a copy of PackedInts in Solr (and use it in the FieldCache) so Lucene 
can immediately simplify PackedInts w/o regard for FieldCache (#1 could still 
be done at a later time)



> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-11-23 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690132#comment-15690132
 ] 

Adrien Grand commented on LUCENE-7521:
--

Thanks Toke for pointing to these benchmarks. I agree the gain is not worth the 
complexity.

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-11-23 Thread Toke Eskildsen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15689447#comment-15689447
 ] 

Toke Eskildsen commented on LUCENE-7521:


I was involved in the original PackedInts implementation, where I did quite a 
bit of performance testing of the two different approaches: Optimal memory 
packing (Packed64) and word-aligned packing (Packed64SingleBlock). They were 
named different back then, but the principles and the performance-relevant code 
parts were about the same. The JIRA is LUCENE-1990. The conclusion then was 
that aligned won in a few cases but added quite a lot of complexity, so it was 
scrapped.

Two years later the aligned version was re-introduced in LUCENE-4062. Again 
there were some performance testing. Performance characteristics differed 
depending on CPU structure and in-memory array size (cache utilization really). 
Overall it seemed that aligned packing was faster, but not by much on the i7 
(desktop & Xeon). 

One important observation from the JIRA is that only the BPVs (Bits Per Value) 
3, 5, 6, 7, 9, 10, 12 and 21 that differ in representation (and get/set 
algorithm) between packed and aligned. There's some poor graphs from an old 
comparison of those values on http://ekot.dk/misc/packedints/padding.html where 
contiguous=packed and padding=aligned. This was for a small (10M values, AFAIR) 
set. Note how the performance difference between the implementation varies a 
lot, depending on CPU type.

Long story longer, I still favour having only 1 underlying format ("optimal" 
packed): Too little gain in too few cases for a high code complexity cost with 
aligned. On a related node, a high-quality micro-benchmark for structures like 
these would be great.

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-10-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608318#comment-15608318
 ] 

Yonik Seeley commented on LUCENE-7521:
--

I assume these specializations were added because they made a difference (and 
hence still do make a difference in Solr when using the FieldCache).  I'd 
certainly want to do a bunch of performance testing to try and quantify 
worst-case slowdowns before making any decision to ditch them or not.  Seems 
easier just to keep them (in lucene or solr).  

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-10-25 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606537#comment-15606537
 ] 

Adrien Grand commented on LUCENE-7521:
--

Option 2 sounds easier so I can look into it if you think it is necessary.

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-10-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605644#comment-15605644
 ] 

Yonik Seeley commented on LUCENE-7521:
--

OK, so to avoid slowdowns (or more memory usage) in the FieldCache, some 
options include:
- Make PackedInts extensible and move the unwanted-by-lucene implementations to 
Solr
- Since PackedInts is so tied to the FieldCache, simply copy the whole "packed" 
package to Solr, like was done with the "uninverted" package
 

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-10-25 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605482#comment-15605482
 ] 

Adrien Grand commented on LUCENE-7521:
--

Yes they are. 

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-10-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605358#comment-15605358
 ] 

Yonik Seeley commented on LUCENE-7521:
--

Are any of these formats used by FieldCacheImpl (that was moved to Solr?)
It's hard to tell at first blush, I may have to resort to prints...

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7521) Simplify PackedInts

2016-10-25 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605337#comment-15605337
 ] 

Michael McCandless commented on LUCENE-7521:


+1!

Look at all that removed code :)

> Simplify PackedInts
> ---
>
> Key: LUCENE-7521
> URL: https://issues.apache.org/jira/browse/LUCENE-7521
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org