[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-18 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632379#comment-14632379
 ] 

Dawid Weiss commented on SOLR-7787:
---

Darn, my bad -- thanks [~steve_rowe]!

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk

 Attachments: SOLR-7787.patch


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631257#comment-14631257
 ] 

ASF subversion and git services commented on SOLR-7787:
---

Commit 1691541 from [~dawidweiss] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1691541 ]

SOLR-7787: Removed fastutil and java-hll dependency, integrated HyperLogLog 
from java-hll into Solr core. (committed from solr/ folder; note to Uwe: this 
is why I hate SVN...).

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk

 Attachments: SOLR-7787.patch


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631254#comment-14631254
 ] 

ASF subversion and git services commented on SOLR-7787:
---

Commit 1691535 from [~dawidweiss] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1691535 ]

SOLR-7787: Removed fastutil and java-hll dependency, integrated HyperLogLog 
from java-hll into Solr core. (committed from solr/ folder; note to Uwe: this 
is why I hate SVN...).

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk

 Attachments: SOLR-7787.patch


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631246#comment-14631246
 ] 

ASF subversion and git services commented on SOLR-7787:
---

Commit 1691531 from [~dawidweiss] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1691531 ]

SOLR-7787: Removed fastutil and java-hll dependency, integrated HyperLogLog 
from java-hll into Solr core.

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk

 Attachments: SOLR-7787.patch


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631207#comment-14631207
 ] 

ASF subversion and git services commented on SOLR-7787:
---

Commit 1691518 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1691518 ]

SOLR-7787: Removed fastutil and java-hll dependency, integrated HyperLogLog 
from java-hll into Solr core.

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk

 Attachments: SOLR-7787.patch


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631558#comment-14631558
 ] 

ASF subversion and git services commented on SOLR-7787:
---

Commit 1691609 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1691609 ]

SOLR-7787: Remove obsolute fastutil and hll files in solr/licenses/

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk

 Attachments: SOLR-7787.patch


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631560#comment-14631560
 ] 

ASF subversion and git services commented on SOLR-7787:
---

Commit 1691610 from [~steve_rowe] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1691610 ]

SOLR-7787: Remove obsolute fastutil and hll files in solr/licenses/ (merged 
trunk r1691609)

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk

 Attachments: SOLR-7787.patch


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629536#comment-14629536
 ] 

ASF subversion and git services commented on SOLR-7787:
---

Commit 1691350 from [~dawidweiss] in branch 'dev/branches/solr7787'
[ https://svn.apache.org/r1691350 ]

SOLR-7787 (jhll integration).

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629832#comment-14629832
 ] 

Yonik Seeley commented on SOLR-7787:


When HLL was first added, I remember scanning the code quickly and noting that 
the explicit storage rehashed the given value (which is wasted effort given 
that the provided values already need to be good hashes to start with).  Is 
there a way to avoid that?  Prob not a big deal though.

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk

 Attachments: SOLR-7787.patch


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-16 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629892#comment-14629892
 ] 

Dawid Weiss commented on SOLR-7787:
---

Yeah, I honestly don't think those storage optimizations are worth the effort 
either, but I think the code can be tweaked after it's integrated -- I wouldn't 
want to tinker with it in this patch (can you file another issue)?

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk

 Attachments: SOLR-7787.patch


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-15 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627784#comment-14627784
 ] 

Dawid Weiss commented on SOLR-7787:
---

This is the PR, let's see if it receives any attention. Otherwise I'll try to 
clean it up from unnecessary stuff and import it directly into Solr's codebase.

https://github.com/aggregateknowledge/java-hll/pull/16

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-14 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626656#comment-14626656
 ] 

Hoss Man commented on SOLR-7787:


I don't have any strong opinions about this issue as it currently stands (ie: 
Fork HyperLogLog and remove fastutil dependency) -- as you mentioned, the 
java-hll project, with it's original goals of a cross langauge binary format 
protocol, doesn't appear to be very active anymore -- but for the record can 
you please clarify the context/objective of this issue?

It started out as Promote fastutil to first-order dependency -- and your goal 
seemed to be to remove HPPC, now you seem to have flipped the objective so that 
the focus is on keeping HPPC, and eliminating fastutil.

having a clear understanding of the goal for either refactoring/forking/etc 
would be helpful for folks to develop an informed opinion.

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-14 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626799#comment-14626799
 ] 

Dawid Weiss commented on SOLR-7787:
---

Well, I thought lack of initial feedback was lazy consensus :)

And seriously -- it can go either way. My original intention was to update HPPC 
which is duplicated in Solr and the clustering contrib. These have to be 
consistent. From there I observed that:

1) Solr uses HPPC in a small number of classes,
2) fastutil is present in solr's lib, but it's not used in any classes, it is 
just a transitive dependency from hll,
3) fastutil is much larger than HPPC (roughly 15x).

I am really ok with any option. I decided to cut fastutil out because it's just 
a much larger library (of which almost nothing is practically used in Solr). 
Also, hll's implementation uses fastutil iterators very inefficiently (causing 
intermediate autoboxing on every value) so I thought I'd take a stab at 
improving that while converting to HPPC.

I can also leave everything as-is, really, but the patch I have locally seems 
like a nice improvement on its own (and could be perhaps pruned further to get 
rid of unnecessary stuff like serialization, etc.).


 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-14 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626975#comment-14626975
 ] 

Dawid Weiss commented on SOLR-7787:
---

bq. Got it – so the primary concern is/was dependency version consistency, and 
in the process of going down that rabbit hole, you realized forking HLL to 
eliminate the dep on fastutil seemed like the best win all around.

Yes, that's exactly what happened.

bq. I wonder though would hll accept a patch to remove the dependency (or make 
it optional)?

It's not a complete removal -- I simply replaced fastutil with HPPC (since Solr 
already uses it in a couple of places anyway). I also did a few other changes 
the author may not be too happy with (replaced testng with junit, replaced 
hardcoded randomization seeds with randomizedtesting, etc.) in preparation of 
importing the code into Solr's codebase. I will submit a PR too, but there 
seems to be a general lack of interest in this code from the original author, 
see Hoss's unaddressed question here:

https://github.com/aggregateknowledge/java-hll/issues/15

and Timon Karnezos doesn't seem to be too active recently: 
https://github.com/timonk?tab=contributionsperiod=monthly






 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-14 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626953#comment-14626953
 ] 

Hoss Man commented on SOLR-7787:


bq. My original intention was to update HPPC which is duplicated in Solr and 
the clustering contrib. These have to be consistent. From there I observed that:
...
bq. I am really ok with any option. I decided to cut fastutil out because it's 
just a much larger library (of which almost nothing is practically used in 
Solr). Also, hll's implementation uses fastutil iterators very inefficiently 
(causing intermediate autoboxing on every value) so I thought I'd take a stab 
at improving that while converting to HPPC.

Got it -- so the primary concern is/was dependency version consistency, and in 
the process of going down that rabbit hole, you realized forking HLL to 
eliminate the dep on fastutil seemed like the best win all around.

Sounds like a good plan of attack to me.

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-14 Thread Upayavira (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626927#comment-14626927
 ] 

Upayavira commented on SOLR-7787:
-

I can see the rationale of forking. I wonder though would hll accept a patch to 
remove the dependency (or make it optional)? Thus removing the need for us to 
maintain a fork?

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency

2015-07-14 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626920#comment-14626920
 ] 

David Smiley commented on SOLR-7787:


I think we should avoid very large dependencies that we so barely use.  +1 to 
Fork HLL so that we needn't use FastUtil.  

Perhaps it would be useful to fork HLL publicly (e.g. on GitHub  publish to 
maven) to be its own 3rd party dependency -- I imagine others aren't too keen 
on using a 17MB library that they aren't already using.  Ideally this would be 
condoned with the original library owner to use their maven groupId but 
differentiated in the artifactId, but it isn't necessary of course.

 Fork HyperLogLog and remove fastutil dependency
 ---

 Key: SOLR-7787
 URL: https://issues.apache.org/jira/browse/SOLR-7787
 Project: Solr
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 5.3, Trunk


 So fastutil is now part of Solr's distribution (because the stats component 
 uses hyperloglog library, which in turn requires fastutil). I looked at the 
 actual uses of fastutil and only java-hll uses it (and only a few classes).
 I've created a fork that uses HPPC instead (also randomized all tests, they 
 pass). Since it's a relatively simple package I think it could be forked and 
 imported into Solr's codebase entirely. I'd make a pull request but I see 
 Hoss also created a few comments/ PRs and none of them received any 
 attention; the project seems to be stale or dead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org