[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632379#comment-14632379 ] Dawid Weiss commented on SOLR-7787: --- Darn, my bad -- thanks [~steve_rowe]! Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk Attachments: SOLR-7787.patch So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631257#comment-14631257 ] ASF subversion and git services commented on SOLR-7787: --- Commit 1691541 from [~dawidweiss] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1691541 ] SOLR-7787: Removed fastutil and java-hll dependency, integrated HyperLogLog from java-hll into Solr core. (committed from solr/ folder; note to Uwe: this is why I hate SVN...). Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk Attachments: SOLR-7787.patch So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631254#comment-14631254 ] ASF subversion and git services commented on SOLR-7787: --- Commit 1691535 from [~dawidweiss] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1691535 ] SOLR-7787: Removed fastutil and java-hll dependency, integrated HyperLogLog from java-hll into Solr core. (committed from solr/ folder; note to Uwe: this is why I hate SVN...). Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk Attachments: SOLR-7787.patch So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631246#comment-14631246 ] ASF subversion and git services commented on SOLR-7787: --- Commit 1691531 from [~dawidweiss] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1691531 ] SOLR-7787: Removed fastutil and java-hll dependency, integrated HyperLogLog from java-hll into Solr core. Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk Attachments: SOLR-7787.patch So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631207#comment-14631207 ] ASF subversion and git services commented on SOLR-7787: --- Commit 1691518 from [~dawidweiss] in branch 'dev/trunk' [ https://svn.apache.org/r1691518 ] SOLR-7787: Removed fastutil and java-hll dependency, integrated HyperLogLog from java-hll into Solr core. Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk Attachments: SOLR-7787.patch So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631558#comment-14631558 ] ASF subversion and git services commented on SOLR-7787: --- Commit 1691609 from [~steve_rowe] in branch 'dev/trunk' [ https://svn.apache.org/r1691609 ] SOLR-7787: Remove obsolute fastutil and hll files in solr/licenses/ Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk Attachments: SOLR-7787.patch So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631560#comment-14631560 ] ASF subversion and git services commented on SOLR-7787: --- Commit 1691610 from [~steve_rowe] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1691610 ] SOLR-7787: Remove obsolute fastutil and hll files in solr/licenses/ (merged trunk r1691609) Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk Attachments: SOLR-7787.patch So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629536#comment-14629536 ] ASF subversion and git services commented on SOLR-7787: --- Commit 1691350 from [~dawidweiss] in branch 'dev/branches/solr7787' [ https://svn.apache.org/r1691350 ] SOLR-7787 (jhll integration). Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629832#comment-14629832 ] Yonik Seeley commented on SOLR-7787: When HLL was first added, I remember scanning the code quickly and noting that the explicit storage rehashed the given value (which is wasted effort given that the provided values already need to be good hashes to start with). Is there a way to avoid that? Prob not a big deal though. Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk Attachments: SOLR-7787.patch So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629892#comment-14629892 ] Dawid Weiss commented on SOLR-7787: --- Yeah, I honestly don't think those storage optimizations are worth the effort either, but I think the code can be tweaked after it's integrated -- I wouldn't want to tinker with it in this patch (can you file another issue)? Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk Attachments: SOLR-7787.patch So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627784#comment-14627784 ] Dawid Weiss commented on SOLR-7787: --- This is the PR, let's see if it receives any attention. Otherwise I'll try to clean it up from unnecessary stuff and import it directly into Solr's codebase. https://github.com/aggregateknowledge/java-hll/pull/16 Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626656#comment-14626656 ] Hoss Man commented on SOLR-7787: I don't have any strong opinions about this issue as it currently stands (ie: Fork HyperLogLog and remove fastutil dependency) -- as you mentioned, the java-hll project, with it's original goals of a cross langauge binary format protocol, doesn't appear to be very active anymore -- but for the record can you please clarify the context/objective of this issue? It started out as Promote fastutil to first-order dependency -- and your goal seemed to be to remove HPPC, now you seem to have flipped the objective so that the focus is on keeping HPPC, and eliminating fastutil. having a clear understanding of the goal for either refactoring/forking/etc would be helpful for folks to develop an informed opinion. Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626799#comment-14626799 ] Dawid Weiss commented on SOLR-7787: --- Well, I thought lack of initial feedback was lazy consensus :) And seriously -- it can go either way. My original intention was to update HPPC which is duplicated in Solr and the clustering contrib. These have to be consistent. From there I observed that: 1) Solr uses HPPC in a small number of classes, 2) fastutil is present in solr's lib, but it's not used in any classes, it is just a transitive dependency from hll, 3) fastutil is much larger than HPPC (roughly 15x). I am really ok with any option. I decided to cut fastutil out because it's just a much larger library (of which almost nothing is practically used in Solr). Also, hll's implementation uses fastutil iterators very inefficiently (causing intermediate autoboxing on every value) so I thought I'd take a stab at improving that while converting to HPPC. I can also leave everything as-is, really, but the patch I have locally seems like a nice improvement on its own (and could be perhaps pruned further to get rid of unnecessary stuff like serialization, etc.). Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626975#comment-14626975 ] Dawid Weiss commented on SOLR-7787: --- bq. Got it – so the primary concern is/was dependency version consistency, and in the process of going down that rabbit hole, you realized forking HLL to eliminate the dep on fastutil seemed like the best win all around. Yes, that's exactly what happened. bq. I wonder though would hll accept a patch to remove the dependency (or make it optional)? It's not a complete removal -- I simply replaced fastutil with HPPC (since Solr already uses it in a couple of places anyway). I also did a few other changes the author may not be too happy with (replaced testng with junit, replaced hardcoded randomization seeds with randomizedtesting, etc.) in preparation of importing the code into Solr's codebase. I will submit a PR too, but there seems to be a general lack of interest in this code from the original author, see Hoss's unaddressed question here: https://github.com/aggregateknowledge/java-hll/issues/15 and Timon Karnezos doesn't seem to be too active recently: https://github.com/timonk?tab=contributionsperiod=monthly Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626953#comment-14626953 ] Hoss Man commented on SOLR-7787: bq. My original intention was to update HPPC which is duplicated in Solr and the clustering contrib. These have to be consistent. From there I observed that: ... bq. I am really ok with any option. I decided to cut fastutil out because it's just a much larger library (of which almost nothing is practically used in Solr). Also, hll's implementation uses fastutil iterators very inefficiently (causing intermediate autoboxing on every value) so I thought I'd take a stab at improving that while converting to HPPC. Got it -- so the primary concern is/was dependency version consistency, and in the process of going down that rabbit hole, you realized forking HLL to eliminate the dep on fastutil seemed like the best win all around. Sounds like a good plan of attack to me. Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626927#comment-14626927 ] Upayavira commented on SOLR-7787: - I can see the rationale of forking. I wonder though would hll accept a patch to remove the dependency (or make it optional)? Thus removing the need for us to maintain a fork? Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7787) Fork HyperLogLog and remove fastutil dependency
[ https://issues.apache.org/jira/browse/SOLR-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626920#comment-14626920 ] David Smiley commented on SOLR-7787: I think we should avoid very large dependencies that we so barely use. +1 to Fork HLL so that we needn't use FastUtil. Perhaps it would be useful to fork HLL publicly (e.g. on GitHub publish to maven) to be its own 3rd party dependency -- I imagine others aren't too keen on using a 17MB library that they aren't already using. Ideally this would be condoned with the original library owner to use their maven groupId but differentiated in the artifactId, but it isn't necessary of course. Fork HyperLogLog and remove fastutil dependency --- Key: SOLR-7787 URL: https://issues.apache.org/jira/browse/SOLR-7787 Project: Solr Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 5.3, Trunk So fastutil is now part of Solr's distribution (because the stats component uses hyperloglog library, which in turn requires fastutil). I looked at the actual uses of fastutil and only java-hll uses it (and only a few classes). I've created a fork that uses HPPC instead (also randomized all tests, they pass). Since it's a relatively simple package I think it could be forked and imported into Solr's codebase entirely. I'd make a pull request but I see Hoss also created a few comments/ PRs and none of them received any attention; the project seems to be stale or dead? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org