Right, Colt likely could be used depending on the package it comes
from and as long as it doesn't have deps on the other packages.
-Grant
On Aug 29, 2009, at 2:22 PM, Ted Dunning wrote:
Trove is LGPL so we can't lift code. Even linking can be tricky.
On Fri, Aug 28, 2009 at 10:06 AM, Shashikant Kore (JIRA) <j...@apache.org
>wrote:
[
https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748904
#action_12748904]
Shashikant Kore commented on MAHOUT-165:
----------------------------------------
I'm fine with copying relevant classes from Colt or Trove.
Please let me know your library of choice. I will create the patch
and
upload.
Using better primitives hash for sparse vector for performance gains
--------------------------------------------------------------------
Key: MAHOUT-165
URL: https://issues.apache.org/jira/browse/MAHOUT-165
Project: Mahout
Issue Type: Improvement
Components: Matrix
Affects Versions: 0.2
Reporter: Shashikant Kore
Fix For: 0.2
Attachments: mahout-165.patch
In SparseVector, we need primitives hash map for index and values.
The
present implementation of this hash map is not as efficient as some
of the
other implementations in non-Apache projects.
In an experiment, I found that, for get/set operations, the
primitive
hash of Colt performance an order of magnitude better than
OrderedIntDoubleMapping. For iteration it is 2x slower, though.
Using Colt in Sparsevector improved performance of canopy
generation. For
an experimental dataset, the current implementation takes 50
minutes. Using
Colt, reduces this duration to 19-20 minutes. That's 60% reduction
in the
delay.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
--
Ted Dunning, CTO
DeepDyve
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search