Re: [jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

Ted Dunning Sat, 29 Aug 2009 11:23:20 -0700

Trove is LGPL so we can't lift code.  Even linking can be tricky.

On Fri, Aug 28, 2009 at 10:06 AM, Shashikant Kore (JIRA) <j...@apache.org>wrote:


>
>    [
> https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748904#action_12748904]
>
> Shashikant Kore commented on MAHOUT-165:
> ----------------------------------------
>
> I'm fine with copying relevant classes from Colt or Trove.
>
> Please let me know your library of choice. I will create the patch and
> upload.
>
>
>
> > Using better primitives hash for sparse vector for performance gains
> > --------------------------------------------------------------------
> >
> >                 Key: MAHOUT-165
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-165
> >             Project: Mahout
> >          Issue Type: Improvement
> >          Components: Matrix
> >    Affects Versions: 0.2
> >            Reporter: Shashikant Kore
> >             Fix For: 0.2
> >
> >         Attachments: mahout-165.patch
> >
> >
> > In SparseVector, we need primitives hash map for index and values. The
> present implementation of this hash map is not as efficient as some of the
> other implementations in non-Apache projects.
> > In an experiment, I found that, for get/set operations, the primitive
> hash of  Colt performance an order of magnitude better than
> OrderedIntDoubleMapping. For iteration it is 2x slower, though.
> > Using Colt in Sparsevector improved performance of canopy generation. For
> an experimental dataset, the current implementation takes 50 minutes. Using
> Colt, reduces this duration to 19-20 minutes. That's 60% reduction in the
> delay.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>


-- 
Ted Dunning, CTO
DeepDyve

Re: [jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

Reply via email to