I'm +1 on this.

-------- Original message --------
From: Suneel Marthi <smar...@apache.org>
Date: 03/07/2016 8:09 PM (GMT-05:00)
To: mahout <dev@mahout.apache.org>
Subject: Re: [jira] [Commented] (MAHOUT-1640) Better collections would 
significantly improve vector-operation speed

If @apalumbo, @pferrel et.al vote for it now, we should merge the patch
into 0.11.2 master and 0.12.0 branch.

No need to wait for 3 days.

Again, +1 from me.

Thanks @vigna and sorry about missing this, my focus has been on 0.12.0
Flink integration.

On Mon, Mar 7, 2016 at 8:06 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:

> ok standard 3 days then.
>
> On Mon, Mar 7, 2016 at 5:04 PM, ASF GitHub Bot (JIRA) <j...@apache.org>
> wrote:
>
> >
> >     [
> >
> https://issues.apache.org/jira/browse/MAHOUT-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184122#comment-15184122
> > ]
> >
> > ASF GitHub Bot commented on MAHOUT-1640:
> > ----------------------------------------
> >
> > Github user smarthi commented on the pull request:
> >
> >     https://github.com/apache/mahout/pull/81#issuecomment-193536262
> >
> >     Seems like it's ASL 2.0 -
> >     https://github.com/vigna/fastutil/blob/master/LICENSE-2.0
> >
> >     +1 from me, good to go.
> >
> >     On Mon, Mar 7, 2016 at 7:21 PM, Dmitriy Lyubimov <
> > notificati...@github.com>
> >     wrote:
> >
> >     > @vigna <https://github.com/vigna> is 0.7.2 fastutil is still the
> > best
> >     > version to use? I can't immediately find the license on it?
> >     > @smarthi <https://github.com/smarthi> et. al. : need a few votes
> on
> >     > inclusion of fastutil as a dependency
> >     >
> >     > —
> >     > Reply to this email directly or view it on GitHub
> >     > <https://github.com/apache/mahout/pull/81#issuecomment-193522992>.
> >     >
> >
> >
> >
> > > Better collections would significantly improve vector-operation speed
> > > ---------------------------------------------------------------------
> > >
> > >                 Key: MAHOUT-1640
> > >                 URL: https://issues.apache.org/jira/browse/MAHOUT-1640
> > >             Project: Mahout
> > >          Issue Type: Improvement
> > >          Components: collections
> > >         Environment: Darwin lithium.local 14.1.0 Darwin Kernel Version
> > 14.1.0: Mon Dec 22 23:10:38 PST 2014;
> root:xnu-2782.10.72~2/RELEASE_X86_64
> > x86_64 i386 MacBookPro10,1 Darwin
> > > java version "1.8.0_31"
> > > Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
> > > Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)
> > >            Reporter: Sebastiano Vigna
> > >            Assignee: Suneel Marthi
> > >              Labels: legacy, math, scala
> > >         Attachments: fastutil.patch, speed-fastutil, speed-std
> > >
> > >
> > > The collections currently used by Mahout to implement sparse vectors
> are
> > extremely slow. The proposed patch (localized to
> RandomAccessSparseVector)
> > uses fastutil's maps and the speed improvements in vector benchmarks are
> > very significant. It would be interesting to see whether these
> improvements
> > percolate to high-level classes using sparse vectors.
> > > I had to patch two unit tests (an off-by-one bug and an overfitting
> bug;
> > both were exposed by the different order in which key/values were
> returned
> > by iterators).
> > > The included files speed-std and speed-fastutil show the speed
> > improvement. Some more speed might be gained by using everywhere the
> > standard java.util.Map.Entry interface instead of Element.
> > > DISCLAIMER: The "Times" set of tests has been run multiplying two
> > identical vectors. The standard tests multiply two random vectors, so in
> > fact they just test the speed of the underlying map remove() method, as
> > almost all products are zero. This is not very realistic and was heavily
> > penalizing fastutil's "true deletions". Better tests, with a typical
> > overlap of nonzero entries, would be even more realistic.
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
> >
>

Reply via email to