Re: Algebraic Classifiers

Dmitriy Lyubimov Tue, 29 Sep 2015 15:25:58 -0700

Congratulations, by the way!.

On Tue, Sep 29, 2015 at 3:14 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:


> as far as i understand, the flexibility idea there is to use streaming
> processing like what author calls foldable functor. Is that what you want
> to do? Do you want to repeat that functional API?
>
> On Tue, Sep 29, 2015 at 2:17 PM, alxsmac733 . <ajmoreno1...@gmail.com>
> wrote:
>
>> Hi Dmitriy,
>>
>> Apologies for not getting back to you sooner - I just got married and was
>> away on my honeymoon.
>>
>> Having taken a closer look at the direction you're trying to take Mahout,
>> while I agree that the approach extolled in the paper is not necessarily
>> completely in line with batch - algebraic problems, I believe it is in a
>> similar spirit. Additionally, I think having algebraic semantics for
>> things
>> like models fits in well with the goal of making Mahout more of a
>> programming environment than a collection of blackbox algorithms.
>>
>> In terms of what specific additions should be made, I'm open in terms of
>> suggestions and I'd love to discuss the matter further. Per your point
>> about low-level speedups, unfortunately I'm not a JVM expert so I probably
>> couldn't help too much on that front.
>>
>> - Alex
>> On Sep 14, 2015 2:00 PM, "Dmitriy Lyubimov" <dlie...@gmail.com> wrote:
>>
>> > Also. as far as i understand, the author does a lot in terms of
>> low-level
>> > speed ups -- using fast numeric libraries, packing memory-fragmented
>> object
>> > trees into continuous cache-friendly representations (something i fought
>> > for years in java, and then in part in Scala -- this is my JVM rant #
>> 1).
>> > Mahout notoriously lacks these techniques. But without these techniques,
>> > the speed-ups are probably not realistic by the monoid architecture
>> alone
>> > (i may be wrong). What are your thoughts in these respects? All these
>> > problems are very welcome to be solved in Mahout. But I expect they'd
>> > require some significant time commitment IMO.
>> >
>> > On Mon, Sep 14, 2015 at 10:41 AM, Dmitriy Lyubimov <dlie...@gmail.com>
>> > wrote:
>> >
>> > > Alex,
>> > >
>> > > so these papers seem to mainly show adaptation of different
>> algorithms to
>> > > a monoid architecture, i.e. online training (or parallel online
>> > training).
>> > > Although IMO these makes it not necessarily batch-algebraic problems
>> > > towards which we were working recently (i.e. "distributed R" notion),
>> I
>> > > suppose they would make a fine architecture addition on their own.
>> > >
>> > > What parts of Mahout you are suggesting to reuse for these methods?
>> > > Also, the papers show adaptation for several classifiers, which ones
>> do
>> > > you suggest to start with?
>> > >
>> > > Thank you for doing this.
>> > >
>> > > -D
>> > >
>> > > On Fri, Sep 4, 2015 at 2:41 PM, alxsmac733 . <ajmoreno1...@gmail.com>
>> > > wrote:
>> > >
>> > >> My pleasure!
>> > >>
>> > >> On Fri, Sep 4, 2015 at 4:03 PM, Andrew Musselman <
>> > >> andrew.mussel...@gmail.com
>> > >> > wrote:
>> > >>
>> > >> > Thanks Alex; grateful for the help.
>> > >> >
>> > >> > On Fri, Sep 4, 2015 at 12:59 PM, alxsmac733 . <
>> ajmoreno1...@gmail.com
>> > >
>> > >> > wrote:
>> > >> >
>> > >> > > Hi Dmitriy,
>> > >> > >
>> > >> > > That sounds more than reasonable - take as much time as you need.
>> > >> I'll
>> > >> > be
>> > >> > > away for the next two weeks anyway so I won't be able to start
>> > >> working on
>> > >> > > this until I get back should you want me to move forward with the
>> > >> > proposal.
>> > >> > >
>> > >> > > - Alex
>> > >> > > On Sep 4, 2015 1:46 PM, "Dmitriy Lyubimov" <dlie...@gmail.com>
>> > wrote:
>> > >> > >
>> > >> > > > Alex,
>> > >> > > >
>> > >> > > > can you give us a week or so to look it over?
>> > >> > > >
>> > >> > > > We have been discussing for a while hyperparameter fitting
>> > >> approaches
>> > >> > and
>> > >> > > > it is fairly high on our roadmap (crossvalidation is of course
>> an
>> > >> > > important
>> > >> > > > element of it). We need to figure how it may fit together; but
>> > don't
>> > >> > get
>> > >> > > > discouraged if we don't get immediately back to you, we need
>> time
>> > to
>> > >> > > digest
>> > >> > > > your proposal.
>> > >> > > >
>> > >> > > > -d
>> > >> > > >
>> > >> > > > On Fri, Sep 4, 2015 at 10:26 AM, alxsmac733 . <
>> > >> ajmoreno1...@gmail.com>
>> > >> > > > wrote:
>> > >> > > >
>> > >> > > > > The fast cross-validation algorithm might be a good place to
>> > >> start as
>> > >> > > it
>> > >> > > > > may be the most broadly useful.
>> > >> > > > >
>> > >> > > > > Any advice on how to get started would be greatly
>> appreciated -
>> > I
>> > >> > want
>> > >> > > to
>> > >> > > > > make sure I do a good job and it fits well with the overall
>> aims
>> > >> of
>> > >> > > > Mahout.
>> > >> > > > >
>> > >> > > > > On Fri, Sep 4, 2015 at 1:12 PM, Andrew Musselman <
>> > >> > > > > andrew.mussel...@gmail.com
>> > >> > > > > > wrote:
>> > >> > > > >
>> > >> > > > > > Sounds interesting; what part would you like to start with?
>> > >> > > > > >
>> > >> > > > > > If you need help getting started we're happy to point you
>> in a
>> > >> good
>> > >> > > > > > direction.
>> > >> > > > > >
>> > >> > > > > > On Fri, Sep 4, 2015 at 9:55 AM, alxsmac733 . <
>> > >> > ajmoreno1...@gmail.com
>> > >> > > >
>> > >> > > > > > wrote:
>> > >> > > > > >
>> > >> > > > > > > Hi everyone,
>> > >> > > > > > >
>> > >> > > > > > > Would there be any interest in adding algebraic
>> > classification
>> > >> > > > methods
>> > >> > > > > to
>> > >> > > > > > > Mahout?  It's an elegant approach that allows for easy
>> > online
>> > >> and
>> > >> > > > > > parallel
>> > >> > > > > > > training as well as fast cross-validation.  Below are
>> some
>> > >> links
>> > >> > > > > > describing
>> > >> > > > > > > the approach as well as an existing Haskell package
>> > >> implemented
>> > >> > by
>> > >> > > > the
>> > >> > > > > > > author.  The first paper does a very good job of
>> explaining
>> > >> the
>> > >> > > basic
>> > >> > > > > > > concepts clearly and concisely.
>> > >> > > > > > >
>> > >> > > > > > >
>> > >> > >
>> https://izbicki.me/public/papers/icml2013-algebraic-classifiers.pdf
>> > >> > > > > > >
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> https://izbicki.me/public/papers/tfp2013-hlearn-a-machine-learning-library-for-haskell.pdf
>> > >> > > > > > > https://izbicki.me/
>> > >> > > > > > > https://github.com/mikeizbicki/HLearn
>> > >> > > > > > >
>> > >> > > > > > > The author saw a very large speed up implementing these
>> > >> > techniques
>> > >> > > > when
>> > >> > > > > > > compared with popular existing libraries such as Weka.
>> > Aside
>> > >> > from
>> > >> > > > the
>> > >> > > > > > > potential performance gains to be had, I think imposing
>> > >> algebraic
>> > >> > > > > > structure
>> > >> > > > > > > provides a nice layer of abstraction over the particular
>> > >> models
>> > >> > > being
>> > >> > > > > > > implemented.
>> > >> > > > > > >
>> > >> > > > > > > I'd love to hear everyone's feedback on this.  Thanks for
>> > your
>> > >> > time
>> > >> > > > and
>> > >> > > > > > > enjoy your weekends!
>> > >> > > > > > >
>> > >> > > > > > > Alex Moreno
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>
>

Re: Algebraic Classifiers

Reply via email to