Re: Codebase refactoring proposal

Dmitriy Lyubimov Thu, 05 Feb 2015 13:33:59 -0800

On Thu, Feb 5, 2015 at 1:14 AM, Gokhan Capan <gkhn...@gmail.com> wrote:


> What I am saying is that for certain algorithms including both
> engine-specific (such as aggregation) and DSL stuff, what is the best way
> of handling them?
>
> i) should we add the distributed operations to Mahout codebase as it is
> proposed in #62?
>

Imo this can't go very well and very far (because of the engine specifics)
but i'd be willing to see an experiment with simple things like map and
reduce.

Bigger quesitons are, where exactly we'll have to stop (we can't abstract
all capabilities out there becuase of "common denominator" issues), and
what percentage of methods will it truly allow to migrate to full backend
portability.

And if after doing all this, we will still find ourselves writing engine
specific mixes, why bother. Wouldn't it be better to find a good,
easy-to-replicate, incrementally-developed pattern to register and apply
engine-specific strategies for every method?


>
> ii) should we have [engine]-ml modules (like spark-bindings and
> h2o-bindings) where we can mix the DSL and engine-specific stuff?
>

This is not quite what i am proposing. Rather, engine-ml modules holding
engine-specific _parts_ of algorithm.

However, this really needs a POC over a guniea pig (similarly to how we
POC'd algebra in the first place with ssvd and spca).


>
>

Re: Codebase refactoring proposal

Reply via email to