Ok,

So, getting back, what you think would be a good way to solve ALS-like
issues within Mahout context?

I see just the following:

a) wait for Yarn + whatever bulk parallel environment built for it?

b) introduce adapters to syncrhonous or dynamic bulk parallel distributed
environments -- if yes, which ones? Worth a try to step there? Is it a good
idea to collaborate with non-Apache projects here?

c) do nothing (no good ALS in Mahout)?

I would happily explore b and open discussion on it if majority supported
it. I guess I am fundamentally fine with c) too :)  I feel a) is not really
an option and in a way is equivalent to c) since it involves unspecified
amount of waiting for unspecified things.



On Mon, Mar 11, 2013 at 1:54 PM, Sebastian Schelter <[email protected]> wrote:

> I spent the last months working on the Stratosphere system, which is
> developed by my group. It's a research prototype, but it's got so much
> things that we would need.
>
> It extends the MapReduce model, for joins, e.g. there is a new operator
> called 'Match' which lets you apply your user code to the result of an
> equi-join. The nice thing is that the system automatically chooses an
> efficient execution strategy for the join. Having something like this
> production ready would save us so much code, as a lot of our
> implementations consist of hand-coded joins.
>
>

Reply via email to