There is MAHOUT-950, which cleans up some of the mixed-usage tricks for
using MultipleOutputs. It does not pass tests under 0.20.203, though.
Under various 23.x-SNAPSHOTS, -822 required -950 to pass all tests, but
I guess before 0.23.1 was released, the earlier behavior was duplicated.
Jeff did +1 the clustering changes in -822, and the HadoopUtil changes
are fixing the build under 0.23.1, not an old 0.20.x. Any objections to
this going in with the whitespace correction and comment removal
mentioned on reviewboard?
-tom
On 03/09/2012 01:52 PM, Dmitriy Lyubimov wrote:
On another note, i was primarily interested to see if this branch
tacles old/new api multiple outputs anywhere, it doesn't look like it
does. I think 0.20.203 already have support for new api multiple
outputs, i guess i can take care of it on some other branch. in fact
the methods i care about initially had been doing it 100% with new api
found in Cloudera distros and were hacked back to support 0.20.2 which
Mahout was shipped with. so i think i'll take a look at it at some
other issue separately.
On Fri, Mar 9, 2012 at 10:50 AM, Dmitriy Lyubimov<dlyubi...@apache.org> wrote:
On Thu, Mar 8, 2012 at 8:43 PM, tom pierce<t...@apache.org> wrote:
Is there someone you'd nominate as an additional reviewer?
This clustering stuff in Mahout is i think something Jeff is very
knowledgeable of. Unless he already was positive in the jira
discussion, maybe he can review?