It's this crazy thing where the new APIs call into old APIs and checks
fail as a result -- for example, try setting an InputFormat class that
implements the 'new' InputFormat. Somewhere in the code it checks to
see if you're implementing the *old* InputFormat.

It may so happen that only my jobs hit this. I don't see it fixed in
any branch yet.

Actually I failed to mention what I think is a far bigger reason to
not move to the new API just yet -- it won't run on Amazon Elastic
MapReduce.

I suppose the thinking is that the old APIs
- work with stuff like Amazon
- work with Hadoop's latest release
- work -- doesn't have a bug that's stopping us

... so therefore what's the actual use in upgrading yet. I also
figured we'd spend some time consolidating our own approach to Hadoop
-- I've refactored my 3 jobs into one approach -- making the eventual
transition simpler.

And so I stopped thinking about it.

No harm in having new-API code alongside the old-API code, but I still
suggest we should stick on the old APIs.

On Thu, Feb 4, 2010 at 5:01 PM, Drew Farris <drew.far...@gmail.com> wrote:
> Sean,
>
> What sort of problems have you run into, are there Hadoop JIRA issues
> open for them?
>
> It would be nice to commit to the 0.20.x api in Mahout, but I agree,
> not very nice if we back the users into a corner wrt what they can and
> can't do due to bugs in Hadoop.
>
> Drew
>
> On Thu, Feb 4, 2010 at 10:57 AM, Sean Owen <sro...@gmail.com> wrote:
>> Yeah I'm still on the old API because of problems in Hadoop. I'm still
>> hoping they get fixed in 0.20.x  We may need two-track support for a
>> while.
>>
>> On Thu, Feb 4, 2010 at 3:48 PM, Robin Anil <robin.a...@gmail.com> wrote:
>>> One important question in my mind here is how does this effect 0.20 based
>>> jobs and pre 0.20 based jobs. I had written pfpgrowth in pure 0.20 api. and
>>> deneche is also maintaining two version it seems. I will check the
>>> AbstractJob and see
>>>
>>>
>>
>

Reply via email to