It's this crazy thing where the new APIs call into old APIs and checks fail as a result -- for example, try setting an InputFormat class that implements the 'new' InputFormat. Somewhere in the code it checks to see if you're implementing the *old* InputFormat.
It may so happen that only my jobs hit this. I don't see it fixed in any branch yet. Actually I failed to mention what I think is a far bigger reason to not move to the new API just yet -- it won't run on Amazon Elastic MapReduce. I suppose the thinking is that the old APIs - work with stuff like Amazon - work with Hadoop's latest release - work -- doesn't have a bug that's stopping us ... so therefore what's the actual use in upgrading yet. I also figured we'd spend some time consolidating our own approach to Hadoop -- I've refactored my 3 jobs into one approach -- making the eventual transition simpler. And so I stopped thinking about it. No harm in having new-API code alongside the old-API code, but I still suggest we should stick on the old APIs. On Thu, Feb 4, 2010 at 5:01 PM, Drew Farris <drew.far...@gmail.com> wrote: > Sean, > > What sort of problems have you run into, are there Hadoop JIRA issues > open for them? > > It would be nice to commit to the 0.20.x api in Mahout, but I agree, > not very nice if we back the users into a corner wrt what they can and > can't do due to bugs in Hadoop. > > Drew > > On Thu, Feb 4, 2010 at 10:57 AM, Sean Owen <sro...@gmail.com> wrote: >> Yeah I'm still on the old API because of problems in Hadoop. I'm still >> hoping they get fixed in 0.20.x We may need two-track support for a >> while. >> >> On Thu, Feb 4, 2010 at 3:48 PM, Robin Anil <robin.a...@gmail.com> wrote: >>> One important question in my mind here is how does this effect 0.20 based >>> jobs and pre 0.20 based jobs. I had written pfpgrowth in pure 0.20 api. and >>> deneche is also maintaining two version it seems. I will check the >>> AbstractJob and see >>> >>> >> >