First snag I've hit -- the new APIs don't include an equivalent of MultipleInputs, which is the primary support for mapping several input paths. Anyone happen to know the equivalent (esp. Ankur)? This could be a bear to reproduce on our own.
Sean On Tue, May 25, 2010 at 3:17 PM, Sean Owen <[email protected]> wrote: > Just to state what seems to be in progress -- looks like we are agreed > we should move to the new Hadoop APIs. Some code is already using it; > most of the part that isn't is the recommender which was due to some > strange bugs deep in Hadoop in prior versions. It's time to try it > again. I'm going to work on porting everything forward now. > > The other argument against this was that Amazon EMR runs 0.18.3. I > think Jeff already established that what we're doing has already > broken compatibility with 0.18.x. We can point those users to release > 0.3 and say they can try to back-port that code to 0.18.x > compatibility. But 0.4 onwards is for 0.20.x+ and you can run your own > cluster using AWS and hopefully EMR updates soon. > > Sean >
