NVM, I think I've got DistanceMeasures sorted out. Still working on
creating proper MockContexts to feed to the mapper and reducer tests.
I'll post my patch in whatever status it is at the end of today.
On 5/26/10 1:35 PM, Jeff Eastman wrote:
I've got most of Canopy converted but am exploding my brain trying to
figure out how best to coax DistanceMeasures to support a
configure(Configuration) method. There's some subtle inheritance in
the parameters package and I can't decide just where to touch it. I'm
running out of time before I head out tomorrow so I was hoping
somebody can sort it out while I'm gone.
Jeff
PS: I'm going sailing for a few days with my cousin Peter and his son
Trout (http://www.coastalview.com/center.asp?article=12251). Should be
back to a computer late Tuesday.
On 5/25/10 7:17 AM, Sean Owen wrote:
Just to state what seems to be in progress -- looks like we are agreed
we should move to the new Hadoop APIs. Some code is already using it;
most of the part that isn't is the recommender which was due to some
strange bugs deep in Hadoop in prior versions. It's time to try it
again. I'm going to work on porting everything forward now.
The other argument against this was that Amazon EMR runs 0.18.3. I
think Jeff already established that what we're doing has already
broken compatibility with 0.18.x. We can point those users to release
0.3 and say they can try to back-port that code to 0.18.x
compatibility. But 0.4 onwards is for 0.20.x+ and you can run your own
cluster using AWS and hopefully EMR updates soon.
Sean