Hi all,

During our GSoC meeting this morning, Isabel and I were discussing a data
type flaw in my algorithm and its possible solutions: one involves rewriting
the data types manually and likely adding another M/R task, and the other
involves making use of a very recent Hadoop API commit (specifically, the
mapreduce.lib.join.* library, committed only two months ago and not
available in version 0.20.2).

Isabel mentioned that others may find the API available in the latest Hadoop
commit to be useful to their code; however, since there hasn't been an
official release after 0.20.2, it would require a bit more finesse to
integrate the new API. Hence I pose this question to the list: does anyone
else need or want to use what is effectively the beta version of Hadoop? If
so, I will make use of it as well (which would greatly simplify and reduce
the amount of code I have to rewrite); if not, I will write an intermediate
task that does all the processing I need.

Thanks for your input!

Regards,
Shannon

Reply via email to