Re: Hadoop MapReduce + MySQL

Arun C Murthy Sun, 06 Jan 2008 19:50:49 -0800

On Sun, Jan 06, 2008 at 04:08:33PM +0100, Fredrik Hedberg wrote:
>Hi,
>
>In order to simplify some data crunching for a client, I threw
>together some code that allows you to run MapReduce jobs over data in
>a MySQL table.
>
>The code is heavily inspired by the MapReduce layer for HBase and
>works much like it. However, it's mainly meant to be used for
>development, as in it's current form, but could potentially be of use
>for people that must keep their data in a relational database and
>cannot migrate to HBase for some reason (without all the benefits of
>HBase of course).
>
>Needless to say, the code is a hack and has a lot of issues. Code is here [1].
>
>If people find it useful, I can clean it up somewhat and put it in JIRA.


Sure. The best bet is to propose a jira and let your consumers get a shot at 
it. I'd think you might get more interesting requirements too. Feel free to 
publicise the proposal on hadoop-user if you feel the need to get more 
eye-balls than on hadoop-dev. Oh, and some documentation would help! *smile*
http://wiki.apache.org/lucene-hadoop/HowToContribute

Doug - should we put up these in mapred.lib? Come to think of it, I'd say we 
could move mapred.lib to contrib and let users go wild with their own 
mappers/reducers/{input|output}formats etc.; and encourage them to contribute 
back. This could help build a nice eco-system around map-reduce, while offering 
lesser guarantees about it's feasibility/usability etc. Thoughts? If that makes 
sense I'll open a jira for this.

Arun

>
> - Fredrik
>
>
>[1] http://www.avafan.com/~fredrik/hadoop/

Re: Hadoop MapReduce + MySQL

Reply via email to