[ 
https://issues.apache.org/jira/browse/HADOOP-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaas Bosteels updated HADOOP-4304:
-----------------------------------

    Status: In Progress  (was: Patch Available)

> Add Dumbo to contrib
> --------------------
>
>                 Key: HADOOP-4304
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4304
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Klaas Bosteels
>            Priority: Minor
>         Attachments: hadoop-4304-v2.patch, hadoop-4304-v3.patch, 
> hadoop-4304.patch
>
>
> Originally, Dumbo was a simple Python module developed at Last.fm to make 
> writing and running Hadoop Streaming programs very easy, but now it also 
> consists of some (up till now unreleased) helper code in Java (although it 
> can still be used without the Java code). We propose to add Dumbo to 
> "src/contrib" such that the Java classes get build/installed together with 
> the rest of Hadoop, and the Python module can be installed separately at 
> will. A tar.gz of the directory that would have to be added to "src/contrib" 
> is available at
> http://static.last.fm/dumbo/dumbo-contrib.tar.gz
> and more info about Dumbo can be found here:
> * Basic documentation: http://github.com/klbostee/dumbo/wikis
> * Presentation at HUG (where it was first suggested to add Dumbo to contrib): 
> http://skillsmatter.com/podcast/home/dumbo-hadoop-streaming-made-elegant-and-easy
> * Initial announcement: 
> http://blog.last.fm/2008/05/29/python-hadoop-flying-circus-elephant
> For some of the more advanced features of Dumbo (in particular the ones for 
> which the Java classes are needed) there is no public documentation yet, but 
> we could easily fill that gap by moving some of the internal Last.fm 
> documentation to the Hadoop wiki.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to