Great! I'm interested in this. Would you please offer some more detailed documentations?
On Tue, Mar 11, 2008 at 12:17 PM, Milind A Bhandarkar <[EMAIL PROTECTED]> wrote: > +1. Would love to see more such projects' integration with hadoop. > > -milind > > ----- Original Message ----- > From: Ian Holsman <[EMAIL PROTECTED]> > To: [email protected] <[email protected]> > Sent: Mon Mar 10 21:12:14 2008 > Subject: PROPOSAL: Summer of Code 2008 - Integrate Talend with Hadoop. > > I'd like to volunteer a proposal for the upcoming summer of code project. > > Talend is a open source (GPL) data integration tool used by companies to > transform data from one format to another. > > For example I might get 2-3 XML input files that I need to feed into a > database, or SOLR server. It works really well until you start bumping > into memory limits or time concerns when you handle large files. > > Enter hadoop. > > I'd would like to propose a project to write the necessary bits to make > talend jobs run on a hadoop cluster, possibly using things like pig. > > > While I understand this code will probably end up as a part of talend's > code base, I think it would be a neat project to expand hadoop's > presence in this space. > I'm willing to act as a mentor for it. (I've been a mentor for HTTP, and > lucene projects in the past) > > > regards > Ian > -- [EMAIL PROTECTED] Institute of Computing Technology, Chinese Academy of Sciences, Beijing.
