+1. Would love to see more such projects' integration with hadoop. -milind
----- Original Message ----- From: Ian Holsman <[EMAIL PROTECTED]> To: [email protected] <[email protected]> Sent: Mon Mar 10 21:12:14 2008 Subject: PROPOSAL: Summer of Code 2008 - Integrate Talend with Hadoop. I'd like to volunteer a proposal for the upcoming summer of code project. Talend is a open source (GPL) data integration tool used by companies to transform data from one format to another. For example I might get 2-3 XML input files that I need to feed into a database, or SOLR server. It works really well until you start bumping into memory limits or time concerns when you handle large files. Enter hadoop. I'd would like to propose a project to write the necessary bits to make talend jobs run on a hadoop cluster, possibly using things like pig. While I understand this code will probably end up as a part of talend's code base, I think it would be a neat project to expand hadoop's presence in this space. I'm willing to act as a mentor for it. (I've been a mentor for HTTP, and lucene projects in the past) regards Ian
