Thank you both Aaron and Sonal for your precious comments and contributions.
I'll check both the projects and try to make a design decision. I'm familiar with the sqoop and just heard about hiho. Sonal: I guess what hiho is a single map/reduce job handling the MySQL hadoop Integration. Is it also possible to use it with other JDBC connectors too? Best Regards, Utku On Fri, Mar 19, 2010 at 5:07 AM, Sonal Goyal <[email protected]> wrote: > Hi Utku, > > If MySQL is your target database, you may check Meghsoft's hiho: > > http://code.google.com/p/hiho/ > > The current release supports transferring data from Hadoop to the MySQL > database. We will be releasing the functionality of transfer from MySQL to > Hadoop soon, sometime next week. > > Thanks and Regards, > Sonal > www.meghsoft.com > > > On Thu, Mar 18, 2010 at 5:31 AM, Aaron Kimball <[email protected]> wrote: > > > Hi Utku, > > > > Apache Hadoop 0.20 cannot support Sqoop as-is. Sqoop makes use of the > > DataDrivenDBInputFormat (among other APIs) which are not shipped with > > Apache's 0.20 release. In order to get Sqoop working on 20, you'd need to > > apply a lengthy list of patches from the project source repository to > your > > copy of Hadoop and recompile. Or you could just download it all from > > Cloudera, where we've done that work for you :) > > > > So as it stands, Sqoop won't be able to run on 0.20 unless you choose to > > use > > Cloudera's distribution. Do note that your use of the term "fork" is a > bit > > strong here; with the exception of (minor) modifications to make it > > interact > > in a more compatible manner with the external Linux environment, our > > distribution only includes code that's available to the project at large. > > But some of that code has not been rolled into a binary release from > Apache > > yet. If you choose to go with Cloudera's distribution, it just means that > > you get publicly-available features (like Sqoop, MRUnit, etc.) a year or > so > > ahead of what Apache has formally released, but our codebase isn't > > radically > > diverging; CDH is just somewhere ahead of the Apache 0.20 release, but > > behind Apache's svn trunk. (All of Sqoop, MRUnit, etc. are available in > the > > Hadoop source repository on the trunk branch.) > > > > If you install our distribution, then Sqoop will be installed in > > /usr/lib/hadoop-0.20/contrib/sqoop and /usr/bin/sqoop for you. There > isn't > > a > > separate package to install Sqoop independent of the rest of CDH; thus no > > extra download link on our site. > > > > I hope this helps! > > > > Good luck, > > - Aaron > > > > > > On Wed, Mar 17, 2010 at 4:30 AM, Reik Schatz <[email protected]> > wrote: > > > > > At least for MRUnit, I was not able to find it outside of the Cloudera > > > distribution (CDH). What I did: installing CDH locally using apt > > (Ubuntu), > > > searched for and copied the mrunit library into my local Maven > > repository, > > > and removed CDH after. I guess the same is somehow possible for Sqoop. > > > > > > /Reik > > > > > > > > > Utku Can Topçu wrote: > > > > > >> Dear All, > > >> > > >> I'm trying to run tests using MySQL as some kind of a datasource, so I > > >> thought cloudera's sqoop would be a nice project to have in the > > >> production. > > >> However, I'm not using the cloudera's hadoop distribution right now, > and > > >> actually I'm not thinking of switching from a main project to a fork. > > >> > > >> I read the documentation on sqoop at > > >> http://www.cloudera.com/developers/downloads/sqoop/ but there are > > >> actually > > >> no links for downloading the sqoop itself. > > >> > > >> Has anyone here know, and tried to use sqoop with the latest apache > > >> hadoop? > > >> If so can you give me some tips and tricks on it? > > >> > > >> Best Regards, > > >> Utku > > >> > > >> > > > > > > -- > > > > > > *Reik Schatz* > > > Technical Lead, Platform > > > P: +46 8 562 470 00 > > > M: +46 76 25 29 872 > > > F: +46 8 562 470 01 > > > E: [email protected] <mailto:[email protected]> > > > */bwin/* Games AB > > > Klarabergsviadukten 82, > > > 111 64 Stockholm, Sweden > > > > > > [This e-mail may contain confidential and/or privileged information. If > > you > > > are not the intended recipient (or have received this e-mail in error) > > > please notify the sender immediately and destroy this e-mail. Any > > > unauthorised copying, disclosure or distribution of the material in > this > > > e-mail is strictly forbidden.] > > > > > > > > >
