Hadoop is a great way to offload ETL jobs (especially aggregation) out of the DB. More than likely you would want to use Hadoop as a mechanism to create a file you can load into the database as a batch job (data pump or sql loader with Oracle for example) outside of Hadoop entirely. I would imagine establishing connections inside map/reduce jobs would not be ideal.
Regards, Ryan -----Original Message----- From: Chris K Wensel [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 02, 2008 11:31 AM To: core-user@hadoop.apache.org Subject: Re: hadoop in the ETL process If your referring to loading an RDBMS with data on Hadoop, this is doable. but you will need to write your own JDBC adapters to your tables. But you might review what you are using the RDBMS for and see if those jobs would be better off running on Hadoop entirely, if not for most of the processing. ckw On Jul 2, 2008, at 10:51 AM, David J. O'Dell wrote: > Is anyone using hadoop for any part of the ETL process? > > Given its ability to process large amounts of log files this seems > like > a good fit. > > -- > David O'Dell > Director, Operations > e: [EMAIL PROTECTED] > t: (415) 738-5152 > 180 Townsend St., Third Floor > San Francisco, CA 94107 > -- Chris K Wensel [EMAIL PROTECTED] http://chris.wensel.net/ http://www.cascading.org/