SQL DB Loader?

Jai Krishna Thu, 04 Nov 2010 04:39:52 -0700

Ankur,

In this case, there is no data on the grid a priori; the data has to come into 
the grid from a DB.
So what would the C/M mappers run on?  Is there a way to run say 5 mappers 
without having 5 blocks of data on HDFS?

Just trying to wrap my head around this; pl. excuse me if Im missing something 
obvious.

Thanks
Jai

On 11/3/10 7:48 PM, "Ankur C. Goel" <[email protected]> wrote:

Hitting the database from multiple mappers is not such a great idea IF there 
are hundreds/thousands of mappers involved processing hundreds of GBs. of data. 
This could easily saturate the I/O bandwidth of the database server creating a  
bottleneck in the processing.  Export and dump to HDFS is a better option

-...@nkur

On 11/3/10 5:02 PM, "Anze" <[email protected]> wrote:

Sonal,

Thanks for answering!

Hiho sounds nice, but from what I gathered, it is more a low-level interface
for efficient loading from and storing to SQL DBs?
(in other words, there is no loader and storage for Pig yet)

I wrote a batch to export DB to local files and then copy them to HDFS, so
there is no gain for me in using another type of export (unless it can be used
directly from Pig and/or keeps the schema intact), but it's nice to know it
exists.

It just seems weird that there is no DB loader for Pig yet. I tried writing it
but it would take more time than I have at the moment... I have a problem to
solve ASAP. :)

Thanks,

Anze

On Wednesday 03 November 2010, Sonal Goyal wrote:
> Anze,
>
> You can check hiho as well:
>
> http://code.google.com/p/hiho/wiki/DatabaseImportFAQ
>
> Let me know if you need any help.
>
> Thanks and Regards,
> Sonal
>
> Sonal Goyal | Founder and CEO | Nube Technologies LLP
> http://www.nubetech.co | http://in.linkedin.com/in/sonalgoyal
>
>
>
>
>
> 2010/11/3 Anze <[email protected]>
>
> > Alejandro, thanks for answering!
> >
> > I was hoping it could be done directly from Pig, but... :)
> >
> > I'll take a look at Sqoop then, and if that doesn't help, I'll just write
> > a simple batch to export data to TXT/CSV. Thanks for the pointer!
> >
> > Anze
> >
> > On Wednesday 03 November 2010, Alejandro Abdelnur wrote:
> > > Not a 100% Pig solution, but you could use Sqoop to get the data in as
> > > a pre-processing step. And if you want to handle all as single job,
> > > you
> >
> > could
> >
> > > use Oozie to create a workflow that does Sqoop and then your Pig
> > > processing.
> > >
> > > Alejandro
> > >
> > > On Wed, Nov 3, 2010 at 3:22 PM, Anze <[email protected]> wrote:
> > > > Hi!
> > > >
> > > > Part of data I have resides in MySQL. Is there a loader that would
> >
> > allow
> >
> > > > loading directly from it?
> > > >
> > > > I can't find anything on the net, but it seems to me this must be a
> >
> > quite
> >
> > > > common problem.
> > > > I checked piggybank but there is only DBStorage (and no DBLoader).
> > > >
> > > > Is some DBLoader out there too?
> > > >
> > > > Thanks,
> > > >
> > > > Anze

Re: MySQL / JDBC / SQL DB Loader?

Reply via email to