Hitting the database from multiple mappers is not such a great idea IF there 
are hundreds/thousands of mappers involved processing hundreds of GBs. of data. 
This could easily saturate the I/O bandwidth of the database server creating a  
bottleneck in the processing.  Export and dump to HDFS is a better option

-...@nkur

On 11/3/10 5:02 PM, "Anze" <[email protected]> wrote:

Sonal,

Thanks for answering!

Hiho sounds nice, but from what I gathered, it is more a low-level interface
for efficient loading from and storing to SQL DBs?
(in other words, there is no loader and storage for Pig yet)

I wrote a batch to export DB to local files and then copy them to HDFS, so
there is no gain for me in using another type of export (unless it can be used
directly from Pig and/or keeps the schema intact), but it's nice to know it
exists.

It just seems weird that there is no DB loader for Pig yet. I tried writing it
but it would take more time than I have at the moment... I have a problem to
solve ASAP. :)

Thanks,

Anze



On Wednesday 03 November 2010, Sonal Goyal wrote:
> Anze,
>
> You can check hiho as well:
>
> http://code.google.com/p/hiho/wiki/DatabaseImportFAQ
>
> Let me know if you need any help.
>
> Thanks and Regards,
> Sonal
>
> Sonal Goyal | Founder and CEO | Nube Technologies LLP
> http://www.nubetech.co | http://in.linkedin.com/in/sonalgoyal
>
>
>
>
>
> 2010/11/3 Anze <[email protected]>
>
> > Alejandro, thanks for answering!
> >
> > I was hoping it could be done directly from Pig, but... :)
> >
> > I'll take a look at Sqoop then, and if that doesn't help, I'll just write
> > a simple batch to export data to TXT/CSV. Thanks for the pointer!
> >
> > Anze
> >
> > On Wednesday 03 November 2010, Alejandro Abdelnur wrote:
> > > Not a 100% Pig solution, but you could use Sqoop to get the data in as
> > > a pre-processing step. And if you want to handle all as single job,
> > > you
> >
> > could
> >
> > > use Oozie to create a workflow that does Sqoop and then your Pig
> > > processing.
> > >
> > > Alejandro
> > >
> > > On Wed, Nov 3, 2010 at 3:22 PM, Anze <[email protected]> wrote:
> > > > Hi!
> > > >
> > > > Part of data I have resides in MySQL. Is there a loader that would
> >
> > allow
> >
> > > > loading directly from it?
> > > >
> > > > I can't find anything on the net, but it seems to me this must be a
> >
> > quite
> >
> > > > common problem.
> > > > I checked piggybank but there is only DBStorage (and no DBLoader).
> > > >
> > > > Is some DBLoader out there too?
> > > >
> > > > Thanks,
> > > >
> > > > Anze


Reply via email to