Eric,
   Thanks for the details. I took a quick look at the link and it seems like
a tool that I would help me here.  Do I need to download whole Cloudera's
Distribution for Hadoop <http://www.cloudera.com/hadoop> just to get sqoop?
I already have Hadoop, Hive and Pig setup completed.
I appreciate your input,
Shiva

On Fri, Jan 22, 2010 at 1:53 PM, Eric Sammer <[email protected]> wrote:

> On 1/22/10 3:09 PM, Shiva wrote:
> > I can try that. Here is what I am trying to do.
> >
> > Load some fact data from a file (say weblogs moved to HDFS after some
> > cleanup and transformation) and then do summarization at daily or weekly
> > level. In that case, I would like to create a one fact table which get
> > loaded with daily data and bring dimensional data from MySQL to perform
> > summarization.
> >
> > I appreciate any input on this technique, performance and how I can get
> > dimensional data to Hive (from MySQL -> to file -> HDFS -> Hive).
> > Thanks,
> > Shiva
>
> Shiva:
>
> This is very common. I use Hive to do something very similar.
>
> Cloudera has a tool called sqoop that will "export" MySQL tables to
> files on HDFS that Hive can understand. Once there, you can easily join
> the data in your Hive queries.
>
> http://www.cloudera.com/hadoop-sqoop
>
> Sqoop is smarter than just doing an export to a local file system and
> then copying to HDFS and should save you a fair amount of time and
> effort. Check out the link.
>
> Hope this helps.
> --
> Eric Sammer
> [email protected]
> http://esammer.blogspot.com
>

Reply via email to