Thanks a lot Bijoy, that makes sense :)

Suppose if I have Mysql database in some other node(not in hadoop cluster),
can I import the tables using sqoop to my HDFS?


On Thu, Mar 15, 2012 at 6:27 PM, Bejoy Ks <bejoy.had...@gmail.com> wrote:

> Hi Manu
>      Please find my responses inline
>
> >I had read about we can install Pig, hive & Sqoop on the client node, no
> need to install it in cluster. What is the client node actually? Can I use
> my management-node as a client?
>
> On larger clusters we have different node that is out of hadoop cluster and
> these stay in there. So user programs would be triggered from this node.
> This is the node refereed to as client node/ edge node etc . For your
> cluster management node and client node can be the same
>
> >What is the best practice to install Pig, Hive, & Sqoop?
>
> On a client node
>
> >For the fully distributed cluster do we need to install Pig, Hive, & Sqoop
> >in each nodes?
>
> No, can be on a client node or on any of the nodes
>
> >Mysql is needed for Hive as a metastore and sqoop can import mysql
> database
> to HDFS or hive or pig, so can we make use of mysql DB's residing on
> another node?
> Regarding your first point, SQOOP import is for different purpose, to get
> data from RDBNS into hdfs. But the meta stores is used by hive  in framing
> the map reduce jobs corresponding to your hive query. Here SQOOP can't help
> you much
> Recommend to have the metastore db of hive on the same node where hive is
> installed as for execution hive queries there is meta data look up required
> much especially when your table has large number of partitions and all.
>
> Regards
> Bejoy.K.S
>
> On Thu, Mar 15, 2012 at 5:34 PM, Manu S <manupk...@gmail.com> wrote:
>
> > Greetings All !!!
> >
> > I am using Cloudera CDH3 for Hadoop deployment. We have 7 nodes, in
> which 5
> > are used for a fully distributed cluster, 1 for pseudo-distributed & 1 as
> > management-node.
> >
> > Fully distributed cluster: HDFS, Mapreduce & Hbase cluster
> > Pseudo distributed mode: All
> >
> > I had read about we can install Pig, hive & Sqoop on the client node, no
> > need to install it in cluster. What is the client node actually? Can I
> use
> > my management-node as a client?
> >
> > What is the best practice to install Pig, Hive, & Sqoop?
> > For the fully distributed cluster do we need to install Pig, Hive, &
> Sqoop
> > in each nodes?
> >
> > Mysql is needed for Hive as a metastore and sqoop can import mysql
> database
> > to HDFS or hive or pig, so can we make use of mysql DB's residing on
> > another node?
> >
> > --
> > Thanks & Regards
> > ----
> > Manu S
> > SI Engineer - OpenSource & HPC
> > Wipro Infotech
> > Mob: +91 8861302855                Skype: manuspkd
> > www.opensourcetalk.co.in
> >
>



-- 
Thanks & Regards
----
Manu S
SI Engineer - OpenSource & HPC
Wipro Infotech
Mob: +91 8861302855                Skype: manuspkd
www.opensourcetalk.co.in

Reply via email to