subject:"Hadoop install"

Re: Hadoop install

2012-02-18 Thread Keith Wiley

I always use the Cloudera packages, CDH3 I think it's called...but it isn't the 
latest by any shot.  It's still .20.  I think Hadoop is nearly to .23 although 
I'm not proficient on those kinds of details.  I mentioned Cloudera's 
distribution because it falls into place pretty smoothly.  For example, a few 
weeks ago and downloaded and installed it on a Mac in a few hours (and ran an 
example) and then installed it on a Tier 3 Linux VM and had it running examples 
there too.

On Feb 18, 2012, at 06:24 , Mohit Anchlia wrote:

 What's the best way or guide to install latest hadoop. Is the latest Hadoop
 still .20 which comes up in google search. Could someone guide me with the
 latest hadoop distribution. I also need pig and mahout xmlinputformat.



Keith Wiley kwi...@keithwiley.com keithwiley.commusic.keithwiley.com

It's a fine line between meticulous and obsessive-compulsive and a slippery
rope between obsessive-compulsive and debilitatingly slow.
   --  Keith Wiley

Re: Hadoop install

2012-02-18 Thread Mohit Anchlia

Thanks Do I have to do something special to get Mahout xmlinput format and
Pig with the new release of hadoop?

On Sat, Feb 18, 2012 at 6:42 AM, Tom Deutsch tdeut...@us.ibm.com wrote:

 Mohit - one place to start is here;

 http://hadoop.apache.org/common/releases.html#Download

 The release notes, as always, are well worth reading.

 
 Tom Deutsch
 Program Director
 Information Management
 Big Data Technologies
 IBM
 3565 Harbor Blvd
 Costa Mesa, CA 92626-1420
 tdeut...@us.ibm.com




 Mohit Anchlia mohitanch...@gmail.com
 02/18/2012 06:24 AM
 Please respond to
 common-user@hadoop.apache.org


 To
 common-user@hadoop.apache.org
 cc

 Subject
 Hadoop install






 What's the best way or guide to install latest hadoop. Is the latest
 Hadoop
 still .20 which comes up in google search. Could someone guide me with the
 latest hadoop distribution. I also need pig and mahout xmlinputformat.

Hi, I'm graduate student and I have one question, multiple hadoop install.

2011-03-01 Thread Sungho Jeon

Hi, I'm graduate student and my major is computer science, data mining.
Is that possible that install multiple hadoop in one node?


I mean, I want to install several hadoop that have different conf.
Specifically, one hadoop has 5 datanode and other hadoop has 10 datanode.


Of course I can control number of datanode by change conf and restart.
But, without changing conf, install multiple hadoop in one node is possible?

Thanks

Re: Hi, I'm graduate student and I have one question, multiple hadoop install.

2011-03-01 Thread Harsh J

Hello,

On Tue, Mar 1, 2011 at 5:49 PM, Sungho Jeon sdev...@gmail.com wrote:
 Hi, I'm graduate student and my major is computer science, data mining.
 Is that possible that install multiple hadoop in one node?

It is possible, but what would you gain from this? One DN can handle
several disks parallely just fine I think.

 I mean, I want to install several hadoop that have different conf.
 Specifically, one hadoop has 5 datanode and other hadoop has 10 datanode.

This is possible with a bunch of datanode configuration tweaks (Data
directories, ipc and http ports, logging directories, etc.).

 Of course I can control number of datanode by change conf and restart.
 But, without changing conf, install multiple hadoop in one node is possible?

Not possible without certain set of configuration changes per
instance. You may make use of a set of directories and using
`hadoop-daemon.sh --config conf_dir_for_this_instance start
datanode` to start each of the DN though.

-- 
Harsh J
www.harshj.com

Re: Hi, I'm graduate student and I have one question, multiple hadoop install.

2011-03-01 Thread Matthew Foley

Hi Sungho,
Here is a recipe for how to run multiple nodes on a single server, posted to
this list on Sept. 15:
http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201009.mbox/%3c8a898c33-dc4e-418c-adc0-5689d434b...@yahoo-inc.com%3E

For v22 and later, the world has been split into three parts; where there was
formerly HADOOP_HOME, there is now HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, and
HADOOP_MAPRED_HOME, and in the default configuration each of them has its own
conf/ subdirectory. However, it is acceptable to pile all the contents of
the three conf directories into a single conf directory somewhere else (the
only name conflict is configuration.xsl which can be shared), set an
environment variable $HADOOP_CONF_DIR to point to it, and pass that value in
with the --config option whenever you launch processes with bin/hadoop or
bin/hdfs.

Now, the above recipe assumes you want multiple nodes from ONE cluster running
on a single server. I suggest you start with that and get it working, so you
understand the hdfs-site.xml file and how it is used.

You seem to be asking to run multiple CLUSTERS on a single server. I believe
the same mechanism will work (pointing different node invocations at
different config directories), but you will need to make several more changes
in the $HADOOP_CONF_DIR/hdfs-site.xml files, to create different namenode
configurations as well as the different datanode configurations addressed in
the recipe. Please look at the documentation for which parameters to change.

A couple comments:
- You probably can't run two namenodes simultaneously in the same server,
unless it has a huge amount of memory and you don't care about performance.
But you can have two different configurations stored, and run them at different
times.
- If the ONLY difference in the two clusters is the number of datanodes, you
actually don't have to have different namenode configurations. You can just
configure 10 datanodes, and then sometimes run only 5 of them (clearing storage
in between test runs, of course, so it doesn't look like you lost half your
stored blocks!). This is because namenodes have no configuration for which or
how many datanodes to expect; namenodes simply accept registration from any
datanode that initiates communication with it.
- Your statement I can control number of datanode by change conf and restart
is therefore not entirely correct. Each datanode launched has to be pointed at
its own config, but there is no place in the config to define how many
datanodes to launch. (This is partly because running multiple nodes on a single
server is not considered normal for a production environment, even though it is
useful for a test environment.) You may be thinking of the slaves file, which
is used by some launch scripts, but that is a tool to assist users in launching
clusters, not part of namenode configuration, and is also not really oriented
toward launching multiple nodes in a single server, if you read the scripts.

If you want launch scripts to help you locally launch different numbers of
nodes with different configs, you'll have to write them yourself, but they're
really easy. They just consist of multiple lines that look like
$HADOOP_COMMON_HOME/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script
$HADOOP_HDFS_HOME/bin/hdfs start datanode|namenode [args]
with different values of $HADOOP_CONF_DIR for each line.

The same lines with stop instead of start will give you a well-behaved kill
script.
As always you have to start and stop each node with appropriate userId so they
have read/write and i/o access permissions.

Hope this helps,
--Matt

On Mar 1, 2011, at 4:19 AM, Sungho Jeon wrote:

Hi, I'm graduate student and my major is computer science, data mining.
Is that possible that install multiple hadoop in one node?

I mean, I want to install several hadoop that have different conf.
Specifically, one hadoop has 5 datanode and other hadoop has 10 datanode.

Of course I can control number of datanode by change conf and restart.
But, without changing conf, install multiple hadoop in one node is possible?

Thanks

Re: Hadoop install

Re: Hadoop install

Hi, I'm graduate student and I have one question, multiple hadoop install.

Re: Hi, I'm graduate student and I have one question, multiple hadoop install.

Re: Hi, I'm graduate student and I have one question, multiple hadoop install.

5 matches

Site Navigation

Mail list logo

Footer information