Question about upgrading

2009-04-02 Thread Usman Waheed
Hello, I have a 5 node cluster with one master node. I am upgrading from 16.4 to 18.3 but am a little confused if i am doing it the right way. I read up on the documentatin and how to use the -upgrade switch but want to make sure i havent missed any step. First i took down the cluster by

Hadoop Reduce Job errors, job gets killed.

2009-04-06 Thread Usman Waheed
Hi, My Hadoop Map/Reduce is giving the following error message right about when it is 95% complete with the reducing step on one node. The process gets killed. The error message from the logs are noted below. *java.io.IOException: Filesystem closed*, any ideas please? 2009-04-06 10:41:07,202

One of my data nodes is under used: Datanode 4

2009-04-26 Thread Usman Waheed
Hi, One of my data nodes is practically under utilized in the cluster of 4 datanodes and one namenode. I executed the hadoop -setrep -w 2 changing it from rep factor 3 for all files in the HDFS. All the nodes are balancer except one. Any clues? Shall i run hadoop balancer after the rep

Re: One of my data nodes is under used: Datanode 4

2009-04-26 Thread Usman Waheed
: hadoop dfsadmin -report On Sun, Apr 26, 2009 at 9:18 AM, Mithila Nagendra mnage...@asu.eduwrote: How do I list all the datanodes like you ve shown in your example? Thanks Mithila On Sun, Apr 26, 2009 at 6:54 PM, Usman Waheed usm...@opera.com wrote: Hi, One of my data nodes

Balancing datanodes - Running hadoop 0.18.3

2009-04-27 Thread Usman Waheed
Hi, I had sent out an email yesterday asking about how to balance the cluster after setting the replication level to 2. I have 4 datanodes and one namenode in my setup. Using the -R switch with -setrep did the trick but one of my nodes became under utilized. I then ran hadoop balancer and it

Re: Balancing datanodes - Running hadoop 0.18.3

2009-04-27 Thread Usman Waheed
is less than 4%. You can use a different threshold than the default 10% (hadoop balancer -threshold 5). Read more here: http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Rebalancer Tamir On Mon, Apr 27, 2009 at 11:36 AM, Usman Waheed usm...@opera.com wrote: Hi, I had sent out

Can i make a node just an HDFS client to put/get data into hadoop

2009-04-29 Thread Usman Waheed
Hi All, Is it possible to make a node just a hadoop client so that it can put/get files into HDFS but not act as a namenode or datanode? I already have a master node and 3 datanodes but need to execute puts/gets into hadoop in parallel using more than just one machine other than the master.

Re: Can i make a node just an HDFS client to put/get data into hadoop

2009-04-29 Thread Usman Waheed
can use) are on same the LAN. This will give us the ability to put a multitude of files into HDFS quickly. Usman Waheed wrote: Hi All, Is it possible to make a node just a hadoop client so that it can put/get files into HDFS but not act as a namenode or datanode? I already have a master node

Re: Can i make a node just an HDFS client to put/get data into hadoop

2009-04-29 Thread Usman Waheed
and it worked. It worked. All our machines (potential clients we can use) are on same the LAN. This will give us the ability to put a multitude of files into HDFS quickly. Usman Waheed wrote: Hi All, Is it possible to make a node just a hadoop client so that it can put/get files into HDFS

Multiple HDFS clients

2009-05-01 Thread Usman Waheed
Hi, I just wanted to share a test we conducted in our small cluster of 3 datanodes and one namenode. Basically we have lots of data to process and we run a parsing script outside hadoop that creates the key,value pairs. This output which is plain txt files is then imported into hadoop

Re: Multiple HDFS clients

2009-05-01 Thread Usman Waheed
then also gage performance by running the MAP/REDUCE both. Thanks, Usman On Fri, May 1, 2009 at 4:22 AM, Usman Waheed usm...@opera.com wrote: Hi, I just wanted to share a test we conducted in our small cluster of 3 datanodes and one namenode. Basically we have lots of data to process and we

Opera Software AS - Job Opening: Hadoop Engineer

2009-06-03 Thread Usman Waheed
Greetings All, Opera Software AS (www.opera.com) in Oslo/Norway is looking for an experienced Hadoop Engineer to join the Statistics Team in order to provide business intelligence metrics both internally and to our customers. If you have the experience and are willing to relocate to beautiful

Re: How to place a data ito HDFS::!

2009-06-05 Thread Usman Waheed
I have setup machines just to act as HADOOP clients which are not part of the actual cluster (master/slave config). The only thing is that these machines acting as hadoop clients were all internal in our network and I have not tested with remote machines outside our internal LAN. My assumption

Re: Placing data into HDFS..!

2009-06-08 Thread Usman Waheed
If you are going to be using this 8th machine as a client only then ensure that it is running the same version of hadoop as your cluster. In the config file hadoop-site.xml point fs.default.name to the namenode. -Usman Hello! I have a 7 node cluster. But there is one remote node(8th machine)

Re: HDFS out of space

2009-06-22 Thread Usman Waheed
I have used the balancer to balance the data in the cluster with the -threshold option. The bandwidth transfer was set to 1MB/sec ( I think thats the default setting) in one of the config files and had to move 500GB of data around. It did take sometime but eventually the data got spread

Are .bz2 extensions supported in Hadoop 18.3

2009-06-24 Thread Usman Waheed
Hi All, Can I map/reduce logs that have the .bz2 extension in Hadoop 18.3? I tried but interestingly the output was not what i expected versus what i got when my data was in uncompressed format. Thanks, Usman

Re: Are .bz2 extensions supported in Hadoop 18.3

2009-06-24 Thread Usman Waheed
Gross -Original Message- From: Usman Waheed [mailto:usm...@opera.com] Sent: Wednesday, June 24, 2009 10:09 AM To: core-user@hadoop.apache.org Subject: Re: Are .bz2 extensions supported in Hadoop 18.3 The version (18.3) i am running in my cluster is the tar ball i got from

Re: Are .bz2 extensions supported in Hadoop 18.3

2009-06-24 Thread Usman Waheed
might try Pig (this is such a cool platform!) Hope it helps. Best regards, Danny -Original Message- From: Usman Waheed [mailto:usm...@opera.com] Sent: Wednesday, June 24, 2009 10:32 AM To: core-user@hadoop.apache.org Subject: Re: Are .bz2 extensions supported in Hadoop 18.3 Hi Danny

Re: Are .bz2 extensions supported in Hadoop 18.3

2009-06-24 Thread Usman Waheed
download RPMs and Ubuntu packages as well as preconfigured EC2 images from: http://www.cloudera.com/hadoop Cheers, Christophe On Wed, Jun 24, 2009 at 6:47 AM, jason hadoopjason.had...@gmail.com wrote: I believe the cloudera 18.3 supports bzip2 On Wed, Jun 24, 2009 at 3:45 AM, Usman Waheed usm

Re: Rebalancing Hadoop Cluster running 15.3

2009-06-25 Thread Usman Waheed
Hi Tom, Thanks for the trick :). I tried by setting the replication to 3 in the hadoop-default.xml but then the namenode-logfile in /var/log/hadoop started getting full with the messages marked in bold: 2009-06-24 14:39:06,338 INFO org.apache.hadoop.dfs.StateChange: STATE*

Re: Rebalancing Hadoop Cluster running 15.3

2009-06-25 Thread Usman Waheed
Thanks much, Cheers, Usman You can change the value of hadoop.root.logger in conf/log4j.properties to change the log level globally. See also the section Custom Logging levels in the same file to set levels on a per-component basis. You can also use hadoop daemonlog to set log levels on a

Error while trying to run map/reduce job

2009-06-26 Thread Usman Waheed
Hi All, On one of the test clusters when i try to launch map/reduce job it fails with the following error. / I am getting the following error in my jobtracker.log on the namenode:/ 2009-06-26 15:20:12,811 INFO org.apache.hadoop.mapred.JobTracker: Adding task

Map/Reduce Errors

2009-06-26 Thread Usman Waheed
Hi All, I had posted a question earlier regarding some not so intuitive error messages that I was getting on one of the clusters when trying to map/reduce. After many hours of googling :) i found a post that solved my problem.