Re: Error in Cluster Startup: NameNode is not formatted

2009-06-26 Thread Matt Massie

Boyu-

You didn't do anything stupid.  I've forgotten to format a NameNode  
too myself.


If you check the QuickStart guide at http://hadoop.apache.org/core/docs/current/quickstart.html 
 you'll see that formatting the NameNode is the first of the  
Execution section (near the bottom of the page).


The command to format the NameNode is:

hadoop namenode -format

A warning though, you should only format your NameNode once.  Just  
like formatting any filesystem, you can loss data if you (re)format.


Good luck.

-Matt

On Jun 26, 2009, at 1:25 PM, Boyu Zhang wrote:


Hi all,

I am a student and I am trying to install the Hadoop on a cluster, I  
have

one machine running namenode, one running jobtracker, two slaves.

When I run the /bin/start-dfs.sh , there is something wrong with my
namenode, it won't start. Here is the error message in the log file:

ERROR org.apache.hadoop.fs.FSNamesystem: FSNamesystem initialization
failed.
java.io.IOException: NameNode is not formatted.
   at
org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:243)
   at
org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:80)
   at
org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:294)
   at  
org.apache.hadoop.dfs.FSNamesystem.init(FSNamesystem.java:273)

   at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148)
   at org.apache.hadoop.dfs.NameNode.init(NameNode.java:193)
   at org.apache.hadoop.dfs.NameNode.init(NameNode.java:179)
   at  
org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:830)

   at org.apache.hadoop.dfs.NameNode.main(NameNode.java:839)


I think it is something stupid i did, could somebody help me out?  
Thanks a

lot!


Sincerely,

Boyu Zhang




Re: Error in Cluster Startup: NameNode is not formatted

2009-06-26 Thread Matt Massie
The property dfs.name.dir allows you to control where Hadoop writes  
NameNode metadata.


You should have a property like

property
namedfs.name.dir/name
value/data/zhang/hadoop/name/data/value
/property

to make sure the NameNode data isn't being deleted when you delete the  
files in /tmp.


-Matt


On Jun 26, 2009, at 2:33 PM, Boyu Zhang wrote:


Matt,

Thanks a lot for your reply! I did formatted the namenode. But I got  
the
same error again. And actually I successfully run the example jar  
file once,
but after that one time, I couldn't get it run again. I clean the / 
tmp dir
every time before I format namenode again(I am just testing it, so I  
don't
worry about losing data:). Still, I got the same error when I  
execute the
bin/start-dfs.sh . I checked my conf, and I can't figure out why.  
Here is my

conf file:

I really appreciate if you could take a look at it. Thanks a lot.


configuration

property
namefs.default.name/name
valuehdfs://hostname1:9000/value
/property


property
namemapred.job.tracker/name
valuehostname2:9001/value
/property



property
 namedfs.data.dir/name
 value/data/zhang/hadoop/dfs/data/value
 descriptionDetermines where on the local filesystem an DFS data  
node

 should store its blocks.  If this is a comma-delimited
 list of directories, then data will be stored in all named
 directories, typically on different devices.
 Directories that do not exist are ignored.
 /description
/property


property
 namemapred.local.dir/name
 value/data/zhang/hadoop/mapred/local/value
 descriptionThe local directory where MapReduce stores intermediate
 data files.  May be a comma-separated list of
 directories on different devices in order to spread disk i/o.
 Directories that do not exist are ignored.
 /description
/property
/configuration


-Original Message-
From: Matt Massie [mailto:m...@cloudera.com]
Sent: Friday, June 26, 2009 4:31 PM
To: core-user@hadoop.apache.org
Subject: Re: Error in Cluster Startup: NameNode is not formatted

Boyu-

You didn't do anything stupid.  I've forgotten to format a NameNode
too myself.

If you check the QuickStart guide at
http://hadoop.apache.org/core/docs/current/quickstart.html
 you'll see that formatting the NameNode is the first of the
Execution section (near the bottom of the page).

The command to format the NameNode is:

hadoop namenode -format

A warning though, you should only format your NameNode once.  Just
like formatting any filesystem, you can loss data if you (re)format.

Good luck.

-Matt

On Jun 26, 2009, at 1:25 PM, Boyu Zhang wrote:


Hi all,

I am a student and I am trying to install the Hadoop on a cluster, I
have
one machine running namenode, one running jobtracker, two slaves.

When I run the /bin/start-dfs.sh , there is something wrong with my
namenode, it won't start. Here is the error message in the log file:

ERROR org.apache.hadoop.fs.FSNamesystem: FSNamesystem initialization
failed.
java.io.IOException: NameNode is not formatted.
  at
org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:243)
  at
org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:80)
  at
org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:294)
  at
org.apache.hadoop.dfs.FSNamesystem.init(FSNamesystem.java:273)
  at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148)
  at org.apache.hadoop.dfs.NameNode.init(NameNode.java:193)
  at org.apache.hadoop.dfs.NameNode.init(NameNode.java:179)
  at
org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:830)
  at org.apache.hadoop.dfs.NameNode.main(NameNode.java:839)


I think it is something stupid i did, could somebody help me out?
Thanks a
lot!


Sincerely,

Boyu Zhang







Re: UnknownHostException

2009-06-23 Thread Matt Massie
fs.default.name in your hadoop-site.xml needs to be set to a fully- 
qualified domain name (instead of an IP address)


-Matt

On Jun 23, 2009, at 6:42 AM, bharath vissapragada wrote:

when i try to execute the command bin/start-dfs.sh  , i get the  
following
error . I have checked the hadoop-site.xml file on all the nodes ,  
and they

are fine ..
can some-one help me out!

10.2.24.21: Exception in thread main java.net.UnknownHostException:
unknown host: 10.2.24.21.
10.2.24.21: at
org.apache.hadoop.ipc.Client$Connection.init(Client.java:195)
10.2.24.21: at
org.apache.hadoop.ipc.Client.getConnection(Client.java:779)
10.2.24.21: at org.apache.hadoop.ipc.Client.call(Client.java:704)
10.2.24.21: at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java: 
216)
10.2.24.21: at org.apache.hadoop.dfs. 
$Proxy4.getProtocolVersion(Unknown

Source)
10.2.24.21: at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
10.2.24.21: at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306)
10.2.24.21: at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343)
10.2.24.21: at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java: 
288)




Re: HDFS out of space

2009-06-22 Thread Matt Massie

Pankil-

I'd be interested to know the size of the /mnt and /mnt2 partitions.   
Are they the same?  Can you run the following and report the output...


% df -h /mnt /mnt2

Thanks.

-Matt

On Jun 22, 2009, at 1:32 PM, Pankil Doshi wrote:


Hey Alex,

Will Hadoop balancer utility work in this case?

Pankil

On Mon, Jun 22, 2009 at 4:30 PM, Alex Loddengaard  
a...@cloudera.com wrote:


Are you seeing any exceptions because of the disk being at 99%  
capacity?


Hadoop should do something sane here and write new data to the disk  
with
more capacity.  That said, it is ideal to be balanced.  As far as I  
know,
there is no way to balance an individual DataNode's hard drives  
(Hadoop

does
round-robin scheduling when writing data).

Alex

On Mon, Jun 22, 2009 at 10:12 AM, Kris Jirapinyo kjirapi...@biz360.com

wrote:



Hi all,
  How does one handle a mount running out of space for HDFS?  We  
have

two
disks mounted on /mnt and /mnt2 respectively on one of the  
machines that

are
used for HDFS, and /mnt is at 99% while /mnt2 is at 30%.  Is there  
a way

to
tell the machine to balance itself out?  I know for the cluster,  
you can
balance it using start-balancer.sh but I don't think that it will  
tell

the
individual machine to balance itself out.  Our hack right now  
would be

just to delete the data on /mnt, since we have replication of 3x, we

should

be OK.  But I'd prefer not to do that.  Any thoughts?







Re: Need help

2009-06-18 Thread Matt Massie
Hadoop can be run on a hardware heterogeneous cluster.  Currently,  
Hadoop clusters really only run well on Linux although you can run a  
Hadoop client on non-Linux machines.


You will need to have a special configuration for each of the machine  
in your cluster based on their hardware profile.  Ideally, you'll be  
able to group the machines in your cluster into classes of machines  
(e.g. machines with 1GB of RAM and 2 core versus 4GB of RAM and 4  
core) to reduce the burden of managing multiple configurations.  If  
you are talking about a Hadoop cluster that is completely  
heterogeneous (each machine is completely different), the management  
overhead could be high.


Configuration variables like mapred.tasktracker.map.tasks.maximum  
and mapred.tasktracker.reduce.tasks.maximum should be set based on  
the number of cores/memory in each machine.  Variables like  
mapred.child.java.opts need to be set differently based on the  
amount of memory the machine has (e.g. -Xmx250m).  You should have  
at least 250MB of memory dedicated to each task although more is  
better.  It's also wise to make sure that each task has the same  
amount of memory regardless of the machine it's scheduled on;  
otherwise, tasks might succeed or fail based on which machine gets the  
task.  This asymmetry will make debugging harder.


You can use our online configurator (http://www.cloudera.com/configurator/ 
), to generate optimized configurations for each class of machines in  
your cluster.  It will ask simple question about your configuration  
and then produce a hadoop-site.xml file.


Good luck!
-Matt

On Jun 18, 2009, at 8:33 AM, ashish pareek wrote:

Can you tell few of the challenges in configuring heterogeneous  
cluster...or

can pass on some link where I would get some information regarding
challenges in running Hadoop on heterogeneous hardware

One more things is How about running different applications on the  
same

Hadoop cluster?and what challenges are involved in it ?

Thanks,
Regards,
Ashish


On Thu, Jun 18, 2009 at 8:53 PM, jason hadoop  
jason.had...@gmail.comwrote:



I don't know anyone who has a completely homogeneous cluster.

So hadoop is scalable across heterogeneous environments.

I stated that configuration is simpler if the machines are similar  
(There

are optimizations in configuration for near homogeneous machines.)

On Thu, Jun 18, 2009 at 8:10 AM, ashish pareek pareek...@gmail.com
wrote:

Does that mean hadoop is not scalable wrt heterogeneous  
environment? and

one
more question is can we run different application on the same hadoop
cluster
.

Thanks.
Regards,
Ashish

On Thu, Jun 18, 2009 at 8:30 PM, jason hadoop  
jason.had...@gmail.com

wrote:



Hadoop has always been reasonably agnostic wrt hardware and

homogeneity.
There are optimizations in configuration for near homogeneous  
machines.




On Thu, Jun 18, 2009 at 7:46 AM, ashish pareek  
pareek...@gmail.com

wrote:


Hello,
  I am doing my master my final year project is on Hadoop

...so

I

would like to know some thing about Hadoop cluster i.e, Do new

version

of

Hadoop are able to handle heterogeneous hardware.If you have any
informantion regarding these please mail me as my project is in
heterogenous
environment.


Thanks!

Reagrds,
Ashish Pareek





--
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals







--
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals





Re: Small Issues..!

2009-06-15 Thread Matt Massie

On Jun 14, 2009, at 11:01 PM, Sugandha Naolekar wrote:


Hello!

I have a 4 node cluster of hadoop running. Now, there is 5th machine  
which

is acting as a  client of hadoop. It's not a part of the hadoop
cluster(master/slave config file). Now I have to writer a JAVA  code  
that
gets executed on this client which will simply put the client  
ystem's data
into HDFS(and get it replicated over 2 datanodes) and as per my  
requirement,

I can simply fetch it back on the client machine itself.

For this, I have done following things as of now::

***
- Among 4 nodes 2 are datanodes and ther oter 2 are namenode and  
jobtracker

respectively.
***

***
- Now, to make that code work on client machine, I have designed a  
UI. Now

here on the client m/c, do i need to install hadoop?
***


You will need to have the same version  of Hadoop installed on any  
client that need to communicate with the Hadoop cluster.




***
- I have installed hadoop on it, and in it's config file, I have  
specified

only 2 tags.
  1) fs.default.name- value=namenode's address.
  2) dfs.http.address(namenode's addres)
***


I'm assuming you mean that you have Hadoop installed on the client  
with a hadoop-site.xml (or core-site.xml) with the correct  
fs.default.name.  Correct?




***
Thus, If there is a file in /home/hadoop/test.java on client  
machine; I will

have 2 get the instance of HDFS fs by Filesystem.get. rt??
***


Before you begin writing special FileSystem Java code, I would do a  
quick sanity check of the client configuration.


Can you run the command...

% bin/hadoop fs -ls

...without error?

Can you -put files onto HDFS from the client...

% bin/hadoop fs -put src dst

...without error?

* You should also check your firewall rules between the client and  
NameNode.
* Make sure that the TCP port you specified in fs.default.name is open  
for connection from the client.
* Run netstat -t -l to make sure that the NameNode is running and  
listening on the TCP port you specified.


Only when you've ensured that the hadoop commandline works would I  
begin writing custom client code based on the FileSystem class.




***
Then, by using Filesystem.util, I will have to simply specify both the
fs::local as src, hdfs as destination, and src path as the
/home/hadoop/test.java and destination as /user/hadoop/. rt??
So it should work ...!
***

***
- But, it gives me an error as not able to find src path
/home/hadoop/test.java

- Will i have to use RPC classes and methods under hadoop api to do  
this.??

***


You should be able to just use the FileSystem class to w/o needing to  
use any RPC classes


FileSystem documentation:
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/fs/FileSystem.html




***
Things don;t seem to be working in any of the ways. Please help me  
out.

***

Thanks!




Re: Multiple NIC Cards

2009-06-10 Thread Matt Massie
If you look at the documentation for the getCanonicalHostName()  
function (thanks, Steve)...


http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#getCanonicalHostName()

you'll see two Java security properties (networkaddress.cache.ttl and  
networkaddress.cache.negative.ttl).


You might take a look at your /etc/nsswitch.conf configuration as well  
to learn how hosts are resolved on your machine, e.g...


$ grep hosts /etc/nsswitch.conf
hosts:  files dns

and lastly, you may want to check if you are running nscd (the  
NameService cache daemon).  If you are, take a look at /etc/nscd.conf  
for the caching policy it's using.


Good luck.

-Matt



On Jun 10, 2009, at 1:09 PM, John Martyniak wrote:

That is what I thought also, is that it needs to keep that  
information somewhere, because it needs to be able to communicate  
with all of the servers.


So I deleted the /tmp/had* and /tmp/hs* directories, removed the log  
files, and grepped for the duey name in all files in config.  And  
the problem still exists.  Originally I thought that it might have  
had something to do with multiple entries in the .ssh/ 
authorized_keys file but removed everything there.  And the problem  
still existed.


So I think that I am going to grab a new install of hadoop 0.19.1,  
delete the existing one and start out fresh to see if that changes  
anything.


Wish me luck:)

-John

On Jun 10, 2009, at 12:30 PM, Steve Loughran wrote:


John Martyniak wrote:
Does hadoop cache the server names anywhere?  Because I changed  
to using DNS for name resolution, but when I go to the nodes view,  
it is trying to view with the old name.  And I changed the hadoop- 
site.xml file so that it no longer has any of those values.


in SVN head, we try and get Java to tell us what is going on
http://svn.apache.org/viewvc/hadoop/core/trunk/src/core/org/apache/hadoop/net/DNS.java

This uses InetAddress.getLocalHost().getCanonicalHostName() to get  
the value, which is cached for life of the process. I don't know of  
anything else, but wouldn't be surprised -the Namenode has to  
remember the machines where stuff was stored.





John Martyniak
President/CEO
Before Dawn Solutions, Inc.
9457 S. University Blvd #266
Highlands Ranch, CO 80126
o: 877-499-1562
c: 303-522-1756
e: j...@beforedawnsoutions.com
w: http://www.beforedawnsolutions.com





Re: Monitoring hadoop?

2009-06-05 Thread Matt Massie

Anthony-

The ganglia web site is at http://ganglia.info/ with documentation in  
a wiki at http://ganglia.wiki.sourceforge.net/.  There is also a good  
wiki page at IBM as well http://www.ibm.com/developerworks/wikis/display/WikiPtype/ganglia 
 .  Ganglia packages are available for most distributions to help  
with installation so make sure to grep for ganglia with your favorite  
package manager (e.g. aptitude, yum, etc).  Ganglia will give you more  
information about your cluster than just Hadoop metrics.  You'll get  
CPU, load, memory, disk and network  monitoring as well for free.


You can see live demos of ganglia at http://ganglia.info/?page_id=69.

Good luck.

-Matt

On Jun 5, 2009, at 7:10 AM, Brian Bockelman wrote:


Hey Anthony,

Look into hooking your Hadoop system into Ganglia; this produces  
about 20 real-time statistics per node.


Hadoop also does JMX, which hooks into more enterprise-y  
monitoring systems.


Brian

On Jun 5, 2009, at 8:55 AM, Anthony McCulley wrote:


Hey all,
I'm currently tasked to come up with a web/flex-based
visualization/monitoring system for a cloud system using hadoop as  
part of a

university research project.  I was wondering if I could elicit some
feedback from all of you with regards to:


 - If you were an engineer of a cloud system running hadoop, what
 information would you be interested in capturing, viewing,  
monitoring, etc?
 - Is there any sort of real-time stats or monitoring currently  
available

 for hadoop?  if so, is in a web-friendly format?

Thanks in advance,

- Anthony






Re: Fastlz coming?

2009-06-04 Thread Matt Massie

Kris-

You might take a look at some of the previous lzo threads on this list  
for help.


See: http://www.mail-archive.com/search?q=lzol=core-user%40hadoop.apache.org

-Matt

On Jun 4, 2009, at 10:29 AM, Kris Jirapinyo wrote:

Is there any documentation on that site on how we can use lzo?  I  
don't see
any entries on the wiki page of the project.  I see an entry on the  
Hadoop
wiki (http://wiki.apache.org/hadoop/UsingLzoCompression) but seems  
like

that's more oriented towards HBase.  I am on hadoop 0.19.1.

Thanks,
Kris J.

On Thu, Jun 4, 2009 at 3:02 AM, Johan Oskarsson jo...@oskarsson.nu  
wrote:



We're using Lzo still, works great for those big log files:
http://code.google.com/p/hadoop-gpl-compression/

/Johan

Kris Jirapinyo wrote:

Hi all,
  In the remove lzo JIRA ticket
https://issues.apache.org/jira/browse/HADOOP-4874 Tatu mentioned  
he was
going to port fastlz from C to Java and provide a patch.  Has  
there been

any

updates on that?  Or is anyone working on any additional custom

compression

codecs?

Thanks,
Kris J.








Re: No route to host prevents from storing files to HDFS

2009-04-22 Thread Matt Massie
Stas-

Is it possible to paste the output from the following command on both your
DataNode and NameNode?

% route -v -n

-Matt


On Wed, Apr 22, 2009 at 4:36 PM, Stas Oskin stas.os...@gmail.com wrote:

 Hi.

 The way to diagnose this explicitly is:
  1) on the server machine that should be accepting connections on the
 port,
  telnet localhost PORT, and telnet IP PORT you should get a connection, if
  not then the server is not binding the port.
  2) on the remote machine verify that you can communicate to the server
  machine via normal tools such as ssh and or ping and or traceroute, using
  the IP address from the error message in your log file
  3) on the remote machine run telnet IP PORT. if (1) and (2) succeeded and
  (3) does not, then there is something blocking packets for the port range
  in
  question. If (3) does succeed then there is some probably interesting
  problem.
 

  Tried in step 3 to telnet both the 50010 and the 8010 ports of the
 problematic datanode - both worked.

 I agree there is indeed an interesting problem :). Question is how it can
 be
 solved.

 Thanks.



Re: No route to host prevents from storing files to HDFS

2009-04-22 Thread Matt Massie
Just for clarity: are you using any type of virtualization (e.g. vmware,
xen) or just running the DataNode java process on the same machine?

What is fs.default.name set to in your hadoop-site.xml?

-Matt


On Wed, Apr 22, 2009 at 5:22 PM, Stas Oskin stas.os...@gmail.com wrote:

 Hi.

 Is it possible to paste the output from the following command on both your
  DataNode and NameNode?
 
  % route -v -n
 

 Sure, here it is:

 Kernel IP routing table
 Destination Gateway Genmask Flags Metric RefUse
 Iface
 192.168.253.0   0.0.0.0 255.255.255.0   U 0  00
 eth0
 169.254.0.0 0.0.0.0 255.255.0.0 U 0  00
 eth0
 0.0.0.0 192.168.253.1   0.0.0.0 UG0  00
 eth0


 As you might recall, the problematic data node runs in same server as the
 NameNode.

 Regards.