Re: Hadoop 2.2.0 from source configuration

Daniel Savard Tue, 03 Dec 2013 16:02:27 -0800

Adam,

here is the link:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html


Then, since it didn't work I tried a number of things, but my configuration
files are really skinny and there isn't much stuff in it.

-----------------
Daniel Savard


2013/12/3 Adam Kawa <[email protected]>

> Could you please send me a link to the documentation that you followed to
> setup your single-node cluster?
> I will go through it and do it step by step, so hopefully at the end your
> issue will be solved and the documentation will be improved.
>
> If you have any non-standard settings in core-site.xml, hdfs-site.xml and
> hadoop-env.sh (that were not suggested by the documentation that you
> followed), then please share them.
>
>
> 2013/12/3 Daniel Savard <[email protected]>
>
>> Adam,
>>
>> that's not the issue, I did substitute the name in the first report. The
>> actual hostname is feynman.cids.ca.
>>
>> -----------------
>> Daniel Savard
>>
>>
>> 2013/12/3 Adam Kawa <[email protected]>
>>
>>> Daniel,
>>>
>>> I see that in previous hdfs report, you had: hosta.subdom1.tld1, but
>>> now you have feynman.cids.ca. What is the content of your /etc/hosts
>>> file, and output of $hostname command?
>>>
>>>
>>>
>>>
>>> 2013/12/3 Daniel Savard <[email protected]>
>>>
>>>> I did that more than once, I just retry it from the beginning. I zapped
>>>> the directories and recreated them with hdfs namenode -format and restarted
>>>> HDFS and I am still getting the very same error.
>>>>
>>>> I have posted previously the report. Is there anything in this report
>>>> that indicates I am not having enough free space somewhere? That's the only
>>>> thing I can see may cause this problem after everything I read on the
>>>> subject. I am new to Hadoop and I just want to setup a standalone node for
>>>> starting to experiment a while with it before going ahead with a complete
>>>> cluster.
>>>>
>>>> I repost the report for convenience:
>>>>
>>>> Configured Capacity: 2939899904 (2.74 GB)
>>>> Present Capacity: 534421504 (509.66 MB)
>>>> DFS Remaining: 534417408 (509.66 MB)
>>>>
>>>> DFS Used: 4096 (4 KB)
>>>> DFS Used%: 0.00%
>>>> Under replicated blocks: 0
>>>> Blocks with corrupt replicas: 0
>>>> Missing blocks: 0
>>>>
>>>> -------------------------------------------------
>>>> Datanodes available: 1 (1 total, 0 dead)
>>>>
>>>> Live datanodes:
>>>> Name: 127.0.0.1:50010 (feynman.cids.ca)
>>>> Hostname: feynman.cids.ca
>>>> Decommission Status : Normal
>>>> Configured Capacity: 2939899904 (2.74 GB)
>>>>
>>>> DFS Used: 4096 (4 KB)
>>>> Non DFS Used: 2405478400 (2.24 GB)
>>>> DFS Remaining: 534417408 (509.66 MB)
>>>> DFS Used%: 0.00%
>>>> DFS Remaining%: 18.18%
>>>> Last contact: Tue Dec 03 13:37:02 EST 2013
>>>>
>>>>
>>>> -----------------
>>>> Daniel Savard
>>>>
>>>>
>>>> 2013/12/3 Adam Kawa <[email protected]>
>>>>
>>>>> Daniel,
>>>>>
>>>>> It looks that you can only communicate with NameNode to do
>>>>> "metadata-only" operations (e.g. listing, creating a dir, empty file)...
>>>>>
>>>>> Did you format the NameNode correctly?
>>>>> A quite similar issue is described here:
>>>>> http://www.manning-sandbox.com/thread.jspa?messageID=126741. The last
>>>>> reply says: "The most common is that you have reformatted the
>>>>> namenode leaving it in an inconsistent state. The most common solution is
>>>>> to stop dfs, remove the contents of the dfs directories on all the
>>>>> machines, run “hadoop namenode -format” on the controller, then restart
>>>>> dfs. That consistently fixes the problem for me. This may be serious
>>>>> overkill but it works."
>>>>>
>>>>>
>>>>> 2013/12/3 Daniel Savard <[email protected]>
>>>>>
>>>>>> Thanks Arun,
>>>>>>
>>>>>> I already read and did everything recommended at the referred URL.
>>>>>> There isn't any error message in the logfiles. The only error message
>>>>>> appears when I try to put a non-zero file on the HDFS as posted above.
>>>>>> Beside that, absolutely nothing in the logs is telling me something is
>>>>>> wrong with the configuration so far.
>>>>>>
>>>>>> Is there some sort of diagnostic tool that can query/ping each server
>>>>>> to make sure it responds properly to requests? When trying to put my 
>>>>>> file,
>>>>>> in the datanode log I see nothing, the message appears in the namenode 
>>>>>> log.
>>>>>> Is this the expected behavior or should I see at least some kind of 
>>>>>> request
>>>>>> message in the datanode logfile?
>>>>>>
>>>>>>
>>>>>> -----------------
>>>>>> Daniel Savard
>>>>>>
>>>>>>
>>>>>> 2013/12/2 Arun C Murthy <[email protected]>
>>>>>>
>>>>>>> Daniel,
>>>>>>>
>>>>>>>  Apologies if you had a bad experience. If you can point them out to
>>>>>>> us, we'd be more than happy to fix it - alternately, we'd *love* it if 
>>>>>>> you
>>>>>>> could help us improve docs too.
>>>>>>>
>>>>>>>  Now, for the problem at hand:
>>>>>>> http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo is one place
>>>>>>> to look. Basically NN cannot find any datanodes. Anything in your NN 
>>>>>>> logs
>>>>>>> to indicate trouble?
>>>>>>>
>>>>>>>  Also, pls feel free to open liras with issues you find and we'll
>>>>>>> help.
>>>>>>>
>>>>>>> thanks,
>>>>>>> Arun
>>>>>>>
>>>>>>> On Dec 2, 2013, at 8:44 AM, Daniel Savard <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> André,
>>>>>>>
>>>>>>> good for you that greedy instructions on the reference page were
>>>>>>> enough to setup your cluster. However, read them again and see how many
>>>>>>> assumptions are made into them about what you are supposed to already 
>>>>>>> know
>>>>>>> and should come without saying more about it.
>>>>>>>
>>>>>>> I did try the single node setup, it is worst than the cluster setup
>>>>>>> regarding the instructions. You are supposed to already have a near 
>>>>>>> working
>>>>>>> system as far as I understand the instructions. It is assumed the HDFS 
>>>>>>> is
>>>>>>> already setup and working properly. Try to find the instructions to 
>>>>>>> setup
>>>>>>> HDFS for version 2.2.0 and you will end up with a lot of inappropriate
>>>>>>> instructions about previous version (some properties were renamed).
>>>>>>>
>>>>>>> It may appear hard at people to say this is toxic, but it is. The
>>>>>>> first place a newcomer will go is setup a single node. This will be his
>>>>>>> starting point and he will be left with a bunch of a priori and no clue.
>>>>>>>
>>>>>>> To go back to my very problem at this point:
>>>>>>>
>>>>>>> 13/12/02 11:34:07 WARN hdfs.DFSClient: DataStreamer Exception
>>>>>>> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
>>>>>>> /test._COPYING_ could only be replicated to 0 nodes instead of
>>>>>>> minReplication (=1).  There are 1 datanode(s) running and no node(s) are
>>>>>>> excluded in this operation.
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>>>>>>>     at
>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>>>>>>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>>>>>>>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
>>>>>>>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>     at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>     at
>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>>>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
>>>>>>>
>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>>>>>>>     at
>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>>>>     at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>     at
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>     at
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>     at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>     at
>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>>>>>>>     at
>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>>>>>>     at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
>>>>>>>
>>>>>>> I can copy an empty file, but as soon as its content is non-zero I
>>>>>>> am getting this message. Searching on the message is of no help so far.
>>>>>>>
>>>>>>> And I skimmed through the cluster instructions and found nothing
>>>>>>> there that could help in any way neither.
>>>>>>>
>>>>>>>
>>>>>>> -----------------
>>>>>>> Daniel Savard
>>>>>>>
>>>>>>>
>>>>>>> 2013/12/2 Andre Kelpe <[email protected]>
>>>>>>>
>>>>>>>> Hi Daniel,
>>>>>>>>
>>>>>>>> first of all, before posting to a mailing list, take a deep breath
>>>>>>>> and
>>>>>>>> let your frustrations out. Then write the email. Using words like
>>>>>>>> "crappy", "toxicware", "nightmare" are not going to help you getting
>>>>>>>> useful responses.
>>>>>>>>
>>>>>>>> While I agree that the docs can be confusing, we should try to stay
>>>>>>>> constructive. You haven't  mentioned which documentation you are
>>>>>>>> using. I found the cluster tutorial sufficient to get me started:
>>>>>>>>
>>>>>>>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html
>>>>>>>>
>>>>>>>> If you are looking for an easy way to spin up a small cluster with
>>>>>>>> hadoop 2.2, try the hadoop2 branch of this vagrant setup:
>>>>>>>>
>>>>>>>> https://github.com/fs111/vagrant-hadoop-cluster/tree/hadoop2
>>>>>>>>
>>>>>>>> - André
>>>>>>>>
>>>>>>>> On Mon, Dec 2, 2013 at 5:34 AM, Daniel Savard <
>>>>>>>> [email protected]> wrote:
>>>>>>>> > I am trying to configure hadoop 2.2.0 from source code and I
>>>>>>>> found the
>>>>>>>> > instructions really crappy and incomplete. It is like they were
>>>>>>>> written to
>>>>>>>> > avoid someone can do the job himself and must contract someone
>>>>>>>> else to do it
>>>>>>>> > or buy a packaged version.
>>>>>>>> >
>>>>>>>> > It is about three days I am struggling with this stuff with
>>>>>>>> partial success.
>>>>>>>> > The documentation is less than clear and most of the stuff out
>>>>>>>> there apply
>>>>>>>> > to earlier version and they haven't been updated for version
>>>>>>>> 2.2.0.
>>>>>>>> >
>>>>>>>> > I was able to setup HDFS, however I am still unable to use it. I
>>>>>>>> am doing a
>>>>>>>> > single node installation and the instruction page doesn't explain
>>>>>>>> anything
>>>>>>>> > beside telling you to do this and that without documenting what
>>>>>>>> each thing
>>>>>>>> > is doing and what choices are available and what guidelines you
>>>>>>>> should
>>>>>>>> > follow. There is even environment variables you are told to set,
>>>>>>>> but nothing
>>>>>>>> > is said about what they mean and to which value they should be
>>>>>>>> set. It seems
>>>>>>>> > it assumes prior knowledge of everything about hadoop.
>>>>>>>> >
>>>>>>>> > Anyone knows a site with proper documentation about hadoop or
>>>>>>>> it's hopeless
>>>>>>>> > and this whole thing is just a piece of toxicware?
>>>>>>>> >
>>>>>>>> > I am already looking for alternate solutions to hadoop which for
>>>>>>>> sure will
>>>>>>>> > be a nightmare to manage and install each time a new version,
>>>>>>>> release will
>>>>>>>> > become available.
>>>>>>>> >
>>>>>>>> > TIA
>>>>>>>> > -----------------
>>>>>>>> > Daniel Savard
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> André Kelpe
>>>>>>>> [email protected]
>>>>>>>> http://concurrentinc.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>> Arun C. Murthy
>>>>>>> Hortonworks Inc.
>>>>>>> http://hortonworks.com/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>> NOTICE: This message is intended for the use of the individual or
>>>>>>> entity to which it is addressed and may contain information that is
>>>>>>> confidential, privileged and exempt from disclosure under applicable 
>>>>>>> law.
>>>>>>> If the reader of this message is not the intended recipient, you are 
>>>>>>> hereby
>>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>>> disclosure or forwarding of this communication is strictly prohibited. 
>>>>>>> If
>>>>>>> you have received this communication in error, please contact the sender
>>>>>>> immediately and delete it from your system. Thank You.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Hadoop 2.2.0 from source configuration

Reply via email to