Re: Hadoop 2.2.0 from source configuration

Daniel Savard Mon, 02 Dec 2013 09:15:41 -0800

Here is additional information about the HDFS:

$ hdfs dfsadmin -report
Configured Capacity: 3208335360 (2.99 GB)
Present Capacity: 534454272 (509.70 MB)
DFS Remaining: 534450176 (509.69 MB)
DFS Used: 4096 (4 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0


-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (hosta.subdom1.tld1)
Hostname: hosta.subdom1.tld1
Decommission Status : Normal
Configured Capacity: 3208335360 (2.99 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 2673881088 (2.49 GB)
DFS Remaining: 534450176 (509.69 MB)
DFS Used%: 0.00%
DFS Remaining%: 16.66%
Last contact: Mon Dec 02 12:07:28 EST 2013


I see nothing that could explain the error. I can mkdir, put empty files,
list content.

-----------------
Daniel Savard


2013/12/2 Daniel Savard <[email protected]>

> André,
>
> good for you that greedy instructions on the reference page were enough to
> setup your cluster. However, read them again and see how many assumptions
> are made into them about what you are supposed to already know and should
> come without saying more about it.
>
> I did try the single node setup, it is worst than the cluster setup
> regarding the instructions. You are supposed to already have a near working
> system as far as I understand the instructions. It is assumed the HDFS is
> already setup and working properly. Try to find the instructions to setup
> HDFS for version 2.2.0 and you will end up with a lot of inappropriate
> instructions about previous version (some properties were renamed).
>
> It may appear hard at people to say this is toxic, but it is. The first
> place a newcomer will go is setup a single node. This will be his starting
> point and he will be left with a bunch of a priori and no clue.
>
> To go back to my very problem at this point:
>
> 13/12/02 11:34:07 WARN hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /test._COPYING_ could only be replicated to 0 nodes instead of
> minReplication (=1).  There are 1 datanode(s) running and no node(s) are
> excluded in this operation.
>     at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>     at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
>     at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>     at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>     at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:415)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
>
>     at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>     at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>     at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>     at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
>     at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
>     at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
>     at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
>
> I can copy an empty file, but as soon as its content is non-zero I am
> getting this message. Searching on the message is of no help so far.
>
> And I skimmed through the cluster instructions and found nothing there
> that could help in any way neither.
>
>
> -----------------
> Daniel Savard
>
>
> 2013/12/2 Andre Kelpe <[email protected]>
>
>> Hi Daniel,
>>
>> first of all, before posting to a mailing list, take a deep breath and
>> let your frustrations out. Then write the email. Using words like
>> "crappy", "toxicware", "nightmare" are not going to help you getting
>> useful responses.
>>
>> While I agree that the docs can be confusing, we should try to stay
>> constructive. You haven't  mentioned which documentation you are
>> using. I found the cluster tutorial sufficient to get me started:
>>
>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html
>>
>> If you are looking for an easy way to spin up a small cluster with
>> hadoop 2.2, try the hadoop2 branch of this vagrant setup:
>>
>> https://github.com/fs111/vagrant-hadoop-cluster/tree/hadoop2
>>
>> - André
>>
>> On Mon, Dec 2, 2013 at 5:34 AM, Daniel Savard <[email protected]>
>> wrote:
>> > I am trying to configure hadoop 2.2.0 from source code and I found the
>> > instructions really crappy and incomplete. It is like they were written
>> to
>> > avoid someone can do the job himself and must contract someone else to
>> do it
>> > or buy a packaged version.
>> >
>> > It is about three days I am struggling with this stuff with partial
>> success.
>> > The documentation is less than clear and most of the stuff out there
>> apply
>> > to earlier version and they haven't been updated for version 2.2.0.
>> >
>> > I was able to setup HDFS, however I am still unable to use it. I am
>> doing a
>> > single node installation and the instruction page doesn't explain
>> anything
>> > beside telling you to do this and that without documenting what each
>> thing
>> > is doing and what choices are available and what guidelines you should
>> > follow. There is even environment variables you are told to set, but
>> nothing
>> > is said about what they mean and to which value they should be set. It
>> seems
>> > it assumes prior knowledge of everything about hadoop.
>> >
>> > Anyone knows a site with proper documentation about hadoop or it's
>> hopeless
>> > and this whole thing is just a piece of toxicware?
>> >
>> > I am already looking for alternate solutions to hadoop which for sure
>> will
>> > be a nightmare to manage and install each time a new version, release
>> will
>> > become available.
>> >
>> > TIA
>> > -----------------
>> > Daniel Savard
>>
>>
>>
>> --
>> André Kelpe
>> [email protected]
>> http://concurrentinc.com
>>
>
>

Re: Hadoop 2.2.0 from source configuration

Reply via email to