Hi Daniel,
I agree with you that 2.2 documents are very unfriendly.
In many documents, the change in 2.2 from 1.x is just a format.
There are still many documents to be converted. (ex. Hadoop Streaming)
Furthermore, there are a lot of dead links in documents.
I've been trying to fix dead links, convert 1.x documents, and update
deprecated instructions.
https://issues.apache.org/jira/browse/HADOOP-9982
https://issues.apache.org/jira/browse/MAPREDUCE-5636
I'll file a JIRA and try to update Single Node Setup document.
Thanks,
Akira
(2013/12/03 1:44), Daniel Savard wrote:
André,
good for you that greedy instructions on the reference page were enough
to setup your cluster. However, read them again and see how many
assumptions are made into them about what you are supposed to already
know and should come without saying more about it.
I did try the single node setup, it is worst than the cluster setup
regarding the instructions. You are supposed to already have a near
working system as far as I understand the instructions. It is assumed
the HDFS is already setup and working properly. Try to find the
instructions to setup HDFS for version 2.2.0 and you will end up with a
lot of inappropriate instructions about previous version (some
properties were renamed).
It may appear hard at people to say this is toxic, but it is. The first
place a newcomer will go is setup a single node. This will be his
starting point and he will be left with a bunch of a priori and no clue.
To go back to my very problem at this point:
13/12/02 11:34:07 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/test._COPYING_ could only be replicated to 0 nodes instead of
minReplication (=1). There are 1 datanode(s) running and no node(s) are
excluded in this operation.
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
I can copy an empty file, but as soon as its content is non-zero I am
getting this message. Searching on the message is of no help so far.
And I skimmed through the cluster instructions and found nothing there
that could help in any way neither.
-----------------
Daniel Savard
2013/12/2 Andre Kelpe <[email protected]
<mailto:[email protected]>>
Hi Daniel,
first of all, before posting to a mailing list, take a deep breath and
let your frustrations out. Then write the email. Using words like
"crappy", "toxicware", "nightmare" are not going to help you getting
useful responses.
While I agree that the docs can be confusing, we should try to stay
constructive. You haven't mentioned which documentation you are
using. I found the cluster tutorial sufficient to get me started:
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html
If you are looking for an easy way to spin up a small cluster with
hadoop 2.2, try the hadoop2 branch of this vagrant setup:
https://github.com/fs111/vagrant-hadoop-cluster/tree/hadoop2
- André
On Mon, Dec 2, 2013 at 5:34 AM, Daniel Savard
<[email protected] <mailto:[email protected]>> wrote:
> I am trying to configure hadoop 2.2.0 from source code and I
found the
> instructions really crappy and incomplete. It is like they were
written to
> avoid someone can do the job himself and must contract someone
else to do it
> or buy a packaged version.
>
> It is about three days I am struggling with this stuff with
partial success.
> The documentation is less than clear and most of the stuff out
there apply
> to earlier version and they haven't been updated for version 2.2.0.
>
> I was able to setup HDFS, however I am still unable to use it. I
am doing a
> single node installation and the instruction page doesn't explain
anything
> beside telling you to do this and that without documenting what
each thing
> is doing and what choices are available and what guidelines you
should
> follow. There is even environment variables you are told to set,
but nothing
> is said about what they mean and to which value they should be
set. It seems
> it assumes prior knowledge of everything about hadoop.
>
> Anyone knows a site with proper documentation about hadoop or
it's hopeless
> and this whole thing is just a piece of toxicware?
>
> I am already looking for alternate solutions to hadoop which for
sure will
> be a nightmare to manage and install each time a new version,
release will
> become available.
>
> TIA
> -----------------
> Daniel Savard
--
André Kelpe
[email protected] <mailto:[email protected]>
http://concurrentinc.com