Re: File is not written on HDFS after running libhdfs C API

Alexandru Calin Thu, 05 Mar 2015 04:40:30 -0800

DataNode was not starting due to this error : java.io.IOException:
Incompatible clusterIDs in /usr/local/hadoop/hadoop_store/hdfs/datanode:
namenode clusterID = CID-b788c93b-a1d7-4351-bd91-28fdd134e9ba; datanode
clusterID = CID-862f3fad-175e-442d-a06b-d65ac57d64b2


I can't image how this happened, anyway .. I issued this command :
*bin/hdfs namenode -format -clusterId
CID-862f3fad-175e-442d-a06b-d65ac57d64b2*

And that got it started, the file is written correctly.

Thank you very much


On Thu, Mar 5, 2015 at 2:03 PM, Alexandru Calin <[email protected]>
wrote:

> After putting the CLASSPATH initialization in .bashrc it creates the file,
> but it has 0 size and I also get this warnning:
>
> file opened
>
> Wrote 14 bytes
> 15/03/05 14:00:55 WARN hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /tmp/testfile.txt could only be replicated to 0 nodes instead of
> minReplication (=1).  There are 0 datanode(s) running and no node(s) are
> excluded in this operation.
>         at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1468)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> FSDataOutputStream#close error:
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /tmp/testfile.txt could only be replicated to 0 nodes instead of
> minReplication (=1).  There are 0 datanode(s) running and no node(s) are
> excluded in this operation.
>         at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1468)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> 15/03/05 14:00:55 ERROR hdfs.DFSClient: Failed to close inode 16393
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /tmp/testfile.txt could only be replicated to 0 nodes instead of
> minReplication (=1).  There are 0 datanode(s) running and no node(s) are
> excluded in this operation.
>         at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1468)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
>
>
> *bin/hdfs dfs -ls /tmp* :
> Found 1 items
> -rw-r--r--   1 userpc supergroup          0 2015-03-05 14:00
> /tmp/testfile.txt
>
>
>
> On Thu, Mar 5, 2015 at 11:10 AM, Azuryy Yu <[email protected]> wrote:
>
>> You don't need to start Yarn if you only want to write HDFS using C API.
>> and you also don't need to restart HDFS.
>>
>>
>>
>> On Thu, Mar 5, 2015 at 4:58 PM, Alexandru Calin <
>> [email protected]> wrote:
>>
>>> Now I've also started YARN ( just for the sake of trying anything), the
>>> config for mapred-site.xml and yarn-site.xml are those on apache website. A 
>>> *jps
>>> *command shows:
>>>
>>> 11257 NodeManager
>>> 11129 ResourceManager
>>> 11815 Jps
>>> 10620 NameNode
>>> 10966 SecondaryNameNode
>>>
>>> On Thu, Mar 5, 2015 at 10:48 AM, Azuryy Yu <[email protected]> wrote:
>>>
>>>> Can you share your core-site.xml here?
>>>>
>>>>
>>>> On Thu, Mar 5, 2015 at 4:32 PM, Alexandru Calin <
>>>> [email protected]> wrote:
>>>>
>>>>> No change at all, I've added them at the start and end of the
>>>>> CLASSPATH, either way it still writes the file on the local fs. I've also
>>>>> restarted hadoop.
>>>>>
>>>>> On Thu, Mar 5, 2015 at 10:22 AM, Azuryy Yu <[email protected]> wrote:
>>>>>
>>>>>> Yes,  you should do it:)
>>>>>>
>>>>>> On Thu, Mar 5, 2015 at 4:17 PM, Alexandru Calin <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Wow, you are so right! it's on the local filesystem!  Do I have to
>>>>>>> manually specify hdfs-site.xml and core-site.xml in the CLASSPATH 
>>>>>>> variable
>>>>>>> ? Like this:
>>>>>>> CLASSPATH=$CLASSPATH:/usr/local/hadoop/etc/hadoop/core-site.xml
>>>>>>> ?
>>>>>>>
>>>>>>> On Thu, Mar 5, 2015 at 10:04 AM, Azuryy Yu <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> you need to include core-site.xml as well. and I think you can find
>>>>>>>> '/tmp/testfile.txt' on your local disk, instead of HDFS.
>>>>>>>>
>>>>>>>> if so,  My guess is right.  because you don't include
>>>>>>>> core-site.xml, then your Filesystem schema is file:// by default, not
>>>>>>>> hdfs://.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 5, 2015 at 3:52 PM, Alexandru Calin <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> I am trying to run the basic libhdfs example, it compiles ok, and
>>>>>>>>> actually runs ok, and executes the whole program, but I cannot see 
>>>>>>>>> the file
>>>>>>>>> on the HDFS.
>>>>>>>>>
>>>>>>>>> It is said  here
>>>>>>>>> <http://hadoop.apache.org/docs/r1.2.1/libhdfs.html>, that you
>>>>>>>>> have to include *the right configuration directory containing
>>>>>>>>> hdfs-site.xml*
>>>>>>>>>
>>>>>>>>> My hdfs-site.xml:
>>>>>>>>>
>>>>>>>>> <configuration>
>>>>>>>>>     <property>
>>>>>>>>>         <name>dfs.replication</name>
>>>>>>>>>         <value>1</value>
>>>>>>>>>     </property>
>>>>>>>>>     <property>
>>>>>>>>>       <name>dfs.namenode.name.dir</name>
>>>>>>>>>       
>>>>>>>>> <value>file:///usr/local/hadoop/hadoop_data/hdfs/namenode</value>
>>>>>>>>>     </property>
>>>>>>>>>     <property>
>>>>>>>>>       <name>dfs.datanode.data.dir</name>
>>>>>>>>>       
>>>>>>>>> <value>file:///usr/local/hadoop/hadoop_store/hdfs/datanode</value>
>>>>>>>>>     </property></configuration>
>>>>>>>>>
>>>>>>>>> I generate my classpath with this:
>>>>>>>>>
>>>>>>>>> #!/bin/bashexport CLASSPATH=/usr/local/hadoop/
>>>>>>>>> declare -a subdirs=("hdfs" "tools" "common" "yarn" "mapreduce")for 
>>>>>>>>> subdir in "${subdirs[@]}"do
>>>>>>>>>         for file in $(find /usr/local/hadoop/share/hadoop/$subdir 
>>>>>>>>> -name *.jar)
>>>>>>>>>         do
>>>>>>>>>                 export CLASSPATH=$CLASSPATH:$file
>>>>>>>>>         donedone
>>>>>>>>>
>>>>>>>>> and I also add export
>>>>>>>>> CLASSPATH=$CLASSPATH:/usr/local/hadoop/etc/hadoop , where my
>>>>>>>>> *hdfs-site.xml* reside.
>>>>>>>>>
>>>>>>>>> MY LD_LIBRARY_PATH =
>>>>>>>>> /usr/local/hadoop/lib/native:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server
>>>>>>>>> Code:
>>>>>>>>>
>>>>>>>>> #include "hdfs.h"#include <stdio.h>#include <string.h>#include 
>>>>>>>>> <stdio.h>#include <stdlib.h>
>>>>>>>>> int main(int argc, char **argv) {
>>>>>>>>>
>>>>>>>>>     hdfsFS fs = hdfsConnect("default", 0);
>>>>>>>>>     const char* writePath = "/tmp/testfile.txt";
>>>>>>>>>     hdfsFile writeFile = hdfsOpenFile(fs, writePath, 
>>>>>>>>> O_WRONLY|O_CREAT, 0, 0, 0);
>>>>>>>>>     if(!writeFile) {
>>>>>>>>>           printf("Failed to open %s for writing!\n", writePath);
>>>>>>>>>           exit(-1);
>>>>>>>>>     }
>>>>>>>>>     printf("\nfile opened\n");
>>>>>>>>>     char* buffer = "Hello, World!";
>>>>>>>>>     tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, 
>>>>>>>>> strlen(buffer)+1);
>>>>>>>>>     printf("\nWrote %d bytes\n", (int)num_written_bytes);
>>>>>>>>>     if (hdfsFlush(fs, writeFile)) {
>>>>>>>>>            printf("Failed to 'flush' %s\n", writePath);
>>>>>>>>>           exit(-1);
>>>>>>>>>     }
>>>>>>>>>    hdfsCloseFile(fs, writeFile);
>>>>>>>>>    hdfsDisconnect(fs);
>>>>>>>>>    return 0;}
>>>>>>>>>
>>>>>>>>> It compiles and runs without error, but I cannot see the file on
>>>>>>>>> HDFS.
>>>>>>>>>
>>>>>>>>> I have Hadoop 2.6.0 on Ubuntu 14.04 64bit.
>>>>>>>>>
>>>>>>>>> Any ideas on this ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: File is not written on HDFS after running libhdfs C API

Reply via email to