[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

Leif Walsh (JIRA) Tue, 11 Apr 2017 08:14:53 -0700

    [ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964499#comment-15964499
 ]


Leif Walsh commented on ARROW-805:
----------------------------------

Okay, I got it set up but got a failure, I think it might be failing a 
permission check?  Not sure yet.

{noformat}
ARROW_HDFS_TEST_PORT=9000 ARROW_HDFS_TEST_USER=hdfs ARROW_HDFS_TEST_HOST=impala 
debug/io-hdfs-test
[==========] Running 18 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 9 tests from TestHdfsClient/0, where TypeParam = 
arrow::io::JNIDriver
[ RUN      ] TestHdfsClient/0.ConnectsAgain
2017-04-11 11:13:25,511 WARN  [main] util.NativeCodeLoader 
(NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for 
your platform... using builtin-java classes where applicable
[       OK ] TestHdfsClient/0.ConnectsAgain (1381 ms)
[ RUN      ] TestHdfsClient/0.CreateDirectory
/home/leif/git/arrow/cpp/src/arrow/io/io-hdfs-test.cc:174: Failure
Value of: s.ok()
  Actual: false
Expected: true
[  FAILED  ] TestHdfsClient/0.CreateDirectory, where TypeParam = 
arrow::io::JNIDriver (201 ms)
[ RUN      ] TestHdfsClient/0.GetCapacityUsed
[       OK ] TestHdfsClient/0.GetCapacityUsed (137 ms)
[ RUN      ] TestHdfsClient/0.GetPathInfo
[       OK ] TestHdfsClient/0.GetPathInfo (799 ms)
[ RUN      ] TestHdfsClient/0.AppendToFile
2017-04-11 11:13:27,536 WARN  [Thread-15] hdfs.DFSClient 
(DFSOutputStream.java:run(557)) - DataStreamer Exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline 
due to no more good datanodes being available to try. (Nodes: 
current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]],
 
original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]).
 The current failed datanode replacement policy is DEFAULT, and a client may 
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' 
in its configuration.
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
FSDataOutputStream#close error:
java.io.IOException: Failed to replace a bad datanode on the existing pipeline 
due to no more good datanodes being available to try. (Nodes: 
current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]],
 
original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]).
 The current failed datanode replacement policy is DEFAULT, and a client may 
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' 
in its configuration.
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
/home/leif/git/arrow/cpp/src/arrow/io/io-hdfs-test.cc:236: Failure
Failed
IOError: HDFS: CloseFile failed
2017-04-11 11:13:27,550 ERROR [main] hdfs.DFSClient 
(DFSClient.java:closeAllFilesBeingWritten(940)) - Failed to close inode 16419
java.io.IOException: Failed to replace a bad datanode on the existing pipeline 
due to no more good datanodes being available to try. (Nodes: 
current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]],
 
original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]).
 The current failed datanode replacement policy is DEFAULT, and a client may 
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' 
in its configuration.
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
[  FAILED  ] TestHdfsClient/0.AppendToFile, where TypeParam = 
arrow::io::JNIDriver (207 ms)
{noformat}

[~cpcloud] maybe we can look together this afternoon?

> listing empty HDFS directory returns an error instead of returning empty list
> -----------------------------------------------------------------------------
>
>                 Key: ARROW-805
>                 URL: https://issues.apache.org/jira/browse/ARROW-805
>             Project: Apache Arrow
>          Issue Type: Bug
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Leif Walsh
>            Assignee: Leif Walsh
>             Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
>     if (entries == nullptr) {
>       // If the directory is empty, entries is NULL but errno is 0. Non-zero
>       // errno indicates error
>       //
>       // Note: errno is thread-locala
>       if (errno == 0) { num_entries = 0; }
>       { return Status::IOError("HDFS: list directory failed"); }
>     }
> {code}
> I think that should have an else:
> {code}
>     if (entries == nullptr) {
>       // If the directory is empty, entries is NULL but errno is 0. Non-zero
>       // errno indicates error
>       //
>       // Note: errno is thread-locala
>       if (errno == 0) {
>         num_entries = 0;
>       } else {
>         return Status::IOError("HDFS: list directory failed");
>       }
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

Reply via email to