[
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964499#comment-15964499
]
Leif Walsh commented on ARROW-805:
----------------------------------
Okay, I got it set up but got a failure, I think it might be failing a
permission check? Not sure yet.
{noformat}
ARROW_HDFS_TEST_PORT=9000 ARROW_HDFS_TEST_USER=hdfs ARROW_HDFS_TEST_HOST=impala
debug/io-hdfs-test
[==========] Running 18 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 9 tests from TestHdfsClient/0, where TypeParam =
arrow::io::JNIDriver
[ RUN ] TestHdfsClient/0.ConnectsAgain
2017-04-11 11:13:25,511 WARN [main] util.NativeCodeLoader
(NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
[ OK ] TestHdfsClient/0.ConnectsAgain (1381 ms)
[ RUN ] TestHdfsClient/0.CreateDirectory
/home/leif/git/arrow/cpp/src/arrow/io/io-hdfs-test.cc:174: Failure
Value of: s.ok()
Actual: false
Expected: true
[ FAILED ] TestHdfsClient/0.CreateDirectory, where TypeParam =
arrow::io::JNIDriver (201 ms)
[ RUN ] TestHdfsClient/0.GetCapacityUsed
[ OK ] TestHdfsClient/0.GetCapacityUsed (137 ms)
[ RUN ] TestHdfsClient/0.GetPathInfo
[ OK ] TestHdfsClient/0.GetPathInfo (799 ms)
[ RUN ] TestHdfsClient/0.AppendToFile
2017-04-11 11:13:27,536 WARN [Thread-15] hdfs.DFSClient
(DFSOutputStream.java:run(557)) - DataStreamer Exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline
due to no more good datanodes being available to try. (Nodes:
current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]],
original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]).
The current failed datanode replacement policy is DEFAULT, and a client may
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration.
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
FSDataOutputStream#close error:
java.io.IOException: Failed to replace a bad datanode on the existing pipeline
due to no more good datanodes being available to try. (Nodes:
current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]],
original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]).
The current failed datanode replacement policy is DEFAULT, and a client may
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration.
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
/home/leif/git/arrow/cpp/src/arrow/io/io-hdfs-test.cc:236: Failure
Failed
IOError: HDFS: CloseFile failed
2017-04-11 11:13:27,550 ERROR [main] hdfs.DFSClient
(DFSClient.java:closeAllFilesBeingWritten(940)) - Failed to close inode 16419
java.io.IOException: Failed to replace a bad datanode on the existing pipeline
due to no more good datanodes being available to try. (Nodes:
current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]],
original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]).
The current failed datanode replacement policy is DEFAULT, and a client may
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration.
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
[ FAILED ] TestHdfsClient/0.AppendToFile, where TypeParam =
arrow::io::JNIDriver (207 ms)
{noformat}
[~cpcloud] maybe we can look together this afternoon?
> listing empty HDFS directory returns an error instead of returning empty list
> -----------------------------------------------------------------------------
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
> Issue Type: Bug
> Affects Versions: 0.2.0, 0.3.0
> Reporter: Leif Walsh
> Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
> // If the directory is empty, entries is NULL but errno is 0. Non-zero
> // errno indicates error
> //
> // Note: errno is thread-locala
> if (errno == 0) { num_entries = 0; }
> { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
> // If the directory is empty, entries is NULL but errno is 0. Non-zero
> // errno indicates error
> //
> // Note: errno is thread-locala
> if (errno == 0) {
> num_entries = 0;
> } else {
> return Status::IOError("HDFS: list directory failed");
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)