Hi,

I am aware of at least two Flink users which were facing various issues
with HDFS when using Flink.

*Issues observed:*
- HDFS client trying to connect to the standby Namenode
"org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby"
-  java.io.IOException: Bad response ERROR for block
BP-1335380477-172.22.5.37-1424696786673:blk_1107843111_34301064 from
datanode 172.22.5.81:50010
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:732)

- Caused by:
org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
0
        at
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:478)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:6039)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:6002)


I've added the exceptions to the email so that users facing these issues
can find a solution for them.
I suspect that all these issues are caused by the Hadoop 2.2.0 client we
are packing into the binary releases.

Upgrading the HDFS client to the same version as the HDFS installation
(say, for example 2.4.1) resolved all issues.

Therefore, I propose to provide Hadoop 2.4.0 and Hadoop 2.6.0 binaries on
the Flink download page.
For the 0.9.0 release, I would do another VOTE on providing these two
binaries.

I've also filed a JIRA to provide a Flink build which doesn't include
Hadoop at all (relying on the version provided by the user through the
classpath): https://issues.apache.org/jira/browse/FLINK-2268


Let me know what you think!

Robert

Reply via email to