Roksolana Diachuk created HADOOP-16301:
------------------------------------------

             Summary: No enum constant Operation.GET_BLOCK_LOCATIONS 
                 Key: HADOOP-16301
                 URL: https://issues.apache.org/jira/browse/HADOOP-16301
             Project: Hadoop Common
          Issue Type: Bug
          Components: common, fs
    Affects Versions: 2.8.5, 2.7.7, 2.9.2, 2.7.6, 2.8.4, 2.9.1, 3.0.0, 2.7.5, 
2.8.3, 2.8.2, 2.8.1, 2.7.4, 2.9.0, 2.7.3, 2.7.2, 2.7.1, 2.8.0, 2.7.0, 2.7.8, 
2.8.6
         Environment: Running on Ubuntu 16.04

Hadoop v2.7.4

Minikube v1.0.1

Scala v2.11

Spark v2.4.2

 
            Reporter: Roksolana Diachuk


I was trying to read Avro files contents from HDFS using Spark application and 
Httpfs configured in minikube (for using Kubernetes locally). Each time I try 
to read the files I get this exception:
{code:java}
Exception in thread "main" 
org.apache.hadoop.ipc.RemoteException(com.sun.jersey.api.ParamException$QueryParamException):
 java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.GET_BLOCK_LOCATIONS
 at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:118)
 at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:367)
 at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:98)
 at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:625)
 at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:472)
 at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:502)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
 at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:498)
 at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileBlockLocations(WebHdfsFileSystem.java:1420)
 at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileBlockLocations(WebHdfsFileSystem.java:1404)
 at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:343)
 at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:204)
 at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
 at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
 at scala.Option.getOrElse(Option.scala:121)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
 at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
 at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
 at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
 at scala.Option.getOrElse(Option.scala:121)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
 at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:945)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
 at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
 at org.apache.spark.rdd.RDD.collect(RDD.scala:944)
 at spark_test.TestSparkJob$.main(TestSparkJob.scala:48)
 at spark_test.TestSparkJob.main(TestSparkJob.scala){code}
 

I access HDFS using Httpfs setup in Kubernetes. So my Spark application runs 
outside of the K8s cluster therefore, all the services are accessed using 
NodePorts. When I launch the Spark app inside of the K8s cluster and use only 
HDFS client or WebHDFS, I can get all the files contents. The error occurs only 
when I execute an app outside of the cluster and that is when I access HDFS 
using Httpfs.

So I checked Hadoop sources and I have found out that there is no such enum as 
GET_BLOCK_LOCATIONS. It is named GETFILEBLOCKLOCATIONS in Operation enum by 
[this 
link|[https://github.com/apache/hadoop/blob/release-2.7.4-RC0/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java]].
 And the same applies to all the Hadoop versions I have checked (2.7.4 and 
higher). 



The conclusion would be that HDFS and HttpFs are not compatible with operations 
names. But it may be true for other operations. So It is not yet possible to 
read the data from HDFS using Httpfs. 
Is it possible to fix this error somehow?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to