[
https://issues.apache.org/jira/browse/HDFS-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16965030#comment-16965030
]
hemanthboyina commented on HDFS-14477:
--------------------------------------
the operation getfileblocklocations was there in httpfs , but the is no
implementation of that
{quote}
no such enum as GET_BLOCK_LOCATIONS. It is named GETFILEBLOCKLOCATIONS
{quote}
yes , it should be as GET_BlOCK_LOCATIONS .
will work on this .
> No enum constant Operation.GET_BLOCK_LOCATIONS
> -----------------------------------------------
>
> Key: HDFS-14477
> URL: https://issues.apache.org/jira/browse/HDFS-14477
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: fs
> Affects Versions: 2.7.0, 2.8.0, 2.7.1, 2.7.2, 2.7.3, 2.9.0, 2.7.4, 2.8.1,
> 2.8.2, 2.8.3, 2.7.5, 3.0.0, 2.9.1, 2.8.4, 2.7.6, 2.9.2, 2.8.5, 2.7.7, 2.7.8,
> 2.8.6
> Environment: Running on Ubuntu 16.04
> Hadoop v2.7.4
> Minikube v1.0.1
> Scala v2.11
> Spark v2.4.2
>
> Reporter: Roksolana Diachuk
> Priority: Major
>
> I was trying to read Avro files contents from HDFS using Spark application
> and Httpfs configured in minikube (for using Kubernetes locally). Each time I
> try to read the files I get this exception:
> {code:java}
> Exception in thread "main"
> org.apache.hadoop.ipc.RemoteException(com.sun.jersey.api.ParamException$QueryParamException):
> java.lang.IllegalArgumentException: No enum constant
> org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.GET_BLOCK_LOCATIONS
> at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:118)
> at
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:367)
> at
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:98)
> at
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:625)
> at
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:472)
> at
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:502)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
> at
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:498)
> at
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileBlockLocations(WebHdfsFileSystem.java:1420)
> at
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileBlockLocations(WebHdfsFileSystem.java:1404)
> at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:343)
> at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:204)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
> at
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
> at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:945)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
> at org.apache.spark.rdd.RDD.collect(RDD.scala:944)
> at spark_test.TestSparkJob$.main(TestSparkJob.scala:48)
> at spark_test.TestSparkJob.main(TestSparkJob.scala){code}
>
> I access HDFS using Httpfs setup in Kubernetes. So my Spark application runs
> outside of the K8s cluster therefore, all the services are accessed using
> NodePorts. When I launch the Spark app inside of the K8s cluster and use only
> HDFS client or WebHDFS, I can get all the files contents. The error occurs
> only when I execute an app outside of the cluster and that is when I access
> HDFS using Httpfs.
> So I checked Hadoop sources and I have found out that there is no such enum
> as GET_BLOCK_LOCATIONS. It is named GETFILEBLOCKLOCATIONS in Operation enum
> by [this
> link|[https://github.com/apache/hadoop/blob/release-2.7.4-RC0/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java]].
> And the same applies to all the Hadoop versions I have checked (2.7.4 and
> higher).
> The conclusion would be that HDFS and HttpFs are not compatible with
> operations names. But it may be true for other operations. So It is not yet
> possible to read the data from HDFS using Httpfs.
> Is it possible to fix this error somehow?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]