1ambda edited a comment on issue #3295:
URL: https://github.com/apache/iceberg/issues/3295#issuecomment-988384976
Tested in Spark 3.1.2 (Iceberg runtime 0.12.1), But still got same error.
In Presto, usually we don't need to setup much, prestodb 0.266 embed
required iceberg runtime (0.12.1). Since catalog setup for iceberg is quite
simple, I don't think it' presto problem. (even though the table created in
presto). The Spark iceberg runtime doesn't seem to be able to read this file
properly.
```
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.1.2
/_/
Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_242)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.sql("SELECT * FROM hive_prod.test_db.test_table ...").show()
res0: org.apache.spark.sql.DataFrame = [placeno: col_a, col_b: string ... 1
more field]
scala> spark.sql("SELECT * FROM hive_prod.test_db.test_table ...").show()
21/12/08 09:42:22 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID
0)/ 1]
java.lang.ClassCastException: [B cannot be cast to
org.apache.spark.unsafe.types.UTF8String
at
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getUTF8String(rows.scala:46)
at
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getUTF8String$(rows.scala:46)
at
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
Source)
at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
at
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
21/12/08 09:42:22 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0)
(hive-metastore-2-66d6c89cf-jxbcg executor driver):
java.lang.ClassCastException: [B cannot be cast to
org.apache.spark.unsafe.types.UTF8String
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]