bersprockets commented on PR #48767:
URL: https://github.com/apache/spark/pull/48767#issuecomment-2519009803
Starting with this commit (800faf0abfa), I get an error with the following
commands:
```
val testDf = spark.range(200000).selectExpr("id as a", "concat('x',
string(id % 2)) as b")
testDf.write.mode("overwrite").partitionBy("b").format("parquet").save("test1")
spark.read.parquet("test1").createOrReplaceTempView("test1")
sql("select * from test1 limit 12 offset 20000").collect
```
The `collect` results in this error:
```
Exception in task 0.0 in stage 2.0 (TID 17)
java.lang.NullPointerException: Cannot invoke
"org.apache.spark.unsafe.types.UTF8String.getBaseObject()" because "input" is
null
at
org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:111)
~[spark-catalyst_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
Source) ~[?:?]
at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
~[spark-sql_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at
org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:50)
~[spark-sql_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
```
If I build with the commit previous to 800faf0abfa, I get actual results
rather than an error.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]