Johan Lasperas created SPARK-46092:
--------------------------------------
Summary: Overflow in Parquet row group filter creation causes
incorrect results
Key: SPARK-46092
URL: https://issues.apache.org/jira/browse/SPARK-46092
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.5.0
Reporter: Johan Lasperas
While the parquet readers don't support reading parquet values into larger
Spark types, it's possible to trigger an overflow when creating a Parquet row
group filter that will then incorrectly skip row groups and bypass the
exception in the reader,
Repro:
```
Seq(0).toDF("a").write.parquet(path)
spark.read.schema("a LONG").parquet(path).where(s"a <
${Long.MaxValue}").collect()
```
This succeeds and returns no results. This should either fail if the Parquet
reader doesn't support the upcast from int to long or produce result `[0]` if
it does.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]