Johan Lasperas created SPARK-46092: -------------------------------------- Summary: Overflow in Parquet row group filter creation causes incorrect results Key: SPARK-46092 URL: https://issues.apache.org/jira/browse/SPARK-46092 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.5.0 Reporter: Johan Lasperas
While the parquet readers don't support reading parquet values into larger Spark types, it's possible to trigger an overflow when creating a Parquet row group filter that will then incorrectly skip row groups and bypass the exception in the reader, Repro: ``` Seq(0).toDF("a").write.parquet(path) spark.read.schema("a LONG").parquet(path).where(s"a < ${Long.MaxValue}").collect() ``` This succeeds and returns no results. This should either fail if the Parquet reader doesn't support the upcast from int to long or produce result `[0]` if it does. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org