wangyum commented on pull request #29642:
URL: https://github.com/apache/spark/pull/29642#issuecomment-803815504
This patch is used to push down the data column when the `InSet` value
exceeds `spark.sql.parquet.pushdown.inFilterThreshold`. This is benchmark and
benchmark result:
wangyum commented on pull request #29642:
URL: https://github.com/apache/spark/pull/29642#issuecomment-743135912
@cloud-fan @HyukjinKwon @gengliangwang Do you have more comments?
This is an automated message from the Apache
wangyum commented on pull request #29642:
URL: https://github.com/apache/spark/pull/29642#issuecomment-743109098
Production real case test:
Before this PR | After this PR
--- | ---
wangyum commented on pull request #29642:
URL: https://github.com/apache/spark/pull/29642#issuecomment-738804876
retest this please.
This is an automated message from the Apache Git Service.
To respond to the message, please
wangyum commented on pull request #29642:
URL: https://github.com/apache/spark/pull/29642#issuecomment-738663385
retest this please.
This is an automated message from the Apache Git Service.
To respond to the message, please
wangyum commented on pull request #29642:
URL: https://github.com/apache/spark/pull/29642#issuecomment-738661573
```scala
package org.apache.spark.sql.execution.benchmark
import java.io.File
import scala.util.Random
import org.apache.spark.SparkConf
import