Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21603#discussion_r202255923
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -386,6 +386,17 @@ object SQLConf {
.booleanConf
.createWithDefault(true)
+ val PARQUET_FILTER_PUSHDOWN_INFILTERTHRESHOLD =
+ buildConf("spark.sql.parquet.pushdown.inFilterThreshold")
+ .doc("The maximum number of values to filter push-down optimization
for IN predicate. " +
+ "Large threshold won't necessarily provide much better
performance. " +
+ "The experiment argued that 300 is the limit threshold. " +
+ "This configuration only has an effect when
'spark.sql.parquet.filterPushdown' is enabled.")
+ .internal()
+ .intConf
+ .checkValue(threshold => threshold > 0, "The threshold must be
greater than 0.")
--- End diff --
Shell we allow, for example, `-1` here to disable this?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]