viirya commented on a change in pull request #29304:
URL: https://github.com/apache/spark/pull/29304#discussion_r463156335
##########
File path:
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
##########
@@ -591,6 +591,15 @@ public boolean anyNull() {
return BitSetMethods.anySet(baseObject, baseOffset, bitSetWidthInBytes /
8);
}
+ public boolean allNull() {
Review comment:
Can you add doc for the method?
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -2684,11 +2684,17 @@ object SQLConf {
.doc("When true, NULL-aware anti join execution will be planed into " +
"BroadcastHashJoinExec with flag isNullAwareAntiJoin enabled, " +
"optimized from O(M*N) calculation into O(M) calculation " +
- "using Hash lookup instead of Looping lookup." +
- "Only support for singleColumn NAAJ for now.")
+ "using Hash lookup instead of Looping lookup.")
Review comment:
Add few words to link this with
OPTIMIZE_NULL_AWARE_ANTI_JOIN_MAX_NUM_KEYS. E.g., "The number of keys supported
for NAAJ is configured by spark.sql.optimizeNullAwareAntiJoin.maxNumKeys".
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -2684,11 +2684,17 @@ object SQLConf {
.doc("When true, NULL-aware anti join execution will be planed into " +
"BroadcastHashJoinExec with flag isNullAwareAntiJoin enabled, " +
"optimized from O(M*N) calculation into O(M) calculation " +
- "using Hash lookup instead of Looping lookup." +
- "Only support for singleColumn NAAJ for now.")
+ "using Hash lookup instead of Looping lookup.")
.booleanConf
.createWithDefault(true)
+ val OPTIMIZE_NULL_AWARE_ANTI_JOIN_MAX_NUM_KEYS =
+ buildConf("spark.sql.optimizeNullAwareAntiJoin.maxNumKeys")
+ .internal()
+ .doc("The maximum number of keys that will be supported to use NAAJ
optimize.")
Review comment:
What the cost is to increase this maximum number? What is good and what
is bad. We should state clearly in the doc.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]