shardulm94 commented on a change in pull request #2248:
URL: https://github.com/apache/iceberg/pull/2248#discussion_r580536744
##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -474,15 +475,31 @@ public static boolean isLocalityEnabled(FileIO io, String
location, CaseInsensit
return false;
}
- public static boolean isVectorizationEnabled(Map<String, String> properties,
CaseInsensitiveStringMap readOptions) {
+ public static boolean isVectorizationEnabled(FileFormat fileFormat,
+ Map<String, String> properties,
+ CaseInsensitiveStringMap
readOptions) {
String batchReadsSessionConf = SparkSession.active().conf()
.get("spark.sql.iceberg.vectorization.enabled", null);
if (batchReadsSessionConf != null) {
return Boolean.valueOf(batchReadsSessionConf);
}
- return readOptions.getBoolean(SparkReadOptions.VECTORIZATION_ENABLED,
Review comment:
I see tradeoffs either way. I agree that the most specific value is
ideally the read options explicitly passed to the table read. But a session
conf taking higher precedence is also convenient in production to turn off
vectorization for an application by a pure config change, no need for code
changes.
Another option we have is to use a boolean `AND` between the session conf
and read option. This is used in
https://github.com/apache/iceberg/blob/91ac42174e4c535ece4e36db2cb587a23babced9/spark2/src/main/java/org/apache/iceberg/spark/source/IcebergSource.java#L182
It can be a little confusing here if the default of session conf (true) is
different than the default of read option (false), but is worth considering. Or
maybe a three state boolean is more appropriate here, but that gets complicated
quickly.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]