rdblue commented on a change in pull request #2248:
URL: https://github.com/apache/iceberg/pull/2248#discussion_r588813154
##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -474,15 +476,33 @@ public static boolean isLocalityEnabled(FileIO io, String
location, CaseInsensit
return false;
}
- public static boolean isVectorizationEnabled(Map<String, String> properties,
CaseInsensitiveStringMap readOptions) {
- String batchReadsSessionConf = SparkSession.active().conf()
- .get("spark.sql.iceberg.vectorization.enabled", null);
- if (batchReadsSessionConf != null) {
- return Boolean.valueOf(batchReadsSessionConf);
+ public static boolean isVectorizationEnabled(FileFormat fileFormat,
+ Map<String, String> properties,
+ RuntimeConfig sessionConf,
+ CaseInsensitiveStringMap
readOptions) {
+
+ String readOptionValue =
readOptions.get(SparkReadOptions.VECTORIZATION_ENABLED);
+ if (readOptionValue != null) {
+ return Boolean.parseBoolean(readOptionValue);
Review comment:
I think it should be safe to do this because it is unlikely that a job
actually sets this in its config unless it is testing or already tested. So
while I think this prohibits the use case of finding a problem and disabling
vectorized reads everywhere to be safe, it is unlikely that this should make it
into final compiled artifacts. So let's just go with what is here since it
appears that is the more common way of thinking about it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]