rdblue commented on a change in pull request #1778:
URL: https://github.com/apache/iceberg/pull/1778#discussion_r525628233



##########
File path: spark2/src/main/java/org/apache/iceberg/spark/source/Reader.java
##########
@@ -156,10 +156,13 @@
     this.io = io;
     this.encryptionManager = encryptionManager;
     this.caseSensitive = caseSensitive;
-
-    this.batchReadsEnabled = 
options.get("vectorization-enabled").map(Boolean::parseBoolean).orElseGet(() ->
-        PropertyUtil.propertyAsBoolean(table.properties(),
-            TableProperties.PARQUET_VECTORIZATION_ENABLED, 
TableProperties.PARQUET_VECTORIZATION_ENABLED_DEFAULT));
+    boolean batchReadsSparkSessionConf = 
SparkSession.active().sparkContext().getConf()
+            .getBoolean("read.parquet.vectorization.enabled", true);
+    boolean batchReadsEnabledTableProp = 
options.get("vectorization-enabled").map(Boolean::parseBoolean).orElseGet(() ->
+            PropertyUtil.propertyAsBoolean(table.properties(),
+                    TableProperties.PARQUET_VECTORIZATION_ENABLED,
+                    TableProperties.PARQUET_VECTORIZATION_ENABLED_DEFAULT));
+    this.batchReadsEnabled = !batchReadsSparkSessionConf ? false : 
batchReadsEnabledTableProp;

Review comment:
       Looks like the logic here is too complicated. I think what you're trying 
to do is disable vectorized reads if the session conf disables them, and 
otherwise delegate to the table property. In other words, the session conf and 
the table property must be enabled to use vectorization. If that's the case, 
then this could be `(batchReadsSparkSessionConf && batchReadsEnabledTableProp)`.
   
   I'm not sure that logic is a good idea, either. This may have surprising 
behavior because a user may expect that `true` enables vectorization for all 
tables when used as a session property.
   
   I think a better way to configure is to choose an order of priority and 
delegate to the next config if the option is not set. Here, I think session 
should override table. If session is set, return whatever it is. Otherwise, use 
the table property.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to