aokolnychyi commented on a change in pull request #2248:
URL: https://github.com/apache/iceberg/pull/2248#discussion_r578036747



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -474,15 +475,28 @@ public static boolean isLocalityEnabled(FileIO io, String 
location, CaseInsensit
     return false;
   }
 
-  public static boolean isVectorizationEnabled(Map<String, String> properties, 
CaseInsensitiveStringMap readOptions) {
+  public static boolean isVectorizationEnabled(FileFormat fileFormat,
+                                               Map<String, String> properties,
+                                               CaseInsensitiveStringMap 
readOptions) {
     String batchReadsSessionConf = SparkSession.active().conf()
         .get("spark.sql.iceberg.vectorization.enabled", null);
     if (batchReadsSessionConf != null) {
       return Boolean.valueOf(batchReadsSessionConf);
     }
-    return readOptions.getBoolean(SparkReadOptions.VECTORIZATION_ENABLED,
-        PropertyUtil.propertyAsBoolean(properties,
-            TableProperties.PARQUET_VECTORIZATION_ENABLED, 
TableProperties.PARQUET_VECTORIZATION_ENABLED_DEFAULT));
+
+    switch (fileFormat) {
+      case PARQUET:
+        boolean defaultValue = PropertyUtil.propertyAsBoolean(
+            properties,
+            TableProperties.PARQUET_VECTORIZATION_ENABLED,
+            TableProperties.PARQUET_VECTORIZATION_ENABLED_DEFAULT);
+        return readOptions.getBoolean(SparkReadOptions.VECTORIZATION_ENABLED, 
defaultValue);
+      case ORC:
+        // TODO: support a table property to enable/disable vectorized reads 
in ORC
+        return readOptions.getBoolean(SparkReadOptions.VECTORIZATION_ENABLED, 
true);

Review comment:
       I think it would be handy to add more ORC properties at the table level 
to control various aspects like we control the row group size in Parquet, for 
example. It is quite flexible and allows us to change it in one place and pick 
it up in all jobs.
   
   I agree about disabling this by default. Let me actually submit a separate 
PR for the table property and then consume it here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to