[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

GitBox Thu, 13 Jan 2022 07:59:32 -0800


RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r784094937




##########
File path: 
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = 
Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), 
resultType));
+    });
+    return (StructType) 
SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> 
properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> 
properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support 
metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws 
UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support 
metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow 
ident) {
+    // support [show partitions] syntax
+    List<InternalRow> rows = Lists.newArrayList();
+    StructType schema = partitionSchema();
+    StructField[] fields = schema.fields();
+    Map<String, String> partitionFilter = Maps.newHashMap();
+    if (names.length > 0) {
+      int idx = 0;
+      while (idx < names.length) {
+        DataType dataType = schema.apply(names[idx]).dataType();
+        partitionFilter.put(names[idx], String.valueOf(ident.get(idx, 
dataType)));
+        idx += 1;
+      }
+    }
+    String fileFormat = icebergTable.properties()
+            .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT, 
TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+    List<SparkTableUtil.SparkPartition> partitions = 
Spark3Util.getPartitions(sparkSession(),

Review comment:
       This method is just for getting the "partitions" of a file based table. 
Iceberg should be using its own internal "partitions" table for this info. See 
https://github.com/apache/iceberg/blob/f932c55fd8c07c875984889ecfbb1fd7c219f726/core/src/main/java/org/apache/iceberg/PartitionsTable.java#L35
   
   Since this is spark we should be using 
   
   
https://github.com/RussellSpitzer/iceberg/blob/56839c8c07b83edcfd11765ffb7303da811a65fc/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java#L610
   
   To use the distributed versions of this code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Reply via email to