RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r784094937
##########
File path:
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
}
}
+ @Override
+ public StructType partitionSchema() {
+ Schema schema = icebergTable.spec().schema();
+ List<PartitionField> fields = icebergTable.spec().fields();
+ List<Types.NestedField> structFields =
Lists.newArrayListWithExpectedSize(fields.size());
+ fields.forEach(f -> {
+ Type resultType = Types.StringType.get();
+ Type sourceType = schema.findType(f.sourceId());
+ if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+ resultType = f.transform().getResultType(sourceType);
+ }
+ structFields.add(Types.NestedField.optional(f.fieldId(), f.name(),
resultType));
+ });
+ return (StructType)
SparkSchemaUtil.convert(Types.StructType.of(structFields));
+ }
+
+ @Override
+ public void createPartition(InternalRow ident, Map<String, String>
properties) throws UnsupportedOperationException {
+ // use Iceberg SQL extensions
+ }
+
+ @Override
+ public boolean dropPartition(InternalRow ident) {
+ // use Iceberg SQL extensions
+ return false;
+ }
+
+ @Override
+ public void replacePartitionMetadata(InternalRow ident, Map<String, String>
properties)
+ throws UnsupportedOperationException {
+ throw new UnsupportedOperationException("Iceberg partitions do not support
metadata");
+ }
+
+ @Override
+ public Map<String, String> loadPartitionMetadata(InternalRow ident) throws
UnsupportedOperationException {
+ throw new UnsupportedOperationException("Iceberg partitions do not support
metadata");
+ }
+
+ @Override
+ public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow
ident) {
+ // support [show partitions] syntax
+ List<InternalRow> rows = Lists.newArrayList();
+ StructType schema = partitionSchema();
+ StructField[] fields = schema.fields();
+ Map<String, String> partitionFilter = Maps.newHashMap();
+ if (names.length > 0) {
+ int idx = 0;
+ while (idx < names.length) {
+ DataType dataType = schema.apply(names[idx]).dataType();
+ partitionFilter.put(names[idx], String.valueOf(ident.get(idx,
dataType)));
+ idx += 1;
+ }
+ }
+ String fileFormat = icebergTable.properties()
+ .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT,
TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+ List<SparkTableUtil.SparkPartition> partitions =
Spark3Util.getPartitions(sparkSession(),
Review comment:
This method is just for getting the "partitions" of a file based table.
Iceberg should be using its own internal "partitions" table for this info. See
https://github.com/apache/iceberg/blob/f932c55fd8c07c875984889ecfbb1fd7c219f726/core/src/main/java/org/apache/iceberg/PartitionsTable.java#L35
Since this is spark we should be using
https://github.com/RussellSpitzer/iceberg/blob/56839c8c07b83edcfd11765ffb7303da811a65fc/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java#L610
To use the distributed versions of this code.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]