[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #207: Add external schema mappings for files written with name-based schemas #40

GitBox Wed, 09 Oct 2019 10:40:51 -0700

rdblue commented on a change in pull request #207: Add external schema mappings 
for files written with name-based schemas #40
URL: https://github.com/apache/incubator-iceberg/pull/207#discussion_r333144120


 ##########
 File path: 
core/src/main/java/org/apache/iceberg/avro/ProjectionDatumReader.java
 ##########
 @@ -26,29 +26,33 @@
 import org.apache.avro.Schema;
 import org.apache.avro.io.DatumReader;
 import org.apache.avro.io.Decoder;
+import org.apache.iceberg.mapping.NameMapping;
 import org.apache.iceberg.types.TypeUtil;
 
 public class ProjectionDatumReader<D> implements DatumReader<D> {
   private final Function<Schema, DatumReader<?>> getReader;
   private final org.apache.iceberg.Schema expectedSchema;
   private final Map<String, String> renames;
+  private final NameMapping nameMapping;
   private Schema readSchema = null;
   private Schema fileSchema = null;
   private DatumReader<D> wrapped = null;
 
   public ProjectionDatumReader(Function<Schema, DatumReader<?>> getReader,
                                org.apache.iceberg.Schema expectedSchema,
-                               Map<String, String> renames) {
+                               Map<String, String> renames,
+                               NameMapping nameMapping) {
     this.getReader = getReader;
     this.expectedSchema = expectedSchema;
     this.renames = renames;
+    this.nameMapping = nameMapping;
   }
 
   @Override
   public void setSchema(Schema newFileSchema) {
     this.fileSchema = newFileSchema;
     Set<Integer> projectedIds = TypeUtil.getProjectedIds(expectedSchema);
-    Schema prunedSchema = AvroSchemaUtil.pruneColumns(newFileSchema, 
projectedIds);
+    Schema prunedSchema = AvroSchemaUtil.pruneColumns(newFileSchema, 
projectedIds, nameMapping);
 
 Review comment:
   I think I mentioned it elsewhere, but I think the behavior should be:
   
   1. If there is a name mapping in table metadata, pass it in
   2. Otherwise, if the incoming schema has no field IDs, infer a mapping from 
the current schema and use it
   3. Otherwise the schema has at least one column ID, so no inferred mapping 
should be used

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #207: Add external schema mappings for files written with name-based schemas #40

Reply via email to