jackye1995 commented on a change in pull request #1790:
URL: https://github.com/apache/iceberg/pull/1790#discussion_r555368951
##########
File path: spark/src/main/java/org/apache/iceberg/spark/data/SparkOrcWriter.java
##########
@@ -98,9 +106,9 @@ public SparkOrcValueWriter primitive(Type.PrimitiveType
iPrimitive, TypeDescript
case LONG:
return SparkOrcValueWriters.longs();
case FLOAT:
- return SparkOrcValueWriters.floats();
+ return SparkOrcValueWriters.floats(getFieldId(primitive));
Review comment:
nit: `getFieldId` is not used anywhere else, why not just use
`ORCSchemaUtil.fieldId`
##########
File path:
flink/src/main/java/org/apache/iceberg/flink/data/FlinkSchemaVisitor.java
##########
@@ -44,17 +44,39 @@
case MAP:
MapType mapType = (MapType) flinkType;
Types.MapType iMapType = iType.asMapType();
-
- T key = visit(mapType.getKeyType(), iMapType.keyType(), visitor);
- T value = visit(mapType.getValueType(), iMapType.valueType(), visitor);
+ T key;
+ T value;
+
+ Types.NestedField keyField = iMapType.field(iMapType.keyId());
+ visitor.beforeMapKey(keyField);
+ try {
+ key = visit(mapType.getKeyType(), iMapType.keyType(), visitor);
+ } finally {
+ visitor.afterMapKey(keyField);
+ }
+
+ Types.NestedField valueField = iMapType.field(iMapType.valueId());
+ visitor.beforeMapValue(valueField);
+ try {
+ value = visit(mapType.getValueType(), iMapType.valueType(), visitor);
+ } finally {
+ visitor.afterMapValue(valueField);
+ }
return visitor.map(iMapType, key, value, mapType.getKeyType(),
mapType.getValueType());
case LIST:
ArrayType listType = (ArrayType) flinkType;
Types.ListType iListType = iType.asListType();
+ T element;
- T element = visit(listType.getElementType(), iListType.elementType(),
visitor);
+ Types.NestedField elementField =
iListType.field(iListType.elementId());
+ visitor.beforeListElement(elementField);
+ try {
+ element = visit(listType.getElementType(), iListType.elementType(),
visitor);
+ } finally {
Review comment:
error should be logged if we catch anything. same for the try finally
block above.
##########
File path: orc/src/main/java/org/apache/iceberg/orc/OrcRowWriter.java
##########
@@ -35,4 +37,9 @@
* @throws IOException if there's any IO error while writing the data value.
*/
void write(T row, VectorizedRowBatch output) throws IOException;
+
+ /**
+ * Returns a stream of {@link FieldMetrics} that this OrcRowWriter keeps
track of.
+ */
+ Stream<FieldMetrics> metrics();
Review comment:
why some method signatures of `metrics` have default, but some others
below do not?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]