openinx commented on a change in pull request #3250:
URL: https://github.com/apache/iceberg/pull/3250#discussion_r725779961
##########
File path: orc/src/main/java/org/apache/iceberg/data/orc/GenericOrcWriter.java
##########
@@ -115,53 +116,77 @@ private WriteBuilder() {
}
@Override
- @SuppressWarnings("unchecked")
public void write(Record value, VectorizedRowBatch output) {
- Preconditions.checkArgument(writer instanceof RecordWriter, "writer must
be a RecordWriter.");
+ Preconditions.checkArgument(value != null, "value must not be null");
int row = output.size;
output.size += 1;
- List<OrcValueWriter<?>> writers = ((RecordWriter) writer).writers();
- for (int c = 0; c < writers.size(); ++c) {
- OrcValueWriter child = writers.get(c);
- child.write(row, value.get(c, child.getJavaClass()), output.cols[c]);
- }
+
+ writer.rootNonNullWrite(row, value, output);
+ }
+
+ @Override
+ public List<OrcValueWriter<?>> writers() {
+ return this.writer.writers();
}
@Override
public Stream<FieldMetrics<?>> metrics() {
return writer.metrics();
}
- private static class RecordWriter implements OrcValueWriter<Record> {
+ public abstract static class StructWriter<S> implements OrcValueWriter<S> {
private final List<OrcValueWriter<?>> writers;
- RecordWriter(List<OrcValueWriter<?>> writers) {
+ protected StructWriter(List<OrcValueWriter<?>> writers) {
this.writers = writers;
}
- List<OrcValueWriter<?>> writers() {
- return writers;
+ public List<OrcValueWriter<?>> writers() {
+ return this.writers;
}
@Override
- public Class<Record> getJavaClass() {
- return Record.class;
+ public Stream<FieldMetrics<?>> metrics() {
+ return writers.stream().flatMap(OrcValueWriter::metrics);
}
@Override
- @SuppressWarnings("unchecked")
- public void nonNullWrite(int rowId, Record data, ColumnVector output) {
+ public void nonNullWrite(int rowId, S value, ColumnVector output) {
StructColumnVector cv = (StructColumnVector) output;
+ write(rowId, value, c -> cv.fields[c]);
+ }
+
+ // Special case of writing the root struct
+ public void rootNonNullWrite(int rowId, S value, VectorizedRowBatch
output) {
Review comment:
Why do we need to expose the `rowId` in this `rootNonNullWrite` method ?
The root `OrcRowWriter` always provide a write method like :
```java
public void write(RowData row, VectorizedRowBatch output)
```
So why not just hide the rowId inside the `rootNonNullWrite` like:
```java
public void write(RowData row, VectorizedRowBatch output){
Preconditions.checkArgument(row != null, "value must not be null");
writer.rootNonNullWrite(row, output);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]