rdsr commented on a change in pull request #112: [Refactor][Minor] Refactor
ORCFileAppender to be consistent with Avro/Parquet
URL: https://github.com/apache/incubator-iceberg/pull/112#discussion_r263687567
##########
File path: orc/src/main/java/com/netflix/iceberg/orc/ORC.java
##########
@@ -74,7 +72,7 @@ public WriteBuilder schema(Schema schema) {
return this;
}
- public OrcFileAppender build() {
Review comment:
There's `OrcFileAppender#getSchema` which returns the underlying ORC schema
as a `TypeDescription` .
Also, now that I think about it. This could be problematic .
`OrcFileAppender` uses `VectorizedRowBatch` as a concrete type when
implementing `FileAppender<T>` . Just returning a `FileAppender`, as I've
done here, may not be a valid change in this specific scenario.
In that regard, ORC's `FileAppender` is deviating from Avro and Parquet
where `FileAppender`'s record type is parameterized. Which implies we cannot
use Iceberg's own record type when writing ORC data.
ORCFileAppender seems to be breaking an abstraction boundary here when it is
exposing an internal class [VectorizedRowBatch] to Iceberg's main APIs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]