[
https://issues.apache.org/jira/browse/DRILL-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045485#comment-17045485
]
ASF GitHub Bot commented on DRILL-7601:
---------------------------------------
arina-ielchiieva commented on pull request #1993: DRILL-7601: Shift column
conversion to reader from scan framework
URL: https://github.com/apache/drill/pull/1993#discussion_r384448178
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/store/avro/AvroBatchReader.java
##########
@@ -69,11 +69,15 @@ public boolean open(FileScanFramework.FileSchemaNegotiator
negotiator) {
negotiator.userName(),
negotiator.context().getFragmentContext().getQueryUserName());
logger.debug("Avro file schema: {}", reader.getSchema());
- TupleMetadata schema = AvroSchemaUtil.convert(reader.getSchema());
- logger.debug("Avro file converted schema: {}", schema);
- negotiator.setTableSchema(schema, true);
+ TupleMetadata readerSchema = AvroSchemaUtil.convert(reader.getSchema());
+ logger.debug("Avro file converted schema: {}", readerSchema);
+ TupleMetadata providedSchema = negotiator.providedSchema();
+ TupleMetadata tableSchema =
StandardConversions.mergeSchemas(providedSchema, readerSchema);
Review comment:
Why reader should merge the schemas? Why not to do this implicitly in
negotiator?
My concern is that the fact the schemas should be merged can be easily
forgotten during reader implementation...
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Shift column conversion to reader from scan framework
> -----------------------------------------------------
>
> Key: DRILL-7601
> URL: https://issues.apache.org/jira/browse/DRILL-7601
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.17.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Major
> Fix For: 1.18.0
>
>
> At the time we implemented provided schemas with the text reader, the best
> path forward appeared to be to perform column type conversions within the
> scan framework including deep in the column writer structure.
> Experience with other readers has shown that the text reader is a special
> case: it always writes strings, which Drill-provided converters can parse
> into other types. Other readers, however are not so simple: they often have
> their own source structures which must be mated to a column reader, and so
> conversion is generally best done in the reader where it can be specific to
> the nuances of each reader.
> This ticket asks to restructure the conversion code to fit the
> reader-does-conversion pattern.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)