paul-rogers commented on a change in pull request #1696: DRILL-7095: Expose
table schema (TupleMetadata) to physical operator (EasySubScan)
URL: https://github.com/apache/drill/pull/1696#discussion_r266213186
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyGroupScan.java
##########
@@ -78,30 +80,41 @@ public EasyGroupScan(
@JsonProperty("format") FormatPluginConfig formatConfig,
@JacksonInject StoragePluginRegistry engineRegistry,
@JsonProperty("columns") List<SchemaPath> columns,
- @JsonProperty("selectionRoot") Path selectionRoot
+ @JsonProperty("selectionRoot") Path selectionRoot,
+ @JsonProperty("schema") TupleSchema schema
) throws IOException, ExecutionSetupException {
this(ImpersonationUtil.resolveUserName(userName),
FileSelection.create(null, files, selectionRoot),
(EasyFormatPlugin<?>)engineRegistry.getFormatPlugin(storageConfig,
formatConfig),
columns,
- selectionRoot);
+ selectionRoot,
+ schema);
}
public EasyGroupScan(String userName, FileSelection selection,
EasyFormatPlugin<?> formatPlugin, Path selectionRoot)
throws IOException {
- this(userName, selection, formatPlugin, ALL_COLUMNS, selectionRoot);
+ this(userName, selection, formatPlugin, ALL_COLUMNS, selectionRoot, null);
+ }
+
+ public EasyGroupScan(String userName, FileSelection selection,
EasyFormatPlugin<?> formatPlugin,
+ List<SchemaPath> columns, Path selectionRoot) throws
IOException {
+ this(userName, selection, formatPlugin, columns, selectionRoot, null);
}
public EasyGroupScan(
String userName,
FileSelection selection,
EasyFormatPlugin<?> formatPlugin,
List<SchemaPath> columns,
- Path selectionRoot
+ Path selectionRoot,
+ TupleMetadata schema
) throws IOException{
super(userName);
this.selection = Preconditions.checkNotNull(selection);
this.formatPlugin = Preconditions.checkNotNull(formatPlugin, "Unable to
load format plugin for provided format config.");
+ if (schema != null) {
+ this.formatPlugin.setSchema(schema);
Review comment:
Not sure this makes sense. The format plugin is shared across multiple
scans. I don't believe it is copied anew for each scan. So, I think the schema
should be a property of the group scan, not the plugin.
Also, the plugin is not serialized; it is created anew (IIRC) on each node.
Note that, once the schema is an attribute of the group scan, it needs to be
set in the copy constructor. Since we don't expect the schema to change, we can
just have the copy reuse the same schema as the original: no need to copy the
schema itself. Should add a comment to explain this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services