[ 
https://issues.apache.org/jira/browse/DRILL-6167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16369599#comment-16369599
 ] 

ASF GitHub Bot commented on DRILL-6167:
---------------------------------------

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/1114
  
    See [this 
example](https://github.com/paul-rogers/drill/tree/regex-plugin/exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/regex),
 and [this 
test](https://github.com/paul-rogers/drill/blob/regex-plugin/exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/regex/TestRegexReader.java)
 for examples of one way to address some of the comments made in the code 
review. That example handles projection, which was mentioned in one of the 
review comments.
    
    Since the regex is file-specific, the regex format plugin is most useful if 
it can be configured per-file using table functions. See 
[DRILL-6167](https://issues.apache.org/jira/browse/DRILL-6167), 
[DRILL-6168](https://issues.apache.org/jira/browse/DRILL-6168) and 
[DRILL-6169](https://issues.apache.org/jira/browse/DRILL-6169) for problems 
that will be encountered. See the example and tests above for how to work 
around the bugs.
    
    Although I suggested looking at the `ResultSetLoader`, it is a bit 
premature to do so. That mechanism relies on additional mechanisms that have 
not yet been committed to master. So, we need to work with the mechanisms we 
have now. See the example reader above for how to cache the per-column mutator 
needed to write to vectors without the switch statements in the code in this PR.
    
    To generalize, the example has an object per column that holds per-column 
info. The example uses only `VarChar` columns. To add additional types, create 
a base column state class with subclasses for each type. Then, simply call a 
`save(String value)` method to write a column. That method can handle nulls 
(for projected non-existent columns) and type conversions (where needed).
    
    Finally, feel free to borrow liberally from the example. (The example was 
created for our Drill book, so is fair game to reuse.)


> Table functions give error without hidden type field
> ----------------------------------------------------
>
>                 Key: DRILL-6167
>                 URL: https://issues.apache.org/jira/browse/DRILL-6167
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.12.0
>            Reporter: Paul Rogers
>            Priority: Major
>
> Drill provides [table 
> functions|https://drill.apache.org/docs/plugin-configuration-basics/] (see 
> _Using the Formats Attributes as Table Function Parameters_) which allow 
> queries to specify properties of format plugins.
> All table functions derive from the {{FormatPluginConfig}} base class:
> {code}
> @JsonTypeInfo(use = JsonTypeInfo.Id.NAME, 
>       include = JsonTypeInfo.As.PROPERTY,
>       property="type")
> public interface FormatPluginConfig { }
> {code}
> The annotation above appears to define a property called {{type}} to identify 
> the subtype of the base class, and is used when deserializing JSON for the 
> config object.
> Suppose we define a "regex plugin" to let us read a log file using a regex. 
> We define a plugin config for this plugin:
> {code}
> @JsonTypeName("regex")
> @JsonInclude(Include.NON_DEFAULT)
> public class RegexFormatConfig implements FormatPluginConfig {
>   public String regex;
> ...
> {code}
> For the above, everything works just fine if we use the config in the normal 
> way (define in the Drill web console or programmatically in a test.)
> Suppose we want to change the regex in a table function:
> {code}
> SELECT * FROM table(cp.`regex/simple.log2`
>   (regex => 'some pattern'))
> {code}
> When run (in the debugger, as a test), we get the following error:
> {code}
> org.apache.calcite.runtime.CalciteContextException: From line 1,
>     column 24 to line 2, column 40:
>     DEFAULT is only allowed for optional parameters
> {code}
> The error is thrown in {{SqlOperator.checkOperandTypes()}} which calls 
> {{FamilyOperandTypeChecker.isOptional()}} which calls 
> {{WorkspaceSchemaFactory.WithOptionsTableMacro.getParameters()}} which has 
> somehow decided that the {{type}} "parameter" is required.
> OK, if it is required, let's provide it:
> {code}
> SELECT * FROM table(cp.`regex/simple.log2`
>   (type => 'regex', regex => 'some pattern'))
> {code}
> The above SQL works as expected, producing the proper output.
> Since the type is hidden, and is known only to Java developers, the code 
> should not require that the user specify it; especially since there is 
> exactly only one correct value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to