[ 
https://issues.apache.org/jira/browse/DRILL-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872907#comment-16872907
 ] 

ASF GitHub Bot commented on DRILL-7293:
---------------------------------------

paul-rogers commented on issue #1807: DRILL-7293: Convert the regex ("log") 
plugin to use EVF
URL: https://github.com/apache/drill/pull/1807#issuecomment-505714454
 
 
   @arina-ielchiieva, I was able to get the plugin to work for this query:
   
   ```
   SELECT * FROM table(dfs.tf.table1(
     type => 'logRegex',
     regex => '(\\d\\d\\d\\d)-(\\d\\d)-(\\d\\d) .*',
     maxErrors => 10))
   ```
   
   To do this, I had to fix some of the issues described in DRILL-7298. In 
particular, DRILL-6672 notes that table functions are not able to call 
{{setFoo()}} methods as Jackson can, so table functions only work if the format 
plugin config fields are {{public}}. The were not public for the log format 
plugin, so I changed them to {{public}} to get the above query to work.
   
   If we look at the code in 
[`FormatPluginOptionsDescriptor.createConfigForTable()`](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FormatPluginOptionsDescriptor.java#L123),
 we'll see that there is nothing that would handle the `values` syntax 
suggested in your note. The only supported types are Java primitives.
   
   When I tried this query:
   
   ```
   SELECT * FROM table(dfs.tf.noGroups(
     type => 'logRegex',
     regex => '(\\d\\d\\d\\d)-(\\d\\d)-(\\d\\d) .*',
     `schema`=>values('month', 'VARCHAR')))
   ```
   
   I got this result:
   
   ```
   PARSE ERROR: Encountered "values" at line 1, column 115.
   
   SQL Query: SELECT * FROM table(dfs.tf.noGroups(type => 'logRegex', regex => 
'(\\d\\d\\d\\d)-(\\d\\d)-(\\d\\d) .*', `schema`=>values('month', 'VARCHAR')))
                                                                                
                                                ^
   ```
   
   So, looks like the {{values}} trick does not work. Even if it did, the code 
to produce the values argument would use some kind of Java collection which 
would not match the {{List<LogFormatField>}} of the {{schema}} field.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Convert the regex ("log") plugin to use EVF
> -------------------------------------------
>
>                 Key: DRILL-7293
>                 URL: https://issues.apache.org/jira/browse/DRILL-7293
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.16.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>             Fix For: 1.17.0
>
>
> The "log" plugin (which uses a regex to define the row format) is the subject 
> of Chapter 12 of the Learning Apache Drill book (though the version in the 
> book is simpler than the one in the master branch.)
> The recently-completed "Enhanced Vector Framework" (EVF, AKA the "row set 
> framework") gives Drill control over the size of batches created by readers, 
> and allows readers to use the recently-added provided schema mechanism.
> We wish to use the log reader as an example for how to convert a Drill format 
> plugin to use the EVF so that other developers can convert their own plugins.
> This PR provides the first set of log plugin changes to enable us to publish 
> a tutorial on the EVF.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to