cgivre commented on code in PR #2836:
URL: https://github.com/apache/drill/pull/2836#discussion_r1376780198


##########
contrib/format-daffodil/src/main/java/org/apache/drill/exec/store/daffodil/DaffodilBatchReader.java:
##########
@@ -64,64 +69,97 @@ public DaffodilBatchReader (DaffodilReaderConfig 
readerConfig, EasySubScan scan,
     this.validationMode = formatConfig.getValidationMode();
 
     //
-    // FIXME: Next, a MIRACLE occurs.
+    // FIXME: Where is this config file to be found? And, what is its syntax?

Review Comment:
   I'm feeling like we're crossing threads here a bit so let me back up a bit 
and explain how Drill handles configurations.  When I was talking about 
configs, I was talking about the params that the format plugin needs.  
   
   ## Format Configurations
   When you create a format plugin, the first file you likely created was 
`DaffodilFormatPlugin` which extends the `EasyFormatPlugin` interface.  Inside 
that generic, you added `DaffodilFormatConfig`. 
   
   ```java
   public class DaffodilFormatPlugin extends 
EasyFormatPlugin<DaffodilFormatConfig>
   ```
   
   By doing this, you've created the format plugin and associated it with a 
configuration object: `DaffodilFormatConfig`.   We do have the convention of 
calling these configs: `XXXFormatConfig` or `XXXStorageConfig`, but you could 
really call it whatever you want as long as that class implements the 
`FormatPluginConfig` interface.
   
   Let's say that we have a format called `foo`, and we've defined one variable 
called `bar` in the `FooFormatConfig` class.   Whenever you create a new 
instance of a file system connection (like HFDS, CP, dfs, etc.) that file 
system configuration has a list of formats which looks something like this:
   
   ```json
    "formats" : {
           "psv" : {
             "type" : "text",
             "extensions" : [ "tbl" ],
             "fieldDelimiter" : "|"
           },
           "csv" : {
             "type" : "text",
             "extensions" : [ "csv" ],
             "fieldDelimiter" : ",",
             "extractHeaders": true
           }
   }
   ```
   
   Using the Drill UI, the user can configure these parameters.  Drill will 
store the actual values as a json file in zookeeper, however, that process is 
handled by Drill's internals and isn't something that the format plugin has to 
manage.  A user should never manually edit these files directly.  They should 
only do so via Drill's UI, and there is a checksum to enforce that.
   
   As mentioned earlier, the user can override these parameters at query time 
by using the `table()` functions.  
   
   Drill will handle loading the configuration from this json file for you, and 
the daffodil format plugin does not need to do anything for that. 
   
   Bottom line is that any parameters you define in your config class, should 
be available via `plugin.getConfig()`.  Note that there is also a readerConfig 
object.  This is meant for more complex plugins and i don't think we need to 
use it for anything in this example.
   
   Does this help?  I hope this answers your questions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to