cgivre opened a new pull request #2494:
URL: https://github.com/apache/drill/pull/2494


   # [DRILL-8167](https://issues.apache.org/jira/browse/DRILL-8167): Add JSON 
Config Options to Format Config
   
   ## Description
   Most all Drill format plugins allow the user to configure various options 
for that plugin as part of the format config.  The one glaring exception is the 
JSON reader which has several configuration options which can only be set 
globally.  This PR moves these to the format config so that users can set these 
options when they configure a storage plugin.  
   
   This PR does not eliminate the global settings for JSON.  It simply adds 
another place where a user can update the settings.  If the settings in the 
config file are not defined (`null`) Drill will use the global settings.
   
   The config is set to only include these values when they are not `null`, so 
there are no breaking changes.
   
   ## Documentation
   Drill's JSON reader can be configured with various global configuration 
variables.  However these variables can also be overridden in an individual 
storage plugin configuration.  The parameters are:
   
   * `allTextMode`:  When `true`, Drill will read all fields in a given JSON 
file as text.
   * `readNumbersAsDouble`:  When `true`, Drill will read all numbers as 
Doubles.  This is useful if your data contains fields with a mix of integers 
and floating point numbers.  A very common place this happens is when the 
record contains `0`.
   * `skipMalformedJSONRecords`:  When set to `true`, Drill will attempt to 
skip malformed JSON records.  When `false`, Drill will throw an exception for 
bad records.
   * `escapeAnyChar`:  Allows escaping of any character when set to `true`. 
   * `nanInf`:  Allows `NaN` and `Infinity` in JSON data when set to `true`. 
   
   A JSON config could look like this:
   
   ```json
   ...
   "json": {
      "type": "json",
      "extensions": ["json"],
      "allTextMode": true,
      "readNumbersAsDouble": true,
      "skipMalformedJSONRecords": true,
      "escapeAnyChar": false,
      "nanInf": true
   }
   ...
   ```
   
   You can also include these values at query time:
   
   ```sql
   SELECT `integer`, `float` 
   FROM table(cp.`jsoninput/input2.json` (type => 'json', allTextMode => True))"
   ```
   
   ## Testing
   Added unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to