[
https://issues.apache.org/jira/browse/DRILL-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17505913#comment-17505913
]
ASF GitHub Bot commented on DRILL-8167:
---------------------------------------
cgivre opened a new pull request #2494:
URL: https://github.com/apache/drill/pull/2494
# [DRILL-8167](https://issues.apache.org/jira/browse/DRILL-8167): Add JSON
Config Options to Format Config
## Description
Most all Drill format plugins allow the user to configure various options
for that plugin as part of the format config. The one glaring exception is the
JSON reader which has several configuration options which can only be set
globally. This PR moves these to the format config so that users can set these
options when they configure a storage plugin.
This PR does not eliminate the global settings for JSON. It simply adds
another place where a user can update the settings. If the settings in the
config file are not defined (`null`) Drill will use the global settings.
The config is set to only include these values when they are not `null`, so
there are no breaking changes.
## Documentation
Drill's JSON reader can be configured with various global configuration
variables. However these variables can also be overridden in an individual
storage plugin configuration. The parameters are:
* `allTextMode`: When `true`, Drill will read all fields in a given JSON
file as text.
* `readNumbersAsDouble`: When `true`, Drill will read all numbers as
Doubles. This is useful if your data contains fields with a mix of integers
and floating point numbers. A very common place this happens is when the
record contains `0`.
* `skipMalformedJSONRecords`: When set to `true`, Drill will attempt to
skip malformed JSON records. When `false`, Drill will throw an exception for
bad records.
* `escapeAnyChar`: Allows escaping of any character when set to `true`.
* `nanInf`: Allows `NaN` and `Infinity` in JSON data when set to `true`.
A JSON config could look like this:
```json
...
"json": {
"type": "json",
"extensions": ["json"],
"allTextMode": true,
"readNumbersAsDouble": true,
"skipMalformedJSONRecords": true,
"escapeAnyChar": false,
"nanInf": true
}
...
```
You can also include these values at query time:
```sql
SELECT `integer`, `float`
FROM table(cp.`jsoninput/input2.json` (type => 'json', allTextMode => True))"
```
## Testing
Added unit tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Add JSON Config Options to Format Config
> ----------------------------------------
>
> Key: DRILL-8167
> URL: https://issues.apache.org/jira/browse/DRILL-8167
> Project: Apache Drill
> Issue Type: Improvement
> Components: Storage - JSON
> Affects Versions: 1.20.0
> Reporter: Charles Givre
> Assignee: Charles Givre
> Priority: Major
> Fix For: Future
>
>
> Most all Drill format plugins allow the user to configure various options for
> that plugin as part of the format config. The one glaring exception is the
> JSON reader which has several configuration options which can only be set
> globally. This PR moves these to the format config so that users can set
> these options when they configure a storage plugin.
> This PR does not eliminate the global settings for JSON. It simply adds
> another place where a user can update the settings. If the settings in the
> config file are not defined (`null`) Drill will use the global settings.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)