cgivre commented on code in PR #2526:
URL: https://github.com/apache/drill/pull/2526#discussion_r864097045
##########
contrib/storage-http/JSON_Options.md:
##########
@@ -0,0 +1,125 @@
+# JSON Options and Configuration
+
+Drill has a collection of JSON configuration options to allow you to configure
how Drill interprets JSON files. These are set at the global level, however
the HTTP plugin
+allows you to configure these options individually per connection and override
the Drill defaults. The options are:
+
+* `allowNanInf`: Configures the connection to interpret `NaN` and `Inf` values
+* `allTextMode`: By default, Drill attempts to infer data types from JSON
data. If the data is malformed, Drill may throw schema change exceptions. If
your data is
+ inconsistent, you can enable `allTextMode` which when true, Drill will read
all JSON values as strings, rather than try to infer the data type.
+* `readNumbersAsDouble`: By default Drill will attempt to interpret integers,
floating point number types and strings. One challenge is when data is
consistent, Drill may
+ throw schema change exceptions. In addition to `allTextMode`, you can make
Drill less sensitive by setting the `readNumbersAsDouble` to `true` which
causes Drill to read all
+ numeric fields in JSON data as `double` data type rather than trying to
distinguish between ints and doubles.
+* `enableEscapeAnyChar`: Allows a user to escape any character with a \
+* `skipMalformedRecords`: Allows Drill to skip malformed records and recover
without throwing exceptions.
+* `skipMalformedDocument`: Allows Drill to skip entire malformed documents
without throwing errors.
+
+All of these can be set by adding the `jsonOptions` to your connection
configuration as shown below:
+
+```json
+
+"jsonOptions": {
+ "allTextMode": true,
+ "readNumbersAsDouble": true
+}
+
+```
+
+## Schema Provisioning
+One of the challenges of querying APIs is inconsistent data. Drill allows you
to provide a schema for individual endpoints. You can do this in one of three
ways:
+
+1. By providing a schema inline [See: Specifying Schema as Table Function
Parameter](https://drill.apache.org/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter)
+2. By providing a schema in the configuration for the endpoint.
+3. By providing a serialized TupleMetadata of the desired schema. This is an
advanced functionality and should only be used by advanced Drill users.
+
+The schema provisioning currently supports complex types of Arrays and Maps at
any nesting level.
+
+### Example Schema Provisioning:
+```json
+"jsonOptions": {
+ "providedSchema": [
Review Comment:
Fixed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]