[druid] branch master updated: Add missing MSQ error code fields to docs (#13308)

karan Thu, 10 Nov 2022 07:33:27 -0800

This is an automated email from the ASF dual-hosted git repository.

karan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git



The following commit(s) were added to refs/heads/master by this push:
     new 03175a2b8d Add missing MSQ error code fields to docs (#13308)
03175a2b8d is described below

commit 03175a2b8d50e1121d7b1937ec61ceabec2ea43f
Author: Andreas Maechler <[email protected]>
AuthorDate: Thu Nov 10 08:33:04 2022 -0700

    Add missing MSQ error code fields to docs (#13308)
    
    * Fix typo
    
    * Fix some spacing
    
    * Add missing fields
    
    * Cleanup table spacing
    
    * Remove durable storage docs again
    
    Thanks Brian for pointing out previous discussions.
    
    * Update docs/multi-stage-query/reference.md
    
    Co-authored-by: Charles Smith <[email protected]>
    
    * Mark codes as code
    
    * And even more codes as code
    
    * Another set of spaces
    
    * Combine `ColumnTypeNotSupported`
    
    Thanks Karan.
    
    * More whitespaces and typos
    
    * Add spelling and fix links
    
    Co-authored-by: Charles Smith <[email protected]>
---
 CONTRIBUTING.md                                    |  56 +++++------
 .../extensions-core/datasketches-hll.md            |  61 ++++++------
 docs/ingestion/data-formats.md                     | 110 ++++++++++++---------
 docs/ingestion/index.md                            |   4 +-
 docs/ingestion/partitioning.md                     |   6 +-
 docs/ingestion/rollup.md                           |  17 +++-
 docs/ingestion/schema-design.md                    |  68 ++++++-------
 docs/ingestion/tasks.md                            |  20 ++--
 docs/multi-stage-query/api.md                      |  77 +++++++--------
 docs/multi-stage-query/concepts.md                 |  14 +--
 docs/multi-stage-query/examples.md                 |   3 -
 docs/multi-stage-query/index.md                    |   6 +-
 docs/multi-stage-query/reference.md                |  97 +++++++++---------
 docs/operations/rule-configuration.md              |  18 +++-
 docs/querying/virtual-columns.md                   |   7 +-
 website/.spelling                                  |   1 +
 16 files changed, 290 insertions(+), 275 deletions(-)

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 545d546964..916d38e1f4 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -46,7 +46,7 @@ You can find more developers' resources in [`dev/`](dev) 
directory.
 
 1. Fork the apache/druid repository into your GitHub account
 
-    https://github.com/apache/druid/fork
+    <https://github.com/apache/druid/fork>
 
 2. Clone your fork of the GitHub repository
 
@@ -58,32 +58,32 @@ You can find more developers' resources in [`dev/`](dev) 
directory.
 
 3. Add a remote to keep up with upstream changes
 
-    ```
+    ```sh
     git remote add upstream https://github.com/apache/druid.git
     ```
 
     If you already have a copy, fetch upstream changes
 
-    ```
+    ```sh
     git fetch upstream master
     ```
 
 4. Create a feature branch to work in
 
-    ```
+    ```sh
     git checkout -b feature-xxx remotes/upstream/master
     ```
 
 5. _Before submitting a pull request_ periodically rebase your changes
     (but don't do it when a pull request is already submitted)
 
-    ```
+    ```sh
     git pull --rebase upstream master
     ```
 
 6. Before submitting a pull request, combine ("squash") related commits into a 
single one
 
-    ```
+    ```sh
     git rebase -i upstream/master
     ```
 
@@ -93,13 +93,13 @@ You can find more developers' resources in [`dev/`](dev) 
directory.
 
 7. Submit a pull-request
 
-    ```
+    ```sh
     git push origin feature-xxx
     ```
 
     Go to your Druid fork main page
 
-    ```
+    ```txt
     https://github.com/<username>/druid
     ```
 
@@ -116,21 +116,21 @@ You can find more developers' resources in [`dev/`](dev) 
directory.
     Address code review comments by committing changes and pushing them to 
your feature
     branch.
 
-    ```
+    ```sh
     git push origin feature-xxx
     ```
 
 ### If your pull request shows conflicts with master
+
   If your pull request shows conflicts with master, merge master into your 
feature branch:
   
-
-  ```
+  ```sh
   git merge upstream/master
   ```
   
   and resolve the conflicts. After resolving conflicts, push your branch again:
   
-  ```
+  ```sh
   git push origin feature-xxx
   ```
 
@@ -143,14 +143,14 @@ You can find more developers' resources in [`dev/`](dev) 
directory.
 
 Release notes are the way that Druid users will learn about your fix or 
improvement. What does a user need to know? The key is to identify the user 
impact. Give it your best shot! Druid committers will review and edit your 
notes.
 
-Here's a non-exhaustive list of the type of changes that have user impact and 
need release notes: 
+Here's a non-exhaustive list of the type of changes that have user impact and 
need release notes:
 
-* Changes what the user sees in the UI.
-* Changes any action that the user takes (in the UI, in the API, in 
configuration, in install, etc.)
-* Changes the results of any query, ingestion, or task.
-* Changes performance (preferably making things faster).
-* Adds, deprecates, or removes features.
-* Anything that changes install, configuration, or operation of Druid.
+- Changes what the user sees in the UI.
+- Changes any action that the user takes (in the UI, in the API, in 
configuration, in install, etc.)
+- Changes the results of any query, ingestion, or task.
+- Changes performance (preferably making things faster).
+- Adds, deprecates, or removes features.
+- Anything that changes install, configuration, or operation of Druid.
 
 An example of a change that doesn't need a release note is fixing a flakey 
test.
 
@@ -164,46 +164,46 @@ Use these tips when writing your release note:
 
 **Give enough context.** Make sure there's enough detail for users to do 
something with the information if they need to. For example, include the 
property they need to set and link to the documentation when possible.
 
-**You don't need to be formal and impersonal.** Speak directly to the user 
("You can..."). Avoid passive voice (“X has been added”). 
+**You don't need to be formal and impersonal.** Speak directly to the user 
("You can..."). Avoid passive voice (“X has been added”).
 
 ### Example release notes
 
 | Template                                        | Examples |
 |-------------------------------------------------|----------|
-| New: You can now…                           | New: You can now upload CSV 
files with a single header row for batch ingestion. Set the 
`infrerSchemaFromHeader` property of your ingestion spec to `true` to enable 
this feature. For more information, see [TITLE](/path/to/doc-file.md#anchor).|
+| New: You can now…                           | New: You can now upload CSV 
files with a single header row for batch ingestion. Set the 
`inferSchemaFromHeader` property of your ingestion spec to `true` to enable 
this feature. For more information, see [TITLE](/path/to/doc-file.md#anchor).|
 | Fixed: X no longer does Y when Z.               | Fixed: Multi-value string 
array expressions no longer give flattened results when used in  groupBy 
queries.         |
 | Changed: X now does Y. Previously, X did Z. | Changed: The first/last string 
aggregator now only compares based on values. Previously, the first/last string 
aggregator’s values were compared based on the `_time` column first and then on 
values.         |
 | Improved: Better / Increased / Updated etc. | Improved: You can now perform 
Kinesis ingestion even if there are empty shards. Previously, all shards had to 
have at least one record. |
 | Improved: Better / Increased / Updated etc. | Improved: Java 11 is fully 
supported and is no longer experimental. Java 17 support is improved.         |
 | Deprecated: Removed / Will remove X. | Deprecated: Support for ZooKeeper X.Y 
will be removed in the next release, so consider upgrading ZooKeeper. For 
information about how to upgrade ZooKeeper, see the ZooKeeper documentation.    
    |
 
-
 ## FAQ
 
-### Help! I merged changes from upstream and cannot figure out how to resolve 
conflicts when rebasing!
+### Help! I merged changes from upstream and cannot figure out how to resolve 
conflicts when rebasing
 
 Never fear! If you occasionally merged upstream/master, here is another way to 
squash your changes into a single commit:
 
 1. First, rename your existing branch to something else, e.g. 
`feature-xxx-unclean`
-    ```
+
+    ```sh
     git branch -m feature-xxx-unclean
     ```
 
-2.  Checkout a new branch with the original name `feature-xxx` from upstream. 
This branch will supercede our old one.
+2. Checkout a new branch with the original name `feature-xxx` from upstream. 
This branch will supersede our old one.
 
-    ```
+    ```sh
     git checkout -b feature-xxx upstream/master
     ```
 
 3. Then merge your changes in your original feature branch 
`feature-xxx-unclean` and create a single commit.
 
-    ```
+    ```sh
     git merge --squash feature-xxx-unclean
     git commit
     ```
 
 4. You can now submit this new branch and create or replace your existing pull 
request.
 
-    ```
+    ```sh
     git push origin [--force] feature-xxx:feature-xxx
     ```
diff --git a/docs/development/extensions-core/datasketches-hll.md 
b/docs/development/extensions-core/datasketches-hll.md
index 07cc7da8b2..291fd58813 100644
--- a/docs/development/extensions-core/datasketches-hll.md
+++ b/docs/development/extensions-core/datasketches-hll.md
@@ -43,18 +43,17 @@ druid.extensions.loadList=["druid-datasketches"]
 |`tgtHllType`|The type of the target HLL sketch. Must be `HLL_4`, `HLL_6` or 
`HLL_8` |no, defaults to `HLL_4`|
 |`round`|Round off values to whole numbers. Only affects query-time behavior 
and is ignored at ingestion-time.|no, defaults to `false`|
 
-
 > The default `lgK` value has proven to be sufficient for most use cases; 
 > expect only very negligible improvements in accuracy with `lgK` values over 
 > `16` in normal circumstances.
 
 #### HLLSketchBuild Aggregator
 
 ```
 {
-  "type" : "HLLSketchBuild",
-  "name" : <output name>,
-  "fieldName" : <metric name>,
-  "lgK" : <size and accuracy parameter>,
-  "tgtHllType" : <target HLL type>,
+  "type": "HLLSketchBuild",
+  "name": <output name>,
+  "fieldName": <metric name>,
+  "lgK": <size and accuracy parameter>,
+  "tgtHllType": <target HLL type>,
   "round": <false | true>
  }
 ```
@@ -65,17 +64,15 @@ When applied at query time on an existing dimension, you 
can use the resulting c
 > It is very common to use `HLLSketchBuild` in combination with 
 > [rollup](../../ingestion/rollup.md) to create a 
 > [metric](../../ingestion/ingestion-spec.html#metricsspec) on 
 > high-cardinality columns.  In this example, a metric called `userid_hll` is 
 > included in the `metricsSpec`.  This will perform a HLL sketch on the 
 > `userid` field at ingestion time, allowing for highly-performant approximate 
 > `COUNT DISTINCT` query operations and improving roll-up ratios when `userid` 
 > is then left out of [...]
 >
 > ```
-> :
 > "metricsSpec": [
->  {
->    "type" : "HLLSketchBuild",
->    "name" : "userid_hll",
->    "fieldName" : "userid",
->    "lgK" : 12,
->    "tgtHllType" : "HLL_4"
->  }
+>   {
+>     "type": "HLLSketchBuild",
+>     "name": "userid_hll",
+>     "fieldName": "userid",
+>     "lgK": 12,
+>     "tgtHllType": "HLL_4"
+>   }
 > ]
-> :
 > ```
 >
 
@@ -83,13 +80,13 @@ When applied at query time on an existing dimension, you 
can use the resulting c
 
 ```
 {
-  "type" : "HLLSketchMerge",
-  "name" : <output name>,
-  "fieldName" : <metric name>,
-  "lgK" : <size and accuracy parameter>,
-  "tgtHllType" : <target HLL type>,
+  "type": "HLLSketchMerge",
+  "name": <output name>,
+  "fieldName": <metric name>,
+  "lgK": <size and accuracy parameter>,
+  "tgtHllType": <target HLL type>,
   "round": <false | true>
- }
+}
 ```
 
 You can use the `HLLSketchMerge` aggregator to ingest pre-generated sketches 
from an input dataset. For example, you can set up a batch processing job to 
generate the sketches before sending the data to Druid. You must serialize the 
sketches in the input dataset to Base64-encoded bytes. Then, specify 
`HLLSketchMerge` for the input column in the native ingestion `metricsSpec`.
@@ -102,10 +99,10 @@ Returns the distinct count estimate as a double.
 
 ```
 {
-  "type"  : "HLLSketchEstimate",
+  "type": "HLLSketchEstimate",
   "name": <output name>,
-  "field"  : <post aggregator that returns an HLL Sketch>,
-  "round" : <if true, round the estimate. Default is false>
+  "field": <post aggregator that returns an HLL Sketch>,
+  "round": <if true, round the estimate. Default is false>
 }
 ```
 
@@ -118,10 +115,10 @@ This must be an integer value of 1, 2 or 3 corresponding 
to approximately 68.3%,
 
 ```
 {
-  "type"  : "HLLSketchEstimateWithBounds",
+  "type": "HLLSketchEstimateWithBounds",
   "name": <output name>,
-  "field"  : <post aggregator that returns an HLL Sketch>,
-  "numStdDev" : <number of standard deviations: 1 (default), 2 or 3>
+  "field": <post aggregator that returns an HLL Sketch>,
+  "numStdDev": <number of standard deviations: 1 (default), 2 or 3>
 }
 ```
 
@@ -129,11 +126,11 @@ This must be an integer value of 1, 2 or 3 corresponding 
to approximately 68.3%,
 
 ```
 {
-  "type"  : "HLLSketchUnion",
+  "type": "HLLSketchUnion",
   "name": <output name>,
-  "fields"  : <array of post aggregators that return HLL sketches>,
+  "fields": <array of post aggregators that return HLL sketches>,
   "lgK": <log2 of K for the target sketch>,
-  "tgtHllType" : <target HLL type>
+  "tgtHllType": <target HLL type>
 }
 ```
 
@@ -143,8 +140,8 @@ Human-readable sketch summary for debugging.
 
 ```
 {
-  "type"  : "HLLSketchToString",
+  "type": "HLLSketchToString",
   "name": <output name>,
-  "field"  : <post aggregator that returns an HLL Sketch>
+  "field": <post aggregator that returns an HLL Sketch>
 }
 ```
diff --git a/docs/ingestion/data-formats.md b/docs/ingestion/data-formats.md
index f484d65006..eb08df0cf7 100644
--- a/docs/ingestion/data-formats.md
+++ b/docs/ingestion/data-formats.md
@@ -101,6 +101,7 @@ The following properties are specialized properties that 
only apply when the JSO
 | useJsonNodeReader | Boolean | When ingesting multi-line JSON events, 
enabling this option will enable the use of a JSON parser which will retain any 
valid JSON events encountered within a streaming record prior to when a parsing 
exception occurred. | no (Default false) |
 
 For example:
+
 ```json
 "ioConfig": {
   "inputFormat": {
@@ -123,6 +124,7 @@ Configure the CSV `inputFormat` to load CSV data as follows:
 | skipHeaderRows | Integer | If this is set, the task will skip the first 
`skipHeaderRows` rows. | no (default = 0) |
 
 For example:
+
 ```json
 "ioConfig": {
   "inputFormat": {
@@ -149,6 +151,7 @@ Configure the TSV `inputFormat` to load TSV data as follows:
 Be sure to change the `delimiter` to the appropriate delimiter for your data. 
Like CSV, you must specify the columns and which subset of the columns you want 
indexed.
 
  For example:
+
 ```json
 "ioConfig": {
   "inputFormat": {
@@ -162,19 +165,20 @@ Be sure to change the `delimiter` to the appropriate 
delimiter for your data. Li
 
 ### Kafka
 
-Configure the Kafka `inputFormat` to load complete kafka records including 
header, key, and value. 
+Configure the Kafka `inputFormat` to load complete kafka records including 
header, key, and value.
 
 | Field | Type | Description | Required |
 |-------|------|-------------|----------|
-| type | String | Set value to `kafka`. | yes |
-| headerLabelPrefix | String | Custom label prefix for all the header columns. 
| no (default = "kafka.header.") |
-| timestampColumnName | String | Name of the column for the kafka record's 
timestamp.| no (default = "kafka.timestamp") |
-| keyColumnName | String | Name of the column for the kafka record's key.| no 
(default = "kafka.key") |
-| headerFormat | Object | `headerFormat` specifies how to parse the Kafka 
headers. Supports String types. Because Kafka header values are bytes, the 
parser decodes them as UTF-8 encoded strings. To change this behavior, 
implement your own parser based on the encoding style. Change the 'encoding' 
type in `KafkaStringHeaderFormat` to match your custom implementation. | no |
-| keyFormat | [InputFormat](#input-format) | Any existing `inputFormat` used 
to parse the Kafka key. It only processes the first entry of the input format. 
For details, see [Specifying data 
format](../development/extensions-core/kafka-supervisor-reference.md#specifying-data-format).
 | no |
-| valueFormat | [InputFormat](#input-format) | `valueFormat` can be any 
existing `inputFormat` to parse the Kafka value payload. For details about 
specifying the input format, see [Specifying data 
format](../development/extensions-core/kafka-supervisor-reference.md#specifying-data-format).
 | yes |
+| `type` | String | Set value to `kafka`. | yes |
+| `headerLabelPrefix` | String | Custom label prefix for all the header 
columns. | no (default = "kafka.header.") |
+| `timestampColumnName` | String | Name of the column for the kafka record's 
timestamp.| no (default = "kafka.timestamp") |
+| `keyColumnName` | String | Name of the column for the kafka record's key.| 
no (default = "kafka.key") |
+| `headerFormat` | Object | `headerFormat` specifies how to parse the Kafka 
headers. Supports String types. Because Kafka header values are bytes, the 
parser decodes them as UTF-8 encoded strings. To change this behavior, 
implement your own parser based on the encoding style. Change the 'encoding' 
type in `KafkaStringHeaderFormat` to match your custom implementation. | no |
+| `keyFormat` | [InputFormat](#input-format) | Any existing `inputFormat` used 
to parse the Kafka key. It only processes the first entry of the input format. 
For details, see [Specifying data 
format](../development/extensions-core/kafka-supervisor-reference.md#specifying-data-format).
 | no |
+| `valueFormat` | [InputFormat](#input-format) | `valueFormat` can be any 
existing `inputFormat` to parse the Kafka value payload. For details about 
specifying the input format, see [Specifying data 
format](../development/extensions-core/kafka-supervisor-reference.md#specifying-data-format).
 | yes |
 
 For example:
+
 ```
 "ioConfig": {
   "inputFormat": {
@@ -200,30 +204,34 @@ For example:
 ```
 
 Note the following behaviors:
+
 - If there are conflicts between column names, Druid uses the column names 
from the payload and ignores the column name from the header or key. This 
behavior makes it easier to migrate to the the Kafka `inputFormat` from another 
Kafka ingestion spec without losing data.
 - The Kafka input format fundamentally blends information from the header, 
key, and value objects from a Kafka record to create a row in Druid. It 
extracts individual records from the value. Then it augments each value with 
the corresponding key or header columns.
 - The Kafka input format by default exposes Kafka timestamp 
`timestampColumnName` to make it available for use as the primary timestamp 
column. Alternatively you can choose timestamp column from either the key or 
value payload.
 
 For example, the following `timestampSpec` uses the default Kafka timestamp 
from the Kafka record:
+
+```json
+"timestampSpec":
+{
+    "column": "kafka.timestamp",
+    "format": "millis"
+}
 ```
-    "timestampSpec":
-    {
-        "column": "kafka.timestamp",
-        "format": "millis"
-    }
-```
-    
+
 If you are using "kafka.header." as the prefix for Kafka header columns and 
there is a timestamp field in the header, the header timestamp serves as the 
primary timestamp column. For example:
+
+```json
+"timestampSpec":
+{
+    "column": "kafka.header.timestamp",
+    "format": "millis"
+}
 ```
-    "timestampSpec":
-    {
-        "column": "kafka.header.timestamp",
-        "format": "millis"
-    }
-```
+
 ### ORC
 
-To use the ORC input format, load the Druid Orc extension ( 
[`druid-orc-extensions`](../development/extensions-core/orc.md)). 
+To use the ORC input format, load the Druid Orc extension ( 
[`druid-orc-extensions`](../development/extensions-core/orc.md)).
 > To upgrade from versions earlier than 0.15.0 to 0.15.0 or new, read 
 > [Migration from 'contrib' 
 > extension](../development/extensions-core/orc.md#migration-from-contrib-extension).
 
 Configure the ORC `inputFormat` to load ORC data as follows:
@@ -235,6 +243,7 @@ Configure the ORC `inputFormat` to load ORC data as follows:
 | binaryAsString | Boolean | Specifies if the binary orc column which is not 
logically marked as a string should be treated as a UTF-8 encoded string. | no 
(default = false) |
 
 For example:
+
 ```json
 "ioConfig": {
   "inputFormat": {
@@ -261,13 +270,14 @@ To use the Parquet input format load the Druid Parquet 
extension ([`druid-parque
 
 Configure the Parquet `inputFormat` to load Parquet data as follows:
 
-| Field | Type | Description                                                   
                                                                                
                | Required |
-|-------|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
-|type| String| Set value to `parquet`.                                         
                                                                                
              | yes |
-|flattenSpec| JSON Object | Define a [`flattenSpec`](#flattenspec) to extract 
nested values from a Parquet file. Only 'path' expressions are supported ('jq' 
and 'tree' are unavailable). | no (default will auto-discover 'root' level 
properties) |
-| binaryAsString | Boolean | Specifies if the bytes parquet column which is 
not logically marked as a string or enum type should be treated as a UTF-8 
encoded string.                     | no (default = false) |
+| Field | Type | Description | Required |
+|---|---|---|---|
+| `type` | String | Set value to `parquet`. | yes |
+| `flattenSpec` | JSON Object | Define a [`flattenSpec`](#flattenspec) to 
extract nested values from a Parquet file. Only 'path' expressions are 
supported ('jq' and 'tree' are unavailable). | no (default will auto-discover 
'root' level properties) |
+| `binaryAsString` | Boolean | Specifies if the bytes parquet column which is 
not logically marked as a string or enum type should be treated as a UTF-8 
encoded string. | no (default = false) |
 
 For example:
+
 ```json
 "ioConfig": {
   "inputFormat": {
@@ -304,6 +314,7 @@ Configure the Avro `inputFormat` to load Avro data as 
follows:
 | binaryAsString | Boolean | Specifies if the bytes Avro column which is not 
logically marked as a string or enum type should be treated as a UTF-8 encoded 
string. | no (default = false) |
 
 For example:
+
 ```json
 "ioConfig": {
   "inputFormat": {
@@ -426,7 +437,6 @@ This section describes the format of the 
`subjectAndIdConverter` object for the
 | type | String | Set value to `avro_1124`. | no |
 | topic | String | Specifies the topic of your Kafka stream. | yes |
 
-
 ###### Avro-1124 Schema Repository
 
 This section describes the format of the `schemaRepository` object for the 
`schema_repo` Avro bytes decoder.
@@ -453,6 +463,7 @@ For details, see the Schema Registry 
[documentation](http://docs.confluent.io/cu
 For a single schema registry instance, use Field `url` or `urls` for multi 
instances.
 
 Single Instance:
+
 ```json
 ...
 "avroBytesDecoder" : {
@@ -463,6 +474,7 @@ Single Instance:
 ```
 
 Multiple Instances:
+
 ```json
 ...
 "avroBytesDecoder" : {
@@ -498,6 +510,7 @@ Multiple Instances:
 ###### Parse exceptions
 
 The following errors when reading records will be considered parse exceptions, 
which can be limited and logged with ingestion task configurations such as 
`maxParseExceptions` and `maxSavedParseExceptions`:
+
 - Failure to retrieve a schema due to misconfiguration or corrupt records 
(invalid schema IDs)
 - Failure to decode an Avro message
 
@@ -517,6 +530,7 @@ Configure the Avro OCF `inputFormat` to load Avro OCF data 
as follows:
 | binaryAsString | Boolean | Specifies if the bytes parquet column which is 
not logically marked as a string or enum type should be treated as a UTF-8 
encoded string.                   | no (default = false) |
 
 For example:
+
 ```json
 "ioConfig": {
   "inputFormat": {
@@ -558,13 +572,14 @@ For example:
 
 Configure the Protobuf `inputFormat` to load Protobuf data as follows:
 
-| Field | Type | Description                                                   
                                                                                
                           | Required |
-|-------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
-|type| String| Set value to `protobuf`.                                        
                                                                                
                         | yes |
-|flattenSpec| JSON Object | Define a [`flattenSpec`](#flattenspec) to extract 
nested values from a Protobuf record. Note that only 'path' expression are 
supported ('jq' and 'tree' is unavailable). | no (default will auto-discover 
'root' level properties) |
-|`protoBytesDecoder`| JSON Object | Specifies how to decode bytes to Protobuf 
record.                                                                         
                                               | yes |
+| Field | Type | Description | Required |
+|---|---|---|---|
+| `type` | String | Set value to `protobuf`. | yes |
+| `flattenSpec` | JSON Object | Define a [`flattenSpec`](#flattenspec) to 
extract nested values from a Protobuf record. Note that only 'path' expression 
are supported ('jq' and 'tree' is unavailable). | no (default will 
auto-discover 'root' level properties) |
+| `protoBytesDecoder` | JSON Object | Specifies how to decode bytes to 
Protobuf record. | yes |
 
 For example:
+
 ```json
 "ioConfig": {
   "inputFormat": {
@@ -603,6 +618,7 @@ Configure your `flattenSpec` as follows:
 | fields | Specifies the fields of interest and how they are accessed. See 
[Field flattening specifications](#field-flattening-specifications) for more 
detail. | `[]` |
 
 For example:
+
 ```json
 "flattenSpec": {
   "useFieldDiscovery": true,
@@ -614,6 +630,7 @@ For example:
   ]
 }
 ```
+
 After Druid reads the input data records, it applies the flattenSpec before 
applying any other specs such as 
[`timestampSpec`](./ingestion-spec.md#timestampspec), 
[`transformSpec`](./ingestion-spec.md#transformspec), 
[`dimensionsSpec`](./ingestion-spec.md#dimensionsspec), or 
[`metricsSpec`](./ingestion-spec.md#metricsspec).  This makes it possible to 
extract timestamps from flattened data, for example, and to refer to flattened 
data in transformations, in your dimension list, and when ge [...]
 
 Flattening is only supported for [data formats](data-formats.md) that support 
nesting, including `avro`, `json`, `orc`, and `parquet`.
@@ -631,13 +648,13 @@ Each entry in the `fields` list can have the following 
components:
 
 #### Notes on flattening
 
-* For convenience, when defining a root-level field, it is possible to define 
only the field name, as a string, instead of a JSON object. For example, 
`{"name": "baz", "type": "root"}` is equivalent to `"baz"`.
-* Enabling `useFieldDiscovery` will only automatically detect "simple" fields 
at the root level that correspond to data types that Druid supports. This 
includes strings, numbers, and lists of strings or numbers. Other types will 
not be automatically detected, and must be specified explicitly in the `fields` 
list.
-* Duplicate field `name`s are not allowed. An exception will be thrown.
-* If `useFieldDiscovery` is enabled, any discovered field with the same name 
as one already defined in the `fields` list will be skipped, rather than added 
twice.
-* [http://jsonpath.herokuapp.com/](http://jsonpath.herokuapp.com/) is useful 
for testing `path`-type expressions.
-* jackson-jq supports a subset of the full 
[jq](https://stedolan.github.io/jq/) syntax.  Please refer to the [jackson-jq 
documentation](https://github.com/eiiches/jackson-jq) for details.
-* [JsonPath](https://github.com/jayway/JsonPath) supports a bunch of 
functions, but not all of these functions are supported by Druid now. Following 
matrix shows the current supported JsonPath functions and corresponding data 
formats. Please also note the output data type of these functions.
+- For convenience, when defining a root-level field, it is possible to define 
only the field name, as a string, instead of a JSON object. For example, 
`{"name": "baz", "type": "root"}` is equivalent to `"baz"`.
+- Enabling `useFieldDiscovery` will only automatically detect "simple" fields 
at the root level that correspond to data types that Druid supports. This 
includes strings, numbers, and lists of strings or numbers. Other types will 
not be automatically detected, and must be specified explicitly in the `fields` 
list.
+- Duplicate field `name`s are not allowed. An exception will be thrown.
+- If `useFieldDiscovery` is enabled, any discovered field with the same name 
as one already defined in the `fields` list will be skipped, rather than added 
twice.
+- [http://jsonpath.herokuapp.com/](http://jsonpath.herokuapp.com/) is useful 
for testing `path`-type expressions.
+- jackson-jq supports a subset of the full 
[jq](https://stedolan.github.io/jq/) syntax.  Please refer to the [jackson-jq 
documentation](https://github.com/eiiches/jackson-jq) for details.
+- [JsonPath](https://github.com/jayway/JsonPath) supports a bunch of 
functions, but not all of these functions are supported by Druid now. Following 
matrix shows the current supported JsonPath functions and corresponding data 
formats. Please also note the output data type of these functions.
   
   | Function   | Description                                                   
      | Output type | json | orc | avro | parquet |
   | :----------| 
:------------------------------------------------------------------ 
|:----------- |:-----|:----|:-----|:-----|
@@ -651,7 +668,6 @@ Each entry in the `fields` list can have the following 
components:
   | append(X)  | add an item to the json path output array                     
      | like input  | &#10003;  |  &#10007;   |   &#10007;   | &#10007;   |
   | keys()     | Provides the property keys (An alternative for terminal tilde 
~)    | Set<E\>      | &#10007;  |  &#10007;   |   &#10007;   | &#10007;   |
 
-
 ## Parser
 
 > The Parser is deprecated for [native batch tasks](./native-batch.md), [Kafka 
 > indexing service](../development/extensions-core/kafka-ingestion.md),
@@ -761,6 +777,7 @@ Auto field discovery will automatically create a string 
dimension for every (non
 primitives, as well as any flatten expressions defined in the `flattenSpec`.
 
 #### Hadoop job properties
+
 Like most Hadoop jobs, the best outcomes will add 
`"mapreduce.job.user.classpath.first": "true"` or
 `"mapreduce.job.classloader": "true"` to the `jobProperties` section of 
`tuningConfig`. Note that it is likely if using
 `"mapreduce.job.classloader": "true"` that you will need to set 
`mapreduce.job.classloader.system.classes` to include
@@ -926,6 +943,7 @@ setting `"mapreduce.job.user.classpath.first": "true"`, 
then this will not be an
 ```
 
 ##### `orc` parser, `timeAndDims` parseSpec
+
 ```json
 {
   "type": "index_hadoop",
@@ -996,13 +1014,13 @@ a format should not be supplied. When the format is UTF8 
(String), either `auto`
 Both parsers read from Parquet files, but slightly differently. The main
 differences are:
 
-* The Parquet Hadoop Parser uses a simple conversion while the Parquet Avro 
Hadoop Parser
+- The Parquet Hadoop Parser uses a simple conversion while the Parquet Avro 
Hadoop Parser
 converts Parquet data into avro records first with the `parquet-avro` library 
and then
 parses avro data using the `druid-avro-extensions` module to ingest into Druid.
-* The Parquet Hadoop Parser sets a hadoop job property
+- The Parquet Hadoop Parser sets a hadoop job property
 `parquet.avro.add-list-element-records` to `false` (which normally defaults to 
`true`), in order to 'unwrap' primitive
 list elements into multi-value dimensions.
-* The Parquet Hadoop Parser supports `int96` Parquet values, while the Parquet 
Avro Hadoop Parser does not.
+- The Parquet Hadoop Parser supports `int96` Parquet values, while the Parquet 
Avro Hadoop Parser does not.
 There may also be some subtle differences in the behavior of JSON path 
expression evaluation of `flattenSpec`.
 
 Based on those differences, we suggest using the Parquet Hadoop Parser over 
the Parquet Avro Hadoop Parser
@@ -1012,6 +1030,7 @@ However, the Parquet Avro Hadoop Parser was the original 
basis for supporting th
 #### Examples
 
 ##### `parquet` parser, `parquet` parseSpec
+
 ```json
 {
   "type": "index_hadoop",
@@ -1066,6 +1085,7 @@ However, the Parquet Avro Hadoop Parser was the original 
basis for supporting th
 ```
 
 ##### `parquet` parser, `timeAndDims` parseSpec
+
 ```json
 {
   "type": "index_hadoop",
@@ -1359,6 +1379,7 @@ Single Instance:
 ```
 
 Multiple Instances:
+
 ```json
 ...
 "protoBytesDecoder": {
@@ -1534,7 +1555,6 @@ tasks will fail with an exception.
 
 The `columns` field must be included and and ensure that the order of the 
fields matches the columns of your input data in the same order.
 
-
 ### Regex ParseSpec
 
 ```json
diff --git a/docs/ingestion/index.md b/docs/ingestion/index.md
index c4a2961b88..d152e75cd6 100644
--- a/docs/ingestion/index.md
+++ b/docs/ingestion/index.md
@@ -30,9 +30,10 @@ For most ingestion methods, the Druid 
[MiddleManager](../design/middlemanager.md
 [Indexer](../design/indexer.md) processes load your source data. The sole 
exception is Hadoop-based ingestion, which
 uses a Hadoop MapReduce job on YARN.
 
-During ingestion Druid creates segments and stores them in [deep 
storage](../dependencies/deep-storage.md). Historical nodes load the segments 
into memory to respond to queries. For streaming ingestion, the Middle Managers 
and indexers can respond to queries in real-time with arriving data. See the 
[Storage design](../design/architecture.md#storage-design) section of the Druid 
design documentation for more information.
+During ingestion, Druid creates segments and stores them in [deep 
storage](../dependencies/deep-storage.md). Historical nodes load the segments 
into memory to respond to queries. For streaming ingestion, the Middle Managers 
and indexers can respond to queries in real-time with arriving data. See the 
[Storage design](../design/architecture.md#storage-design) section of the Druid 
design documentation for more information.
 
 This topic introduces streaming and batch ingestion methods. The following 
topics describe ingestion concepts and information that apply to all [ingestion 
methods](#ingestion-methods):
+
 - [Druid data model](./data-model.md) introduces concepts of datasources, 
primary timestamp, dimensions, and metrics.
 - [Data rollup](./rollup.md) describes rollup as a concept and provides 
suggestions to maximize the benefits of rollup.
 - [Partitioning](./partitioning.md) describes time chunk and secondary 
partitioning in Druid.
@@ -77,4 +78,3 @@ runs for the duration of the job.
 | **Input formats** | Any [`inputFormat`](./data-formats.md#input-format). | 
Any [`inputFormat`](./data-formats.md#input-format). | Any Hadoop InputFormat. |
 | **Secondary partitioning options** | Dynamic, hash-based, and range-based 
partitioning methods are available. See 
[partitionsSpec](./native-batch.md#partitionsspec) for details.| Range 
partitioning ([CLUSTERED BY](../multi-stage-query/concepts.md#clustering)). |  
Hash-based or range-based partitioning via 
[`partitionsSpec`](hadoop.md#partitionsspec). |
 | **[Rollup modes](./rollup.md#perfect-rollup-vs-best-effort-rollup)** | 
Perfect if `forceGuaranteedRollup` = true in the 
[`tuningConfig`](native-batch.md#tuningconfig).  | Always perfect. | Always 
perfect. |
-
diff --git a/docs/ingestion/partitioning.md b/docs/ingestion/partitioning.md
index 0ffd015600..422c07de80 100644
--- a/docs/ingestion/partitioning.md
+++ b/docs/ingestion/partitioning.md
@@ -26,7 +26,7 @@ description: Describes time chunk and secondary partitioning 
in Druid. Provides
 
 You can use segment partitioning and sorting within your Druid datasources to 
reduce the size of your data and increase performance.
 
-One way to partition is to load data into separate datasources. This is a 
perfectly viable approach that works very well when the number of datasources 
does not lead to excessive per-datasource overheads. 
+One way to partition is to load data into separate datasources. This is a 
perfectly viable approach that works very well when the number of datasources 
does not lead to excessive per-datasource overheads.
 
 This topic describes how to set up partitions within a single datasource. It 
does not cover how to use multiple datasources. See [Multitenancy 
considerations](../querying/multitenancy.md) for more details on splitting data 
into separate datasources and potential operational considerations.
 
@@ -72,9 +72,9 @@ The following table shows how each ingestion method handles 
partitioning:
 |[Kafka indexing 
service](../development/extensions-core/kafka-ingestion.md)|Kafka topic 
partitioning defines how Druid partitions the datasource. You can also 
[reindex](../data-management/update.md#reindex) or 
[compact](../data-management/compaction.md) to repartition after initial 
ingestion.|
 |[Kinesis indexing 
service](../development/extensions-core/kinesis-ingestion.md)|Kinesis stream 
sharding defines how Druid partitions the datasource. You can also 
[reindex](../data-management/update.md#reindex) or 
[compact](../data-management/compaction.md) to repartition after initial 
ingestion.|
 
-
 ## Learn more
+
 See the following topics for more information:
+
 * [`partitionsSpec`](native-batch.md#partitionsspec) for more detail on 
partitioning with Native Batch ingestion.
 * [Reindexing](../data-management/update.md#reindex) and 
[Compaction](../data-management/compaction.md) for information on how to 
repartition existing data in Druid.
-
diff --git a/docs/ingestion/rollup.md b/docs/ingestion/rollup.md
index 411dd5ead9..08cdfba378 100644
--- a/docs/ingestion/rollup.md
+++ b/docs/ingestion/rollup.md
@@ -31,10 +31,12 @@ At ingestion time, you control rollup with the `rollup` 
setting in the [`granula
 When you disable rollup, Druid loads each row as-is without doing any form of 
pre-aggregation. This mode is similar to databases that do not support a rollup 
feature. Set `rollup` to `false` if you want Druid to store each record as-is, 
without any rollup summarization.
 
 Use roll-up when creating a table datasource if both:
+
 - You want optimal performance or you have strict space constraints.
 - You don't need raw values from [high-cardinality 
dimensions](schema-design.md#sketches).
 
 Conversely, disable roll-up if either:
+
 - You need results for individual rows.
 - You need to execute `GROUP BY` or `WHERE` queries on _any_ column.
 
@@ -47,6 +49,7 @@ To measure the rollup ratio of a datasource, compare the 
number of rows in Druid
 ```sql
 SELECT SUM("num_rows") / (COUNT(*) * 1.0) FROM datasource
 ```
+
 The higher the result, the greater the benefit you gain from rollup. See 
[Counting the number of ingested events](schema-design.md#counting) for more 
details about how counting works with rollup is enabled.
 
 Tips for maximizing rollup:
@@ -55,20 +58,22 @@ Tips for maximizing rollup:
 - Use [sketches](schema-design.md#sketches) to avoid storing high cardinality 
dimensions, which decrease rollup ratios.
 - Adjust your `queryGranularity` at ingestion time to increase the chances 
that multiple rows in Druid having matching timestamps. For example, use five 
minute query granularity (`PT5M`) instead of one minute (`PT1M`).
 - You can optionally load the same data into more than one Druid datasource. 
For example:
-    - Create a "full" datasource that has rollup disabled, or enabled, but 
with a minimal rollup ratio.
-    - Create a second "abbreviated" datasource with fewer dimensions and a 
higher rollup ratio.
+  - Create a "full" datasource that has rollup disabled, or enabled, but with 
a minimal rollup ratio.
+  - Create a second "abbreviated" datasource with fewer dimensions and a 
higher rollup ratio.
      When queries only involve dimensions in the "abbreviated" set, use the 
second datasource to reduce query times. Often, this method only requires a 
small increase in storage footprint because abbreviated datasources tend to be 
substantially smaller.
 - If you use a [best-effort rollup](#perfect-rollup-vs-best-effort-rollup) 
ingestion configuration that does not guarantee perfect rollup, try one of the 
following:
-    - Switch to a guaranteed perfect rollup option.
-    - [Reindex](../data-management/update.md#reindex) or 
[compact](../data-management/compaction.md) your data in the background after 
initial ingestion.
+  - Switch to a guaranteed perfect rollup option.
+  - [Reindex](../data-management/update.md#reindex) or 
[compact](../data-management/compaction.md) your data in the background after 
initial ingestion.
 
 ## Perfect rollup vs best-effort rollup
 
 Depending on the ingestion method, Druid has the following rollup options:
+
 - Guaranteed _perfect rollup_: Druid perfectly aggregates input data at 
ingestion time.
 - _Best-effort rollup_: Druid may not perfectly aggregate input data. 
Therefore, multiple segments might contain rows with the same timestamp and 
dimension values.
 
 In general, ingestion methods that offer best-effort rollup do this for one of 
the following reasons:
+
 - The ingestion method parallelizes ingestion without a shuffling step 
required for perfect rollup.
 - The ingestion method uses _incremental publishing_ which means it finalizes 
and publishes segments before all data for a time chunk has been received.
 In both of these cases, records that could theoretically be rolled up may end 
up in different segments. All types of streaming ingestion run in this mode.
@@ -86,5 +91,7 @@ The following table shows how each method handles rollup:
 |[Kinesis indexing 
service](../development/extensions-core/kinesis-ingestion.md)|Always 
best-effort.|
 
 ## Learn more
+
 See the following topic for more information:
-* [Rollup tutorial](../tutorials/tutorial-rollup.md) for an example of how to 
configure rollup, and of how the feature modifies your data.
+
+- [Rollup tutorial](../tutorials/tutorial-rollup.md) for an example of how to 
configure rollup, and of how the feature modifies your data.
diff --git a/docs/ingestion/schema-design.md b/docs/ingestion/schema-design.md
index 2d3cc0215b..10e6ea82cd 100644
--- a/docs/ingestion/schema-design.md
+++ b/docs/ingestion/schema-design.md
@@ -36,8 +36,7 @@ general tips and common practices.
 * [Dimension columns](./data-model.md#dimensions) are stored as-is, so they 
can be filtered on, grouped by, or aggregated at query time. They are always 
single Strings, [arrays of Strings](../querying/multi-value-dimensions.md), 
single Longs, single Doubles or single Floats.
 * [Metric columns](./data-model.md#metrics) are stored 
[pre-aggregated](../querying/aggregations.md), so they can only be aggregated 
at query time (not filtered or grouped by). They are often stored as numbers 
(integers or floats) but can also be stored as complex objects like 
[HyperLogLog sketches or approximate quantile 
sketches](../querying/aggregations.md#approximate-aggregations). Metrics can be 
configured at ingestion time even when rollup is disabled, but are most useful 
when roll [...]
 
-
-## If you're coming from a...
+## If you're coming from a
 
 ### Relational model
 
@@ -71,13 +70,13 @@ reflected immediately for already-ingested rows in your 
main table.
 
 Tips for modeling relational data in Druid:
 
-- Druid datasources do not have primary or unique keys, so skip those.
-- Denormalize if possible. If you need to be able to update dimension / lookup 
tables periodically and have those
+* Druid datasources do not have primary or unique keys, so skip those.
+* Denormalize if possible. If you need to be able to update dimension / lookup 
tables periodically and have those
 changes reflected in already-ingested data, consider partial normalization 
with [lookups](../querying/lookups.md).
-- If you need to join two large distributed tables with each other, you must 
do this before loading the data into Druid.
+* If you need to join two large distributed tables with each other, you must 
do this before loading the data into Druid.
 Druid does not support query-time joins of two datasources. Lookups do not 
help here, since a full copy of each lookup
 table is stored on each Druid server, so they are not a good choice for large 
tables.
-- Consider whether you want to enable [rollup](#rollup) for pre-aggregation, 
or whether you want to disable
+* Consider whether you want to enable [rollup](#rollup) for pre-aggregation, 
or whether you want to disable
 rollup and load your existing data as-is. Rollup in Druid is similar to 
creating a summary table in a relational model.
 
 ### Time series model
@@ -93,21 +92,21 @@ sort by metric name, like timeseries databases often do. 
See [Partitioning and s
 
 Tips for modeling timeseries data in Druid:
 
-- Druid does not think of data points as being part of a "time series". 
Instead, Druid treats each point separately
+* Druid does not think of data points as being part of a "time series". 
Instead, Druid treats each point separately
 for ingestion and aggregation.
-- Create a dimension that indicates the name of the series that a data point 
belongs to. This dimension is often called
+* Create a dimension that indicates the name of the series that a data point 
belongs to. This dimension is often called
 "metric" or "name". Do not get the dimension named "metric" confused with the 
concept of Druid metrics. Place this
 first in the list of dimensions in your "dimensionsSpec" for best performance 
(this helps because it improves locality;
 see [partitioning and sorting](./partitioning.md) below for details).
-- Create other dimensions for attributes attached to your data points. These 
are often called "tags" in timeseries
+* Create other dimensions for attributes attached to your data points. These 
are often called "tags" in timeseries
 database systems.
-- Create [metrics](../querying/aggregations.md) corresponding to the types of 
aggregations that you want to be able
+* Create [metrics](../querying/aggregations.md) corresponding to the types of 
aggregations that you want to be able
 to query. Typically this includes "sum", "min", and "max" (in one of the long, 
float, or double flavors). If you want the ability
 to compute percentiles or quantiles, use Druid's [approximate 
aggregators](../querying/aggregations.md#approximate-aggregations).
-- Consider enabling [rollup](./rollup.md), which will allow Druid to 
potentially combine multiple points into one
+* Consider enabling [rollup](./rollup.md), which will allow Druid to 
potentially combine multiple points into one
 row in your Druid datasource. This can be useful if you want to store data at 
a different time granularity than it is
 naturally emitted. It is also useful if you want to combine timeseries and 
non-timeseries data in the same datasource.
-- If you don't know ahead of time what columns you'll want to ingest, use an 
empty dimensions list to trigger
+* If you don't know ahead of time what columns you'll want to ingest, use an 
empty dimensions list to trigger
 [automatic detection of dimension columns](#schema-less-dimensions).
 
 ### Log aggregation model
@@ -122,10 +121,10 @@ nested data.
 
 Tips for modeling log data in Druid:
 
-- If you don't know ahead of time what columns you'll want to ingest, use an 
empty dimensions list to trigger
+* If you don't know ahead of time what columns you'll want to ingest, use an 
empty dimensions list to trigger
 [automatic detection of dimension columns](#schema-less-dimensions).
-- If you have nested data, flatten it using a 
[`flattenSpec`](./ingestion-spec.md#flattenspec).
-- Consider enabling [rollup](./rollup.md) if you have mainly analytical use 
cases for your log data. This will
+* If you have nested data, flatten it using a 
[`flattenSpec`](./ingestion-spec.md#flattenspec).
+* Consider enabling [rollup](./rollup.md) if you have mainly analytical use 
cases for your log data. This will
 mean you lose the ability to retrieve individual events from Druid, but you 
potentially gain substantial compression and
 query performance boosts.
 
@@ -182,8 +181,6 @@ You may want to experiment to find the optimal choice for 
your use case.
 
 For details about how to configure numeric dimensions, see the 
[`dimensionsSpec`](./ingestion-spec.md#dimensionsspec) documentation.
 
-
-
 ### Secondary timestamps
 
 Druid schemas must always include a primary timestamp. The primary timestamp 
is used for
@@ -205,14 +202,14 @@ You can ingest and store nested JSON in a Druid column as 
a `COMPLEX<json>` data
 
 If you want to ingest nested data in a format other than JSON&mdash;for 
example Avro, ORC, and Parquet&mdash;you  must use the `flattenSpec` object to 
flatten it. For example, if you have data of the following form:
 
-```
-{"foo":{"bar": 3}}
+```json
+{ "foo": { "bar": 3 } }
 ```
 
 then before indexing it, you should transform it to:
 
-```
-{"foo_bar": 3}
+```json
+{ "foo_bar": 3 }
 ```
 
 See the [`flattenSpec`](./ingestion-spec.md#flattenspec) documentation for 
more details.
@@ -231,27 +228,20 @@ the number of Druid rows for the time interval, which can 
be used to determine w
 
 To clarify with an example, if your ingestion spec contains:
 
-```
-...
-"metricsSpec" : [
-      {
-        "type" : "count",
-        "name" : "count"
-      },
-...
+```json
+"metricsSpec": [
+    { "type": "count", "name": "count" }
+]
 ```
 
 You should query for the number of ingested rows with:
 
-```
-...
+```json
 "aggregations": [
-    { "type": "longSum", "name": "numIngestedEvents", "fieldName": "count" },
-...
+    { "type": "longSum", "name": "numIngestedEvents", "fieldName": "count" }
+]
 ```
 
-
-
 ### Schema-less dimensions
 
 If the `dimensions` field is left empty in your ingestion spec, Druid will 
treat every column that is not the timestamp column,
@@ -268,14 +258,14 @@ some work at ETL time.
 
 As an example, for schema-less dimensions, repeat the same column:
 
-```
-{"device_id_dim":123, "device_id_met":123}
+```json
+{ "device_id_dim": 123, "device_id_met": 123 }
 ```
 
 and in your `metricsSpec`, include:
 
-```
-{ "type" : "hyperUnique", "name" : "devices", "fieldName" : "device_id_met" }
+```json
+{ "type": "hyperUnique", "name": "devices", "fieldName": "device_id_met" }
 ```
 
 `device_id_dim` should automatically get picked up as a dimension.
diff --git a/docs/ingestion/tasks.md b/docs/ingestion/tasks.md
index 1a6d58d063..c8a2e915d4 100644
--- a/docs/ingestion/tasks.md
+++ b/docs/ingestion/tasks.md
@@ -157,20 +157,20 @@ A description of the fields:
 The `ingestionStatsAndErrors` report provides information about row counts and 
errors.
 
 The `ingestionState` shows what step of ingestion the task reached. Possible 
states include:
-* `NOT_STARTED`: The task has not begun reading any rows
-* `DETERMINE_PARTITIONS`: The task is processing rows to determine partitioning
-* `BUILD_SEGMENTS`: The task is processing rows to construct segments
-* `COMPLETED`: The task has finished its work.
+- `NOT_STARTED`: The task has not begun reading any rows
+- `DETERMINE_PARTITIONS`: The task is processing rows to determine partitioning
+- `BUILD_SEGMENTS`: The task is processing rows to construct segments
+- `COMPLETED`: The task has finished its work.
 
 Only batch tasks have the DETERMINE_PARTITIONS phase. Realtime tasks such as 
those created by the Kafka Indexing Service do not have a DETERMINE_PARTITIONS 
phase.
 
 `unparseableEvents` contains lists of exception messages that were caused by 
unparseable inputs. This can help with identifying problematic input rows. 
There will be one list each for the DETERMINE_PARTITIONS and BUILD_SEGMENTS 
phases. Note that the Hadoop batch task does not support saving of unparseable 
events.
 
 the `rowStats` map contains information about row counts. There is one entry 
for each ingestion phase. The definitions of the different row counts are shown 
below:
-* `processed`: Number of rows successfully ingested without parsing errors
-* `processedWithError`: Number of rows that were ingested, but contained a 
parsing error within one or more columns. This typically occurs where input 
rows have a parseable structure but invalid types for columns, such as passing 
in a non-numeric String value for a numeric column.
-* `thrownAway`: Number of rows skipped. This includes rows with timestamps 
that were outside of the ingestion task's defined time interval and rows that 
were filtered out with a [`transformSpec`](./ingestion-spec.md#transformspec), 
but doesn't include the rows skipped by explicit user configurations. For 
example, the rows skipped by `skipHeaderRows` or `hasHeaderRow` in the CSV 
format are not counted.
-* `unparseable`: Number of rows that could not be parsed at all and were 
discarded. This tracks input rows without a parseable structure, such as 
passing in non-JSON data when using a JSON parser.
+- `processed`: Number of rows successfully ingested without parsing errors
+- `processedWithError`: Number of rows that were ingested, but contained a 
parsing error within one or more columns. This typically occurs where input 
rows have a parseable structure but invalid types for columns, such as passing 
in a non-numeric String value for a numeric column.
+- `thrownAway`: Number of rows skipped. This includes rows with timestamps 
that were outside of the ingestion task's defined time interval and rows that 
were filtered out with a [`transformSpec`](./ingestion-spec.md#transformspec), 
but doesn't include the rows skipped by explicit user configurations. For 
example, the rows skipped by `skipHeaderRows` or `hasHeaderRow` in the CSV 
format are not counted.
+- `unparseable`: Number of rows that could not be parsed at all and were 
discarded. This tracks input rows without a parseable structure, such as 
passing in non-JSON data when using a JSON parser.
 
 The `errorMsg` field shows a message describing the error that caused a task 
to fail. It will be null if the task was successful.
 
@@ -257,7 +257,7 @@ These overshadowed segments are not considered in query 
processing to filter out
 Each segment has a _major_ version and a _minor_ version. The major version is
 represented as a timestamp in the format of 
[`"yyyy-MM-dd'T'hh:mm:ss"`](https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat)
 while the minor version is an integer number. These major and minor versions
-are used to determine the overshadow relation between segments as seen below. 
+are used to determine the overshadow relation between segments as seen below.
 
 A segment `s1` overshadows another `s2` if
 
@@ -363,7 +363,6 @@ The following parameters apply to all task types.
 |`useLineageBasedSegmentAllocation`|false in 0.21 or earlier, true in 0.22 or 
later|Enable the new lineage-based segment allocation protocol for the native 
Parallel task with dynamic partitioning. This option should be off during the 
replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 
to Druid 0.22 or higher. Once the upgrade is done, it must be set to true to 
ensure data correctness.|
 |`storeEmptyColumns`|true|Boolean value for whether or not to store empty 
columns during ingestion. When set to true, Druid stores every column specified 
in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). If you use 
schemaless ingestion and don't specify any dimensions to ingest, you must also 
set [`includeAllDimensions`](ingestion-spec.md#dimensionsspec) for Druid to 
store empty columns.<br/><br/>If you set `storeEmptyColumns` to false, Druid 
SQL queries referencing empty colu [...]
 
-
 ## Task logs
 
 Logs are created by ingestion tasks as they run.  You can configure Druid to 
push these into a repository for long-term storage after they complete.
@@ -389,7 +388,6 @@ You can configure retention periods for logs in 
milliseconds by setting `druid.i
 
 > Automatic log file deletion typically works based on the log file's 
 > 'modified' timestamp in the back-end store.  Large clock skews between Druid 
 > processes and the long-term store might result in unintended behavior.
 
-
 ## All task types
 
 ### `index_parallel`
diff --git a/docs/multi-stage-query/api.md b/docs/multi-stage-query/api.md
index b81370c42a..81ffecfefd 100644
--- a/docs/multi-stage-query/api.md
+++ b/docs/multi-stage-query/api.md
@@ -110,7 +110,6 @@ print(response.text)
 
 <!--END_DOCUSAURUS_CODE_TABS-->
 
-
 #### Response
 
 ```json
@@ -122,15 +121,14 @@ print(response.text)
 
 **Response fields**
 
-|Field|Description|
-|-----|-----------|
-| taskId | Controller task ID. You can use Druid's standard [task 
APIs](../operations/api-reference.md#overlord) to interact with this controller 
task.|
-| state | Initial state for the query, which is "RUNNING".|
-
+| Field | Description |
+|---|---|
+| `taskId` | Controller task ID. You can use Druid's standard [task 
APIs](../operations/api-reference.md#overlord) to interact with this controller 
task. |
+| `state` | Initial state for the query, which is "RUNNING". |
 
 ## Get the status for a query task
 
-You can retrieve status of a query to see if it is still running, completed 
successfully, failed, or got canceled. 
+You can retrieve status of a query to see if it is still running, completed 
successfully, failed, or got canceled.
 
 #### Request
 
@@ -238,7 +236,6 @@ response = requests.request("GET", url, headers=headers)
 print(response.text)
 ```
 
-
 <!--END_DOCUSAURUS_CODE_TABS-->
 
 #### Response
@@ -512,7 +509,7 @@ The response shows an example report for a query.
                 "0": 1,
                 "1": 1,
                 "2": 1
-              },
+              }, 
               "totalMergersForUltimateLevel": 1,
               "progressDigest": 1
             }
@@ -546,37 +543,37 @@ The response shows an example report for a query.
 
 The following table describes the response fields when you retrieve a report 
for a MSQ task engine using the `/druid/indexer/v1/task/<taskId>/reports` 
endpoint:
 
-|Field|Description|
-|-----|-----------|
-|multiStageQuery.taskId|Controller task ID.|
-|multiStageQuery.payload.status|Query status container.|
-|multiStageQuery.payload.status.status|RUNNING, SUCCESS, or FAILED.|
-|multiStageQuery.payload.status.startTime|Start time of the query in ISO 
format. Only present if the query has started running.|
-|multiStageQuery.payload.status.durationMs|Milliseconds elapsed after the 
query has started running. -1 denotes that the query hasn't started running 
yet.|
-|multiStageQuery.payload.status.pendingTasks|Number of tasks that are not 
fully started. -1 denotes that the number is currently unknown.|
-|multiStageQuery.payload.status.runningTasks|Number of currently running 
tasks. Should be at least 1 since the controller is included.|
-|multiStageQuery.payload.status.errorReport|Error object. Only present if 
there was an error.|
-|multiStageQuery.payload.status.errorReport.taskId|The task that reported the 
error, if known. May be a controller task or a worker task.|
-|multiStageQuery.payload.status.errorReport.host|The hostname and port of the 
task that reported the error, if known.|
-|multiStageQuery.payload.status.errorReport.stageNumber|The stage number that 
reported the error, if it happened during execution of a specific stage.|
-|multiStageQuery.payload.status.errorReport.error|Error object. Contains 
`errorCode` at a minimum, and may contain other fields as described in the 
[error code table](./reference.md#error-codes). Always present if there is an 
error.|
-|multiStageQuery.payload.status.errorReport.error.errorCode|One of the error 
codes from the [error code table](./reference.md#error-codes). Always present 
if there is an error.|
-|multiStageQuery.payload.status.errorReport.error.errorMessage|User-friendly 
error message. Not always present, even if there is an error.|
-|multiStageQuery.payload.status.errorReport.exceptionStackTrace|Java stack 
trace in string form, if the error was due to a server-side exception.|
-|multiStageQuery.payload.stages|Array of query stages.|
-|multiStageQuery.payload.stages[].stageNumber|Each stage has a number that 
differentiates it from other stages.|
-|multiStageQuery.payload.stages[].phase|Either NEW, READING_INPUT, 
POST_READING, RESULTS_COMPLETE, or FAILED. Only present if the stage has 
started.|
-|multiStageQuery.payload.stages[].workerCount|Number of parallel tasks that 
this stage is running on. Only present if the stage has started.|
-|multiStageQuery.payload.stages[].partitionCount|Number of output partitions 
generated by this stage. Only present if the stage has started and has computed 
its number of output partitions.|
-|multiStageQuery.payload.stages[].startTime|Start time of this stage. Only 
present if the stage has started.|
-|multiStageQuery.payload.stages[].duration|The number of milliseconds that the 
stage has been running. Only present if the stage has started.|
-|multiStageQuery.payload.stages[].sort|A boolean that is set to `true` if the 
stage does a sort as part of its execution.|
-|multiStageQuery.payload.stages[].definition|The object defining what the 
stage does.|
-|multiStageQuery.payload.stages[].definition.id|The unique identifier of the 
stage.|
-|multiStageQuery.payload.stages[].definition.input|Array of inputs that the 
stage has.|
-|multiStageQuery.payload.stages[].definition.broadcast|Array of input indexes 
that get broadcasted. Only present if there are inputs that get broadcasted.|
-|multiStageQuery.payload.stages[].definition.processor|An object defining the 
processor logic.|
-|multiStageQuery.payload.stages[].definition.signature|The output signature of 
the stage.|
+| Field | Description |
+|---|---|
+| `multiStageQuery.taskId` | Controller task ID. |
+| `multiStageQuery.payload.status` | Query status container. |
+| `multiStageQuery.payload.status.status` | RUNNING, SUCCESS, or FAILED. |
+| `multiStageQuery.payload.status.startTime` | Start time of the query in ISO 
format. Only present if the query has started running. |
+| `multiStageQuery.payload.status.durationMs` | Milliseconds elapsed after the 
query has started running. -1 denotes that the query hasn't started running 
yet. |
+| `multiStageQuery.payload.status.pendingTasks` | Number of tasks that are not 
fully started. -1 denotes that the number is currently unknown. |
+| `multiStageQuery.payload.status.runningTasks` | Number of currently running 
tasks. Should be at least 1 since the controller is included. |
+| `multiStageQuery.payload.status.errorReport` | Error object. Only present if 
there was an error. |
+| `multiStageQuery.payload.status.errorReport.taskId` | The task that reported 
the error, if known. May be a controller task or a worker task. |
+| `multiStageQuery.payload.status.errorReport.host` | The hostname and port of 
the task that reported the error, if known. |
+| `multiStageQuery.payload.status.errorReport.stageNumber` | The stage number 
that reported the error, if it happened during execution of a specific stage. |
+| `multiStageQuery.payload.status.errorReport.error` | Error object. Contains 
`errorCode` at a minimum, and may contain other fields as described in the 
[error code table](./reference.md#error-codes). Always present if there is an 
error. |
+| `multiStageQuery.payload.status.errorReport.error.errorCode` | One of the 
error codes from the [error code table](./reference.md#error-codes). Always 
present if there is an error. |
+| `multiStageQuery.payload.status.errorReport.error.errorMessage` | 
User-friendly error message. Not always present, even if there is an error. |
+| `multiStageQuery.payload.status.errorReport.exceptionStackTrace` | Java 
stack trace in string form, if the error was due to a server-side exception. |
+| `multiStageQuery.payload.stages` | Array of query stages. |
+| `multiStageQuery.payload.stages[].stageNumber` | Each stage has a number 
that differentiates it from other stages. |
+| `multiStageQuery.payload.stages[].phase` | Either NEW, READING_INPUT, 
POST_READING, RESULTS_COMPLETE, or FAILED. Only present if the stage has 
started. |
+| `multiStageQuery.payload.stages[].workerCount` | Number of parallel tasks 
that this stage is running on. Only present if the stage has started. |
+| `multiStageQuery.payload.stages[].partitionCount` | Number of output 
partitions generated by this stage. Only present if the stage has started and 
has computed its number of output partitions. |
+| `multiStageQuery.payload.stages[].startTime` | Start time of this stage. 
Only present if the stage has started. |
+| `multiStageQuery.payload.stages[].duration` | The number of milliseconds 
that the stage has been running. Only present if the stage has started. |
+| `multiStageQuery.payload.stages[].sort` | A boolean that is set to `true` if 
the stage does a sort as part of its execution. |
+| `multiStageQuery.payload.stages[].definition` | The object defining what the 
stage does. |
+| `multiStageQuery.payload.stages[].definition.id` | The unique identifier of 
the stage. |
+| `multiStageQuery.payload.stages[].definition.input` | Array of inputs that 
the stage has. |
+| `multiStageQuery.payload.stages[].definition.broadcast` | Array of input 
indexes that get broadcasted. Only present if there are inputs that get 
broadcasted. |
+| `multiStageQuery.payload.stages[].definition.processor` | An object defining 
the processor logic. |
+| `multiStageQuery.payload.stages[].definition.signature` | The output 
signature of the stage. |
 
 ## Cancel a query task
 
diff --git a/docs/multi-stage-query/concepts.md 
b/docs/multi-stage-query/concepts.md
index ae64478243..44e5ea43d4 100644
--- a/docs/multi-stage-query/concepts.md
+++ b/docs/multi-stage-query/concepts.md
@@ -79,7 +79,7 @@ publishes them at the end of its run. For this reason, it is 
best suited to load
 INSERT statements to load data in a sequence of microbatches; for that, use 
[streaming
 ingestion](../ingestion/index.md#streaming) instead.
 
-When deciding whether to use REPLACE or INSERT, keep in mind that segments 
generated with REPLACE can be pruned with dimension-based pruning but those 
generated with INSERT cannot. For more information about the requirements for 
dimension-based pruning, see [Clustering](#clustering). 
+When deciding whether to use REPLACE or INSERT, keep in mind that segments 
generated with REPLACE can be pruned with dimension-based pruning but those 
generated with INSERT cannot. For more information about the requirements for 
dimension-based pruning, see [Clustering](#clustering).
 
 For more information about the syntax, see [INSERT](./reference.md#insert).
 
@@ -211,19 +211,19 @@ For an example, see [INSERT with rollup 
example](examples.md#insert-with-rollup)
 When you execute a SQL statement using the task endpoint 
[`/druid/v2/sql/task`](api.md#submit-a-query), the following
 happens:
 
-1.  The Broker plans your SQL query into a native query, as usual.
+1. The Broker plans your SQL query into a native query, as usual.
 
-2.  The Broker wraps the native query into a task of type `query_controller`
+2. The Broker wraps the native query into a task of type `query_controller`
     and submits it to the indexing service.
 
 3. The Broker returns the task ID to you and exits.
 
-4.  The controller task launches some number of worker tasks determined by
+4. The controller task launches some number of worker tasks determined by
     the `maxNumTasks` and `taskAssignment` [context 
parameters](./reference.md#context-parameters). You can set these settings 
individually for each query.
 
-5.  Worker tasks of type `query_worker` execute the query.
+5. Worker tasks of type `query_worker` execute the query.
 
-6.  If the query is a SELECT query, the worker tasks send the results
+6. If the query is a SELECT query, the worker tasks send the results
     back to the controller task, which writes them into its task report.
     If the query is an INSERT or REPLACE query, the worker tasks generate and
     publish new Druid segments to the provided datasource.
@@ -287,4 +287,4 @@ Worker tasks use local disk for four purposes:
 
 Workers use the task working directory, given by
 
[`druid.indexer.task.baseDir`](../configuration/index.md#additional-peon-configuration),
 for these items. It is
-important that this directory has enough space available for these purposes. 
+important that this directory has enough space available for these purposes.
diff --git a/docs/multi-stage-query/examples.md 
b/docs/multi-stage-query/examples.md
index eed42fec4b..ba05c4e362 100644
--- a/docs/multi-stage-query/examples.md
+++ b/docs/multi-stage-query/examples.md
@@ -145,7 +145,6 @@ CLUSTERED BY page
 
 </details>
 
-
 ## INSERT with JOIN
 
 This example inserts data into a table named `w003` and joins data from two 
sources:
@@ -302,10 +301,8 @@ CLUSTERED BY page
 
 ## SELECT with EXTERN and JOIN
 
-
 <details><summary>Show the query</summary>
 
-
 ```sql
 WITH flights AS (
   SELECT * FROM TABLE(
diff --git a/docs/multi-stage-query/index.md b/docs/multi-stage-query/index.md
index f3fe703510..81d40bd8df 100644
--- a/docs/multi-stage-query/index.md
+++ b/docs/multi-stage-query/index.md
@@ -69,6 +69,6 @@ To use [EXTERN](reference.md#extern), you need READ 
permission on the resource n
 
 ## Next steps
 
-* [Read about key concepts](./concepts.md) to learn more about how SQL-based 
ingestion and multi-stage queries work.
-* [Check out the examples](./examples.md) to see SQL-based ingestion in action.
-* [Explore the Query view](../operations/web-console.md) to get started in the 
web console.
+- [Read about key concepts](./concepts.md) to learn more about how SQL-based 
ingestion and multi-stage queries work.
+- [Check out the examples](./examples.md) to see SQL-based ingestion in action.
+- [Explore the Query view](../operations/web-console.md) to get started in the 
web console.
diff --git a/docs/multi-stage-query/reference.md 
b/docs/multi-stage-query/reference.md
index d500329f7a..a85a8e785b 100644
--- a/docs/multi-stage-query/reference.md
+++ b/docs/multi-stage-query/reference.md
@@ -52,9 +52,9 @@ FROM TABLE(
 
 EXTERN consists of the following parts:
 
-1.  Any [Druid input source](../ingestion/native-batch-input-source.md) as a 
JSON-encoded string.
-2.  Any [Druid input format](../ingestion/data-formats.md) as a JSON-encoded 
string.
-3.  A row signature, as a JSON-encoded array of column descriptors. Each 
column descriptor must have a `name` and a `type`. The type can be `string`, 
`long`, `double`, or `float`. This row signature is used to map the external 
data into the SQL layer.
+1. Any [Druid input source](../ingestion/native-batch-input-source.md) as a 
JSON-encoded string.
+2. Any [Druid input format](../ingestion/data-formats.md) as a JSON-encoded 
string.
+3. A row signature, as a JSON-encoded array of column descriptors. Each column 
descriptor must have a `name` and a `type`. The type can be `string`, `long`, 
`double`, or `float`. This row signature is used to map the external data into 
the SQL layer.
 
 For more information, see [Read external data with EXTERN](concepts.md#extern).
 
@@ -122,10 +122,10 @@ REPLACE consists of the following parts:
 1. Optional [context parameters](./reference.md#context-parameters).
 2. A `REPLACE INTO <dataSource>` clause at the start of your query, such as 
`REPLACE INTO "your-table".`
 3. An OVERWRITE clause after the datasource, either OVERWRITE ALL or OVERWRITE 
WHERE:
-  - OVERWRITE ALL replaces the entire existing datasource with the results of 
the query.
-  - OVERWRITE WHERE drops the time segments that match the condition you set. 
Conditions are based on the `__time`
-    column and use the format `__time [< > = <= >=] TIMESTAMP`. Use them with 
AND, OR, and NOT between them, inclusive
-    of the timestamps specified. No other expressions or functions are valid 
in OVERWRITE.
+    - OVERWRITE ALL replaces the entire existing datasource with the results 
of the query.
+    - OVERWRITE WHERE drops the time segments that match the condition you 
set. Conditions are based on the `__time`
+        column and use the format `__time [< > = <= >=] TIMESTAMP`. Use them 
with AND, OR, and NOT between them, inclusive
+        of the timestamps specified. No other expressions or functions are 
valid in OVERWRITE.
 4. A clause for the actual data you want to use for the replacement.
 5. A [PARTITIONED BY](#partitioned-by) clause, such as `PARTITIONED BY DAY`.
 6. An optional [CLUSTERED BY](#clustered-by) clause.
@@ -175,7 +175,7 @@ For more information about clustering, see 
[Clustering](concepts.md#clustering).
 
 ## Context parameters
 
-In addition to the Druid SQL [context 
parameters](../querying/sql-query-context.md), the multi-stage query task 
engine accepts certain context parameters that are specific to it. 
+In addition to the Druid SQL [context 
parameters](../querying/sql-query-context.md), the multi-stage query task 
engine accepts certain context parameters that are specific to it.
 
 Use context parameters alongside your queries to customize the behavior of the 
query. If you're using the API, include the context parameters in the query 
context when you submit a query:
 
@@ -193,16 +193,16 @@ If you're using the web console, you can specify the 
context parameters through
 
 The following table lists the context parameters for the MSQ task engine:
 
-|Parameter|Description|Default value|
-|---------|-----------|-------------|
-| maxNumTasks | SELECT, INSERT, REPLACE<br /><br />The maximum total number of 
tasks to launch, including the controller task. The lowest possible value for 
this setting is 2: one controller and one worker. All tasks must be able to 
launch simultaneously. If they cannot, the query returns a `TaskStartTimeout` 
error code after approximately 10 minutes.<br /><br />May also be provided as 
`numTasks`. If both are present, `maxNumTasks` takes priority.| 2 |
-| taskAssignment | SELECT, INSERT, REPLACE<br /><br />Determines how many 
tasks to use. Possible values include: <ul><li>`max`: Uses as many tasks as 
possible, up to `maxNumTasks`.</li><li>`auto`: When file sizes can be 
determined through directory listing (for example: local files, S3, GCS, HDFS) 
uses as few tasks as possible without exceeding 10 GiB or 10,000 files per 
task, unless exceeding these limits is necessary to stay within `maxNumTasks`. 
When file sizes cannot be determined th [...]
-| finalizeAggregations | SELECT, INSERT, REPLACE<br /><br />Determines the 
type of aggregation to return. If true, Druid finalizes the results of complex 
aggregations that directly appear in query results. If false, Druid returns the 
aggregation's intermediate type rather than finalized type. This parameter is 
useful during ingestion, where it enables storing sketches directly in Druid 
tables. For more information about aggregations, see [SQL aggregation 
functions](../querying/sql-aggreg [...]
-| rowsInMemory | INSERT or REPLACE<br /><br />Maximum number of rows to store 
in memory at once before flushing to disk during the segment generation 
process. Ignored for non-INSERT queries. In most cases, use the default value. 
You may need to override the default if you run into one of the [known 
issues](./known-issues.md) around memory usage. | 100,000 |
-| segmentSortOrder | INSERT or REPLACE<br /><br />Normally, Druid sorts rows 
in individual segments using `__time` first, followed by the [CLUSTERED 
BY](#clustered-by) clause. When you set `segmentSortOrder`, Druid sorts rows in 
segments using this column list first, followed by the CLUSTERED BY order.<br 
/><br />You provide the column list as comma-separated values or as a JSON 
array in string form. If your query includes `__time`, then this list must 
begin with `__time`. For example, c [...]
-| maxParseExceptions| SELECT, INSERT, REPLACE<br /><br />Maximum number of 
parse exceptions that are ignored while executing the query before it stops 
with `TooManyWarningsFault`. To ignore all the parse exceptions, set the value 
to -1.| 0 |
-| rowsPerSegment | INSERT or REPLACE<br /><br />The number of rows per segment 
to target. The actual number of rows per segment may be somewhat higher or 
lower than this number. In most cases, use the default. For general information 
about sizing rows per segment, see [Segment Size 
Optimization](../operations/segment-optimization.md). | 3,000,000 |
-| indexSpec | INSERT or REPLACE<br /><br />An 
[`indexSpec`](../ingestion/ingestion-spec.md#indexspec) to use when generating 
segments. May be a JSON string or object. | See 
[`indexSpec`](../ingestion/ingestion-spec.md#indexspec). |
+| Parameter | Description | Default value |
+|---|---|---|
+| `maxNumTasks` | SELECT, INSERT, REPLACE<br /><br />The maximum total number 
of tasks to launch, including the controller task. The lowest possible value 
for this setting is 2: one controller and one worker. All tasks must be able to 
launch simultaneously. If they cannot, the query returns a `TaskStartTimeout` 
error code after approximately 10 minutes.<br /><br />May also be provided as 
`numTasks`. If both are present, `maxNumTasks` takes priority.| 2 |
+| `taskAssignment` | SELECT, INSERT, REPLACE<br /><br />Determines how many 
tasks to use. Possible values include: <ul><li>`max`: Uses as many tasks as 
possible, up to `maxNumTasks`.</li><li>`auto`: When file sizes can be 
determined through directory listing (for example: local files, S3, GCS, HDFS) 
uses as few tasks as possible without exceeding 10 GiB or 10,000 files per 
task, unless exceeding these limits is necessary to stay within `maxNumTasks`. 
When file sizes cannot be determined  [...]
+| `finalizeAggregations` | SELECT, INSERT, REPLACE<br /><br />Determines the 
type of aggregation to return. If true, Druid finalizes the results of complex 
aggregations that directly appear in query results. If false, Druid returns the 
aggregation's intermediate type rather than finalized type. This parameter is 
useful during ingestion, where it enables storing sketches directly in Druid 
tables. For more information about aggregations, see [SQL aggregation 
functions](../querying/sql-aggr [...]
+| `rowsInMemory` | INSERT or REPLACE<br /><br />Maximum number of rows to 
store in memory at once before flushing to disk during the segment generation 
process. Ignored for non-INSERT queries. In most cases, use the default value. 
You may need to override the default if you run into one of the [known 
issues](./known-issues.md) around memory usage. | 100,000 |
+| `segmentSortOrder` | INSERT or REPLACE<br /><br />Normally, Druid sorts rows 
in individual segments using `__time` first, followed by the [CLUSTERED 
BY](#clustered-by) clause. When you set `segmentSortOrder`, Druid sorts rows in 
segments using this column list first, followed by the CLUSTERED BY order.<br 
/><br />You provide the column list as comma-separated values or as a JSON 
array in string form. If your query includes `__time`, then this list must 
begin with `__time`. For example, [...]
+| `maxParseExceptions`| SELECT, INSERT, REPLACE<br /><br />Maximum number of 
parse exceptions that are ignored while executing the query before it stops 
with `TooManyWarningsFault`. To ignore all the parse exceptions, set the value 
to -1.| 0 |
+| `rowsPerSegment` | INSERT or REPLACE<br /><br />The number of rows per 
segment to target. The actual number of rows per segment may be somewhat higher 
or lower than this number. In most cases, use the default. For general 
information about sizing rows per segment, see [Segment Size 
Optimization](../operations/segment-optimization.md). | 3,000,000 |
+| `indexSpec` | INSERT or REPLACE<br /><br />An 
[`indexSpec`](../ingestion/ingestion-spec.md#indexspec) to use when generating 
segments. May be a JSON string or object. | See 
[`indexSpec`](../ingestion/ingestion-spec.md#indexspec). |
 
 ## Durable Storage
 This section enumerates the advantages and performance implications of 
enabling durable storage while executing MSQ tasks.
@@ -225,8 +225,8 @@ Knowing the limits for the MSQ task engine can help you 
troubleshoot any [errors
 
 The following table lists query limits:
 
-|Limit|Value|Error if exceeded|
-|-----|-----|-----------------|
+| Limit | Value | Error if exceeded |
+|---|---|---|
 | Size of an individual row written to a frame. Row size when written to a 
frame may differ from the original row size. | 1 MB | `RowTooLarge` |
 | Number of segment-granular time chunks encountered during ingestion. | 5,000 
| `TooManyBuckets` |
 | Number of input files/segments per worker. | 10,000 | `TooManyInputFiles` |
@@ -241,32 +241,31 @@ The following table lists query limits:
 
 The following table describes error codes you may encounter in the 
`multiStageQuery.payload.status.errorReport.error.errorCode` field:
 
-|Code|Meaning|Additional fields|
-|----|-----------|----|
-|  BroadcastTablesTooLarge  | The size of the broadcast tables, used in right 
hand side of the joins, exceeded the memory reserved for them in a worker 
task.<br /><br />Try increasing the peon memory or reducing the size of the 
broadcast tables. | `maxBroadcastTablesSize`: Memory reserved for the broadcast 
tables, measured in bytes. |
-|  Canceled  |  The query was canceled. Common reasons for cancellation:<br 
/><br /><ul><li>User-initiated shutdown of the controller task via the 
`/druid/indexer/v1/task/{taskId}/shutdown` API.</li><li>Restart or failure of 
the server process that was running the controller task.</li></ul>|    |
-|  CannotParseExternalData |  A worker task could not parse data from an 
external datasource.  |    |
-|  ColumnNameRestricted|  The query uses a restricted column name.  |    |
-|  ColumnTypeNotSupported|  Support for writing or reading from a particular 
column type is not supported. |    |
-|  ColumnTypeNotSupported | The query attempted to use a column type that is 
not supported by the frame format. This occurs with ARRAY types, which are not 
yet implemented for frames.  | `columnName`<br /> <br />`columnType`   |
-|  InsertCannotAllocateSegment |  The controller task could not allocate a new 
segment ID due to conflict with existing segments or pending segments. Common 
reasons for such conflicts:<br /> <br /><ul><li>Attempting to mix different 
granularities in the same intervals of the same datasource.</li><li>Prior 
ingestions that used non-extendable shard specs.</li></ul>| `dataSource`<br /> 
<br />`interval`: The interval for the attempted new segment allocation.  |
-|  InsertCannotBeEmpty |  An INSERT or REPLACE query did not generate any 
output rows in a situation where output rows are required for success. This can 
happen for INSERT or REPLACE queries with `PARTITIONED BY` set to something 
other than `ALL` or `ALL TIME`.  |  `dataSource`  |
-|  InsertCannotOrderByDescending  |  An INSERT query contained a `CLUSTERED 
BY` expression in descending order. Druid's segment generation code only 
supports ascending order.  |   `columnName` |
-|  InsertCannotReplaceExistingSegment |  A REPLACE query cannot proceed 
because an existing segment partially overlaps those bounds, and the portion 
within the bounds is not fully overshadowed by query results. <br /> <br 
/>There are two ways to address this without modifying your 
query:<ul><li>Shrink the OVERLAP filter to match the query 
results.</li><li>Expand the OVERLAP filter to fully contain the existing 
segment.</li></ul>| `segmentId`: The existing segment <br /> 
-|  InsertLockPreempted  | An INSERT or REPLACE query was canceled by a 
higher-priority ingestion job, such as a real-time ingestion task.  | |
-|  InsertTimeNull  | An INSERT or REPLACE query encountered a null timestamp 
in the `__time` field.<br /><br />This can happen due to using an expression 
like `TIME_PARSE(timestamp) AS __time` with a timestamp that cannot be parsed. 
(TIME_PARSE returns null when it cannot parse a timestamp.) In this case, try 
parsing your timestamps using a different function or pattern.<br /><br />If 
your timestamps may genuinely be null, consider using COALESCE to provide a 
default value. One option is [...]
-| InsertTimeOutOfBounds  |  A REPLACE query generated a timestamp outside the 
bounds of the TIMESTAMP parameter for your OVERWRITE WHERE clause.<br /> <br 
/>To avoid this error, verify that the   you specified is valid.  |  
`interval`: time chunk interval corresponding to the out-of-bounds timestamp  |
-|  InvalidNullByte  | A string column included a null byte. Null bytes in 
strings are not permitted. |  `column`: The column that included the null byte |
-| QueryNotSupported   | QueryKit could not translate the provided native query 
to a multi-stage query.<br /> <br />This can happen if the query uses features 
that aren't supported, like GROUPING SETS. |    |
-|  RowTooLarge  |  The query tried to process a row that was too large to 
write to a single frame. See the [Limits](#limits) table for the specific limit 
on frame size. Note that the effective maximum row size is smaller than the 
maximum frame size due to alignment considerations during frame writing.  |   
`maxFrameSize`: The limit on the frame size. |
-|  TaskStartTimeout  | Unable to launch all the worker tasks in time. <br /> 
<br />There might be insufficient available slots to start all the worker tasks 
simultaneously.<br /> <br /> Try splitting up the query into smaller chunks 
with lesser `maxNumTasks` number. Another option is to increase capacity.  | |
-|  TooManyBuckets  |  Exceeded the number of partition buckets for a stage. 
Partition buckets are only used for `segmentGranularity` during INSERT queries. 
The most common reason for this error is that your `segmentGranularity` is too 
narrow relative to the data. See the [Limits](#limits) table for the specific 
limit.  |  `maxBuckets`: The limit on buckets.  |
-| TooManyInputFiles | Exceeded the number of input files/segments per worker. 
See the [Limits](#limits) table for the specific limit. | `umInputFiles`: The 
total number of input files/segments for the stage.<br /><br />`maxInputFiles`: 
The maximum number of input files/segments per worker per stage.<br /><br 
/>`minNumWorker`: The minimum number of workers required for a successful run. |
-|  TooManyPartitions   |  Exceeded the number of partitions for a stage. The 
most common reason for this is that the final stage of an INSERT or REPLACE 
query generated too many segments. See the [Limits](#limits) table for the 
specific limit.  | `maxPartitions`: The limit on partitions which was exceeded  
  |
-|  TooManyColumns |  Exceeded the number of columns for a stage. See the 
[Limits](#limits) table for the specific limit.  | `maxColumns`: The limit on 
columns which was exceeded.  |
-|  TooManyWarnings |  Exceeded the allowed number of warnings of a particular 
type. | `rootErrorCode`: The error code corresponding to the exception that 
exceeded the required limit. <br /><br />`maxWarnings`: Maximum number of 
warnings that are allowed for the corresponding `rootErrorCode`.   |
-|  TooManyWorkers |  Exceeded the supported number of workers running 
simultaneously. See the [Limits](#limits) table for the specific limit.  | 
`workers`: The number of simultaneously running workers that exceeded a hard or 
soft limit. This may be larger than the number of workers in any one stage if 
multiple stages are running simultaneously. <br /><br />`maxWorkers`: The hard 
or soft limit on workers that was exceeded.  |
-|  NotEnoughMemory  |  Insufficient memory to launch a stage.  |  
`serverMemory`: The amount of memory available to a single process.<br /><br 
/>`usableMemory`: The amount of server memory usable by query-related work. 
Excludes space taken by lookups plus an additional 25% overhead.<br /><br 
/>`serverWorkers`: The number of workers running in a single process.<br /><br 
/>`serverThreads`: The number of threads in a single process.  |
-|  WorkerFailed  |  A worker task failed unexpectedly.  |  `workerTaskId`: The 
ID of the worker task.  |
-|  WorkerRpcFailed  |  A remote procedure call to a worker task failed and 
could not recover.  |  `workerTaskId`: the id of the worker task  |
-|  UnknownError   |  All other errors.  |    |
+| Code | Meaning | Additional fields |
+|---|---|---|
+| `BroadcastTablesTooLarge` | The size of the broadcast tables used in the 
right hand side of the join exceeded the memory reserved for them in a worker 
task.<br /><br />Try increasing the peon memory or reducing the size of the 
broadcast tables. | `maxBroadcastTablesSize`: Memory reserved for the broadcast 
tables, measured in bytes. |
+| `Canceled` | The query was canceled. Common reasons for cancellation:<br 
/><br /><ul><li>User-initiated shutdown of the controller task via the 
`/druid/indexer/v1/task/{taskId}/shutdown` API.</li><li>Restart or failure of 
the server process that was running the controller task.</li></ul>| |
+| `CannotParseExternalData` | A worker task could not parse data from an 
external datasource. | `errorMessage`: More details on why parsing failed. |
+| `ColumnNameRestricted` | The query uses a restricted column name. | 
`columnName`: The restricted column name. |
+| `ColumnTypeNotSupported` | The column type is not supported. This can be 
because:<br /> <br /><ul><li>Support for writing or reading from a particular 
column type is not supported.</li><li>The query attempted to use a column type 
that is not supported by the frame format. This occurs with ARRAY types, which 
are not yet implemented for frames.</li></ul> | `columnName`: The column name 
with an unsupported type.<br /> <br />`columnType`: The unknown column type. |
+| `InsertCannotAllocateSegment` | The controller task could not allocate a new 
segment ID due to conflict with existing segments or pending segments. Common 
reasons for such conflicts:<br /> <br /><ul><li>Attempting to mix different 
granularities in the same intervals of the same datasource.</li><li>Prior 
ingestions that used non-extendable shard specs.</li></ul>| `dataSource`<br /> 
<br />`interval`: The interval for the attempted new segment allocation. |
+| `InsertCannotBeEmpty` | An INSERT or REPLACE query did not generate any 
output rows in a situation where output rows are required for success. This can 
happen for INSERT or REPLACE queries with `PARTITIONED BY` set to something 
other than `ALL` or `ALL TIME`. | `dataSource` |
+| `InsertCannotOrderByDescending` | An INSERT query contained a `CLUSTERED BY` 
expression in descending order. Druid's segment generation code only supports 
ascending order. | `columnName` |
+| `InsertCannotReplaceExistingSegment` | A REPLACE query cannot proceed 
because an existing segment partially overlaps those bounds, and the portion 
within the bounds is not fully overshadowed by query results. <br /> <br 
/>There are two ways to address this without modifying your 
query:<ul><li>Shrink the OVERLAP filter to match the query 
results.</li><li>Expand the OVERLAP filter to fully contain the existing 
segment.</li></ul>| `segmentId`: The existing segment <br />
+| `InsertLockPreempted` | An INSERT or REPLACE query was canceled by a 
higher-priority ingestion job, such as a real-time ingestion task. | |
+| `InsertTimeNull` | An INSERT or REPLACE query encountered a null timestamp 
in the `__time` field.<br /><br />This can happen due to using an expression 
like `TIME_PARSE(timestamp) AS __time` with a timestamp that cannot be parsed. 
(TIME_PARSE returns null when it cannot parse a timestamp.) In this case, try 
parsing your timestamps using a different function or pattern.<br /><br />If 
your timestamps may genuinely be null, consider using COALESCE to provide a 
default value. One option is [...]
+| `InsertTimeOutOfBounds` | A REPLACE query generated a timestamp outside the 
bounds of the TIMESTAMP parameter for your OVERWRITE WHERE clause.<br /> <br 
/>To avoid this error, verify that the you specified is valid. | `interval`: 
time chunk interval corresponding to the out-of-bounds timestamp |
+| `InvalidNullByte` | A string column included a null byte. Null bytes in 
strings are not permitted. | `column`: The column that included the null byte |
+| `QueryNotSupported` | QueryKit could not translate the provided native query 
to a multi-stage query.<br /> <br />This can happen if the query uses features 
that aren't supported, like GROUPING SETS. | |
+| `RowTooLarge` | The query tried to process a row that was too large to write 
to a single frame. See the [Limits](#limits) table for the specific limit on 
frame size. Note that the effective maximum row size is smaller than the 
maximum frame size due to alignment considerations during frame writing. | 
`maxFrameSize`: The limit on the frame size. |
+| `TaskStartTimeout` | Unable to launch all the worker tasks in time. <br /> 
<br />There might be insufficient available slots to start all the worker tasks 
simultaneously.<br /> <br /> Try splitting up the query into smaller chunks 
with lesser `maxNumTasks` number. Another option is to increase capacity. | 
`numTasks`: The number of tasks attempted to launch. |
+| `TooManyBuckets` | Exceeded the number of partition buckets for a stage. 
Partition buckets are only used for `segmentGranularity` during INSERT queries. 
The most common reason for this error is that your `segmentGranularity` is too 
narrow relative to the data. See the [Limits](#limits) table for the specific 
limit. | `maxBuckets`: The limit on buckets. |
+| `TooManyInputFiles` | Exceeded the number of input files/segments per 
worker. See the [Limits](#limits) table for the specific limit. | 
`numInputFiles`: The total number of input files/segments for the stage.<br 
/><br />`maxInputFiles`: The maximum number of input files/segments per worker 
per stage.<br /><br />`minNumWorker`: The minimum number of workers required 
for a successful run. |
+| `TooManyPartitions` | Exceeded the number of partitions for a stage. The 
most common reason for this is that the final stage of an INSERT or REPLACE 
query generated too many segments. See the [Limits](#limits) table for the 
specific limit. | `maxPartitions`: The limit on partitions which was exceeded |
+| `TooManyColumns` | Exceeded the number of columns for a stage. See the 
[Limits](#limits) table for the specific limit. | `numColumns`: The number of 
columns requested.<br /><br />`maxColumns`: The limit on columns which was 
exceeded. |
+| `TooManyWarnings` | Exceeded the allowed number of warnings of a particular 
type. | `rootErrorCode`: The error code corresponding to the exception that 
exceeded the required limit. <br /><br />`maxWarnings`: Maximum number of 
warnings that are allowed for the corresponding `rootErrorCode`. |
+| `TooManyWorkers` | Exceeded the supported number of workers running 
simultaneously. See the [Limits](#limits) table for the specific limit. | 
`workers`: The number of simultaneously running workers that exceeded a hard or 
soft limit. This may be larger than the number of workers in any one stage if 
multiple stages are running simultaneously. <br /><br />`maxWorkers`: The hard 
or soft limit on workers that was exceeded. |
+| `NotEnoughMemory` | Insufficient memory to launch a stage. | `serverMemory`: 
The amount of memory available to a single process.<br /><br />`serverWorkers`: 
The number of workers running in a single process.<br /><br />`serverThreads`: 
The number of threads in a single process. |
+| `WorkerFailed` | A worker task failed unexpectedly. | `errorMsg`<br /><br 
/>`workerTaskId`: The ID of the worker task. |
+| `WorkerRpcFailed` | A remote procedure call to a worker task failed and 
could not recover. | `workerTaskId`: the id of the worker task |
+| `UnknownError` | All other errors. | `message` |
diff --git a/docs/operations/rule-configuration.md 
b/docs/operations/rule-configuration.md
index 14df071c6d..687fb83e6f 100644
--- a/docs/operations/rule-configuration.md
+++ b/docs/operations/rule-configuration.md
@@ -44,7 +44,7 @@ You can use the Druid [web console](./web-console.md) or the 
[Coordinator API](.
 
 To set retention rules in the Druid web console:
 
-1. On the console home page, click **Datasources**. 
+1. On the console home page, click **Datasources**.
 2. Click the name of your datasource to open the data window.
 3. Select **Actions > Edit retention rules**.
 4. Click **+New rule**.
@@ -84,6 +84,7 @@ curl --location --request POST 
'http://localhost:8888/druid/coordinator/v1/rules
     "includeFuture": true
    }]'
 ```
+
 To retrieve all rules for all datasources, send a GET request to 
`/druid/coordinator/v1/rules`&mdash;for example:
 
 ```bash
@@ -112,7 +113,7 @@ If you have a single tier, Druid automatically names the 
tier `_default` and loa
 
 ### Forever load rule
 
-The forever load rule assigns all datasource segments to specified tiers. It 
is the default rule Druid applies to datasources. Forever load rules have type 
`loadForever`. 
+The forever load rule assigns all datasource segments to specified tiers. It 
is the default rule Druid applies to datasources. Forever load rules have type 
`loadForever`.
 
 The following example places one replica of each segment on a custom tier 
named `hot`, and another single replica on the default tier.
 
@@ -125,7 +126,9 @@ The following example places one replica of each segment on 
a custom tier named
   }
 }
 ```
+
 Set the following property:
+
 - `tieredReplicants`: a map of tier names to the number of segment replicas 
for that tier.
 
 ### Period load rule
@@ -147,6 +150,7 @@ Period load rules have type `loadByPeriod`. The following 
example places one rep
 ```
 
 Set the following properties:
+
 - `period`: a JSON object representing 
[ISO-8601](https://en.wikipedia.org/wiki/ISO_8601) periods. The period is from 
some time in the past to the present, or into the future if `includeFuture` is 
set to `true`.
 - `includeFuture`: a boolean flag to instruct Druid to match a segment if:
 <br>- the segment interval overlaps the rule interval, or
@@ -172,6 +176,7 @@ Interval load rules have type `loadByInterval`. The 
following example places one
 ```
 
 Set the following properties:
+
 - `interval`: the load interval specified as an ISO-8601 
[ISO-8601](https://en.wikipedia.org/wiki/ISO_8601) range encoded as a string.
 - `tieredReplicants`: a map of tier names to the number of segment replicas 
for that tier.
 
@@ -208,6 +213,7 @@ Period drop rules have type `dropByPeriod` and the 
following JSON structure:
 ```
 
 Set the following properties:
+
 - `period`: a JSON object representing 
[ISO-8601](https://en.wikipedia.org/wiki/ISO_8601) periods. The period is from 
some time in the past to the future or to the current time, depending on the 
`includeFuture` flag.
 - `includeFuture`: a boolean flag to instruct Druid to match a segment if:
 <br>- the segment interval overlaps the rule interval, or
@@ -216,7 +222,7 @@ Set the following properties:
 
 ### Period drop before rule
 
-Druid compares a segment's interval to the period you specify in the rule and 
drops the matching data. The rule matches if the segment interval is before the 
specified period. 
+Druid compares a segment's interval to the period you specify in the rule and 
drops the matching data. The rule matches if the segment interval is before the 
specified period.
 
 If you only want to retain recent data, you can use this rule to drop old data 
before a specified period, and add a `loadForever` rule to retain the data that 
follows it. Note that the rule combination `dropBeforeByPeriod` + `loadForever` 
is equivalent to `loadByPeriod(includeFuture = true)` + `dropForever`.
 
@@ -230,6 +236,7 @@ Period drop rules have type `dropBeforeByPeriod` and the 
following JSON structur
 ```
 
 Set the following property:
+
 - `period`: a JSON object representing 
[ISO-8601](https://en.wikipedia.org/wiki/ISO_8601) periods.
 
 ### Interval drop rule
@@ -246,6 +253,7 @@ Interval drop rules have type `dropByInterval` and the 
following JSON structure:
 ```
 
 Set the following property:
+
 - `interval`: the drop interval specified as an ISO-8601 
[ISO-8601](https://en.wikipedia.org/wiki/ISO_8601) range encoded as a string.
 
 ## Broadcast rules
@@ -254,7 +262,7 @@ Druid extensions use broadcast rules to load segment data 
onto all brokers in th
 
 ### Forever broadcast rule
 
-The forever broadcast rule loads all segment data in your datasources onto all 
brokers in the cluster. 
+The forever broadcast rule loads all segment data in your datasources onto all 
brokers in the cluster.
 
 Forever broadcast rules have type `broadcastForever`:
 
@@ -262,7 +270,7 @@ Forever broadcast rules have type `broadcastForever`:
 {
   "type": "broadcastForever",
 }
-``` 
+```
 
 ### Period broadcast rule
 
diff --git a/docs/querying/virtual-columns.md b/docs/querying/virtual-columns.md
index 6229d83326..b5ccf80f42 100644
--- a/docs/querying/virtual-columns.md
+++ b/docs/querying/virtual-columns.md
@@ -64,12 +64,13 @@ Each Apache Druid query can accept a list of virtual 
columns as a parameter. The
 ## Virtual column types
 
 ### Expression virtual column
+
 Expression virtual columns use Druid's native 
[expression](../misc/math-expr.md) system to allow defining query time
 transforms of inputs from one or more columns.
 
 The expression virtual column has the following syntax:
 
-```
+```json
 {
   "type": "expression",
   "name": <name of the virtual column>,
@@ -85,7 +86,6 @@ The expression virtual column has the following syntax:
 |expression|An [expression](../misc/math-expr.md) that takes a row as input 
and outputs a value for the virtual column.|yes|
 |outputType|The expression's output will be coerced to this type. Can be LONG, 
FLOAT, DOUBLE, STRING, ARRAY types, or COMPLEX types.|no, default is FLOAT|
 
-
 ### Nested field virtual column
 
 The nested field virtual column is an optimized virtual column that can 
provide direct access into various paths of
@@ -155,6 +155,7 @@ is using JSONPath syntax `path`, the second with a jq 
`path`, and the third uses
 |useJqSyntax|If true, parse `path` using 'jq' syntax instead of 
'JSONPath'.|no, default is false|
 
 #### Nested path part
+
 Specify `pathParts` as an array of objects that describe each component of the 
path to traverse. Each object can take the following properties:
 
 |property|description|required?|
@@ -166,11 +167,11 @@ Specify `pathParts` as an array of objects that describe 
each component of the p
 See [Nested columns](./nested-columns.md) for more information on ingesting 
and storing nested data.
 
 ### List filtered virtual column
+
 This virtual column provides an alternative way to use
 ['list filtered' dimension spec](./dimensionspecs.md#filtered-dimensionspecs) 
as a virtual column. It has optimized
 access to the underlying column value indexes that can provide a small 
performance improvement in some cases.
 
-
 ```json
     {
       "type": "mv-filtered",
diff --git a/website/.spelling b/website/.spelling
index fc177de24f..30300bd6a4 100644
--- a/website/.spelling
+++ b/website/.spelling
@@ -27,6 +27,7 @@ ACLs
 APIs
 AvroStorage
 ARN
+autokill
 AWS
 AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
 AWS_CONTAINER_CREDENTIALS_FULL_URI


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[druid] branch master updated: Add missing MSQ error code fields to docs (#13308)

Reply via email to