vtlim commented on code in PR #15049:
URL: https://github.com/apache/druid/pull/15049#discussion_r1445363848


##########
docs/development/extensions-contrib/ddsketch-quantiles.md:
##########
@@ -0,0 +1,136 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+This module provides aggregators for approximate quantile queries using the 
[DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch 
library provides a fast, and fully-mergeable quantile sketch with relative 
error. If the true quantile is 100, a sketch with relative error of 1% 
guarantees a quantile value between 101 and 99. This is important and highly 
valuable behavior for long tail distributions. The best use case for these 
sketches is for accurately describing the upper quantiles of long tailed 
distributions such as network latencies.
+
+To use this Apache Druid extension, 
[include](../../configuration/extensions.md#loading-extensions) in the 
extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches 
either built from raw data or read from the segments. The single number that is 
returned represents the total number of included data points. The default 
aggregator type of `ddSketch` uses the collapsingLowestDense strategy for 
storing and merging sketch. This means that in favor of keeping the highest 
values represented at the highest accuracy, the sketch will collapse and merge 
lower, smaller values in the sketch. Collapsed bins will lose accuracy 
guarantees. The default number of bins is 1000. Sketches can only be merged 
when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+  "type" : "ddSketch",
+  "name" : <output_name>,
+  "fieldName" : <input_name>,
+  "relativeError" : <double(0, 1)>,
+  "numBins": <int>
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or 
raw numeric values).|yes|
+|relativeError|Describes the precision in which to store the sketch. Must be a 
number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the 
distribution. This has a direct impact on max memory used. The more total bins 
available, the larger the range of accurate quantiles. With relative accuracy 
of 2%, only 275 bins are required to cover values between 1 millisecond and 1 
minute. 800 bins are required to cover values between 1 nanosecond and 1 
day.|no, defaults to 1000|
+
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `quantilesFromDDSketch` 
post-aggregator on the sketches created by the `ddSketch` aggregators.
+
+#### quantilesFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantilesFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fractions|list of doubles from 0 to 1 of the quantiles to compute|yes|
+
+#### quantileFromDDSketch
+

Review Comment:
   ```suggestion
   
   Use `quantileFromDDSketch` to fetch a single quantile.
   
   ```



##########
docs/development/extensions-contrib/ddsketch-quantiles.md:
##########
@@ -0,0 +1,136 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+This module provides aggregators for approximate quantile queries using the 
[DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch 
library provides a fast, and fully-mergeable quantile sketch with relative 
error. If the true quantile is 100, a sketch with relative error of 1% 
guarantees a quantile value between 101 and 99. This is important and highly 
valuable behavior for long tail distributions. The best use case for these 
sketches is for accurately describing the upper quantiles of long tailed 
distributions such as network latencies.
+
+To use this Apache Druid extension, 
[include](../../configuration/extensions.md#loading-extensions) in the 
extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches 
either built from raw data or read from the segments. The single number that is 
returned represents the total number of included data points. The default 
aggregator type of `ddSketch` uses the collapsingLowestDense strategy for 
storing and merging sketch. This means that in favor of keeping the highest 
values represented at the highest accuracy, the sketch will collapse and merge 
lower, smaller values in the sketch. Collapsed bins will lose accuracy 
guarantees. The default number of bins is 1000. Sketches can only be merged 
when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+  "type" : "ddSketch",
+  "name" : <output_name>,
+  "fieldName" : <input_name>,
+  "relativeError" : <double(0, 1)>,
+  "numBins": <int>
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or 
raw numeric values).|yes|
+|relativeError|Describes the precision in which to store the sketch. Must be a 
number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the 
distribution. This has a direct impact on max memory used. The more total bins 
available, the larger the range of accurate quantiles. With relative accuracy 
of 2%, only 275 bins are required to cover values between 1 millisecond and 1 
minute. 800 bins are required to cover values between 1 nanosecond and 1 
day.|no, defaults to 1000|
+
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `quantilesFromDDSketch` 
post-aggregator on the sketches created by the `ddSketch` aggregators.
+
+#### quantilesFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantilesFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fractions|list of doubles from 0 to 1 of the quantiles to compute|yes|
+
+#### quantileFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantileFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fraction|double from 0 to 1 of the quantile to compute|yes|
+
+
+
+```json
+{
+  "type"  : "quantilesFromDDSketch",
+  "name" : <output_name>,
+  "field" : <reference to DDSketch>,
+  "fractions" : <array of doubles in [0,1]>
+}
+```
+
+Single quantiles may be fetched as well.

Review Comment:
   ```suggestion
   ```
   Remove this line here, and move it as the introduction to the single 
quantile section



##########
docs/development/extensions-contrib/ddsketch-quantiles.md:
##########
@@ -0,0 +1,136 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+This module provides aggregators for approximate quantile queries using the 
[DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch 
library provides a fast, and fully-mergeable quantile sketch with relative 
error. If the true quantile is 100, a sketch with relative error of 1% 
guarantees a quantile value between 101 and 99. This is important and highly 
valuable behavior for long tail distributions. The best use case for these 
sketches is for accurately describing the upper quantiles of long tailed 
distributions such as network latencies.
+
+To use this Apache Druid extension, 
[include](../../configuration/extensions.md#loading-extensions) in the 
extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches 
either built from raw data or read from the segments. The single number that is 
returned represents the total number of included data points. The default 
aggregator type of `ddSketch` uses the collapsingLowestDense strategy for 
storing and merging sketch. This means that in favor of keeping the highest 
values represented at the highest accuracy, the sketch will collapse and merge 
lower, smaller values in the sketch. Collapsed bins will lose accuracy 
guarantees. The default number of bins is 1000. Sketches can only be merged 
when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+  "type" : "ddSketch",
+  "name" : <output_name>,
+  "fieldName" : <input_name>,
+  "relativeError" : <double(0, 1)>,
+  "numBins": <int>
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or 
raw numeric values).|yes|
+|relativeError|Describes the precision in which to store the sketch. Must be a 
number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the 
distribution. This has a direct impact on max memory used. The more total bins 
available, the larger the range of accurate quantiles. With relative accuracy 
of 2%, only 275 bins are required to cover values between 1 millisecond and 1 
minute. 800 bins are required to cover values between 1 nanosecond and 1 
day.|no, defaults to 1000|
+
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `quantilesFromDDSketch` 
post-aggregator on the sketches created by the `ddSketch` aggregators.
+
+#### quantilesFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantilesFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fractions|list of doubles from 0 to 1 of the quantiles to compute|yes|

Review Comment:
   ```suggestion
   |fractions|Array of doubles from 0 to 1 of the quantiles to compute|yes|
   ```



##########
docs/development/extensions-contrib/ddsketch-quantiles.md:
##########
@@ -0,0 +1,136 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+This module provides aggregators for approximate quantile queries using the 
[DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch 
library provides a fast, and fully-mergeable quantile sketch with relative 
error. If the true quantile is 100, a sketch with relative error of 1% 
guarantees a quantile value between 101 and 99. This is important and highly 
valuable behavior for long tail distributions. The best use case for these 
sketches is for accurately describing the upper quantiles of long tailed 
distributions such as network latencies.
+
+To use this Apache Druid extension, 
[include](../../configuration/extensions.md#loading-extensions) in the 
extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches 
either built from raw data or read from the segments. The single number that is 
returned represents the total number of included data points. The default 
aggregator type of `ddSketch` uses the collapsingLowestDense strategy for 
storing and merging sketch. This means that in favor of keeping the highest 
values represented at the highest accuracy, the sketch will collapse and merge 
lower, smaller values in the sketch. Collapsed bins will lose accuracy 
guarantees. The default number of bins is 1000. Sketches can only be merged 
when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+  "type" : "ddSketch",
+  "name" : <output_name>,
+  "fieldName" : <input_name>,
+  "relativeError" : <double(0, 1)>,
+  "numBins": <int>
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or 
raw numeric values).|yes|
+|relativeError|Describes the precision in which to store the sketch. Must be a 
number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the 
distribution. This has a direct impact on max memory used. The more total bins 
available, the larger the range of accurate quantiles. With relative accuracy 
of 2%, only 275 bins are required to cover values between 1 millisecond and 1 
minute. 800 bins are required to cover values between 1 nanosecond and 1 
day.|no, defaults to 1000|
+
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `quantilesFromDDSketch` 
post-aggregator on the sketches created by the `ddSketch` aggregators.
+
+#### quantilesFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantilesFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fractions|list of doubles from 0 to 1 of the quantiles to compute|yes|
+
+#### quantileFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantileFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fraction|double from 0 to 1 of the quantile to compute|yes|

Review Comment:
   ```suggestion
   |fraction|A double from 0 to 1 of the quantile to compute|yes|
   ```



##########
docs/development/extensions-contrib/ddsketch-quantiles.md:
##########
@@ -0,0 +1,136 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+This module provides aggregators for approximate quantile queries using the 
[DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch 
library provides a fast, and fully-mergeable quantile sketch with relative 
error. If the true quantile is 100, a sketch with relative error of 1% 
guarantees a quantile value between 101 and 99. This is important and highly 
valuable behavior for long tail distributions. The best use case for these 
sketches is for accurately describing the upper quantiles of long tailed 
distributions such as network latencies.
+
+To use this Apache Druid extension, 
[include](../../configuration/extensions.md#loading-extensions) in the 
extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches 
either built from raw data or read from the segments. The single number that is 
returned represents the total number of included data points. The default 
aggregator type of `ddSketch` uses the collapsingLowestDense strategy for 
storing and merging sketch. This means that in favor of keeping the highest 
values represented at the highest accuracy, the sketch will collapse and merge 
lower, smaller values in the sketch. Collapsed bins will lose accuracy 
guarantees. The default number of bins is 1000. Sketches can only be merged 
when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+  "type" : "ddSketch",
+  "name" : <output_name>,
+  "fieldName" : <input_name>,
+  "relativeError" : <double(0, 1)>,
+  "numBins": <int>
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or 
raw numeric values).|yes|
+|relativeError|Describes the precision in which to store the sketch. Must be a 
number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the 
distribution. This has a direct impact on max memory used. The more total bins 
available, the larger the range of accurate quantiles. With relative accuracy 
of 2%, only 275 bins are required to cover values between 1 millisecond and 1 
minute. 800 bins are required to cover values between 1 nanosecond and 1 
day.|no, defaults to 1000|
+
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `quantilesFromDDSketch` 
post-aggregator on the sketches created by the `ddSketch` aggregators.
+
+#### quantilesFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantilesFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fractions|list of doubles from 0 to 1 of the quantiles to compute|yes|
+
+#### quantileFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantileFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fraction|double from 0 to 1 of the quantile to compute|yes|
+
+
+
+```json
+{
+  "type"  : "quantilesFromDDSketch",
+  "name" : <output_name>,
+  "field" : <reference to DDSketch>,
+  "fractions" : <array of doubles in [0,1]>
+}
+```

Review Comment:
   Move this into the previous section so all `quantilesFromDDSketch` content 
stays together



##########
docs/development/extensions-contrib/ddsketch-quantiles.md:
##########
@@ -0,0 +1,133 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+This module provides aggregators for approximate quantile queries using the 
[DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch 
library provides a fast, and fully-mergeable quantile sketch with relative 
error. If the true quantile is 100, a sketch with relative error of 1% 
guarantees a quantile value between 101 and 99. This is important and highly 
valuable behavior for long tail distributions. The best use case for these 
sketches is for accurately describing the upper quantiles of long tailed 
distributions such as network latencies.
+
+To use this Apache Druid extension, 
[include](../../configuration/extensions.md#loading-extensions) in the 
extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches 
either built from raw data or read from the segments. The single number that is 
returned represents the total number of included data points. The default 
aggregator type of `ddSketch` uses the collapsingLowestDense strategy for 
storing and merging sketch. This means that in favor of keeping the highest 
values represented at the highest accuracy, the sketch will collapse and merge 
lower, smaller values in the sketch. Collapsed bins will lose accuracy 
guarantees. The default number of bins is 1000. Sketches can only be merged 
when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+  "type" : "ddSketch",
+  "name" : <output_name>,
+  "fieldName" : <input_name>,
+  "relativeError" : <double(0, 1)>,
+  "numBins": <int>
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or 
raw numeric values).|yes|
+|relativeError||Describes the precision in which to store the sketch. Must be 
a number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the 
distribution. This has a direct impact on max memory used. The more total bins 
available, the larger the range of accurate quantiles.* |no, defaults to 1000|
+
+* Examples Tuning: With relative accuracy of 2%, only 275 bins are required to 
cover values between 1 millisecond and 1 minute. 800 bins are required to cover 
values between 1 nanosecond and 1 day.
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `quantilesFromDDSketch` 
post-aggregator on the sketches created by the `ddSketch` aggregators.
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantilesFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fractions|list of doubles from 0 to 1 of the quantiles to compute|yes|
+
+|property|description|required?|

Review Comment:
   Thanks for updating the header for each. Since it can still be easily 
overlooked (quantile(s) and fraction(s)), consider adding an intro to each 
section. I'll leave some suggestions.



##########
docs/development/extensions-contrib/ddsketch-quantiles.md:
##########
@@ -0,0 +1,136 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+This module provides aggregators for approximate quantile queries using the 
[DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch 
library provides a fast, and fully-mergeable quantile sketch with relative 
error. If the true quantile is 100, a sketch with relative error of 1% 
guarantees a quantile value between 101 and 99. This is important and highly 
valuable behavior for long tail distributions. The best use case for these 
sketches is for accurately describing the upper quantiles of long tailed 
distributions such as network latencies.
+
+To use this Apache Druid extension, 
[include](../../configuration/extensions.md#loading-extensions) in the 
extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches 
either built from raw data or read from the segments. The single number that is 
returned represents the total number of included data points. The default 
aggregator type of `ddSketch` uses the collapsingLowestDense strategy for 
storing and merging sketch. This means that in favor of keeping the highest 
values represented at the highest accuracy, the sketch will collapse and merge 
lower, smaller values in the sketch. Collapsed bins will lose accuracy 
guarantees. The default number of bins is 1000. Sketches can only be merged 
when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+  "type" : "ddSketch",
+  "name" : <output_name>,
+  "fieldName" : <input_name>,
+  "relativeError" : <double(0, 1)>,
+  "numBins": <int>
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or 
raw numeric values).|yes|
+|relativeError|Describes the precision in which to store the sketch. Must be a 
number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the 
distribution. This has a direct impact on max memory used. The more total bins 
available, the larger the range of accurate quantiles. With relative accuracy 
of 2%, only 275 bins are required to cover values between 1 millisecond and 1 
minute. 800 bins are required to cover values between 1 nanosecond and 1 
day.|no, defaults to 1000|
+
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `quantilesFromDDSketch` 
post-aggregator on the sketches created by the `ddSketch` aggregators.
+
+#### quantilesFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantilesFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fractions|list of doubles from 0 to 1 of the quantiles to compute|yes|
+
+#### quantileFromDDSketch
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantileFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fraction|double from 0 to 1 of the quantile to compute|yes|
+
+
+
+```json
+{
+  "type"  : "quantilesFromDDSketch",
+  "name" : <output_name>,
+  "field" : <reference to DDSketch>,
+  "fractions" : <array of doubles in [0,1]>
+}
+```
+
+Single quantiles may be fetched as well.
+
+```json
+{
+  "type"  : "quantileFromDDSketch",
+  "name" : <output_name>,
+  "field" : <reference to DDsketch>,
+  "fraction" : <double [0,1]>
+}
+```
+
+### Example
+As an example of a query with sketches pre-aggregated at ingestion time, one 
could set up the following aggregator at ingest:
+
+```json
+{
+  "type": "ddSketch",
+  "name": "sketch",
+  "fieldName": "value",
+  "relativeError": 0.01,
+  "numBins": 1000,
+}
+```
+
+and make queries using the following aggregator + post-aggregator:

Review Comment:
   ```suggestion
   You can query pre-aggregated sketches using the following aggregator and 
post-aggregator:
   ```
   I'm not sure if this changes the meaning of the sentence; please confirm



##########
docs/development/extensions-contrib/ddsketch-quantiles.md:
##########
@@ -0,0 +1,136 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+This module provides aggregators for approximate quantile queries using the 
[DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch 
library provides a fast, and fully-mergeable quantile sketch with relative 
error. If the true quantile is 100, a sketch with relative error of 1% 
guarantees a quantile value between 101 and 99. This is important and highly 
valuable behavior for long tail distributions. The best use case for these 
sketches is for accurately describing the upper quantiles of long tailed 
distributions such as network latencies.
+
+To use this Apache Druid extension, 
[include](../../configuration/extensions.md#loading-extensions) in the 
extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches 
either built from raw data or read from the segments. The single number that is 
returned represents the total number of included data points. The default 
aggregator type of `ddSketch` uses the collapsingLowestDense strategy for 
storing and merging sketch. This means that in favor of keeping the highest 
values represented at the highest accuracy, the sketch will collapse and merge 
lower, smaller values in the sketch. Collapsed bins will lose accuracy 
guarantees. The default number of bins is 1000. Sketches can only be merged 
when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+  "type" : "ddSketch",
+  "name" : <output_name>,
+  "fieldName" : <input_name>,
+  "relativeError" : <double(0, 1)>,
+  "numBins": <int>
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or 
raw numeric values).|yes|
+|relativeError|Describes the precision in which to store the sketch. Must be a 
number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the 
distribution. This has a direct impact on max memory used. The more total bins 
available, the larger the range of accurate quantiles. With relative accuracy 
of 2%, only 275 bins are required to cover values between 1 millisecond and 1 
minute. 800 bins are required to cover values between 1 nanosecond and 1 
day.|no, defaults to 1000|
+
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `quantilesFromDDSketch` 
post-aggregator on the sketches created by the `ddSketch` aggregators.
+
+#### quantilesFromDDSketch

Review Comment:
   Mention both post-aggregators in the introduction to the section



##########
docs/development/extensions-contrib/ddsketch-quantiles.md:
##########
@@ -0,0 +1,136 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+This module provides aggregators for approximate quantile queries using the 
[DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch 
library provides a fast, and fully-mergeable quantile sketch with relative 
error. If the true quantile is 100, a sketch with relative error of 1% 
guarantees a quantile value between 101 and 99. This is important and highly 
valuable behavior for long tail distributions. The best use case for these 
sketches is for accurately describing the upper quantiles of long tailed 
distributions such as network latencies.
+
+To use this Apache Druid extension, 
[include](../../configuration/extensions.md#loading-extensions) in the 
extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches 
either built from raw data or read from the segments. The single number that is 
returned represents the total number of included data points. The default 
aggregator type of `ddSketch` uses the collapsingLowestDense strategy for 
storing and merging sketch. This means that in favor of keeping the highest 
values represented at the highest accuracy, the sketch will collapse and merge 
lower, smaller values in the sketch. Collapsed bins will lose accuracy 
guarantees. The default number of bins is 1000. Sketches can only be merged 
when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+  "type" : "ddSketch",
+  "name" : <output_name>,
+  "fieldName" : <input_name>,
+  "relativeError" : <double(0, 1)>,
+  "numBins": <int>
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or 
raw numeric values).|yes|
+|relativeError|Describes the precision in which to store the sketch. Must be a 
number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the 
distribution. This has a direct impact on max memory used. The more total bins 
available, the larger the range of accurate quantiles. With relative accuracy 
of 2%, only 275 bins are required to cover values between 1 millisecond and 1 
minute. 800 bins are required to cover values between 1 nanosecond and 1 
day.|no, defaults to 1000|
+
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `quantilesFromDDSketch` 
post-aggregator on the sketches created by the `ddSketch` aggregators.
+
+#### quantilesFromDDSketch

Review Comment:
   ```suggestion
   To compute approximate quantiles, you can use the `quantilesFromDDSketch` to 
query for a set of quantiles or `quantileFromDDSketch` to query for a single 
quantile. Call these post-aggregators on the sketches created by the `ddSketch` 
aggregators.
   
   #### quantilesFromDDSketch
   
   Use `quantilesFromDDSketch` to compute a set of quantiles.
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to