Hi,

I recently contributed TDigest based sketch aggregators in Druid. It also
included a post aggregator that lets you generate quantiles from the
aggregated sketches.

Example query:

{
        "queryType": "groupBy",
        "dataSource": "test_datasource",
        "granularity": "ALL",
        "dimensions": [],
        "aggregations": [{
                "type": "mergeTDigestSketch",
                "name": "merged_sketch",
                "fieldName": "ingested_sketch",
                "compression": 200
        }],
        "postAggregations": [{
                "type": "quantilesFromTDigestSketch",
                "name": "quantiles",
                "fractions": [0, 0.5, 1],
                "field": {
                        "type": "fieldAccess",
                        "fieldName": "merged_sketch"
                }
        }],
        "intervals": ["2016-01-01T00:00:00.000Z/2016-01-31T00:00:00.000Z"]
}

The one limitation I have been running into is that the above query returns
both merged_sketch that was aggregated and the quantiles array that was
generated from applying post aggregation on merged_sketch. What I would
rather want in this case is for the query to just return the quantiles
array.

So instead of

"version": "v1",
        "timestamp": "2019-06-25T00:00:00.000Z",
        "event": {
             "quantiles": [
                0,
                162569.21411280808,
                5814934
            ],
            "merged_sketch": "AAAABBAXAS"
          }

I would prefer this:
"version": "v1",
        "timestamp": "2019-06-25T00:00:00.000Z",
        "event": {
             "quantiles": [
                0,
                162569.21411280808,
                5814934
            ]
          }

Is there a way to achieve this today? I tried changing post aggregation
field access from

"field": {
                        "type": "fieldAccess",
                        "fieldName": "merged_sketch"
                }

to

"field": {
                        "type": "finalizingFieldAccess",
                        "fieldName": "merged_sketch"
                }

but that didn't help either.

Thanks,
Samarth

Reply via email to