Hey Samarth,

This kind of thing doable in Druid SQL, which will only return the stuff
you SELECT. Native queries don't have a concept like that, so they always
return everything, even if you intended certain things to be 'internal'
computations and aren't interested in seeing the results directly. If it
makes sense for you to use SQL I would suggest going that route. Otherwise
it might be interesting to add a native query feature to select only
certain fields.

On Wed, Jun 26, 2019 at 3:30 PM Samarth Jain <samarth.j...@gmail.com> wrote:

> Hi,
>
> I recently contributed TDigest based sketch aggregators in Druid. It also
> included a post aggregator that lets you generate quantiles from the
> aggregated sketches.
>
> Example query:
>
> {
>         "queryType": "groupBy",
>         "dataSource": "test_datasource",
>         "granularity": "ALL",
>         "dimensions": [],
>         "aggregations": [{
>                 "type": "mergeTDigestSketch",
>                 "name": "merged_sketch",
>                 "fieldName": "ingested_sketch",
>                 "compression": 200
>         }],
>         "postAggregations": [{
>                 "type": "quantilesFromTDigestSketch",
>                 "name": "quantiles",
>                 "fractions": [0, 0.5, 1],
>                 "field": {
>                         "type": "fieldAccess",
>                         "fieldName": "merged_sketch"
>                 }
>         }],
>         "intervals": ["2016-01-01T00:00:00.000Z/2016-01-31T00:00:00.000Z"]
> }
>
> The one limitation I have been running into is that the above query returns
> both merged_sketch that was aggregated and the quantiles array that was
> generated from applying post aggregation on merged_sketch. What I would
> rather want in this case is for the query to just return the quantiles
> array.
>
> So instead of
>
> "version": "v1",
>         "timestamp": "2019-06-25T00:00:00.000Z",
>         "event": {
>              "quantiles": [
>                 0,
>                 162569.21411280808,
>                 5814934
>             ],
>             "merged_sketch": "AAAABBAXAS"
>           }
>
> I would prefer this:
> "version": "v1",
>         "timestamp": "2019-06-25T00:00:00.000Z",
>         "event": {
>              "quantiles": [
>                 0,
>                 162569.21411280808,
>                 5814934
>             ]
>           }
>
> Is there a way to achieve this today? I tried changing post aggregation
> field access from
>
> "field": {
>                         "type": "fieldAccess",
>                         "fieldName": "merged_sketch"
>                 }
>
> to
>
> "field": {
>                         "type": "finalizingFieldAccess",
>                         "fieldName": "merged_sketch"
>                 }
>
> but that didn't help either.
>
> Thanks,
> Samarth
>

Reply via email to