[I] Druid query to get last N transactions from the db (druid)

via GitHub Mon, 28 Aug 2023 07:49:48 -0700


cisco2825 opened a new issue, #14914:
URL: https://github.com/apache/druid/issues/14914


   I have written a druid topN query to fetch the last n transactions performed 
from the db. I have aggregated __time column and used it as the metric for the 
topN query.  For aggregation I have used longLast aggregator.
   
   But this query is taking too much time in case of multiple segment data. Can 
someone please suggest me what optimization I can do in the query to make it 
faster.  Or if there is some other faster way to  get the last n transactions.
   
   Adding the structure of the query I have:
   
   `{
     "dataSource": {
       "type": "query",
       "query": {
         "dataSource": {
           "type": "table",
           "name": "txn"
         },
         "queryType": "topN",
         "intervals": [
           "2021-08-24T15:51:54.599+05:30/2023-08-24T10:21:54.599Z"
         ],
         "granularity": "all",
         "virtualColumns": [
           {
             "type": "expression",
             "name": "sourceValue",
             "outputType": "STRING",
             "expression": "regexp_extract(trim(transEntityRecords), 
'(.*)nslAttributeID(.*)909075081867,.values.:(..([-0-9]*(.[-0-9]+)?))(.*)', 4)"
           }
         ],
         "filter": {
           "type": "and",
           "fields": [
             {
               "type": "in",
               "dimension": "gsiId",
               "values": [
                 "1595974428413"
               ]
             },
             {
               "type": "in",
               "dimension": "triggerCuId",
               "values": [
                 "1240994321229",
               ]
             },
             {
               "type": "selector",
               "dimension": "triggerState",
               "value": "COMPLETED"
             },
             {
               "type": "or",
               "fields": [
                 {
                   "type": "selector",
                   "dimension": "gsiName",
                   "value": "MaxLimitLoadTestingGSI"
                 },
                 {
                   "type": "selector",
                   "dimension": "cuName",
                   "value": "test CU27281"
                 },
                 {
                   "type": "selector",
                   "dimension": "cuIndex",
                   "value": "1"
                 },
                 {
                   "type": "selector",
                   "dimension": "txnSlotItemDataName",
                   "value": "NumberEntity 862"
                 }
               ]
             }
           ]
         },
         "aggregations": [
           {
             "type": "count",
             "name": "count"
           },
           {
             "type": "doubleLast",
             "name": "sourceValue",
             "fieldName": "sourceValue"
           },
           {
             "type": "longLast",
             "name": "time",
             "fieldName": "__time"
           }
         ],
         "dimension": {
           "type": "default",
           "dimension": "transactionId",
           "outputName": "transactionId",
           "outputType": "STRING"
         },
         "threshold": 10000,
         "metric": "time"
       }
     },
     "queryType": "groupBy",
     "intervals": [
       "2021-08-24T15:51:54.599+05:30/2023-08-24T10:21:54.599Z"
     ],
     "granularity": "all",
     "aggregations": [
       {
         "type": "doubleMean",
         "name": "sourceMean",
         "fieldName": "sourceValue"
       },
       {
         "type": "count",
         "name": "count"
       }
     ],
     "dimensions": []
   }`
   
   NOTE: 
   1) Interval is supposed to be 2 years. So can not reduce that
   2) All the filters/virtual columns are essentials
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Druid query to get last N transactions from the db (druid)

Reply via email to