estebanz01 opened a new issue, #12734:
URL: https://github.com/apache/pinot/issues/12734

   Hola! 👋 
   
   So playing around with pinot, I found a small issue that bothers me a bit, 
so I thought it might be good to raise it as an issue. I'm dealing with high 
frequency data, so we need to timestamp points with a _microsecond_ precision. 
We could even have a case for _nanoseconds_, but at that point it's better to 
just deal with it on the edge. Anyhow, here's what I have as a schema:
   
   <details>
   <summary>Microseconds schema</summary>
   
   ```json
   {
     "schemaName": "data_counting_temp",
     "dimensionFieldSpecs": [
       {
         "name": "device_uuid",
         "dataType": "STRING"
       }
     ],
     "metricFieldSpecs": [
       {
         "name": "noise",
         "dataType": "LONG"
       }
     ],
     "dateTimeFieldSpecs": [
       {
         "name": "__key",
         "dataType": "TIMESTAMP",
         "format": "1:MICROSECONDS:EPOCH",
         "granularity": "1:MICROSECONDS"
       },
       {
         "name": "__metadata$eventTime",
         "dataType": "TIMESTAMP",
         "format": "1:MICROSECONDS:EPOCH",
         "granularity": "1:MICROSECONDS"
       }
     ]
   }
   ```
   
   </details>
   
   the data comes from pulsar, so that's why I have the `__metadata$eventTime` 
and `__key` fields.
   
   The granularity is supported as it uses the time unit enums from here: 
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/TimeUnit.html, 
the problem comes when querying with the API. If I look at the table data in 
the pinot UI, I get the following:
   
   ![a pinot SQL table result showing wrong 
timestamp](https://github.com/apache/pinot/assets/2417465/2482b721-bb2a-4cdc-a716-2f5ddc3ce86c)
   
   which is fine as I thought it was converted on the frontend for styling 
purposes, but the real reason is that this data _comes_ from the backend:
   
   ```json
   {
     "resultTable": {
       "dataSchema": {
         "columnNames": [
           "__key",
           "__metadata$eventTime"
         ],
         "columnDataTypes": [
           "TIMESTAMP",
           "TIMESTAMP"
         ]
       },
       "rows": [
         [
           "56202-05-11 12:33:13.431",
           "56202-05-11 12:33:13.431"
         ],
         [
           "56202-05-11 12:34:14.949",
           "56202-05-11 12:34:14.949"
         ],
         [
           "56202-05-11 12:34:37.971",
           "56202-05-11 12:34:37.971"
         ],
         [
           "56202-05-11 12:34:44.0",
           "56202-05-11 12:34:44.0"
         ],
         [
           "56202-05-11 12:34:54.065",
           "56202-05-11 12:34:54.065"
         ],
         [
           "56202-05-11 12:35:55.881",
           "56202-05-11 12:35:55.881"
         ],
         [
           "56202-05-11 12:36:19.978",
           "56202-05-11 12:36:19.978"
         ],
         [
           "56202-05-11 12:36:24.787",
           "56202-05-11 12:36:24.787"
         ],
         [
           "56202-05-11 12:36:36.25",
           "56202-05-11 12:36:36.25"
         ],
         [
           "56202-05-11 12:37:36.541",
           "56202-05-11 12:37:36.541"
         ]
       ]
     },
     "requestId": "1093678062000000000",
     "brokerId": "<broker-id>",
     "exceptions": [],
     "numServersQueried": 4,
     "numServersResponded": 4,
     "numSegmentsQueried": 2632,
     "numSegmentsProcessed": 337,
     "numSegmentsMatched": 4,
     "numConsumingSegmentsQueried": 1,
     "numConsumingSegmentsProcessed": 1,
     "numConsumingSegmentsMatched": 0,
     "numDocsScanned": 40,
     "numEntriesScannedInFilter": 0,
     "numEntriesScannedPostFilter": 80,
     "numGroupsLimitReached": false,
     "totalDocs": 42393394,
     "timeUsedMs": 365,
     "offlineThreadCpuTimeNs": 0,
     "realtimeThreadCpuTimeNs": 0,
     "offlineSystemActivitiesCpuTimeNs": 0,
     "realtimeSystemActivitiesCpuTimeNs": 0,
     "offlineResponseSerializationCpuTimeNs": 0,
     "realtimeResponseSerializationCpuTimeNs": 0,
     "offlineTotalCpuTimeNs": 0,
     "realtimeTotalCpuTimeNs": 0,
     "brokerReduceTimeMs": 13,
     "segmentStatistics": [],
     "traceInfo": {},
     "minConsumingFreshnessTimeMs": 1711547747582,
     "numSegmentsPrunedByBroker": 0,
     "numSegmentsPrunedByServer": 2295,
     "numSegmentsPrunedInvalid": 0,
     "numSegmentsPrunedByLimit": 0,
     "numSegmentsPrunedByValue": 2295,
     "explainPlanNumEmptyFilterSegments": 0,
     "explainPlanNumMatchAllFilterSegments": 0,
     "numRowsResultSet": 10
   }
   ```
   
   Basically, when querying this two fields, I'm not getting the `epoch` number 
that I expect, but a naive conversion to `yyyy-MM-dd hh:mm:ss.sssssZ`. 
Shouldn't the API SQL check for the format and based on that, either apply 
transformations or get the raw value ?
   
   My workaround is ugly, but how well. Use timeconvert as follows:
   ```sql
   SELECT timeConvert(__key, 'MICROSECONDS', 'MICROSECONDS') as epoch FROM table
   ```
   
   which gives me the expected info:
   ![another table results showing correct 
epochs](https://github.com/apache/pinot/assets/2417465/fff843c7-3215-4f2f-8bbc-4e4cefc4637c)
   
   Let me know if I should close this. Thanks in advance!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to