chenboat opened a new issue #3515: DATETIMECONVERT udf conversion to Simple_Date_Format are 50x slower than conversion to epoch time URL: https://github.com/apache/incubator-pinot/issues/3515 The following query which uses DATETIMECONVERT to convert epoch time to simple data format time finishes in 3mins on a table with 400 million rows. `SELECT COUNT(*) AS count FROM table1 WHERE secondsSinceEpoch BETWEEN 1541894400 AND 1542499200 GROUP BY DATETIMECONVERT(secondsSinceEpoch, '1:SECONDS:EPOCH', '1:DAYS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm:ss', '1:DAYS')` If we use epoch time instead, the processing time is only 5 seconds. `SELECT COUNT(*) AS count FROM table1 WHERE secondsSinceEpoch BETWEEN 1541894400 AND 1542499200 GROUP BY DATETIMECONVERT(secondsSinceEpoch, '1:SECONDS:EPOCH', '1:DAYS:EPOCH', '1:DAYS')` Most of the query time is spent on time format conversion. If we perform the time format conversion only during the server combination step instead of doing it for every row, can we get better query perf? @fx19880617 @npawar
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
