ly923976094 commented on issue #3626: timezone problem URL: https://github.com/apache/incubator-pinot/issues/3626#issuecomment-450606119 > Yes, you are right the UDF converts each row rather than the result. It is required to ensure the right grouping happens after applying the UDF. However, in your case it is not necessary as one day already has one single millis value (which is a special case). We will explore on making this faster for your case (eg by caching the conversion). But in the meanwhile, is it possible for you to apply the conversion at the client side? Also, if you can share, are you already using Pinot in production, or evaluating may the moment? Get Outlook for iOS<https://aka.ms/o0ukef> > […](#) > ________________________________ From: Sun-Li <[email protected]> Sent: Sunday, December 30, 2018 6:37 PM To: apache/incubator-pinot Cc: Mayank Shrivastava; Comment Subject: Re: [apache/incubator-pinot] timezone problem (#3626) Thanks for sharing the information. From the second result it seems the query selects 1.2B records, which means too much computation needs to happen. Do you really need to store the data in milliseconds granularity? If not, preaggregating offline data after with larger granularity will reduce data size, and hence the the num records to process. May I ask which company/organization is this use case for? I work in sina weibo (Beijing), and my main work is the big data r&d engineer. The data of this table has been pre-aggregated for days outside, and the time stamp of day milliseconds is stored in pinot. In fact, if the time is not converted using SIMPLE_DATE_FORMAT, the query time is very fast (that is, the time zone cannot be specified, resulting in the mismatch between time and real time),SIMPLE_DATE_FORMAT converts each piece of data rather than the query result — You are receiving this because you commented. Reply to this email directly, view it on GitHub<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-pinot%2Fissues%2F3626%23issuecomment-450602555&data=02%7C01%7Cmshrivas%40linkedin.com%7C1bd0d135f3bc43e14f7c08d66ec8e08b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636818206392019077&sdata=jyq4ddidHUIrBy9arNVd9vbyEo9%2BhEz0rz2yCfsZkBU%3D&reserved=0>, or mute the thread<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAMCAzZvCQfrd2iJhMGzisja0gF0cjxMfks5u-XhVgaJpZM4Zkbf0&data=02%7C01%7Cmshrivas%40linkedin.com%7C1bd0d135f3bc43e14f7c08d66ec8e08b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636818206392019077&sdata=fWpRMIwihaV8ju0T9FtWJmT6Twp0P%2FizqBJsquvfKIk%3D&reserved=0>. Yes, I have used pinot in production, I want to use pql2 (select sum(total_count) from video_market-video_download_performance_full_day where fdate >= 1543593600000 and fdate < 1546271999000 group by dateTimeConvert(fdate, '1:MILLISECONDS:EPOCH','1:MILLISECONDS:EPOCH', '1:DAYS')) to query the result, and then convert the timestamp on the client side, but what comes out is UTC time (such as storage time stamp is the Shanghai time zone, pinot process according to UTC timestamp, out of the time difference between the 16 hours). what do you mean by caching conversion
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
