richardstartin opened a new pull request #8397:
URL: https://github.com/apache/pinot/pull/8397


   This PR starts with a small refactoring to allow `TransformFunctionType`s to 
have aliases, then introduces `TransformFunction`s implemented in terms of 
`java.time` for the following functions:
   
   * `year`
   * `yearOfWeek` 
   * `yow`
   * `month`
   * `week`   
   * `weekOfYear`
   * `quarter`
   * `dayOfWeek`
   *  `dow`
   * `dayOfYear`
   * `doy`
   * `dayOfMonth`
   * `day`
   * `hour`
   * `minute`
   * `millisecond`
   
    The transform functions are tested against the `@ScalarFunction`s, which 
provides a free test oracle for migrating the latter away from Joda time in the 
future.
   
   The motivation for the change is the high cost of using scalar functions in 
transforms. Here is the allocation rate observed for 
   ```sql
   select count(*), year(event_time) as y, month(event_time) as m from 
githubEvents group by y, m
   ```
   <img width="854" alt="Screenshot 2022-03-24 at 12 11 19" 
src="https://user-images.githubusercontent.com/16439049/159913668-602fc431-56e7-43ab-8187-c14952d40f21.png";>
   
   
   
   vs the similar 
   ```sql
   select count(*), dateTrunc('year', event_time) as y, dateTrunc('m', 
event_time) as y from githubEvents group by y, m
   ```
   <img width="1607" alt="Screenshot 2022-03-24 at 12 11 33" 
src="https://user-images.githubusercontent.com/16439049/159913696-563defb7-aade-4ffe-97f9-62aab1c01dc8.png";>
   
   When scalar function is used, 16% of samples were in verifying accessibility 
of the `Method` - `@ScalarFunction` should be used only as a last resort.
   
   <img width="782" alt="Screenshot 2022-03-24 at 12 12 00" 
src="https://user-images.githubusercontent.com/16439049/159913825-21ccc0bd-fb2c-41ce-b60b-932e6749af0e.png";>
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to