jacobmarble commented on issue #3015:
URL: 
https://github.com/apache/arrow-datafusion/issues/3015#issuecomment-1204739188

   I don't know how far DataFusion likes to deviate from PostgreSQL-flavored 
SQL, but here's an idea for this feature.
   
   ### Suggestion 1
   Make the default value of the `origin` argument `1970-01-01 00:00:00 UTC`.
   
   It seems there is no default value specified by PostgreSQL. Certainly there 
exists a need for `origin` values like `XXXX-XX-XX 00:02:30 UTC`, but the far 
more common uses of `date_bin` are certainly with `stride` values like 1s, 10s, 
30s, 1m, 5m, 10m, 15m, 20m, 30m, 1h, 2h, 3h, 6h, etc, with the reasonable 
expectation that the result of `date_bin` align with a top-of-the-hour origin.
   
   The result is that 99% of queries written with this function will be shorter 
and less error prone.
   
   Some proposed, equivalent statements:
   ```sql
   -- as specified by PostgreSQL - origin at Unix epoch
   select date_bin('15 minutes', timestamp '2022-08-03 14:49:50 PST', timestamp 
'1970-01-01 00:00:00 UTC');
   -- 2022-08-03 14:45:00
   
   -- proposed
   select date_bin('15 minutes', timestamp '2022-08-03 14:49:50 PST');
   ```
   
   ### Suggestion 2
   Allow `origin` values with type `timestamp`, `date`, `time` or `interval`. 
Again, PostgreSQL doesn't have any prior art here.
   
   The result enables better readability.
   
   Some proposed, equivalent statements:
   ```sql
   -- as specified by PostgreSQL - origin shifts bins forward 5 minutes
   select date_bin('15 minutes', timestamp '2022-08-03 14:49:50 PST', timestamp 
'1970-01-01 00:05:00 UTC');
   -- 2022-08-03 14:35:00
   
   -- proposed
   select date_bin('15 minutes', timestamp '2022-08-03 14:49:50 PST', time 
'00:05:00');
   select date_bin('15 minutes', timestamp '2022-08-03 14:49:50 PST', interval 
'5 minutes');
   ```
   
   ```sql
   -- as specified by PostgreSQL - origin shifts bins back 5 minutes
   select date_bin('15 minutes', timestamp '2022-08-03 14:49:50 PST', timestamp 
'1969-12-31 23:55:00 UTC');
   -- 2022-08-03 14:40:00
   
   -- proposed
   select date_bin('15 minutes', timestamp '2022-08-03 14:49:50 PST', - time 
'00:05:00');
   select date_bin('15 minutes', timestamp '2022-08-03 14:49:50 PST', interval 
'- 5 minutes');
   ```
   
   ```sql
   -- as specified by PostgreSQL - origin shifts bins forward one day
   select date_bin('7 days', timestamp '2022-08-03 14:49:50 PST', timestamp 
'1970-01-02 00:00:00 UTC');
   -- 2022-07-29 00:00:00
   
   -- proposed
   select date_bin('7 days', timestamp '2022-08-03 14:49:50 PST', date 
'1970-01-02');
   select date_bin('7 days', timestamp '2022-08-03 14:49:50 PST', interval '1 
day');
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to