wolffcm commented on issue #4809:
URL: 
https://github.com/apache/arrow-datafusion/issues/4809#issuecomment-1378146101

   @jiacai2050 
   > Subquery seems unnecessary, if time range in time_bucket_gapfill different 
with range in where clause, maybe we can overwrite where clause, and filter 
data in GapFill plan node, something like this(adopted from google docs above):
   
   I understand what you're suggesting, but I worry that rewriting a filter 
like that would have unforeseen effects that are difficult to understand. For 
example, if the input to `Aggregate` was not a simple scan or filter, but 
instead the output of a derived table or a join, it could be hard to do a 
rewrite. What would the behavior be for that case?
   
   I think this problem is a really tricky one. In the TImeScale 
[docs](https://docs.timescale.com/api/latest/hyperfunctions/gapfilling-interpolation/locf/)
 for `locf()` there is a `prev` parameter which solves this problem. It is 
basically a subquery. It's a little awkward to have to type it but has the 
advantage of not requiring rewriting other parts of the plan.
   ```sql
     locf(
       avg(temperature),
       (SELECT temperature FROM metrics m2 WHERE m2.time < now() - INTERVAL '2 
week' AND m.device_id = m2.device_id ORDER BY time DESC LIMIT 1)
     )
   ```
   I'm curious about what you think of that approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to