wolffcm commented on issue #4809: URL: https://github.com/apache/arrow-datafusion/issues/4809#issuecomment-1378146101
@jiacai2050 > Subquery seems unnecessary, if time range in time_bucket_gapfill different with range in where clause, maybe we can overwrite where clause, and filter data in GapFill plan node, something like this(adopted from google docs above): I understand what you're suggesting, but I worry that rewriting a filter like that would have unforeseen effects that are difficult to understand. For example, if the input to `Aggregate` was not a simple scan or filter, but instead the output of a derived table or a join, it could be hard to do a rewrite. What would the behavior be for that case? I think this problem is a really tricky one. In the TImeScale [docs](https://docs.timescale.com/api/latest/hyperfunctions/gapfilling-interpolation/locf/) for `locf()` there is a `prev` parameter which solves this problem. It is basically a subquery. It's a little awkward to have to type it but has the advantage of not requiring rewriting other parts of the plan. ```sql locf( avg(temperature), (SELECT temperature FROM metrics m2 WHERE m2.time < now() - INTERVAL '2 week' AND m.device_id = m2.device_id ORDER BY time DESC LIMIT 1) ) ``` I'm curious about what you think of that approach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
