mustafasrepo commented on issue #7518: URL: https://github.com/apache/arrow-datafusion/issues/7518#issuecomment-1713623935
When a window function with rank is used. A single pass on the whole data is sufficient. What we need is finding partition boundaries in the data. However, for window frames (with groups, and range) for each row we find the preceding and following boundary (corresponding to each row, determined by values of the row.) Hence this calculation should be done for each row(storing last boundaries found, helps in decreasing search space) on the subset of the table(around row of the interest.) I am not sure whether window frame boundary search can be vectorized. That being said, I guess instead of `Vec<ScalarValue>` we can use `Row` type. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
