Jimexist commented on pull request #403:
URL: https://github.com/apache/arrow-datafusion/pull/403#issuecomment-850049189


   > Thank you @Jimexist  -- the  code structure looks clean to me . Nice work
   > 
   > 
   > 
   > However, I am concerned about the correctness of these results. As I 
understand it, `first_value`, `last_value` and `nth_value` are not well defined 
unless there is an ordering on the window (aka without an ordering you 
basically can get some arbitrary value from the window). 
   > 
   > 
   > 
   > I wonder if it would make sense to implement ordering for windows first, 
so we can write tests will well defined output
   > 
   > 
   > 
   > I also see some change to the `parquet-testing` module which I wonder if 
that was intended
   
   I guess it's not arbitrary but rather just take the ordering as is. When 
https://github.com/apache/arrow-datafusion/pull/425 is merged I'll add one test 
case to compare with psql so the behavior is consistent.
   
   Of course when ordering clause is implemented then the behavior can also be 
tested in the same way, along with some unit test.
   
   I plan to implement ordering after this and the lead/lag PR are merged 
because it requires some structural changes to the planner.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to