[GitHub] [arrow] jorgecarleitao commented on pull request #7880: ARROW-9619: [Rust] [DataFusion] Add predicate push-down

GitBox Thu, 13 Aug 2020 22:24:56 -0700


jorgecarleitao commented on pull request #7880:
URL: https://github.com/apache/arrow/pull/7880#issuecomment-673890773



   Thank you very much @alamb for reviewing it!
   
   This optimizer is mostly useful in the `table` or `DataFrame` API, on which 
a view can be declared as a sequence of statements that are not optimized for 
execution, but optimized for a logical and code organization's point of view.
   
   One example is when we have a dataframe `df` that was constructed optimally, 
but we would like to only look at rows whose `'a' > 2`. Instead of having to go 
through the actual code that built that DataFrame and place the filter in the 
correct place after investigating where we should place it, we can just write 
`df.filter(df['a'] > 2).collect()`, and let the optimizer figure it out where 
to place it.
   
   I incorporated the comments above into #7879 , as IMO they are part of that 
PR, and rebased the whole thing. I will still address your comment about not 
full understanding the algorithm by adding a more extended comment and maybe 
try drawing some ASCII to better explain the idea, so that it is not only on my 
head.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorgecarleitao commented on pull request #7880: ARROW-9619: [Rust] [DataFusion] Add predicate push-down

Reply via email to