[GitHub] [arrow] yordan-pavlov commented on pull request #8960: ARROW-10540: [Rust] Extended filter kernel to all types and improved performance

GitBox Sat, 19 Dec 2020 13:11:54 -0800


yordan-pavlov commented on pull request #8960:
URL: https://github.com/apache/arrow/pull/8960#issuecomment-748525543



   @jorgecarleitao thanks for the detailed explanation - it's great to see you 
have thought about optimizing the filtering of both single and multiple columns 
as much as possible;
   
   regarding the meaning of high vs low selectivity of a filter, I agree it can 
be confusing - a highly / very selective filter is one which discards most of 
the data; it's not easy to find a good explanation from a credible source; here 
is one:
   
   > Selectivity could be defined as “the percentage of matching rows compared 
to total rows, regarding a given query’s criteria.” A lower percentage 
indicates higher selectivity. This means that if there are very few rows that 
meet a query’s (or an index’s, in the case of a filtered index) WHERE criteria 
compared to the total number of rows, the index is considered very selective.
   
   from here 
https://www.red-gate.com/simple-talk/sql/performance/introduction-to-sql-server-filtered-indexes/
   
   you might be right though - it might be better to come up with more 
intuitive names for those benchmarks
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] yordan-pavlov commented on pull request #8960: ARROW-10540: [Rust] Extended filter kernel to all types and improved performance

Reply via email to