yordan-pavlov commented on pull request #8960: URL: https://github.com/apache/arrow/pull/8960#issuecomment-748525543
@jorgecarleitao thanks for the detailed explanation - it's great to see you have thought about optimizing the filtering of both single and multiple columns as much as possible; regarding the meaning of high vs low selectivity of a filter, I agree it can be confusing - a highly / very selective filter is one which discards most of the data; it's not easy to find a good explanation from a credible source; here is one: > Selectivity could be defined as “the percentage of matching rows compared to total rows, regarding a given query’s criteria.” A lower percentage indicates higher selectivity. This means that if there are very few rows that meet a query’s (or an index’s, in the case of a filtered index) WHERE criteria compared to the total number of rows, the index is considered very selective. from here https://www.red-gate.com/simple-talk/sql/performance/introduction-to-sql-server-filtered-indexes/ you might be right though - it might be better to come up with more intuitive names for those benchmarks ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
