alamb commented on issue #7363: URL: https://github.com/apache/arrow-rs/issues/7363#issuecomment-2797040089
I spent some more time today thinking about what filter patterns are important to test and came up with the following siz patterns (to replace the 4 I suggested above). What do you think @zhuqi-lucas ? @XiangpengHao did I miss any important filter patterns? ```text ┌───────────────┐ ┌───────────────┐ │ │ │ │ │ │ │ ... │ │ │ │ │ │ │ │ │ │ ... │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │ │ │ │ ... │ │ │ │ │ │ │ │ │ └───────────────┘ └───────────────┘ "Point Lookup": selects a single row (1 RowSelection of 1 row) ┌───────────────┐ ┌───────────────┐ │ ... │ │ │ │ │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │ │ ... │ │ │ │ │ │ │ │ │ │ ... │ │ │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │ └───────────────┘ └───────────────┘ selective (1%) unclustered filter (1000 RowSelection of 10 rows each) ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ ... │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ ... │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │ │ │ │ │ │ │ │ ... │ │ │ │ │ │ ... │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ ... │ │ │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ └───────────────┘ └───────────────┘ └───────────────┘ └───────────────┘ moderately selective (10%) clustered filter moderately selective (10%) unclustered filter (10 RowSelections of 10,000 rows each) (10000 RowSelection of 10 rows each) ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ ... │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ ... │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ └───────────────┘ └───────────────┘ └───────────────┘ └───────────────┘ unselective (99%) unclustered filter unselective (90%) clustered filter (99,000 RowSelections of 10 rows each) (99 RowSelection of 10,000 rows each) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org