alamb commented on issue #7363: URL: https://github.com/apache/arrow-rs/issues/7363#issuecomment-2797040089
I spent some more time today thinking about what filter patterns are
important to test and came up with the following siz patterns (to replace the 4
I suggested above).
What do you think @zhuqi-lucas ? @XiangpengHao did I miss any important
filter patterns?
```text
┌───────────────┐ ┌───────────────┐
│ │ │ │
│ │ │ ... │
│ │ │ │
│ │ │ │
│ ... │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
│ │ │ │
│ │ │ ... │
│ │ │ │
│ │ │ │
└───────────────┘ └───────────────┘
"Point Lookup": selects a single row
(1 RowSelection of 1 row)
┌───────────────┐ ┌───────────────┐
│ ... │ │ │
│ │ │ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│ │ │ ... │
│ │ │ │
│ │ │ │
│ ... │ │ │
│ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
│ │ │ │
└───────────────┘ └───────────────┘
selective (1%) unclustered filter
(1000 RowSelection of 10 rows each)
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
┌───────────────┐
│ ... │ │ │ │ │
│ │
│ │ │ │ │ │
│ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│ ... │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │
│ │
│ │ │ │ │ ... │
│ │
│ │ │ ... │ │ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
│ ... │ │ │ │ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
│ │ │ │ │ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
│ │ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
└───────────────┘ └───────────────┘ └───────────────┘
└───────────────┘
moderately selective
(10%) clustered filter
moderately selective (10%) unclustered filter (10 RowSelections of
10,000 rows each)
(10000 RowSelection of 10 rows each)
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
┌───────────────┐
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│ ... │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ ... │
│ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │ │ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │ │
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│ └───────────────┘
└───────────────┘
└───────────────┘ └───────────────┘
unselective (99%) unclustered filter unselective (90%)
clustered filter
(99,000 RowSelections of 10 rows each) (99 RowSelection of
10,000 rows each)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
