I’m playing with Drill’s kudu-storage code and converting it to use the ScanToken api. It’s a fairly simple matter to do this by serializing ScanTokens (thanks of the great api!), but I’m not sure how much to “push down” to Kudu. For Tablet/Drillbit affinity and for pruning it’s clear I need to push hash key and range bounds predicates, but it can get a bit tricky given that the ScanToken api does not support OR logic unless it fits into an IN_LIST expression, so for more complex logic I suppose it’s best to not push down the filtering. However, I’m wondering if there is a way to push disjoint bounds to Kudu. For example, if I have a table with range keys of Year,Month,Day, it there a way I can include only (2017,6,1) OR (2017,5,5) in a group scan?
Thanks, Cliff
