On Wed, 5 May 2021 13:23:36 -0400
Benjamin Kietzman <bengil...@gmail.com> wrote:
> Currently, Expressions (used to specify dataset filters and projections)
> are simplified by direct rewriting: a filter such as `alpha == 2 and beta >
> 3`
> on a partition where we are guaranteed that `beta == 5` will be rewritten
> to `alpha == 2` before evaluation against scanned batches. This can
> potentially occur for each scanned batch: for example, Parquet's row group
> statistics are used in the same way to simplify filters.
> 
> Rewriting is not extremely expensive (a microbenchmark estimate on
> my machine shows that a simple case such as the above takes 4ms).

4ms for a single rewriting actually sounds quite large to me.
(or did you mean 4µs?)



Reply via email to