Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/21677#discussion_r200260115
--- Diff: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ---
@@ -0,0 +1,556 @@
+############################[ Pushdown for many distinct value case
]############################
--- End diff --
How about this?
```
...
Select all int rows (value != -1): Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
Parquet Vectorized 1140 / 1165 0.9
1087.4 1.0X
Parquet Vectorized (Pushdown) 1140 / 1172 0.9
1086.8 1.0X
Native ORC Vectorized 1158 / 1206 0.9
1104.7 1.0X
Native ORC Vectorized (Pushdown) 1151 / 1220 0.9
1098.1 1.0X
================================================================================================
Pushdown for few distinct value case (use dictionary encoding)
================================================================================================
Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
Select 0 distinct string row (value IS NULL): Best/Avg Time(ms)
Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
Parquet Vectorized 512 / 565 2.0
488.6 1.0X
Parquet Vectorized (Pushdown) 27 / 33 39.3
25.5 19.2X
Native ORC Vectorized 509 / 546 2.1
485.0 1.0X
Native ORC Vectorized (Pushdown) 79 / 91 13.2
75.5 6.5X
...
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]