[ https://issues.apache.org/jira/browse/SPARK-24706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuming Wang updated SPARK-24706: -------------------------------- Description: (was: Benchmark result: {noformat} ###############################[ Pushdown benchmark for tinyint ]################################ Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz Select 1 tinyint row (value = CAST(63 AS tinyint)): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Parquet Vectorized 4307 / 4575 3.7 273.8 1.0X Parquet Vectorized (Pushdown) 227 / 241 69.4 14.4 19.0X Native ORC Vectorized 3646 / 3727 4.3 231.8 1.2X Native ORC Vectorized (Pushdown) 736 / 744 21.4 46.8 5.9X Select 10% tinyint rows (value < 12): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Parquet Vectorized 5209 / 5843 3.0 331.2 1.0X Parquet Vectorized (Pushdown) 1296 / 1759 12.1 82.4 4.0X Native ORC Vectorized 4455 / 4594 3.5 283.2 1.2X Native ORC Vectorized (Pushdown) 1736 / 1813 9.1 110.4 3.0X Select 50% tinyint rows (value < 63): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Parquet Vectorized 8362 / 8394 1.9 531.7 1.0X Parquet Vectorized (Pushdown) 6303 / 6530 2.5 400.7 1.3X Native ORC Vectorized 7962 / 8113 2.0 506.2 1.1X Native ORC Vectorized (Pushdown) 6680 / 7556 2.4 424.7 1.3X Select 90% tinyint rows (value < 114): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Parquet Vectorized 11572 / 11715 1.4 735.7 1.0X Parquet Vectorized (Pushdown) 11198 / 11326 1.4 712.0 1.0X Native ORC Vectorized 11041 / 11209 1.4 702.0 1.0X Native ORC Vectorized (Pushdown) 11104 / 11472 1.4 706.0 1.0X ###############################[ Pushdown benchmark for smallint ]############################### Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz Select 1 smallint row (value = CAST(63 AS smallint)): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Parquet Vectorized 2939 / 2966 5.4 186.9 1.0X Parquet Vectorized (Pushdown) 85 / 91 184.9 5.4 34.6X Native ORC Vectorized 2927 / 3026 5.4 186.1 1.0X Native ORC Vectorized (Pushdown) 418 / 432 37.7 26.6 7.0X Select 10% smallint rows (value < CAST(3276 AS smallint)): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Parquet Vectorized 3735 / 3897 4.2 237.5 1.0X Parquet Vectorized (Pushdown) 1204 / 1222 13.1 76.6 3.1X Native ORC Vectorized 3796 / 3831 4.1 241.4 1.0X Native ORC Vectorized (Pushdown) 1570 / 1581 10.0 99.8 2.4X Select 50% smallint rows (value < CAST(16383 AS smallint)): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Parquet Vectorized 7194 / 8522 2.2 457.4 1.0X Parquet Vectorized (Pushdown) 5758 / 5806 2.7 366.1 1.2X Native ORC Vectorized 7311 / 7585 2.2 464.8 1.0X Native ORC Vectorized (Pushdown) 6123 / 6342 2.6 389.3 1.2X Select 90% smallint rows (value < CAST(29490 AS smallint)): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Parquet Vectorized 10558 / 10638 1.5 671.3 1.0X Parquet Vectorized (Pushdown) 10380 / 10517 1.5 659.9 1.0X Native ORC Vectorized 11045 / 11202 1.4 702.2 1.0X Native ORC Vectorized (Pushdown) 10912 / 11176 1.4 693.7 1.0X {noformat} ) > Support ByteType and ShortType pushdown to parquet > -------------------------------------------------- > > Key: SPARK-24706 > URL: https://issues.apache.org/jira/browse/SPARK-24706 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.0 > Reporter: Yuming Wang > Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org