This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 59ba09a3c51 [SPARK-42926][BUILD][SQL] Upgrade Parquet to 1.13.0
59ba09a3c51 is described below

commit 59ba09a3c511b3f11a07138afae6dc9f15edf99d
Author: Yuming Wang <yumw...@ebay.com>
AuthorDate: Sat Apr 15 09:15:34 2023 +0800

    [SPARK-42926][BUILD][SQL] Upgrade Parquet to 1.13.0
    
    ### What changes were proposed in this pull request?
    
    This PR upgrades Apache Parquet to 1.13.0. Apache Parquet [1.13.0 release 
notes](https://github.com/apache/parquet-mr/blob/apache-parquet-1.13.0/CHANGES.md?plain=1#L22-L78).
    
    ### Why are the changes needed?
    
    1. This release includes 
[PARQUET-2160](https://issues.apache.org/jira/browse/PARQUET-2160). So we no 
longer need [SPARK-41952](https://issues.apache.org/jira/browse/SPARK-41952).
    2. This release includes [Java Vector API 
support](https://github.com/apache/parquet-mr/blob/apache-parquet-1.13.0/README.md?plain=1#L88-L100).
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Existing unit test and benchmark test.
    
    TPC-DS benchmark result:
    Query | Parquet 1.13.0(first time) | Parquet 1.12.3(first time) | Parquet 
1.13.0(second time) | Parquet 1.12.3(second time) | Parquet 1.13.0(third time) 
| Parquet 1.12.3(third time)
    -- | -- | -- | -- | -- | -- | --
    q1.sql | 37.819 | 37.786 | 36.322 | 37.59 | 37.772 | 36.776
    q2.sql | 42.132 | 41.513 | 43.189 | 42.274 | 42.859 | 42.605
    q3.sql | 5.933 | 6.1 | 6.082 | 6.071 | 6.128 | 6.094
    q4.sql | 335.051 | 319.173 | 322.396 | 320.977 | 324.464 | 326.822
    q5.sql | 78.41 | 76.631 | 76.841 | 76.37 | 78.257 | 76.502
    q6.sql | 9.006 | 9.11 | 8.737 | 8.577 | 8.729 | 9.05
    q7.sql | 12.881 | 12.731 | 12.685 | 12.662 | 12.606 | 12.675
    q8.sql | 10.122 | 10.092 | 10.035 | 10.853 | 10.277 | 10.841
    q9.sql | 72.562 | 71.942 | 73.649 | 73.04 | 72.899 | 72.01
    q10.sql | 14.127 | 13.075 | 14.276 | 13.913 | 13.281 | 13.229
    q11.sql | 111.334 | 111.612 | 110.952 | 110.776 | 111.686 | 112.27
    q12.sql | 3.138 | 3.854 | 3.187 | 3.613 | 3.437 | 3.306
    q13.sql | 13.131 | 12.676 | 12.516 | 12.417 | 12.739 | 12.987
    q14a.sql | 217.664 | 213.632 | 214.655 | 213.333 | 217.601 | 213.341
    q14b.sql | 191.553 | 182.775 | 184.35 | 187.004 | 188.313 | 189.876
    q15.sql | 10.308 | 10.46 | 10.304 | 9.901 | 10.175 | 10.307
    q16.sql | 81.97 | 82.059 | 82.41 | 81.263 | 83.179 | 82.042
    q17.sql | 28.876 | 28.905 | 30.41 | 29.573 | 29.555 | 28.837
    q18.sql | 14.183 | 13.929 | 14.11 | 14.466 | 13.969 | 14.022
    q19.sql | 6.611 | 7.593 | 6.652 | 6.659 | 6.446 | 6.533
    q20.sql | 3.263 | 3.701 | 3.56 | 3.503 | 3.53 | 3.627
    q21.sql | 2.252 | 2.188 | 2.249 | 2.128 | 2.161 | 2.252
    q22.sql | 14.809 | 14.715 | 14.324 | 14.266 | 14.567 | 14.123
    q23a.sql | 554.385 | 544.75 | 546.213 | 542.194 | 553.784 | 547.388
    q23b.sql | 781.236 | 768.367 | 770.584 | 776.065 | 776.502 | 776.006
    q24a.sql | 196.806 | 193.989 | 197.608 | 194.416 | 194.71 | 192.817
    q24b.sql | 176.56 | 183.084 | 177.486 | 177.936 | 177.776 | 177.389
    q25.sql | 22.323 | 22.089 | 22.665 | 22.049 | 22.248 | 22.317
    q26.sql | 8.574 | 8.356 | 8.174 | 8.753 | 8.186 | 8.302
    q27.sql | 9.056 | 8.252 | 8.37 | 8.319 | 8.516 | 8.38
    q28.sql | 102.185 | 102.382 | 102.344 | 103.058 | 102.024 | 102.786
    q29.sql | 75.655 | 75.604 | 75.217 | 75.532 | 75.835 | 76.024
    q30.sql | 12.476 | 12.966 | 13.039 | 14.108 | 12.19 | 13.143
    q31.sql | 26.343 | 27.632 | 26.337 | 26.791 | 26.74 | 26.098
    q32.sql | 3.251 | 3.41 | 3.378 | 3.333 | 3.371 | 3.516
    q33.sql | 7.143 | 6.125 | 6.85 | 6.718 | 7.067 | 6.615
    q34.sql | 8.53 | 8.656 | 8.536 | 8.866 | 8.358 | 8.589
    q35.sql | 35.212 | 35.571 | 35.659 | 37.631 | 36.292 | 35.603
    q36.sql | 9.264 | 9.166 | 9.748 | 9.488 | 9.45 | 9.469
    q37.sql | 36.368 | 35.881 | 37.023 | 36.578 | 35.823 | 36.7
    q38.sql | 74.58 | 73.472 | 72.926 | 73.823 | 71.097 | 73.329
    q39a.sql | 8.596 | 7.637 | 8.036 | 7.984 | 7.849 | 7.88
    q39b.sql | 7.233 | 6.641 | 6.278 | 7.06 | 6.595 | 6.691
    q40.sql | 17.34 | 16.558 | 16.448 | 16.864 | 16.432 | 16.413
    q41.sql | 1.223 | 1.105 | 1.103 | 1.182 | 1.232 | 1.304
    q42.sql | 2.464 | 2.441 | 2.554 | 2.544 | 2.314 | 2.393
    q43.sql | 7.477 | 7.396 | 7.394 | 7.764 | 7.381 | 7.534
    q44.sql | 30.228 | 30.516 | 30.859 | 31.057 | 30.372 | 29.008
    q45.sql | 9.93 | 10.089 | 9.874 | 10.075 | 9.802 | 9.838
    q46.sql | 9.544 | 9.949 | 9.503 | 9.755 | 9.395 | 9.25
    q47.sql | 27.322 | 26.952 | 26.974 | 26.83 | 27.087 | 26.991
    q48.sql | 14.266 | 14.39 | 14.517 | 14.684 | 14.471 | 14.61
    q49.sql | 21.279 | 21.733 | 20.286 | 20.945 | 22.388 | 21.52
    q50.sql | 191.416 | 194.256 | 196.701 | 194.113 | 193.354 | 191.004
    q51.sql | 37.552 | 37.767 | 38.317 | 37.731 | 37.369 | 38.187
    q52.sql | 2.206 | 2.406 | 2.235 | 2.362 | 2.337 | 2.278
    q53.sql | 5.282 | 5.131 | 5.465 | 5.137 | 5.142 | 5.069
    q54.sql | 13.039 | 12.655 | 13.047 | 12.382 | 12.992 | 12.988
    q55.sql | 2.534 | 2.39 | 2.375 | 2.867 | 2.623 | 2.546
    q56.sql | 7.365 | 7.087 | 6.902 | 7.406 | 7.586 | 7.081
    q57.sql | 18.064 | 17.945 | 18.699 | 17.664 | 18.362 | 18.222
    q58.sql | 6.198 | 6.702 | 6.109 | 6.211 | 5.9 | 6.101
    q59.sql | 28.266 | 28.195 | 27.876 | 28.748 | 29.027 | 28.543
    q60.sql | 6.847 | 7.143 | 7.322 | 7.1 | 7.207 | 7.215
    q61.sql | 7.258 | 7.62 | 7.317 | 7.781 | 7.616 | 7.669
    q62.sql | 10.334 | 11.523 | 10.389 | 10.378 | 10.072 | 10.583
    q63.sql | 4.631 | 4.944 | 4.947 | 5.124 | 4.61 | 4.865
    q64.sql | 249.694 | 252.117 | 254.359 | 254.813 | 253.236 | 250.401
    q65.sql | 78.742 | 79.184 | 78.559 | 78.305 | 78.985 | 78.515
    q66.sql | 14.98 | 14.854 | 14.794 | 14.767 | 14.781 | 14.696
    q67.sql | 1019.744 | 1048.439 | 987.894 | 972.062 | 927.566 | 1002.206
    q68.sql | 8.903 | 8.915 | 8.277 | 8.709 | 9.349 | 9.178
    q69.sql | 13.097 | 13.01 | 14.352 | 12.036 | 12.302 | 12.843
    q70.sql | 21.175 | 21.085 | 21.102 | 20.471 | 20.129 | 19.678
    q71.sql | 15.13 | 15.526 | 14.929 | 15.231 | 15.406 | 15.487
    q72.sql | 76.463 | 75.851 | 72.002 | 72.356 | 72.676 | 74.798
    q73.sql | 5.894 | 6.09 | 5.877 | 6.051 | 6.365 | 6.634
    q74.sql | 99.106 | 99.356 | 100.291 | 99.51 | 96.766 | 97.292
    q75.sql | 126.625 | 128.094 | 127.364 | 128.575 | 127.418 | 125.806
    q76.sql | 35.172 | 33.601 | 34.752 | 34.764 | 34.228 | 35.748
    q77.sql | 8.394 | 8.01 | 7.951 | 8.061 | 7.839 | 8.348
    q78.sql | 289.061 | 287.508 | 283.615 | 288.768 | 288.448 | 288.661
    q79.sql | 10.048 | 9.251 | 9.396 | 9.81 | 8.607 | 8.341
    q80.sql | 59.68 | 59.458 | 60.234 | 60.415 | 61.325 | 60.744
    q81.sql | 17.822 | 18.815 | 18.488 | 18.95 | 17.911 | 18.113
    q82.sql | 64.781 | 63.957 | 63.621 | 64.38 | 63.637 | 64.488
    q83.sql | 4.686 | 4.922 | 4.635 | 4.827 | 4.678 | 5.071
    q84.sql | 10.987 | 10.629 | 10.841 | 11.151 | 10.646 | 10.6
    q85.sql | 12.689 | 13.304 | 13.362 | 13.19 | 13.779 | 12.657
    q86.sql | 6.48 | 6.491 | 6.722 | 6.667 | 6.833 | 6.52
    q87.sql | 77.589 | 77.377 | 77.177 | 77.011 | 78.339 | 78.399
    q88.sql | 83.876 | 83.676 | 84.044 | 83.761 | 84.201 | 84.089
    q89.sql | 6.741 | 6.564 | 6.755 | 6.708 | 6.704 | 6.794
    q90.sql | 7.79 | 7.812 | 7.882 | 7.88 | 7.875 | 7.854
    q91.sql | 4.072 | 3.728 | 3.883 | 3.976 | 4.151 | 4.035
    q92.sql | 3.05 | 3.155 | 3.336 | 3.067 | 2.942 | 3.099
    q93.sql | 356.412 | 360.731 | 358.14 | 356 | 356.108 | 358.011
    q94.sql | 43.202 | 43.561 | 44.63 | 44.486 | 43.993 | 42.693
    q95.sql | 197.185 | 199.657 | 193.975 | 195.843 | 201.801 | 196.113
    q96.sql | 12.765 | 12.481 | 12.682 | 12.799 | 12.528 | 12.505
    q97.sql | 82.895 | 82.067 | 81.754 | 82.799 | 81.788 | 81.572
    q98.sql | 7.338 | 7.066 | 7.133 | 7.005 | 7.254 | 7.047
    q99.sql | 18.431 | 17.874 | 17.826 | 17.861 | 17.705 | 17.878
    total | 7105.675 | 7091.391 | 7030.209 | 7021.7 | 6992.413 | 7047.295
    
    Closes #40555 from wangyum/SPARK-42926.
    
    Authored-by: Yuming Wang <yumw...@ebay.com>
    Signed-off-by: Yuming Wang <yumw...@ebay.com>
---
 dev/deps/spark-deps-hadoop-2-hive-2.3 | 12 ++++++------
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 12 ++++++------
 docs/sql-data-sources-parquet.md      |  4 ++--
 pom.xml                               |  4 ++--
 4 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2-hive-2.3 
b/dev/deps/spark-deps-hadoop-2-hive-2.3
index fc320529fda..5fa2ddfd367 100644
--- a/dev/deps/spark-deps-hadoop-2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2-hive-2.3
@@ -229,12 +229,12 @@ orc-shims/1.8.3//orc-shims-1.8.3.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
-parquet-column/1.12.3//parquet-column-1.12.3.jar
-parquet-common/1.12.3//parquet-common-1.12.3.jar
-parquet-encoding/1.12.3//parquet-encoding-1.12.3.jar
-parquet-format-structures/1.12.3//parquet-format-structures-1.12.3.jar
-parquet-hadoop/1.12.3//parquet-hadoop-1.12.3.jar
-parquet-jackson/1.12.3//parquet-jackson-1.12.3.jar
+parquet-column/1.13.0//parquet-column-1.13.0.jar
+parquet-common/1.13.0//parquet-common-1.13.0.jar
+parquet-encoding/1.13.0//parquet-encoding-1.13.0.jar
+parquet-format-structures/1.13.0//parquet-format-structures-1.13.0.jar
+parquet-hadoop/1.13.0//parquet-hadoop-1.13.0.jar
+parquet-jackson/1.13.0//parquet-jackson-1.13.0.jar
 pickle/1.3//pickle-1.3.jar
 protobuf-java/2.5.0//protobuf-java-2.5.0.jar
 py4j/0.10.9.7//py4j-0.10.9.7.jar
diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 25a54fdd2a9..f30984f60ea 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -215,12 +215,12 @@ orc-shims/1.8.3//orc-shims-1.8.3.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
-parquet-column/1.12.3//parquet-column-1.12.3.jar
-parquet-common/1.12.3//parquet-common-1.12.3.jar
-parquet-encoding/1.12.3//parquet-encoding-1.12.3.jar
-parquet-format-structures/1.12.3//parquet-format-structures-1.12.3.jar
-parquet-hadoop/1.12.3//parquet-hadoop-1.12.3.jar
-parquet-jackson/1.12.3//parquet-jackson-1.12.3.jar
+parquet-column/1.13.0//parquet-column-1.13.0.jar
+parquet-common/1.13.0//parquet-common-1.13.0.jar
+parquet-encoding/1.13.0//parquet-encoding-1.13.0.jar
+parquet-format-structures/1.13.0//parquet-format-structures-1.13.0.jar
+parquet-hadoop/1.13.0//parquet-hadoop-1.13.0.jar
+parquet-jackson/1.13.0//parquet-jackson-1.13.0.jar
 pickle/1.3//pickle-1.3.jar
 protobuf-java/2.5.0//protobuf-java-2.5.0.jar
 py4j/0.10.9.7//py4j-0.10.9.7.jar
diff --git a/docs/sql-data-sources-parquet.md b/docs/sql-data-sources-parquet.md
index 4a4a3938c86..58d90fb491b 100644
--- a/docs/sql-data-sources-parquet.md
+++ b/docs/sql-data-sources-parquet.md
@@ -257,7 +257,7 @@ REFRESH TABLE my_table;
 
 Since Spark 3.2, columnar encryption is supported for Parquet tables with 
Apache Parquet 1.12+.
 
-Parquet uses the envelope encryption practice, where file parts are encrypted 
with "data encryption keys" (DEKs), and the DEKs are encrypted with "master 
encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each 
encrypted file/column. The MEKs are generated, stored and managed in a Key 
Management Service (KMS) of user’s choice. The Parquet Maven 
[repository](https://repo1.maven.org/maven2/org/apache/parquet/parquet-hadoop/1.12.3/)
 has a jar with a mock KMS implementati [...]
+Parquet uses the envelope encryption practice, where file parts are encrypted 
with "data encryption keys" (DEKs), and the DEKs are encrypted with "master 
encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each 
encrypted file/column. The MEKs are generated, stored and managed in a Key 
Management Service (KMS) of user’s choice. The Parquet Maven 
[repository](https://repo1.maven.org/maven2/org/apache/parquet/parquet-hadoop/1.13.0/)
 has a jar with a mock KMS implementati [...]
 
 <div class="codetabs">
 
@@ -350,7 +350,7 @@ Dataset<Row> df2 = 
spark.read().parquet("/path/to/table.parquet.encrypted");
 
 #### KMS Client
 
-The InMemoryKMS class is provided only for illustration and simple 
demonstration of Parquet encryption functionality. **It should not be used in a 
real deployment**. The master encryption keys must be kept and managed in a 
production-grade KMS system, deployed in user's organization. Rollout of Spark 
with Parquet encryption requires implementation of a client class for the KMS 
server. Parquet provides a plug-in 
[interface](https://github.com/apache/parquet-mr/blob/1.12.3/parquet-hadoop/s 
[...]
+The InMemoryKMS class is provided only for illustration and simple 
demonstration of Parquet encryption functionality. **It should not be used in a 
real deployment**. The master encryption keys must be kept and managed in a 
production-grade KMS system, deployed in user's organization. Rollout of Spark 
with Parquet encryption requires implementation of a client class for the KMS 
server. Parquet provides a plug-in 
[interface](https://github.com/apache/parquet-mr/blob/1.13.0/parquet-hadoop/s 
[...]
 
 <div data-lang="java"  markdown="1">
 {% highlight java %}
diff --git a/pom.xml b/pom.xml
index 198a82b8c27..9811742b866 100644
--- a/pom.xml
+++ b/pom.xml
@@ -140,7 +140,7 @@
     <kafka.version>3.4.0</kafka.version>
     <!-- After 10.15.1.3, the minimum required version is JDK9 -->
     <derby.version>10.14.2.0</derby.version>
-    <parquet.version>1.12.3</parquet.version>
+    <parquet.version>1.13.0</parquet.version>
     <orc.version>1.8.3</orc.version>
     <orc.classifier>shaded-protobuf</orc.classifier>
     <jetty.version>9.4.51.v20230217</jetty.version>
@@ -2361,7 +2361,7 @@
             <groupId>${hive.group}</groupId>
             <artifactId>hive-service-rpc</artifactId>
           </exclusion>
-          <!-- parquet-hadoop-bundle:1.8.1 conflict with 1.12.3 -->
+          <!-- parquet-hadoop-bundle:1.8.1 conflict with 1.13.0 -->
           <exclusion>
             <groupId>org.apache.parquet</groupId>
             <artifactId>parquet-hadoop-bundle</artifactId>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to