(spark) branch master updated: [MINOR][DOCS][TESTS] Update repo name and link from `parquet-mr` to `parquet-java`

yao Thu, 13 Jun 2024 19:34:09 -0700

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 0b214f166a92 [MINOR][DOCS][TESTS] Update repo name and link from 
`parquet-mr` to `parquet-java`
0b214f166a92 is described below

commit 0b214f166a92c4e6b4fdc102f7718903a1a152d5
Author: Wei Guo <[email protected]>
AuthorDate: Fri Jun 14 10:33:49 2024 +0800

    [MINOR][DOCS][TESTS] Update repo name and link from `parquet-mr` to 
`parquet-java`
    
    ### What changes were proposed in this pull request?
    
    This pr replaces parquet related repo name from `parquet-mr` to 
`parquet-java` and repo link from `https://github.com/apache/parquet-mr` to 
`https://github.com/apache/parquet-java`.
    
    ### Why are the changes needed?
    
    The upstream repo name has made a change with 
[INFRA-25802](https://issues.apache.org/jira/browse/INFRA-25802), 
[PARQUET-2475](https://issues.apache.org/jira/browse/PARQUET-2475), it's better 
to update with the latest name and link.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Passed GA.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #46963 from wayneguow/parquet.
    
    Authored-by: Wei Guo <[email protected]>
    Signed-off-by: Kent Yao <[email protected]>
---
 docs/sql-data-sources-load-save-functions.md                        | 2 +-
 docs/sql-data-sources-parquet.md                                    | 6 +++---
 .../datasources/parquet/ParquetInteroperabilitySuite.scala          | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/docs/sql-data-sources-load-save-functions.md 
b/docs/sql-data-sources-load-save-functions.md
index b42f6e84076d..70105c22e583 100644
--- a/docs/sql-data-sources-load-save-functions.md
+++ b/docs/sql-data-sources-load-save-functions.md
@@ -109,7 +109,7 @@ For example, you can control bloom filters and dictionary 
encodings for ORC data
 The following ORC example will create bloom filter and use dictionary encoding 
only for `favorite_color`.
 For Parquet, there exists `parquet.bloom.filter.enabled` and 
`parquet.enable.dictionary`, too.
 To find more detailed information about the extra ORC/Parquet options,
-visit the official Apache [ORC](https://orc.apache.org/docs/spark-config.html) 
/ [Parquet](https://github.com/apache/parquet-mr/tree/master/parquet-hadoop) 
websites.
+visit the official Apache [ORC](https://orc.apache.org/docs/spark-config.html) 
/ [Parquet](https://github.com/apache/parquet-java/tree/master/parquet-hadoop) 
websites.
 
 ORC data source:
 
diff --git a/docs/sql-data-sources-parquet.md b/docs/sql-data-sources-parquet.md
index f5c5ccd3b89a..5a0ca595fabb 100644
--- a/docs/sql-data-sources-parquet.md
+++ b/docs/sql-data-sources-parquet.md
@@ -350,7 +350,7 @@ Dataset<Row> df2 = 
spark.read().parquet("/path/to/table.parquet.encrypted");
 
 #### KMS Client
 
-The InMemoryKMS class is provided only for illustration and simple 
demonstration of Parquet encryption functionality. **It should not be used in a 
real deployment**. The master encryption keys must be kept and managed in a 
production-grade KMS system, deployed in user's organization. Rollout of Spark 
with Parquet encryption requires implementation of a client class for the KMS 
server. Parquet provides a plug-in 
[interface](https://github.com/apache/parquet-mr/blob/apache-parquet-1.13.1/p 
[...]
+The InMemoryKMS class is provided only for illustration and simple 
demonstration of Parquet encryption functionality. **It should not be used in a 
real deployment**. The master encryption keys must be kept and managed in a 
production-grade KMS system, deployed in user's organization. Rollout of Spark 
with Parquet encryption requires implementation of a client class for the KMS 
server. Parquet provides a plug-in 
[interface](https://github.com/apache/parquet-java/blob/apache-parquet-1.13.1 
[...]
 
 <div data-lang="java"  markdown="1">
 {% highlight java %}
@@ -371,9 +371,9 @@ public interface KmsClient {
 
 </div>
 
-An 
[example](https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/test/java/org/apache/parquet/crypto/keytools/samples/VaultClient.java)
 of such class for an open source 
[KMS](https://www.vaultproject.io/api/secret/transit) can be found in the 
parquet-mr repository. The production KMS client should be designed in 
cooperation with organization's security administrators, and built by 
developers with an experience in access control management. Once such class is 
created, it c [...]
+An 
[example](https://github.com/apache/parquet-java/blob/master/parquet-hadoop/src/test/java/org/apache/parquet/crypto/keytools/samples/VaultClient.java)
 of such class for an open source 
[KMS](https://www.vaultproject.io/api/secret/transit) can be found in the 
parquet-java repository. The production KMS client should be designed in 
cooperation with organization's security administrators, and built by 
developers with an experience in access control management. Once such class is 
created,  [...]
 
-Note: By default, Parquet implements a "double envelope encryption" mode, that 
minimizes the interaction of Spark executors with a KMS server. In this mode, 
the DEKs are encrypted with "key encryption keys" (KEKs, randomly generated by 
Parquet). The KEKs are encrypted with MEKs in KMS; the result and the KEK 
itself are cached in Spark executor memory. Users interested in regular 
envelope encryption, can switch to it by setting the 
`parquet.encryption.double.wrapping` parameter to `false` [...]
+Note: By default, Parquet implements a "double envelope encryption" mode, that 
minimizes the interaction of Spark executors with a KMS server. In this mode, 
the DEKs are encrypted with "key encryption keys" (KEKs, randomly generated by 
Parquet). The KEKs are encrypted with MEKs in KMS; the result and the KEK 
itself are cached in Spark executor memory. Users interested in regular 
envelope encryption, can switch to it by setting the 
`parquet.encryption.double.wrapping` parameter to `false` [...]
 
 
 ## Data Source Option
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
index fffc9e2b1924..baa11df302b0 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
@@ -213,8 +213,8 @@ class ParquetInteroperabilitySuite extends 
ParquetCompatibilityTest with SharedS
               // predicates because (a) in ParquetFilters, we ignore 
TimestampType and (b) parquet
               // does not read statistics from int96 fields, as they are 
unsigned.  See
               // scalastyle:off line.size.limit
-              // 
https://github.com/apache/parquet-mr/blob/2fd62ee4d524c270764e9b91dca72e5cf1a005b7/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L419
-              // 
https://github.com/apache/parquet-mr/blob/2fd62ee4d524c270764e9b91dca72e5cf1a005b7/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L348
+              // 
https://github.com/apache/parquet-java/blob/2fd62ee4d524c270764e9b91dca72e5cf1a005b7/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L419
+              // 
https://github.com/apache/parquet-java/blob/2fd62ee4d524c270764e9b91dca72e5cf1a005b7/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L348
               // scalastyle:on line.size.limit
               //
               // Just to be defensive in case anything ever changes in 
parquet, this test checks


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [MINOR][DOCS][TESTS] Update repo name and link from `parquet-mr` to `parquet-java`

Reply via email to