Re: [PR] [#4630] feat(iceberg): remove S3 SDK jar from Gravitino Iceberg REST server [gravitino]

via GitHub Thu, 22 Aug 2024 06:47:37 -0700


yuqi1129 commented on code in PR #4631:
URL: https://github.com/apache/gravitino/pull/4631#discussion_r1727087404



##########
docs/spark-connector/spark-catalog-iceberg.md:
##########
@@ -97,44 +97,29 @@ DESC EXTENDED employee;
 
 For more details about `CALL`, please refer to the [Spark Procedures 
description](https://iceberg.apache.org/docs/1.5.2/spark-procedures/#spark-procedures)
 in Iceberg official document. 
 
-## Apache Iceberg catalog backend support
-- HiveCatalog
-- JdbcCatalog
-- RESTCatalog
-
-### Catalog properties
+## Catalog properties
 
 Gravitino spark connector will transform below property names which are 
defined in catalog properties to Spark Iceberg connector configuration.
 
-#### HiveCatalog
-
-| Gravitino catalog property name | Spark Iceberg connector configuration | 
Default Value | Required | Description               | Since Version |
-|---------------------------------|---------------------------------------|---------------|----------|---------------------------|---------------|
-| `catalog-backend`               | `type`                                | 
`memory`      | Yes      | Catalog backend type      | 0.5.0         |
-| `uri`                           | `uri`                                 | 
(none)        | Yes      | Catalog backend uri       | 0.5.0         |
-| `warehouse`                     | `warehouse`                           | 
(none)        | Yes      | Catalog backend warehouse | 0.5.0         |
-
-#### JdbcCatalog
-
-| Gravitino catalog property name | Spark Iceberg connector configuration | 
Default Value | Required | Description               | Since Version |
-|---------------------------------|---------------------------------------|---------------|----------|---------------------------|---------------|
-| `catalog-backend`               | `type`                                | 
`memory`      | Yes      | Catalog backend type      | 0.5.0         |
-| `uri`                           | `uri`                                 | 
(none)        | Yes      | Catalog backend uri       | 0.5.0         |
-| `warehouse`                     | `warehouse`                           | 
(none)        | Yes      | Catalog backend warehouse | 0.5.0         |
-| `jdbc-user`                     | `jdbc.user`                           | 
(none)        | Yes      | JDBC user name            | 0.5.0         |
-| `jdbc-password`                 | `jdbc.password`                       | 
(none)        | Yes      | JDBC password             | 0.5.0         |
+| Gravitino catalog property name | Spark Iceberg connector configuration | 
Description                                                                     
                                                                                
                                                    | Since Version |
+|---------------------------------|---------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|
+| `catalog-backend`               | `type`                                | 
Catalog backend type                                                            
                                                                                
                                                    | 0.5.0         |
+| `uri`                           | `uri`                                 | 
Catalog backend uri                                                             
                                                                                
                                                    | 0.5.0         |
+| `warehouse`                     | `warehouse`                           | 
Catalog backend warehouse                                                       
                                                                                
                                                    | 0.5.0         |
+| `jdbc-user`                     | `jdbc.user`                           | 
JDBC user name                                                                  
                                                                                
                                                    | 0.5.0         |
+| `jdbc-password`                 | `jdbc.password`                       | 
JDBC password                                                                   
                                                                                
                                                    | 0.5.0         |
+| `io-impl`                       | `io-impl`                             | 
The io implementation for `FileIO` in Iceberg.                                  
                                                                                
                                                    | 0.6.0         |
+| `s3-endpoint`                   | `s3.endpoint`                         | An 
alternative endpoint of the S3 service, This could be used for S3FileIO with 
any s3-compatible object storage service that has a different endpoint, or 
access a private S3 endpoint in a virtual private cloud. | 0.6.0         | 
+| `s3-region`                     | `client.region`                       | 
The region of the S3 service, like `us-west-2`.                                 
                                                                                
                                                    | 0.6.0         |
 
-#### RESTCatalog
-
-| Gravitino catalog property name | Spark Iceberg connector configuration | 
Default Value | Required | Description               | Since Version |
-|---------------------------------|---------------------------------------|---------------|----------|---------------------------|---------------|
-| `catalog-backend`               | `type`                                | 
`memory`      | Yes      | Catalog backend type      | 0.5.1         |
-| `uri`                           | `uri`                                 | 
(none)        | Yes      | Catalog backend uri       | 0.5.1         |
-| `warehouse`                     | `warehouse`                           | 
(none)        | No       | Catalog backend warehouse | 0.5.1         |
-
-Gravitino catalog property names with the prefix `spark.bypass.` are passed to 
Spark Iceberg connector. For example, using `spark.bypass.io-impl` to pass the 
`io-impl` to the Spark Iceberg connector.
+Gravitino catalog property names with the prefix `spark.bypass.` are passed to 
Spark Iceberg connector. For example, using `spark.bypass.clients` to pass the 
`clients` to the Spark Iceberg connector.
 
 :::info
 Iceberg catalog property `cache-enabled` is setting to `false` internally and 
not allowed to change.
 :::
 
+## Storage
+
+### S3
+
+To access the table stored on S3, you must add s3 secret in Spark 
configuration by `spark.sql.catalog.${iceberg_catalog_name}.s3.access-key-id` 
and `spark.sql.catalog.${iceberg_catalog_name}.s3.secret-access-key`, download 
[Iceberg AWS 
bundle](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-aws-bundle)
 and place it to the classpath of Spark.

Review Comment:
   the table -> tables.
   
   > you must add s3 secret in Spark configuration by 
`spark.sql.catalog.${iceberg_catalog_name}.s3.access-key-id` and 
`spark.sql.catalog.${iceberg_catalog_name}.s3.secret-access-key`
   
   You need to add s3 ... to the spark configuration using ... and .....,  
additionally, download the ... and place it in spark's classpath.
   



##########
docs/spark-connector/spark-catalog-iceberg.md:
##########
@@ -97,44 +97,29 @@ DESC EXTENDED employee;
 
 For more details about `CALL`, please refer to the [Spark Procedures 
description](https://iceberg.apache.org/docs/1.5.2/spark-procedures/#spark-procedures)
 in Iceberg official document. 
 
-## Apache Iceberg catalog backend support
-- HiveCatalog
-- JdbcCatalog
-- RESTCatalog
-
-### Catalog properties
+## Catalog properties
 
 Gravitino spark connector will transform below property names which are 
defined in catalog properties to Spark Iceberg connector configuration.
 
-#### HiveCatalog
-
-| Gravitino catalog property name | Spark Iceberg connector configuration | 
Default Value | Required | Description               | Since Version |
-|---------------------------------|---------------------------------------|---------------|----------|---------------------------|---------------|
-| `catalog-backend`               | `type`                                | 
`memory`      | Yes      | Catalog backend type      | 0.5.0         |
-| `uri`                           | `uri`                                 | 
(none)        | Yes      | Catalog backend uri       | 0.5.0         |
-| `warehouse`                     | `warehouse`                           | 
(none)        | Yes      | Catalog backend warehouse | 0.5.0         |
-
-#### JdbcCatalog
-
-| Gravitino catalog property name | Spark Iceberg connector configuration | 
Default Value | Required | Description               | Since Version |
-|---------------------------------|---------------------------------------|---------------|----------|---------------------------|---------------|
-| `catalog-backend`               | `type`                                | 
`memory`      | Yes      | Catalog backend type      | 0.5.0         |
-| `uri`                           | `uri`                                 | 
(none)        | Yes      | Catalog backend uri       | 0.5.0         |
-| `warehouse`                     | `warehouse`                           | 
(none)        | Yes      | Catalog backend warehouse | 0.5.0         |
-| `jdbc-user`                     | `jdbc.user`                           | 
(none)        | Yes      | JDBC user name            | 0.5.0         |
-| `jdbc-password`                 | `jdbc.password`                       | 
(none)        | Yes      | JDBC password             | 0.5.0         |
+| Gravitino catalog property name | Spark Iceberg connector configuration | 
Description                                                                     
                                                                                
                                                    | Since Version |
+|---------------------------------|---------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|
+| `catalog-backend`               | `type`                                | 
Catalog backend type                                                            
                                                                                
                                                    | 0.5.0         |
+| `uri`                           | `uri`                                 | 
Catalog backend uri                                                             
                                                                                
                                                    | 0.5.0         |
+| `warehouse`                     | `warehouse`                           | 
Catalog backend warehouse                                                       
                                                                                
                                                    | 0.5.0         |
+| `jdbc-user`                     | `jdbc.user`                           | 
JDBC user name                                                                  
                                                                                
                                                    | 0.5.0         |
+| `jdbc-password`                 | `jdbc.password`                       | 
JDBC password                                                                   
                                                                                
                                                    | 0.5.0         |
+| `io-impl`                       | `io-impl`                             | 
The io implementation for `FileIO` in Iceberg.                                  
                                                                                
                                                    | 0.6.0         |
+| `s3-endpoint`                   | `s3.endpoint`                         | An 
alternative endpoint of the S3 service, This could be used for S3FileIO with 
any s3-compatible object storage service that has a different endpoint, or 
access a private S3 endpoint in a virtual private cloud. | 0.6.0         | 

Review Comment:
   What's the meaning of `alternative`? Is this configuration optional?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [#4630] feat(iceberg): remove S3 SDK jar from Gravitino Iceberg REST server [gravitino]

Reply via email to