This is an automated email from the ASF dual-hosted git repository.
bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new d283192def4 added link and command (#10293)
d283192def4 is described below
commit d283192def43a7bc9009db877933def237fec1c2
Author: Sagar Lakshmipathy <[email protected]>
AuthorDate: Tue Dec 12 05:33:14 2023 -0800
added link and command (#10293)
---
website/docs/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++
.../version-0.12.0/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++
.../version-0.12.1/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++
.../version-0.12.2/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++
.../version-0.12.3/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++
.../version-0.13.0/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++
.../version-0.13.1/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++
.../version-0.14.0/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++
8 files changed, 120 insertions(+)
diff --git a/website/docs/syncing_aws_glue_data_catalog.md
b/website/docs/syncing_aws_glue_data_catalog.md
index 3ab47deeab7..e54c6d52887 100644
--- a/website/docs/syncing_aws_glue_data_catalog.md
+++ b/website/docs/syncing_aws_glue_data_catalog.md
@@ -16,3 +16,18 @@ be passed along.
```shell
--sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool
```
+
+#### Running AWS Glue Catalog Sync for Spark DataSource
+
+To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog,
you can use the options mentioned in the
+[AWS
documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write)
+
+#### Running AWS Glue Catalog Sync from EMR
+
+If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog
as external metastore, you can simply run the sync from command line like below:
+
+```shell
+cd /usr/lib/hudi/bin
+
+./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name>
--database <database_name> --table <table_name> --partitioned-by <column_name>
+```
\ No newline at end of file
diff --git
a/website/versioned_docs/version-0.12.0/syncing_aws_glue_data_catalog.md
b/website/versioned_docs/version-0.12.0/syncing_aws_glue_data_catalog.md
index 0d9075993ec..1228c0b21c4 100644
--- a/website/versioned_docs/version-0.12.0/syncing_aws_glue_data_catalog.md
+++ b/website/versioned_docs/version-0.12.0/syncing_aws_glue_data_catalog.md
@@ -16,3 +16,18 @@ be passed along.
```shell
--sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool
```
+
+#### Running AWS Glue Catalog Sync for Spark DataSource
+
+To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog,
you can use the options mentioned in the
+[AWS
documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write)
+
+#### Running AWS Glue Catalog Sync from EMR
+
+If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog
as external metastore, you can simply run the sync from command line like below:
+
+```shell
+cd /usr/lib/hudi/bin
+
+./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name>
--database <database_name> --table <table_name> --partitioned-by <column_name>
+```
\ No newline at end of file
diff --git
a/website/versioned_docs/version-0.12.1/syncing_aws_glue_data_catalog.md
b/website/versioned_docs/version-0.12.1/syncing_aws_glue_data_catalog.md
index 0d9075993ec..1228c0b21c4 100644
--- a/website/versioned_docs/version-0.12.1/syncing_aws_glue_data_catalog.md
+++ b/website/versioned_docs/version-0.12.1/syncing_aws_glue_data_catalog.md
@@ -16,3 +16,18 @@ be passed along.
```shell
--sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool
```
+
+#### Running AWS Glue Catalog Sync for Spark DataSource
+
+To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog,
you can use the options mentioned in the
+[AWS
documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write)
+
+#### Running AWS Glue Catalog Sync from EMR
+
+If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog
as external metastore, you can simply run the sync from command line like below:
+
+```shell
+cd /usr/lib/hudi/bin
+
+./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name>
--database <database_name> --table <table_name> --partitioned-by <column_name>
+```
\ No newline at end of file
diff --git
a/website/versioned_docs/version-0.12.2/syncing_aws_glue_data_catalog.md
b/website/versioned_docs/version-0.12.2/syncing_aws_glue_data_catalog.md
index 0d9075993ec..1228c0b21c4 100644
--- a/website/versioned_docs/version-0.12.2/syncing_aws_glue_data_catalog.md
+++ b/website/versioned_docs/version-0.12.2/syncing_aws_glue_data_catalog.md
@@ -16,3 +16,18 @@ be passed along.
```shell
--sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool
```
+
+#### Running AWS Glue Catalog Sync for Spark DataSource
+
+To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog,
you can use the options mentioned in the
+[AWS
documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write)
+
+#### Running AWS Glue Catalog Sync from EMR
+
+If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog
as external metastore, you can simply run the sync from command line like below:
+
+```shell
+cd /usr/lib/hudi/bin
+
+./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name>
--database <database_name> --table <table_name> --partitioned-by <column_name>
+```
\ No newline at end of file
diff --git
a/website/versioned_docs/version-0.12.3/syncing_aws_glue_data_catalog.md
b/website/versioned_docs/version-0.12.3/syncing_aws_glue_data_catalog.md
index 0d9075993ec..1228c0b21c4 100644
--- a/website/versioned_docs/version-0.12.3/syncing_aws_glue_data_catalog.md
+++ b/website/versioned_docs/version-0.12.3/syncing_aws_glue_data_catalog.md
@@ -16,3 +16,18 @@ be passed along.
```shell
--sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool
```
+
+#### Running AWS Glue Catalog Sync for Spark DataSource
+
+To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog,
you can use the options mentioned in the
+[AWS
documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write)
+
+#### Running AWS Glue Catalog Sync from EMR
+
+If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog
as external metastore, you can simply run the sync from command line like below:
+
+```shell
+cd /usr/lib/hudi/bin
+
+./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name>
--database <database_name> --table <table_name> --partitioned-by <column_name>
+```
\ No newline at end of file
diff --git
a/website/versioned_docs/version-0.13.0/syncing_aws_glue_data_catalog.md
b/website/versioned_docs/version-0.13.0/syncing_aws_glue_data_catalog.md
index 0d9075993ec..1228c0b21c4 100644
--- a/website/versioned_docs/version-0.13.0/syncing_aws_glue_data_catalog.md
+++ b/website/versioned_docs/version-0.13.0/syncing_aws_glue_data_catalog.md
@@ -16,3 +16,18 @@ be passed along.
```shell
--sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool
```
+
+#### Running AWS Glue Catalog Sync for Spark DataSource
+
+To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog,
you can use the options mentioned in the
+[AWS
documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write)
+
+#### Running AWS Glue Catalog Sync from EMR
+
+If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog
as external metastore, you can simply run the sync from command line like below:
+
+```shell
+cd /usr/lib/hudi/bin
+
+./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name>
--database <database_name> --table <table_name> --partitioned-by <column_name>
+```
\ No newline at end of file
diff --git
a/website/versioned_docs/version-0.13.1/syncing_aws_glue_data_catalog.md
b/website/versioned_docs/version-0.13.1/syncing_aws_glue_data_catalog.md
index 0d9075993ec..1228c0b21c4 100644
--- a/website/versioned_docs/version-0.13.1/syncing_aws_glue_data_catalog.md
+++ b/website/versioned_docs/version-0.13.1/syncing_aws_glue_data_catalog.md
@@ -16,3 +16,18 @@ be passed along.
```shell
--sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool
```
+
+#### Running AWS Glue Catalog Sync for Spark DataSource
+
+To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog,
you can use the options mentioned in the
+[AWS
documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write)
+
+#### Running AWS Glue Catalog Sync from EMR
+
+If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog
as external metastore, you can simply run the sync from command line like below:
+
+```shell
+cd /usr/lib/hudi/bin
+
+./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name>
--database <database_name> --table <table_name> --partitioned-by <column_name>
+```
\ No newline at end of file
diff --git
a/website/versioned_docs/version-0.14.0/syncing_aws_glue_data_catalog.md
b/website/versioned_docs/version-0.14.0/syncing_aws_glue_data_catalog.md
index 3ab47deeab7..e54c6d52887 100644
--- a/website/versioned_docs/version-0.14.0/syncing_aws_glue_data_catalog.md
+++ b/website/versioned_docs/version-0.14.0/syncing_aws_glue_data_catalog.md
@@ -16,3 +16,18 @@ be passed along.
```shell
--sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool
```
+
+#### Running AWS Glue Catalog Sync for Spark DataSource
+
+To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog,
you can use the options mentioned in the
+[AWS
documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write)
+
+#### Running AWS Glue Catalog Sync from EMR
+
+If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog
as external metastore, you can simply run the sync from command line like below:
+
+```shell
+cd /usr/lib/hudi/bin
+
+./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name>
--database <database_name> --table <table_name> --partitioned-by <column_name>
+```
\ No newline at end of file