This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push: new 0aa1ecc907e added onetable support (#10461) 0aa1ecc907e is described below commit 0aa1ecc907ea2337e6e6b0e26dad4eba8c26e36a Author: Sagar Lakshmipathy <18vidhyasa...@gmail.com> AuthorDate: Mon Jan 8 16:33:53 2024 -0800 added onetable support (#10461) --- website/docs/syncing_onetable.md | 54 ++++++++++++++++++++++ website/sidebars.js | 3 +- .../version-0.14.0/syncing_onetable.md | 54 ++++++++++++++++++++++ .../version-0.14.1/syncing_onetable.md | 54 ++++++++++++++++++++++ .../version-0.14.0-sidebars.json | 3 +- .../version-0.14.1-sidebars.json | 3 +- 6 files changed, 168 insertions(+), 3 deletions(-) diff --git a/website/docs/syncing_onetable.md b/website/docs/syncing_onetable.md new file mode 100644 index 00000000000..99760e101c0 --- /dev/null +++ b/website/docs/syncing_onetable.md @@ -0,0 +1,54 @@ +--- +title: OneTable +keywords: [onetable, hudi, delta-lake, iceberg, sync] +--- + +Hudi (tables created from 0.14.0 onwards) supports syncing to [OneTable](https://onetable.dev/), providing users with the option to interoperate with other table formats like Delta Lake and Apache Iceberg. + +## Interoperating with OneTable + +If you have tables in one of the supported formats (Delta/Iceberg), you can use OneTable to translate the existing metadata to read as a Hudi table and vice versa. + +### Installation + +You can work with OneTable by either building the jar from the [source](https://github.com/onetable-io/onetable) or by downloading from [GitHub packages](https://github.com/onetable-io/onetable/packages/1986830). + +:::tip Note +If you're using one of the JVM languages to work with Hudi/Delta/Iceberg, you can directly use OneTable as a [dependency](https://github.com/onetable-io/onetable/packages/1986830) in your project. +This is highlighted in this [demo](https://onetable.dev/docs/demo/docker). +::: + +### Syncing to OneTable + +Once you have the jar, you can simply run it against a Hudi/Delta/Iceberg table to add target table format metadata to the table. +Below is an example configuration to translate a Hudi table to Delta & Iceberg table. + +```shell md title="my_config.yaml" +sourceFormat: HUDI +targetFormats: + - DELTA + - ICEBERG +datasets: + - + tableBasePath: path/to/hudi/table + tableName: tableName + partitionSpec: partition_field_name:VALUE +``` + +```shell md title="shell" +java -jar path/to/bundled-onetable.jar --datasetConfig path/to/my_config.yaml +``` + +### Hudi Streamer Extensions +If you want to use OneTable with Hudi Streamer to sync each commit into other table formats, you have to + +1. Add the [extensions jar](https://github.com/onetable-io/onetable/tree/main/hudi-support/extensions) `hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` to your class path. +2. Add `io.onetable.hudi.sync.OneTableSyncTool` to your list of sync classes +3. Set the following configurations based on your preferences: + + ``` + hoodie.onetable.formats: "ICEBERG,DELTA" + hoodie.onetable.target.metadata.retention.hr: 168 + ``` + +For more examples, you can refer to the [OneTable docs](https://onetable.dev/docs/how-to). \ No newline at end of file diff --git a/website/sidebars.js b/website/sidebars.js index 9ee664a606c..72456f554bb 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -57,7 +57,8 @@ module.exports = { 'syncing_aws_glue_data_catalog', 'syncing_datahub', 'syncing_metastore', - "gcp_bigquery" + 'gcp_bigquery', + 'syncing_onetable' ], } ], diff --git a/website/versioned_docs/version-0.14.0/syncing_onetable.md b/website/versioned_docs/version-0.14.0/syncing_onetable.md new file mode 100644 index 00000000000..99760e101c0 --- /dev/null +++ b/website/versioned_docs/version-0.14.0/syncing_onetable.md @@ -0,0 +1,54 @@ +--- +title: OneTable +keywords: [onetable, hudi, delta-lake, iceberg, sync] +--- + +Hudi (tables created from 0.14.0 onwards) supports syncing to [OneTable](https://onetable.dev/), providing users with the option to interoperate with other table formats like Delta Lake and Apache Iceberg. + +## Interoperating with OneTable + +If you have tables in one of the supported formats (Delta/Iceberg), you can use OneTable to translate the existing metadata to read as a Hudi table and vice versa. + +### Installation + +You can work with OneTable by either building the jar from the [source](https://github.com/onetable-io/onetable) or by downloading from [GitHub packages](https://github.com/onetable-io/onetable/packages/1986830). + +:::tip Note +If you're using one of the JVM languages to work with Hudi/Delta/Iceberg, you can directly use OneTable as a [dependency](https://github.com/onetable-io/onetable/packages/1986830) in your project. +This is highlighted in this [demo](https://onetable.dev/docs/demo/docker). +::: + +### Syncing to OneTable + +Once you have the jar, you can simply run it against a Hudi/Delta/Iceberg table to add target table format metadata to the table. +Below is an example configuration to translate a Hudi table to Delta & Iceberg table. + +```shell md title="my_config.yaml" +sourceFormat: HUDI +targetFormats: + - DELTA + - ICEBERG +datasets: + - + tableBasePath: path/to/hudi/table + tableName: tableName + partitionSpec: partition_field_name:VALUE +``` + +```shell md title="shell" +java -jar path/to/bundled-onetable.jar --datasetConfig path/to/my_config.yaml +``` + +### Hudi Streamer Extensions +If you want to use OneTable with Hudi Streamer to sync each commit into other table formats, you have to + +1. Add the [extensions jar](https://github.com/onetable-io/onetable/tree/main/hudi-support/extensions) `hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` to your class path. +2. Add `io.onetable.hudi.sync.OneTableSyncTool` to your list of sync classes +3. Set the following configurations based on your preferences: + + ``` + hoodie.onetable.formats: "ICEBERG,DELTA" + hoodie.onetable.target.metadata.retention.hr: 168 + ``` + +For more examples, you can refer to the [OneTable docs](https://onetable.dev/docs/how-to). \ No newline at end of file diff --git a/website/versioned_docs/version-0.14.1/syncing_onetable.md b/website/versioned_docs/version-0.14.1/syncing_onetable.md new file mode 100644 index 00000000000..99760e101c0 --- /dev/null +++ b/website/versioned_docs/version-0.14.1/syncing_onetable.md @@ -0,0 +1,54 @@ +--- +title: OneTable +keywords: [onetable, hudi, delta-lake, iceberg, sync] +--- + +Hudi (tables created from 0.14.0 onwards) supports syncing to [OneTable](https://onetable.dev/), providing users with the option to interoperate with other table formats like Delta Lake and Apache Iceberg. + +## Interoperating with OneTable + +If you have tables in one of the supported formats (Delta/Iceberg), you can use OneTable to translate the existing metadata to read as a Hudi table and vice versa. + +### Installation + +You can work with OneTable by either building the jar from the [source](https://github.com/onetable-io/onetable) or by downloading from [GitHub packages](https://github.com/onetable-io/onetable/packages/1986830). + +:::tip Note +If you're using one of the JVM languages to work with Hudi/Delta/Iceberg, you can directly use OneTable as a [dependency](https://github.com/onetable-io/onetable/packages/1986830) in your project. +This is highlighted in this [demo](https://onetable.dev/docs/demo/docker). +::: + +### Syncing to OneTable + +Once you have the jar, you can simply run it against a Hudi/Delta/Iceberg table to add target table format metadata to the table. +Below is an example configuration to translate a Hudi table to Delta & Iceberg table. + +```shell md title="my_config.yaml" +sourceFormat: HUDI +targetFormats: + - DELTA + - ICEBERG +datasets: + - + tableBasePath: path/to/hudi/table + tableName: tableName + partitionSpec: partition_field_name:VALUE +``` + +```shell md title="shell" +java -jar path/to/bundled-onetable.jar --datasetConfig path/to/my_config.yaml +``` + +### Hudi Streamer Extensions +If you want to use OneTable with Hudi Streamer to sync each commit into other table formats, you have to + +1. Add the [extensions jar](https://github.com/onetable-io/onetable/tree/main/hudi-support/extensions) `hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` to your class path. +2. Add `io.onetable.hudi.sync.OneTableSyncTool` to your list of sync classes +3. Set the following configurations based on your preferences: + + ``` + hoodie.onetable.formats: "ICEBERG,DELTA" + hoodie.onetable.target.metadata.retention.hr: 168 + ``` + +For more examples, you can refer to the [OneTable docs](https://onetable.dev/docs/how-to). \ No newline at end of file diff --git a/website/versioned_sidebars/version-0.14.0-sidebars.json b/website/versioned_sidebars/version-0.14.0-sidebars.json index f56d56a1e20..245e9eb9bec 100644 --- a/website/versioned_sidebars/version-0.14.0-sidebars.json +++ b/website/versioned_sidebars/version-0.14.0-sidebars.json @@ -50,7 +50,8 @@ "syncing_aws_glue_data_catalog", "syncing_datahub", "syncing_metastore", - "gcp_bigquery" + "gcp_bigquery", + "syncing_onetable" ] } ] diff --git a/website/versioned_sidebars/version-0.14.1-sidebars.json b/website/versioned_sidebars/version-0.14.1-sidebars.json index 1de288cea81..f6ce1b05983 100644 --- a/website/versioned_sidebars/version-0.14.1-sidebars.json +++ b/website/versioned_sidebars/version-0.14.1-sidebars.json @@ -50,7 +50,8 @@ "syncing_aws_glue_data_catalog", "syncing_datahub", "syncing_metastore", - "gcp_bigquery" + "gcp_bigquery", + "syncing_onetable" ] } ]