This is an automated email from the ASF dual-hosted git repository.
adutra pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/polaris-tools.git
The following commit(s) were added to refs/heads/main by this push:
new 601fee2 docs(iceberg-catalog-migrator): Updating the Iceberg Catalog
Migrator documentation (#42)
601fee2 is described below
commit 601fee246b7e917b420e5a1a1f73da4770fe9c45
Author: Adam Christian
<[email protected]>
AuthorDate: Thu Nov 13 13:51:29 2025 -0500
docs(iceberg-catalog-migrator): Updating the Iceberg Catalog Migrator
documentation (#42)
---
iceberg-catalog-migrator/README.md | 377 +--------------------
iceberg-catalog-migrator/docs/examples.md | 122 +++++++
iceberg-catalog-migrator/docs/getting-started.md | 116 +++++++
.../docs/object-store-access-configuration.md | 36 ++
iceberg-catalog-migrator/docs/troubleshooting.md | 37 ++
5 files changed, 322 insertions(+), 366 deletions(-)
diff --git a/iceberg-catalog-migrator/README.md
b/iceberg-catalog-migrator/README.md
index 4128f4d..04fee21 100644
--- a/iceberg-catalog-migrator/README.md
+++ b/iceberg-catalog-migrator/README.md
@@ -17,379 +17,24 @@
- under the License.
-->
-# Objective
-Introduce a command-line tool that enables bulk migration of Iceberg tables
from one catalog to another without the need to copy the data.
+# Iceberg Catalog Migrator
+
+The Iceberg Catalog Migrator is a command-line tool that enables bulk
migration of [Apache Iceberg](https://iceberg.apache.org/) tables from one
[Iceberg Catalog](https://iceberg.apache.org/rest-catalog-spec/) to another
without the need to copy the data. This tool works with all Iceberg Catalogs;
not just Polaris.
-There are various reasons why users may want to move their Iceberg tables to a
different catalog. For instance,
-* They were using hadoop catalog and later realized that it is not production
recommended. So, they want to move tables to other production ready catalogs.
-* They just heard about the awesome Apache Polaris catalog and want to move
their existing iceberg tables to Apache Polaris catalog.
-* They had an on-premise Hive catalog, but want to move tables to a
cloud-based catalog as part of their cloud migration strategy.
-
-The CLI tool should support two commands
-* migrate - To bulk migrate the iceberg tables from source catalog to target
catalog without data copy.
-Table entries from source catalog will be deleted after the successful
migration to the target catalog.
-* register - To bulk register the iceberg tables from source catalog to target
catalog without data copy.
+The migrator tool provides two operations:
+* Migrate - Bulk migration of the Iceberg tables from source catalog to target
catalog. Table entries from source catalog will be deleted after the successful
migration to the target catalog.
+* Register - Bulk register the Iceberg tables from source catalog to target
catalog.
> :warning: `register` command just registers the table.
Which means the table will be present in both the catalogs after registering.
-**Operating same table from more than one catalog can lead to missing updates,
loss of data and table corruption.
-So, it is recommended to use the 'migrate' command in CLI to automatically
delete the table from source catalog after registering
+**Operating same table from more than one catalog can lead to missing updates,
loss of data, and table corruption.
+It is recommended to use the 'migrate' command in CLI to automatically delete
the table from source catalog after registering
or avoid operating tables from the source catalog after registering if
'migrate' command is not used.**
-> :warning: **Avoid using this CLI tool when there are in-progress commits for
tables in the source catalog
+> :warning: **Avoid using this tool when there are in-progress commits for
tables in the source catalog
to prevent missing updates, data loss and table corruption in the target
catalog.
In-progress commits may not be properly transferred and could compromise the
integrity of your data.**
-# Iceberg-catalog-migrator
-Need to have Java installed in your machine (Java 21 is recommended and the
minimum Java version) to use this CLI tool.
-
-Below is the CLI syntax:
-```
-$ java -jar iceberg-catalog-migrator-cli-0.0.1.jar -h
-Usage: iceberg-catalog-migrator [-hV] [COMMAND]
- -h, --help Show this help message and exit.
- -V, --version Print version information and exit.
-Commands:
- migrate Bulk migrate the iceberg tables from source catalog to target
catalog without data copy. Table entries from source catalog will be
- deleted after the successful migration to the target catalog.
- register Bulk register the iceberg tables from source catalog to target
catalog without data copy.
-```
-
-```
-$ java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate -h
-Usage: iceberg-catalog-migrator migrate [-hV] [--disable-safety-prompts]
[--dry-run] [--stacktrace] [--output-dir=<outputDirPath>]
- (--source-catalog-type=<type>
--source-catalog-properties=<String=String>[,<String=String>...]
-
[--source-catalog-properties=<String=String>[,<String=String>...]]...
-
[--source-catalog-hadoop-conf=<String=String>[,<String=String>...]]...
-
[--source-custom-catalog-impl=<customCatalogImpl>])
(--target-catalog-type=<type>
-
--target-catalog-properties=<String=String>[,<String=String>...]
[--target-catalog-properties=<String=String>
- [,<String=String>...]]...
[--target-catalog-hadoop-conf=<String=String>[,<String=String>...]]...
-
[--target-custom-catalog-impl=<customCatalogImpl>])
[--identifiers=<identifiers>[,<identifiers>...]
-
[--identifiers=<identifiers>[,<identifiers>...]]... |
--identifiers-from-file=<identifiersFromFile> |
- --identifiers-regex=<identifiersRegEx>]
-Bulk migrate the iceberg tables from source catalog to target catalog without
data copy. Table entries from source catalog will be deleted after the
-successful migration to the target catalog.
- --output-dir=<outputDirPath>
- Optional local output directory path to write CLI output
files like `failed_identifiers.txt`, `failed_to_delete_at_source.txt`,
- `dry_run_identifiers.txt`. If not specified, uses the
present working directory.
- Example: --output-dir /tmp/output/
- --output-dir $PWD/output_folder
- --dry-run Optional configuration to simulate the registration
without actually registering. Can learn about a list of tables that will be
- registered by running this.
- --disable-safety-prompts
- Optional configuration to disable safety prompts which
needs console input.
- --stacktrace Optional configuration to enable capturing stacktrace in
logs in case of failures.
- -h, --help Show this help message and exit.
- -V, --version Print version information and exit.
-Source catalog options:
- --source-catalog-type=<type>
- Source catalog type. Can be one of these [CUSTOM,
DYNAMODB, ECS, GLUE, HADOOP, HIVE, JDBC, NESSIE, REST].
- Example: --source-catalog-type GLUE
- --source-catalog-type NESSIE
- --source-catalog-properties=<String=String>[,<String=String>...]
- Iceberg catalog properties for source catalog (like uri,
warehouse, etc).
- Example: --source-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouseNessie
- --source-catalog-hadoop-conf=<String=String>[,<String=String>...]
- Optional source catalog Hadoop configurations required by
the Iceberg catalog.
- Example: --source-catalog-hadoop-conf
key1=value1,key2=value2
- --source-custom-catalog-impl=<customCatalogImpl>
- Optional fully qualified class name of the custom catalog
implementation of the source catalog. Required when the catalog type
- is CUSTOM.
- Example: --source-custom-catalog-impl
org.apache.iceberg.AwesomeCatalog
-Target catalog options:
- --target-catalog-type=<type>
- Target catalog type. Can be one of these [CUSTOM,
DYNAMODB, ECS, GLUE, HADOOP, HIVE, JDBC, NESSIE, REST].
- Example: --target-catalog-type GLUE
- --target-catalog-type NESSIE
- --target-catalog-properties=<String=String>[,<String=String>...]
- Iceberg catalog properties for target catalog (like uri,
warehouse, etc).
- Example: --target-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouseNessie
- --target-catalog-hadoop-conf=<String=String>[,<String=String>...]
- Optional target catalog Hadoop configurations required by
the Iceberg catalog.
- Example: --target-catalog-hadoop-conf
key1=value1,key2=value2
- --target-custom-catalog-impl=<customCatalogImpl>
- Optional fully qualified class name of the custom catalog
implementation of the target catalog. Required when the catalog type
- is CUSTOM.
- Example: --target-custom-catalog-impl
org.apache.iceberg.AwesomeCatalog
-Identifier options:
- --identifiers=<identifiers>[,<identifiers>...]
- Optional selective set of identifiers to register. If not
specified, all the tables will be registered. Use this when there are
- few identifiers that need to be registered. For a large
number of identifiers, use the `--identifiers-from-file` or
- `--identifiers-regex` option.
- Example: --identifiers foo.t1,bar.t2
- --identifiers-from-file=<identifiersFromFile>
- Optional text file path that contains a set of table
identifiers (one per line) to register. Should not be used with
- `--identifiers` or `--identifiers-regex` option.
- Example: --identifiers-from-file /tmp/files/ids.txt
- --identifiers-regex=<identifiersRegEx>
- Optional regular expression pattern used to register only
the tables whose identifiers match this pattern. Should not be used
- with `--identifiers` or '--identifiers-from-file'
option.
- Example: --identifiers-regex ^foo\..*
-```
-
-Note: Options for register command is exactly same as migrate command.
-
-# Sample Inputs
-
-Note:
-a) Before migrating tables to Apache polaris, Make sure the catalog instance
is configured to the `base-location`
-same as source catalog `warehouse` location during catalog creation.
-
-```
-{
- "catalog": {
- "name": "test",
- "type": "INTERNAL",
- "readOnly": false,
- "properties": {
- "default-base-location": "file:/path/to/source_catalog"
- },
- "storageConfigInfo": {
- "storageType": "FILE",
- "allowedLocations": [
- "file:/path/to/source_catalog"
- ]
- }
- }
-}
-```
-
-b) Get the Oauth token and export it to the local variable
-
-```shell
-curl -X POST http://localhost:8181/api/catalog/v1/oauth/tokens \
--d "grant_type=client_credentials" \
--d "client_id=my-client-id" \
--d "client_secret=my-client-secret" \
--d "scope=PRINCIPAL_ROLE:ALL"
-
-export TOKEN=xxxxxxx
-```
-
-c) Also export the required storage related configs and use them respectively
for catalog configuration.
-For s3,
-
-```shell
-export AWS_ACCESS_KEY_ID=xxxxxxx
-export AWS_SECRET_ACCESS_KEY=xxxxxxx
-export AWS_S3_ENDPOINT=xxxxxxx
-```
-
-for ADLS,
-```shell
-export AZURE_SAS_TOKEN=<token>
-```
-
-## Bulk registering all the tables from Hadoop catalog to Polaris catalog
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar register \
---source-catalog-type HADOOP \
---source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
---target-catalog-type REST \
---target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
-```
-
-## Migrate selected tables (t1,t2 in namespace foo) from Hadoop catalog to
Polaris catalog
-
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type HADOOP \
---source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
---target-catalog-type REST \
---target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN \
---identifiers foo.t1,foo.t2
-```
-
-## Migrate all tables from GLUE catalog to Polaris catalog
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type GLUE \
---source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO \
---target-catalog-type REST \
---target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
-```
-
-## Migrate all tables from HIVE catalog to Polaris catalog
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type HIVE \
---source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083
\
---target-catalog-type REST \
---target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
-```
-
-Note: Need to configure `ALLOW_UNSTRUCTURED_TABLE_LOCATION` property at the
polaris server side as
-HMS creates a namespace folder with ".db" extension. Also need to configure
`allowedLocations` to be
-source catalog directory in `storage_configuration_info`.
-
-## Migrate all tables from DYNAMODB catalog to Polaris catalog
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type DYNAMODB \
---source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO \
---target-catalog-type REST \
---target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
-```
-
-## Migrate all tables from JDBC catalog to Polaris catalog
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type JDBC \
---source-catalog-properties
warehouse=/tmp/warehouseJdbc,jdbc.user=root,jdbc.password=pass,uri=jdbc:mysql://localhost:3306/db1,name=catalogName
\
---target-catalog-type REST \
---target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
-```
-
-# Scenarios
-## A. User wants to try out a new catalog
-Users can use a new catalog by creating a fresh table to test the new
catalog's capabilities.
-
-## B. Users wants to move the tables from one catalog (example: Hive) to
another (example: Nessie).
-
-### B.1) Executes `--dry-run` option to check which tables will get migrated.
-
-Sample input:
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type HIVE \
---source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083
\
---target-catalog-type NESSIE \
---target-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouse \
---dry-run
-```
-
-After validating all inputs, the console will display a list of table
identifiers, that are identified for migration, along with the total count.
-This information will also be written to a file called `dry_run.txt`,
-The list of table identifiers in `dry_run.txt` can be altered (if needed) and
reused for the actual migration using the `--identifiers-from-file` option;
thus eliminating the need for the tool to list the tables from the catalog in
the actual run.
-
-### B.2) Executes the migration of all 1000 tables and all the tables are
successfully migrated.
-
-Sample input:
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type HIVE \
---source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083
\
---target-catalog-type NESSIE \
---target-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouse
-```
-
-After input validation, users will receive a prompt message with the option to
either abort or continue the operation.
-
-```
-WARN - User has not specified the table identifiers. Will be selecting all
the tables from all the namespaces from the source catalog.
-INFO - Configured source catalog: SOURCE_CATALOG_HIVE
-INFO - Configured target catalog: TARGET_CATALOG_NESSIE
-WARN -
- a) Executing catalog migration when the source catalog has some
in-progress commits
- can lead to a data loss as the in-progress commits will not be
considered for migration.
- So, while using this tool please make sure there are no in-progress
commits for the source catalog.
-
- b) After the migration, successfully migrated tables will be deleted
from the source catalog
- and can only be accessed from the target catalog.
-INFO - Are you certain that you wish to proceed, after reading the above
warnings? (yes/no):
-```
-
-If the user chooses to continue, additional information will be displayed on
the console.
-
-```
-INFO - Continuing...
-INFO - Identifying tables for migration ...
-INFO - Identified 1000 tables for migration.
-INFO - Started migration ...
-INFO - Attempted Migration for 100 tables out of 1000 tables.
-INFO - Attempted Migration for 200 tables out of 1000 tables.
-.
-.
-.
-INFO - Attempted Migration for 900 tables out of 1000 tables.
-INFO - Attempted Migration for 1000 tables out of 1000 tables.
-INFO - Finished migration ...
-INFO - Summary:
-INFO - Successfully migrated 1000 tables from HIVE catalog to NESSIE catalog.
-INFO - Details:
-INFO - Successfully migrated these tables:
-[foo.tbl-1, foo.tbl-2, bar.tbl-4, bar.tbl-3, …, …,bar.tbl-1000]
-```
-
-Please note that a log file will be created, which will print "successfully
migrated table X" for every table migration,
-and also log any table level failures, if present.
-
-### B.3) Executes the migration and out of 1000 tables 10 tables have failed
to migrate because of some error. Remaining 990 tables were successfully
migrated.
-
-Sample input:
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type HIVE \
---source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083
\
---target-catalog-type NESSIE \
---target-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouse \
---stacktrace
-```
-
-Console output will be same as B.2) till summary because even in case of
failure,
-all the identified tables will be attempted for migration.
-
-```
-INFO - Summary:
-INFO - Successfully migrated 990 tables from HIVE catalog to NESSIE catalog.
-ERROR - Failed to migrate 10 tables from HIVE catalog to NESSIE catalog.
Please check the `catalog_migration.log` file for the failure reason.
-Failed Identifiers are written to `failed_identifiers.txt`. Retry with that
file using the `--identifiers-from-file` option if the failure is because of
network/connection timeouts.
-INFO - Details:
-INFO - Successfully migrated these tables:
-[foo.tbl-1, foo.tbl-2, bar.tbl-4, bar.tbl-3, …, …,bar.tbl-1000]
-ERROR - Failed to migrate these tables:
-[bar.tbl-201, foo.tbl-202, …, …,bar.tbl-210]
-```
-
-Please note that a log file will be generated, which will print "successfully
migrated table X" for every table migration and log any table-level failures in
the `failed_identifiers.txt` file.
-Users can use this file to identify failed tables and search for them in the
log, which will contain the exception stacktrace for those 10 tables.
-This can help users understand why the migration failed.
-* If the migration of those tables failed with `TableAlreadyExists` exception,
users can rename the tables in the source catalog and migrate only those 10
tables using any of the identifier options available in the argument.
-* If the migration of those tables failed with `ConnectionTimeOut` exception,
users can retry migrating only those 10 tables using the
`--identifiers-from-file` option with the `failed_identifiers.txt` file.
-* If the migration is successful but deletion of some tables form source
catalog is failed, summary will mention that these table names were written
into the `failed_to_delete.txt` file and logs will capture the failure reason.
-Do not operate these tables from the source catalog and user will have to
delete them manually.
-
-### B.4) Executes the migration and out of 1000 tables. But manually aborts
the migration by killing the process.
-
-To determine the number of migrated tables, the user can either review the log
or use the `listTables()` function in the target catalog.
-In the event of an abort, migrated tables may not be deleted from the source
catalog, and users should avoid manipulating them from there.
-To recover, users can manually remove these tables from the source catalog or
attempt a bulk migration to transfer all tables from the source catalog.
-
-### B.5) Users need to move away from one catalog to another with selective
tables (maybe want to move only the production tables, test tables, etc)
-
-Users can provide the selective list of identifiers to migrate using any of
these 3 options
-`--identifiers`, `--identifiers-from-file`, `--identifier-regex` and it can be
used along with the dry-run option too.
-
-Sample input: (only migrate tables that starts with "foo.")
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type HIVE \
---source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083
\
---target-catalog-type NESSIE \
---target-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouse \
---identifiers-regex ^foo\..*
-
-```
-
-Sample input: (migrate all tables in the file ids.txt where each entry is
delimited by newline)
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type HIVE \
---source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
---target-catalog-type NESSIE \
---target-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouse \
---identifiers-from-file ids.txt
-```
-
-Sample input: (migrate only two tables foo.tbl1, foo.tbl2)
-```shell
-java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
---source-catalog-type HIVE \
---source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083
\
---target-catalog-type NESSIE \
---target-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouse \
---identifiers foo.tbl1,foo.tbl2
-```
+Please use the [getting started guide](docs/getting-started.md) for a
step-by-step guide on how to use the tool.
-Console will clearly print that only these identifiers are used for table
migration.
-Rest of the behavior will be the same as mentioned in the previous sections.
\ No newline at end of file
+Please use the [examples guide](./docs/examples.md) to learn about the
different options available in the tool.
diff --git a/iceberg-catalog-migrator/docs/examples.md
b/iceberg-catalog-migrator/docs/examples.md
new file mode 100644
index 0000000..dbc7a8a
--- /dev/null
+++ b/iceberg-catalog-migrator/docs/examples.md
@@ -0,0 +1,122 @@
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one
+ - or more contributor license agreements. See the NOTICE file
+ - distributed with this work for additional information
+ - regarding copyright ownership. The ASF licenses this file
+ - to you under the Apache License, Version 2.0 (the
+ - "License"); you may not use this file except in compliance
+ - with the License. You may obtain a copy of the License at
+ -
+ - http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing,
+ - software distributed under the License is distributed on an
+ - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ - KIND, either express or implied. See the License for the
+ - specific language governing permissions and limitations
+ - under the License.
+ -->
+
+# Examples
+
+This document provides examples of how to use the Iceberg Catalog Migrator for
various Iceberg Catalogs. It is broken into:
+1. [Registration Examples](#registration-examples)
+2. [Migration Examples](#migration-examples)
+3. [Tips](#tips)
+
+For more information on how handle failures, please refer to [the
troubleshooting guide](./troubleshooting.md).
+
+## Registration Examples
+Below are some examples of registering tables from one catalog to another.
+
+### Registering All Tables from Hadoop Catalog to Polaris Catalog
+
+```shell
+java -jar iceberg-catalog-migrator-cli-0.0.1.jar register \
+--source-catalog-type HADOOP \
+--source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
+--target-catalog-type REST \
+--target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
+```
+
+## Migration Examples
+
+### Migrate Selected Tables from Hadoop Catalog to Polaris Catalog
+
+In this example, only tables t1 and t2 in namespace foo will be migrated.
+
+```shell
+java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
+--source-catalog-type HADOOP \
+--source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
+--target-catalog-type REST \
+--target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN \
+--identifiers foo.t1,foo.t2
+```
+
+### Migrate from GLUE Catalog to Polaris Catalog
+```shell
+java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
+--source-catalog-type GLUE \
+--source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO \
+--target-catalog-type REST \
+--target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
+```
+
+### Migrate from HIVE Catalog to Polaris Catalog
+```shell
+java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
+--source-catalog-type HIVE \
+--source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083
\
+--target-catalog-type REST \
+--target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
+```
+
+Note: You will need to configure `ALLOW_UNSTRUCTURED_TABLE_LOCATION` property
on the Polaris server side as
+HMS creates a namespace folder with ".db" extension. In addition, you will
need to configure `allowedLocations` to be
+source catalog directory in `storage_configuration_info`.
+
+### Migrate from DYNAMODB Catalog to Polaris Catalog
+```shell
+java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
+--source-catalog-type DYNAMODB \
+--source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO \
+--target-catalog-type REST \
+--target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
+```
+
+### Migrate from JDBC Catalog to Polaris Catalog
+```shell
+java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
+--source-catalog-type JDBC \
+--source-catalog-properties
warehouse=/tmp/warehouseJdbc,jdbc.user=root,jdbc.password=pass,uri=jdbc:mysql://localhost:3306/db1,name=catalogName
\
+--target-catalog-type REST \
+--target-catalog-properties
uri=http://localhost:60904/api/catalog,warehouse=test,token=$TOKEN
+```
+
+### Migrate Only Tables Starting with "foo"
+```shell
+java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
+--source-catalog-type HIVE \
+--source-catalog-properties
warehouse=s3a://some-bucket/wh/,io-impl=org.apache.iceberg.aws.s3.S3FileIO,uri=thrift://localhost:9083
\
+--target-catalog-type NESSIE \
+--target-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouse \
+--identifiers-regex ^foo\..*
+
+```
+
+### Migrate Tables from a File
+The file idx.txt contains each table identifier to migrate delimited by a
newline.
+
+```shell
+java -jar iceberg-catalog-migrator-cli-0.0.1.jar migrate \
+--source-catalog-type HIVE \
+--source-catalog-properties warehouse=/tmp/warehouse,type=hadoop \
+--target-catalog-type NESSIE \
+--target-catalog-properties
uri=http://localhost:19120/api/v1,ref=main,warehouse=/tmp/warehouse \
+--identifiers-from-file ids.txt
+```
+
+
+## Tips
+1. Before migrating tables to Polaris, make sure the catalog is configured to
the `base-location` same as source catalog `warehouse` location during catalog
creation.
diff --git a/iceberg-catalog-migrator/docs/getting-started.md
b/iceberg-catalog-migrator/docs/getting-started.md
new file mode 100644
index 0000000..48bd6f9
--- /dev/null
+++ b/iceberg-catalog-migrator/docs/getting-started.md
@@ -0,0 +1,116 @@
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one
+ - or more contributor license agreements. See the NOTICE file
+ - distributed with this work for additional information
+ - regarding copyright ownership. The ASF licenses this file
+ - to you under the Apache License, Version 2.0 (the
+ - "License"); you may not use this file except in compliance
+ - with the License. You may obtain a copy of the License at
+ -
+ - http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing,
+ - software distributed under the License is distributed on an
+ - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ - KIND, either express or implied. See the License for the
+ - specific language governing permissions and limitations
+ - under the License.
+ -->
+
+# Getting Started
+
+This document provides a step-by-step guide on how to use the Iceberg Catalog
Migrator.
+This guide uses an example of migrating from a Polaris catalog to another
Polaris catalog that are backed by an AWS S3 bucket.
+
+## Prerequisites
+1. Java 21 or later installed
+2. Have a target catalog created and configured
+3. Have a source catalog to migrate from
+4. Block in-progress commits to the source catalog
+
+## Getting Started
+Migration happens in five steps:
+1. Build the Iceberg Catalog Migrator
+2. Set the object storage environment variables
+3. Get access to the source and target catalogs
+4. Validate the migration
+5. Migrate the tables
+
+### Step 1: Build the Iceberg Catalog Migrator
+Execute the following commands to build the tool:
+```shell
+git clone https://github.com/apache/polaris-tools.git
+cd polaris-tools/iceberg-catalog-migrator
+./gradlew build
+```
+
+These commands:
+1. Clone the repository
+2. Navigate to the `iceberg-catalog-migrator` directory
+3. Build the tool
+4. Create a JAR file in `iceberg-catalog-migrator/cli/build/libs/` directory
+
+The JAR file will be created with name
`iceberg-catalog-migrator-cli-<version>.jar` where `<version>` is the version
of the tool found in the `iceberg-catalog-migrator/version.txt` file. For the
examples below, we will assume the version is `0.0.1-SNAPSHOT`, so the JAR file
name will be `iceberg-catalog-migrator-cli-0.0.1-SNAPSHOT.jar`.
+
+### Step 2: Set the Object Storage Environment Variables
+The tool will need access to the underlying object storage via environmental
variables. For this example, we will use AWS S3 with an access key and id:
+```shell
+export AWS_ACCESS_KEY_ID=<access_key>
+export AWS_SECRET_ACCESS_KEY=<secret_key>
+```
+
+For more information on configuring access to object storage, please see [this
guide](./object-store-access-configuration.md).
+
+### Step 3: Get Access to the Source and Target Catalogs
+The tool will need to be authorized to the source & target catalogs. In this
example, we will use two Polaris catalogs. For getting access to a Polaris
catalog, use the OAuth token endpoint like:
+```shell
+curl -X POST http://sourcecatalog:8181/api/catalog/v1/oauth/tokens \
+-d "grant_type=client_credentials" \
+-d "client_id=my-client-id" \
+-d "client_secret=my-client-secret" \
+-d "scope=PRINCIPAL_ROLE:ALL"
+
+export TOKEN_SOURCE=xxxxxxx
+
+curl -X POST http://targetcatalog:8181/api/catalog/v1/oauth/tokens \
+-d "grant_type=client_credentials" \
+-d "client_id=my-client-id" \
+-d "client_secret=my-client-secret" \
+-d "scope=PRINCIPAL_ROLE:ALL"
+
+export TOKEN_TARGET=xxxxxxx
+```
+
+### Step 4: Validate the Migration
+Execute the following command to understand how to migrate the tables:
+```shell
+java -jar ./cli/build/libs/iceberg-catalog-migrator-cli-0.0.1-SNAPSHOT.jar
register -h
+```
+
+In the example, execute the following command to perform a dry run migration.
This will not migrate the tables but will provide information on the operation:
+```shell
+java -jar ./cli/build/libs/iceberg-catalog-migrator-cli-0.0.1-SNAPSHOT.jar
register \
+--source-catalog-type REST \
+--source-catalog-properties
uri=http://sourcecatalog:8181/api/catalog,warehouse=test,token=$TOKEN_SOURCE \
+--target-catalog-type REST \
+--target-catalog-properties
uri=http://targetcatalog:8181/api/catalog,warehouse=test,token=$TOKEN_TARGET \
+--dry-run
+```
+
+After validating all inputs, the console will display a list of table
identifiers that are identified for migration. This information will also be
written to a file called `dry_run.txt`,
+
+### Step 5: Migrate the Tables
+
+In the example, execute the following command to perform a migration:
+```shell
+java -jar ./cli/build/libs/iceberg-catalog-migrator-cli-0.0.1-SNAPSHOT.jar
migrate \
+--source-catalog-type REST \
+--source-catalog-properties
uri=http://sourcecatalog:8181/api/catalog,warehouse=test,token=$TOKEN_SOURCE \
+--target-catalog-type REST \
+--target-catalog-properties
uri=http://targetcatalog:8181/api/catalog,warehouse=test,token=$TOKEN_TARGET
+```
+
+Please note that a log file will be created to verify the migration proceeded
successfully.
+If any issues occur, please use [the troubleshooting
guide](./troubleshooting.md).
+
+For more example migrations, please see [this guide](./examples.md).
diff --git a/iceberg-catalog-migrator/docs/object-store-access-configuration.md
b/iceberg-catalog-migrator/docs/object-store-access-configuration.md
new file mode 100644
index 0000000..a28b760
--- /dev/null
+++ b/iceberg-catalog-migrator/docs/object-store-access-configuration.md
@@ -0,0 +1,36 @@
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one
+ - or more contributor license agreements. See the NOTICE file
+ - distributed with this work for additional information
+ - regarding copyright ownership. The ASF licenses this file
+ - to you under the Apache License, Version 2.0 (the
+ - "License"); you may not use this file except in compliance
+ - with the License. You may obtain a copy of the License at
+ -
+ - http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing,
+ - software distributed under the License is distributed on an
+ - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ - KIND, either express or implied. See the License for the
+ - specific language governing permissions and limitations
+ - under the License.
+ -->
+
+# Object Store Access Configuration
+
+This document provides a guide on how to configure access to object stores for
the Iceberg Catalog Migrator.
+
+## AWS S3
+For AWS, you can use the following environment variables:
+```shell
+export AWS_ACCESS_KEY_ID=xxxxxxx
+export AWS_SECRET_ACCESS_KEY=xxxxxxx
+export AWS_S3_ENDPOINT=xxxxxxx
+```
+
+## ADLS
+For ADLS, you can use the following environment variables:
+```shell
+export AZURE_SAS_TOKEN=xxxxxxx
+```
diff --git a/iceberg-catalog-migrator/docs/troubleshooting.md
b/iceberg-catalog-migrator/docs/troubleshooting.md
new file mode 100644
index 0000000..dd44493
--- /dev/null
+++ b/iceberg-catalog-migrator/docs/troubleshooting.md
@@ -0,0 +1,37 @@
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one
+ - or more contributor license agreements. See the NOTICE file
+ - distributed with this work for additional information
+ - regarding copyright ownership. The ASF licenses this file
+ - to you under the Apache License, Version 2.0 (the
+ - "License"); you may not use this file except in compliance
+ - with the License. You may obtain a copy of the License at
+ -
+ - http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing,
+ - software distributed under the License is distributed on an
+ - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ - KIND, either express or implied. See the License for the
+ - specific language governing permissions and limitations
+ - under the License.
+ -->
+
+# Troubleshooting
+
+This document provides troubleshooting information for common issues
encountered while using the Iceberg Catalog Migrator:
+1. [Errors while migrating tables](#errors-while-migrating-tables)
+2. [Manually aborting the migration](#manually-aborting-the-migration)
+
+## Errors while Migrating Tables
+There can be errors while migrating tables. These errors can come from the
source or the target. To troubleshoot:
+1. Look at the console output or the log file to identify the failed tables.
In the logs, there will be an exception stracktrace for each failed table.
+2. If the migration of those tables failed with `TableAlreadyExists`
exception, there is a conflict in the table identifiers in the target catalog.
Users can rename the tables in the source catalog.
+3. If the migration of those tables failed with `ConnectionTimeOut` exception,
users can retry migrating only those tables using the `--identifiers-from-file`
option with the `failed_identifiers.txt` file created in the output directory.
+4. If the migration is successful but deletion of some tables form source
catalog is failed, summary will mention that these table names were written
into the `failed_to_delete.txt` file and logs will capture the failure reason.
Do not operate these tables from the source catalog and user will have to
delete them manually.
+
+## Manually Aborting the Migration
+If a migration was manually aborted:
+1. Determine the number of migrated tables. A user can either review the log
or use the `listTables()` function in the target catalog.
+2. Migrated tables may not be deleted from the source catalog. Users should
avoid manipulating these tables in the source catalog.
+3. To recover, the user can manually remove these tables from the source
catalog or attempt a bulk migration to transfer all tables from the source
catalog. Please note that this may result in several `TableAlreadyExists`
exceptions as many of the tables may have already been migrated.