jackye1995 opened a new pull request #2688: URL: https://github.com/apache/iceberg/pull/2688
Add DynamoDB catalog implementation, with the following specifications: 1. identifier column (partition key): table identifier string, or `NAMESPACE` for namespaces 2. namespace column (sort key): namespace string 3. a global secondary index with namespace as partition key, identifier as sort key 4. version column : UUID string, used for optimistic locking 5. updated_at column : timestamp long, used to record latest update time 6. created_at column : timestamp long, used to record initial create time 7. p.[property_key] column : string, used to store properties (namespace property or Iceberg-defined table properties including `table_type`, `metadata_location` and `previous_metadata_location`) This design has the following benefits: 1. table name is used directly as partition key to avoid any potential hot partition issue, comparing to use namespace as partition key and table name as sort key 2. namespace operations are clustered in a single partition to avoid affecting table commit operations 3. a reverse GSI is used for list table operation, and all other operations are single row ops or single partition query 4. a string UUID version field is used instead of updated_at to avoid 2 processes committing at the same millisecond 5. multi-row transaction is used for `renameTable` to ensure idempotency 6. storage per row and update overhead is minimized by flattening properties with a `p.` prefix, instead of placing them in a single nested map type column. Limitations: 1. To avoid complications in parsing namespace, dot (`.`) is not allowed in any level of namespace 2. Similarly, to avoid complications in parsing table identifier, dot is not allowed in table name. @yyanyy @rdblue @SreeramGarlapati @johnclara @danielcweeks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
