[
https://issues.apache.org/jira/browse/HUDI-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-4626:
-----------------------------
Fix Version/s: 0.12.1
> Partitioning table by `_hoodie_partition_path` fails
> ----------------------------------------------------
>
> Key: HUDI-4626
> URL: https://issues.apache.org/jira/browse/HUDI-4626
> Project: Apache Hudi
> Issue Type: Bug
> Affects Versions: 0.12.0
> Reporter: Alexey Kudinkin
> Priority: Blocker
> Fix For: 0.12.1
>
>
>
> Currently, creating a table partitioned by "_hoodie_partition_path" using
> Glue catalog fails w/ the following exception:
> {code:java}
> AnalysisException: Found duplicate column(s) in the data schema and the
> partition schema: _hoodie_partition_path
> {code}
> Using following DDL:
> {code:java}
> CREATE EXTERNAL TABLE `active_storage_attachments`( `_hoodie_commit_time`
> string COMMENT '', `_hoodie_commit_seqno` string COMMENT '',
> `_hoodie_record_key` string COMMENT '', `_hoodie_file_name` string COMMENT
> '', `_change_operation_type` string COMMENT '',
> `_upstream_event_processed_ts_ms` bigint COMMENT '',
> `db_shard_source_partition` string COMMENT '', `_event_origin_ts_ms` bigint
> COMMENT '', `_event_tx_id` bigint COMMENT '', `_event_lsn` bigint COMMENT
> '', `_event_xmin` bigint COMMENT '', `id` bigint COMMENT '', `name`
> string COMMENT '', `record_type` string COMMENT '', `record_id` bigint
> COMMENT '', `blob_id` bigint COMMENT '', `created_at` timestamp COMMENT
> '')PARTITIONED BY ( `_hoodie_partition_path` string COMMENT '')ROW FORMAT
> SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' WITH
> SERDEPROPERTIES ( 'hoodie.query.as.ro.table'='false', 'path'='...')
> STORED AS INPUTFORMAT 'org.apache.hudi.hadoop.HoodieParquetInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'LOCATION
> '...'
> TBLPROPERTIES ( 'spark.sql.sources.provider'='hudi' )
> {code}
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)