[jira] [Updated] (HUDI-3319) Prepare metadata table testing environment in cluster

Ethan Guo (Jira) Mon, 24 Jan 2022 22:51:28 -0800


     [ 
https://issues.apache.org/jira/browse/HUDI-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ethan Guo updated HUDI-3319:
----------------------------
    Description: 
The goal is to setup a long-running ingestion pipeline writing to a Hudi table 
with metadata table enabled in a cluster environment with Spark.  A few things 
to consider:
 * Long-running for a few days
 * Different table type: COW, MOR
 * Some aggressive configs around compaction, archival, cleaner, to hit 
possible concurrency cases
 * Multi-writer: one writer for continuous ingestion, another writer with 
periodic backfills / async table services
 * Data validation: making sure both data table and metadata table are intact, 
with expected data

> Prepare metadata table testing environment in cluster
> -----------------------------------------------------
>
>                 Key: HUDI-3319
>                 URL: https://issues.apache.org/jira/browse/HUDI-3319
>             Project: Apache Hudi
>          Issue Type: Task
>            Reporter: Ethan Guo
>            Assignee: Yue Zhang
>            Priority: Blocker
>             Fix For: 0.11.0
>
>
> The goal is to setup a long-running ingestion pipeline writing to a Hudi 
> table with metadata table enabled in a cluster environment with Spark.  A few 
> things to consider:
>  * Long-running for a few days
>  * Different table type: COW, MOR
>  * Some aggressive configs around compaction, archival, cleaner, to hit 
> possible concurrency cases
>  * Multi-writer: one writer for continuous ingestion, another writer with 
> periodic backfills / async table services
>  * Data validation: making sure both data table and metadata table are 
> intact, with expected data



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HUDI-3319) Prepare metadata table testing environment in cluster

Reply via email to