Re: RichCdcSinkBuilder with Iceberg catalog?

Andrew Otto Fri, 19 Jul 2024 11:25:38 -0700

TIL about XTable.  Cool!


On Fri, Jul 19, 2024 at 2:11 PM Kyle Weller <k...@onehouse.ai> wrote:

> I wonder if Apache XTable <https://xtable.apache.org/> is also a
> viable option to consider? Data could still be written and stored natively
> as Paimon and asynchronously generate the iceberg manifest files and sync
> to an Iceberg catalog. It is working great between Iceberg, Hudi, Delta
> today in production. There may be some code in that project to leverage or
> adding paimon XTable interface would auto unlock omni directional
> translation to all 4 table formats versus a 1 by 1 integration.
>
> On Fri, Jul 19, 2024 at 8:41 AM Andrew Otto <o...@wikimedia.org> wrote:
>
>> > > Another approach is to create a snapshot compatible way for Paimon
>>  to generate Iceberg, which is what we are working on.
>> Hi, just checking in!  How is this going? Thanks!
>>
>> On Mon, Jun 10, 2024 at 9:17 AM Andrew Otto <o...@wikimedia.org> wrote:
>>
>>> Awesome, I look forward to it!  Thank you!
>>>
>>> On Mon, Jun 10, 2024 at 2:35 AM Jingsong Li <jingsongl...@gmail.com>
>>> wrote:
>>>
>>>> We are developing prototype in our internal.
>>>>
>>>> It takes about 2 to 3 months.
>>>>
>>>> Andrew Otto <o...@wikimedia.org>于2024年5月29日 周三21:46写道：
>>>>
>>>>> > Another approach is to create a snapshot compatible way for Paimon
>>>>> to generate Iceberg, which is what we are working on.
>>>>>
>>>>> Oh!  Very interesting.  Can you say more? And/or do you have links to
>>>>> Jira or anything?
>>>>>
>>>>> Thanks for your response! :)
>>>>>
>>>>>
>>>>>
>>>>> On Wed, May 29, 2024 at 7:41 AM Jingsong Li <jingsongl...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Andrew,
>>>>>>
>>>>>> It is difficult to move this mechanism to the Iceberg sink. The table
>>>>>> structure change in Iceberg's design requires generating a new
>>>>>> snapshot, which poses significant challenges to schema evolution.
>>>>>>
>>>>>> Another approach is to create a snapshot compatible way for Paimon to
>>>>>> generate Iceberg, which is what we are working on.
>>>>>>
>>>>>> Best,
>>>>>> Jingsong
>>>>>>
>>>>>> On Fri, May 24, 2024 at 8:11 PM Andrew Otto <o...@wikimedia.org>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi!
>>>>>> >
>>>>>> > How coupled to Paimon catalogs and tables is the cdc part of
>>>>>> Paimon?  RichCdcMultiplexRecord and related code seem incredibly useful
>>>>>> even outside of the context of the Paimon table format.
>>>>>> >
>>>>>> > I'm asking because the database sync action feature is amazing.  At
>>>>>> the Wikimedia Foundation, we are on an all-in journey with Iceberg.  I'm
>>>>>> wondering how hard it would be to extract the CDC logic from Paimon and
>>>>>> abstract the Sink bits.
>>>>>> >
>>>>>> > Could the table/database sync with schema evolution (without Flink
>>>>>> job restarts!) potentially work with the Iceberg sink?
>>>>>> >
>>>>>> > Thanks!
>>>>>> > -Andrew Otto
>>>>>> >  Wikimedia Foundation
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>>
>>>>>

Re: RichCdcSinkBuilder with Iceberg catalog?

Reply via email to