Re: [DISCUSS] PIP-18: Introduce clone Procedure

Aitozi Tue, 19 Mar 2024 08:24:58 -0700

Hi Jingsong,

    Clone table is a useful feature, I have two question here.


1. Is the `clone` procedure equal to the following process ?
       - create the target table in target catalog
       - Insert into the target_table select * from source_table /*+
OPTIONS('scan.tag-name' = 'tag-1') */;
2. What do you mean by

> "The target table may need to synchronize Hive metadata, which means
using HiveCatalog, which cannot be solved by copying files."

Does it means clone a paimon table to a hive table ?

If clone a paimon table to another paimon table, can we use file copy
solution?
I guess it may be useful when we have to move the data between cluster.

Best,
Atiozi

Jingsong Li <[email protected]> 于2024年3月18日周一 13:30写道：

> Hi devs,
>
> I have heard many times that there is a need to copy the entire table,
> and my advice to them is often to use file system file copying.
>
> But there are a few issues:
> 1. It is necessary to copy a large number of files, and it is likely
> that some files will be deleted due to ongoing work, resulting in
> copying failure.
> 2. The target table may need to synchronize Hive metadata, which means
> using HiveCatalog, which cannot be solved by copying files.
>
> So I suggest we have a clone procedure. [1]
>
> Also, welcome contributors to develop this PIP together, and I will
> help you review your code.
>
> [1]
> https://cwiki.apache.org/confluence/display/PAIMON/PIP-18%3A+Introduce+clone+Procedure
>
> Best,
> Jingsong
>

Re: [DISCUSS] PIP-18: Introduce clone Procedure

Reply via email to