Hi Jingsong,
Clone table is a useful feature, I have two question here.
1. Is the `clone` procedure equal to the following process ?
- create the target table in target catalog
- Insert into the target_table select * from source_table /*+
OPTIONS('scan.tag-name' = 'tag-1') */;
2. What do you mean by
> "The target table may need to synchronize Hive metadata, which means
using HiveCatalog, which cannot be solved by copying files."
Does it means clone a paimon table to a hive table ?
If clone a paimon table to another paimon table, can we use file copy
solution?
I guess it may be useful when we have to move the data between cluster.
Best,
Atiozi
Jingsong Li <[email protected]> 于2024年3月18日周一 13:30写道:
> Hi devs,
>
> I have heard many times that there is a need to copy the entire table,
> and my advice to them is often to use file system file copying.
>
> But there are a few issues:
> 1. It is necessary to copy a large number of files, and it is likely
> that some files will be deleted due to ongoing work, resulting in
> copying failure.
> 2. The target table may need to synchronize Hive metadata, which means
> using HiveCatalog, which cannot be solved by copying files.
>
> So I suggest we have a clone procedure. [1]
>
> Also, welcome contributors to develop this PIP together, and I will
> help you review your code.
>
> [1]
> https://cwiki.apache.org/confluence/display/PAIMON/PIP-18%3A+Introduce+clone+Procedure
>
> Best,
> Jingsong
>