> I want to know that if in the clone procedure, the specified snapshot or tag is being deleted, how do we handle the exception? Should we stop the procedure and clean the temporary target table directory?
- For cloning specified snapshot or tag, undoubtedly, rollback operation (deleting copied files) and an exception should be thrown. - For cloning all snapshots and tags, we should ignore deleted files to keep this clone working. To avoid conflicting with expiring snapshots and deleting files in streaming writing job. Best, Jingsong On Wed, Apr 3, 2024 at 3:08 PM yu zelin <[email protected]> wrote: > > Hi Jingsong, > > I want to know that if in the clone procedure, the specified snapshot or > tag is being deleted, how do we handle the exception? > Should we stop the procedure and clean the temporary target table directory? > > Best regards, > Zelin Yu > > On Mon, Mar 18, 2024 at 1:30 PM Jingsong Li <[email protected]> wrote: > > > Hi devs, > > > > I have heard many times that there is a need to copy the entire table, > > and my advice to them is often to use file system file copying. > > > > But there are a few issues: > > 1. It is necessary to copy a large number of files, and it is likely > > that some files will be deleted due to ongoing work, resulting in > > copying failure. > > 2. The target table may need to synchronize Hive metadata, which means > > using HiveCatalog, which cannot be solved by copying files. > > > > So I suggest we have a clone procedure. [1] > > > > Also, welcome contributors to develop this PIP together, and I will > > help you review your code. > > > > [1] > > https://cwiki.apache.org/confluence/display/PAIMON/PIP-18%3A+Introduce+clone+Procedure > > > > Best, > > Jingsong > >
