Thanks Yong for your feedback.

Sounds good~

Best,
Jingsong

On Thu, Jun 27, 2024 at 12:24 PM Yong Fang <[email protected]> wrote:
>
> I reviewed my original proposal and find the 'replaceBranch' is mainly for
> optimization of operations between paimon branch and flink jobs. It can be
> replaced with stop job -> merge branch -> restart job.
>
> Relatively speaking, this is a low-frequency operation, we can remove it
> first, and consider adding it to the appropriate position when needed in
> the future, without adding additional IO. WDYT?
>
> 在 2024年6月26日星期三,Jingsong Li <[email protected]> 写道:
>
> > Hi Shammon,
> >
> > After some implementation, I discovered an issue:
> >
> > replace_branch incurs an expensive IO overhead for most operations in
> > the normal code path. For HDFS, it is a namenode access, and for
> > object storage, it is a separate billing.
> >
> > This is difficult to accept, and if replace_branch is not as useful, I
> > suggest removing this operation.
> >
> > If we remove replace_branch, can we consider changing the name of
> > merge_branch, such as changing it to fast_forward, which seems more
> > appropriate to its original meaning.
> >
> > Best,
> > Jingsong
> >
> > On Fri, Sep 29, 2023 at 2:25 AM Jingsong Li <[email protected]>
> > wrote:
> > >
> > > Thanks Shammon for driving.
> > >
> > > Sounds good to me to start a voting process.
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Mon, Sep 25, 2023 at 7:14 PM Shammon FY <[email protected]> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > Thanks for all the valuable feedback. If there‘s no more comments, I
> > will
> > > > start a vote for this PIP in the next 2 days.
> > > >
> > > > Best,
> > > > Shammon FY
> > > >
> > > >
> > > > On Thu, Sep 21, 2023 at 5:19 PM Shammon FY <[email protected]> wrote:
> > > >
> > > > > The feature `Replace Main With Branch` is used in duplicate data
> > > > > correction without modifying jobs. For example:
> > > > >
> > > > > 1. We can create branches with the same name for a series of paimon
> > tables
> > > > > 2. Re-submit all streaming jobs to read and write these branches for
> > tables
> > > > > 3. After the data in the branch is up to the main, we can stop all
> > the
> > > > > jobs which read and write main branch
> > > > > 4. Replace main branch with the created branch, we don't need to do
> > > > > anything with the jobs read and write the specified branch
> > > > >
> > > > > We cannot `Merge Branch To Main` here because the correct jobs will
> > still
> > > > > read and write the branches which will be completely independent of
> > main.
> > > > >
> > > > > Best,
> > > > > Shammon FY
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Sep 21, 2023 at 12:21 AM Jingsong Li <[email protected]
> > >
> > > > > wrote:
> > > > >
> > > > >> Can you explain more about "Replace Main With Branch"?
> > > > >>
> > > > >> Does this need to be implemented?
> > > > >>
> > > > >> Best,
> > > > >> Jingsong
> > > > >>
> > > > >> On Tue, Sep 19, 2023 at 2:17 PM Shammon FY <[email protected]>
> > wrote:
> > > > >> >
> > > > >> > Hi ConradJam,
> > > > >> >
> > > > >> > How to handle data conflicts between the main branch and branches
> > is a
> > > > >> > complex problem. At present, we would like to replace data in
> > main with
> > > > >> > branch directly. You can think that during merge and replace
> > operations,
> > > > >> > the data after the specified tag in the main branch will be
> > deleted and
> > > > >> > then the data after the tag in the branch will be used in the
> > main.
> > > > >> >
> > > > >> > We can consider  "merge" conflicting data in the future when we
> > meet
> > > > >> these
> > > > >> > requirements.
> > > > >> >
> > > > >> > Best,
> > > > >> > Shammon FY
> > > > >> >
> > > > >> > On Tue, Sep 19, 2023 at 10:50 AM ConradJam <[email protected]>
> > wrote:
> > > > >> >
> > > > >> > > +1 This feature looks a bit like Git’s branch management.If
> > this is
> > > > >> really
> > > > >> > > the case, how do we solve the data conflict when merging
> > branches? Do
> > > > >> we
> > > > >> > > need the user to specify that a certain branch data shall
> > prevail?
> > > > >> > >
> > > > >> > > Shammon FY <[email protected]> 于2023年9月18日周一 20:06写道:
> > > > >> > >
> > > > >> > > > Hi Jingsong,
> > > > >> > > >
> > > > >> > > > I have updated the PIP-9 to explain that the main `Snapshot`,
> > > > >> `Schema`
> > > > >> > > and
> > > > >> > > > `Tag` will exist in the base directory by default, just as
> > same as
> > > > >> the
> > > > >> > > > current directory structure. Thanks
> > > > >> > > >
> > > > >> > > > Best,
> > > > >> > > > Shammon FY
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On Fri, Sep 15, 2023 at 10:32 AM Shammon FY <
> > [email protected]>
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > > Hi Jingsong,
> > > > >> > > > >
> > > > >> > > > > Thanks for your suggestion, it sounds good to me. Currently
> > I only
> > > > >> > > > > mentioned it in the `Compatibility` section, I'll update
> > the PIP
> > > > >> to
> > > > >> > > > explain
> > > > >> > > > > this more clearly.
> > > > >> > > > >
> > > > >> > > > > Best,
> > > > >> > > > > Shammon FY
> > > > >> > > > >
> > > > >> > > > > On Wed, Sep 13, 2023 at 12:26 PM Jingsong Li <
> > > > >> [email protected]>
> > > > >> > > > > wrote:
> > > > >> > > > >
> > > > >> > > > >> Thanks Shammon for the proposal!
> > > > >> > > > >>
> > > > >> > > > >> It looks very good!
> > > > >> > > > >>
> > > > >> > > > >> I don't get the main branch file.
> > > > >> > > > >>
> > > > >> > > > >> Can we keep the main branch as it is? Just put snapshot/
> > tag/
> > > > >> schema/
> > > > >> > > > >> in the table root directory.
> > > > >> > > > >>
> > > > >> > > > >> Best,
> > > > >> > > > >> Jingsong
> > > > >> > > > >>
> > > > >> > > > >> On Tue, Sep 12, 2023 at 3:55 PM Shammon FY <
> > [email protected]>
> > > > >> wrote:
> > > > >> > > > >> >
> > > > >> > > > >> > Hi devs,
> > > > >> > > > >> >
> > > > >> > > > >> > I would like to start a discussion about PIP-9: Support
> > Branch
> > > > >> [1].
> > > > >> > > > >> Branch
> > > > >> > > > >> > in Paimon will help us deal with data correction without
> > > > >> copying all
> > > > >> > > > >> data
> > > > >> > > > >> > from original tables, and it can also enhance Tag for
> > Paimon
> > > > >> like
> > > > >> > > > >> > traditional Hive partition tables, providing data
> > correction
> > > > >> > > > >> capabilities
> > > > >> > > > >> > on the basis of Tag.
> > > > >> > > > >> >
> > > > >> > > > >> > Looking forward to your feedback, thanks!
> > > > >> > > > >> >
> > > > >> > > > >> >
> > > > >> > > > >> > [1]
> > > > >> > > > >> >
> > > > >> > > > >>
> > > > >> > > >
> > > > >> > >
> > > > >> https://cwiki.apache.org/confluence/display/PAIMON/PIP-
> > 9%3A+Support+Branch
> > > > >> > > > >> >
> > > > >> > > > >> > Best,
> > > > >> > > > >> > Shammon FY
> > > > >> > > > >>
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> > >
> > > > >> > > --
> > > > >> > > Best
> > > > >> > >
> > > > >> > > ConradJam
> > > > >> > >
> > > > >>
> > > > >
> >

Reply via email to