I'm fine starting a branch later if we do run into those issues, but I
don't think it is a good idea to do it now in anticipation. All of the work
that we can do on master we should try to do on master. We can start a
branch when we need one.

On Mon, Mar 30, 2020 at 7:44 PM OpenInx <open...@gmail.com> wrote:

> Hi Ryan
>
> The reason I suggest to open a new dev branch for row-delete development
> is:  we will split the whole feature into
> many small issues and each issue will have a pull request with appropriate
> length of code so the contributors/reviewers
> can discuss one point each time and make this feature a faster iteration.
> In the process of implementation, we will ensure
> that the v1 works for every separate PR but it may not ready for cutting
> release, for example, when release the 0.8.0 I'm
> sure we won't like the release version contains part of the v2 spec(such
> as provide the sequence_number, but no file_type).
> The spark reader/writer and data/delete manifest may also need some code
> refactor, it's possible to put them into several PR.
> Splitting into multiple Pull Requests may block the release of the new
> version for a certain period of time, that's not we want
> to see.
>
> About the new branch maintenance, in my experience we could rebase the new
> branch with master periodly(such as rebase
> for every three days), so that the new pull request for row-delete will be
> designed based on the newest changes. It should work
> for the master which would not have too many new change. This is in line
> with our current situation.
>
> In this case, I weighed the maintenance costs of the new branch against
> the delay of the row-delete. I think we should let the
> row-delete go a little faster (almost all community users are looking
> forward to this feature), and I think the current maintenance
> cost is acceptable.
>
> Thanks
>
> On Tue, Mar 31, 2020 at 5:52 AM Ryan Blue <rb...@netflix.com.invalid>
> wrote:
>
>> Sorry, I didn't address the suggestion to add a Flink branch as well. The
>> work needed for the Flink sink is to remove parts that are specific to
>> Netflix, so I'm not sure what the rationale for a branch would be. Is there
>> a reason why this can't be done in master, but requires a shared branch? If
>> multiple people want to contribute, why not contribute to the same PR?
>>
>> A shared PR branch makes the most sense to me for this because it is
>> regularly tested against master.
>>
>> On Mon, Mar 30, 2020 at 2:48 PM Ryan Blue <rb...@netflix.com> wrote:
>>
>>> I think we will eventually may want a branch, but I think it is too
>>> early to create one now.
>>>
>>> Branches are expensive. They require maintenance to stay in sync with
>>> master, usually copying changes from master into the branch with updates.
>>> Updating the changes to master for the branch is more difficult because it
>>> is usually not the original contributor or reviewer porting them. And it is
>>> better to catch problems between changes in master and the branch early.
>>>
>>> I'm not against branches, but I don't want to create them unless they
>>> are valuable. In this case, I don't see the value. We plan to add v2 in
>>> parallel so you can still write v1 tables for compatibility, and most of
>>> the work that needs to be done -- like creating readers and writers for
>>> diff formats -- can be done in master.
>>>
>>> rb
>>>
>>> On Mon, Mar 30, 2020 at 9:00 AM Gautam <gautamkows...@gmail.com> wrote:
>>>
>>>> Thanks for bringing this up OpenInx.  That's a great idea: to open a
>>>> separate branch for row-level deletes.
>>>>
>>>> I would like to help support/contribute/review this as well. If there
>>>> are sub-tasks you guys have identified that can be added to
>>>> https://github.com/apache/incubator-iceberg/milestone/4 we can start
>>>> taking those up too.
>>>>
>>>> thanks for the good work,
>>>> - Gautam.
>>>>
>>>>
>>>>
>>>> On Mon, Mar 30, 2020 at 8:39 AM Junjie Chen <chenjunjied...@gmail.com>
>>>> wrote:
>>>>
>>>>> +1 to create the branch. Some row-level delete subtasks must be based
>>>>> on the sequence number as well as end to end tests.
>>>>>
>>>>> On Fri, Mar 27, 2020 at 4:42 PM OpenInx <open...@gmail.com> wrote:
>>>>>
>>>>>> Dear Dev:
>>>>>>
>>>>>>      Tuesday, we had a sync meeting. and discussed about the things:
>>>>>>          1.  cut the 0.8.0 release;
>>>>>>          2.  flink connector ;
>>>>>>          3.  iceberg row-level delete;
>>>>>>          4. Map-Reduce Formats and Hive support.
>>>>>>
>>>>>>       We'll release version 0.8.0 around April 15, the following
>>>>>> 0.9.0 will be
>>>>>>      released in the next few month. On the other hand, Ryan, Junjie
>>>>>> Chen
>>>>>>      and I have done three PoC versions for the row-level deletes. We
>>>>>> had
>>>>>>      a full discussion[4] and started to do the relevant code design.
>>>>>> we're sure that
>>>>>>      the feature will introduce some incompatible specification,
>>>>>> such as the
>>>>>>      sequence_number spec[1], file_type spec[2], the sortedOrder
>>>>>> feature seems
>>>>>>      also to be a breaking change [3].
>>>>>>
>>>>>>      To avoid affecting the release of version 0.8.0 and push the
>>>>>> row-delete feature
>>>>>>      early. I suggest to open a new branch for the row-delete
>>>>>> feature, name it branch-1.
>>>>>>      Once the row-delete feature is stable, we could release the
>>>>>> 1.0.0. Or we can just
>>>>>>      open a row-delete feature branch and once the work is done we
>>>>>> will merge
>>>>>>      the row-delete feature branch back to master branch, and
>>>>>> continue to release the 0.9.0
>>>>>>      version.
>>>>>>
>>>>>>      I guess the flink connector dev are facing the same problem ?
>>>>>>
>>>>>>      What do you think about this ?
>>>>>>
>>>>>>      Thank you.
>>>>>>
>>>>>>
>>>>>>   [1]. https://github.com/apache/incubator-iceberg/pull/588
>>>>>>   [2]. https://github.com/apache/incubator-iceberg/issues/824
>>>>>>   [3]. https://github.com/apache/incubator-iceberg/issues/317
>>>>>>   [4].
>>>>>> https://docs.google.com/document/d/1CPFun2uG-eXdJggqKcPsTdNa2wPMpAdw8loeP-0fm_M/edit?usp=sharing
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>>
>>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to