I agree that a threshold on the mere number of changed files (20 in the PR) would be nice to avoid.
On the other hand, I think leaving this to be a reviewer's call is fine from my perspective. Any committer could flag a change as large enough to be discussed on the dev list of go through the improvement process. Cheers, Dmitri. On Fri, Aug 9, 2024 at 2:50 PM Walaa Eldin Moustafa <wa.moust...@gmail.com> wrote: > Thanks, I still have similar concerns as my original ones. > > There is some ambiguity in the first paragraph that might lead to > inconsistencies in how different committers handle potential conflicts. > Some might be overly cautious even when it’s unnecessary, while others > might proceed with reviews even in situations where there is a clear > conflict, but they don't recognize it as such. > > Also, I would avoid basing what is worth discussion on the number of > files. It is better to define criteria around public APIs/spec changes/REST > APIs, etc (I see some of those in the current doc, +1 to that). Would be > great to have a discussion around the specific criteria for changes > requiring a discussion, vote, etc from API/spec perspective. > > Also, it is not very clear when to use IIP vs voting vs discussion. I > think the last two paragraphs can benefit from more clarity on the > criteria. For example, if the committer were to follow a flowchart to test > whether a PR is mergeable, what would the flowchart look like? (Of course, > we do not have to depict a flowchart, but I am just conveying the style of > guidance that could potentially be more effective and clear). > > Thanks, > Walaa. > > > On Fri, Aug 9, 2024 at 8:05 AM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> As a summary, I think we are potentially at consensus. I think there are >> some concerns but not hard blockers: >> >> 1. Having a vote for every spec change might be too onerous (we can >> maybe change this policy once we have a sense of the overhead). >> 2. Some of the content is slightly repetitive with other documentation >> in ASF (I've tried to shorten this to minimal possible length with links). >> 3. Rules can be used to point fingers instead of having constructive >> conversations on how to improve process or use judgement. >> >> Please comment on the PR or reply here if we need more discussion. If >> there aren't other items I'll start a vote next week to merge the change. >> >> Thanks, >> Micah >> >> On Tue, Aug 6, 2024 at 12:18 PM Micah Kornfield <emkornfi...@gmail.com> >> wrote: >> >>> My only question is around "conflict of interest" while reviewing PRs. I >>>> think it needs further explanation and some concrete examples. >>> >>> >>> I was hesitant to add details here as I thought people could use >>> reasonable judgement and to my knowledge I don't think the project has had >>> to deal with PR that represented a conflict. Since this has come up a few >>> times I added the following text: "There is no strict definition of >>> conflict of interest. A conflict of interest is most likely to occur when >>> the author of a pull request and the reviewer share an employer, and the >>> change in question could be construed as providing preferential treatment >>> to their shared employer. " >>> >>> Hopefully, this is sufficient for a first draft and if there is a >>> necessity for more clarity the wording can be refined and more concrete >>> examples can be given if this starts to be a problem in the community >>> (happy to also refine this language on the PR if others want to reword or >>> add examples that I'm not aware of). >>> >>> Thans, >>> Micah >>> >>> On Tue, Aug 6, 2024 at 11:55 AM Anton Okolnychyi <aokolnyc...@gmail.com> >>> wrote: >>> >>>> The doc in its current form seems reasonable to me. I understand it >>>> reiterates some of the ASF guidelines but I don't mind that given that it >>>> is fairly concise. My only question is around "conflict of interest" while >>>> reviewing PRs. I think it needs further explanation and some concrete >>>> examples. >>>> >>>> - Anton >>>> >>>> нд, 4 серп. 2024 р. о 15:04 Micah Kornfield <emkornfi...@gmail.com> >>>> пише: >>>> >>>>> Thank you for the feedback. I've put some replies inline to specific >>>>> points below. >>>>> >>>>> >>>>>> "Request Changes" should only be used to literally block a PR from >>>>>> being >>>>>> merged - either for timing issues (e.g. "can only be merged right >>>>>> after >>>>>> a release is out, because it changes public docs") or because of >>>>>> really >>>>>> fundamental issues with the change, with a clear and precise >>>>>> justification. >>>>> >>>>> >>>>> I'm not sure I'm understanding this correctly, but it sounds like a >>>>> fundamentally different usage of how "request changes" is used in my >>>>> experience (which is typically, reviewer is done reviewing and there are >>>>> still outstanding issues to address). IIUC this actually is a different >>>>> usage as what is currently described in the PR? For the cases mentioned >>>>> here, especially the latter it seems like this should be specifically >>>>> mentioned on the PR (and it's not clear Request changes by itself is >>>>> sufficient) . >>>>> >>>>> "Large changes" (5 lines? 5000 lines?) is also rather subjective - >>>>>> proposal: "User facing functional and behavior changes". >>>>> >>>>> >>>>> I changed the Iceberg improvement proposal language to be functional >>>>> and behavior changes to the specification. I also added in a bullet point >>>>> which says changes affecting more than 20 files must be discussed on the >>>>> mailing list (20 was chosen arbitrarily but seems to be higher than any >>>>> file count observed in sampling ~30 PRs that are currently open in the >>>>> repo). >>>>> >>>>> [1] A few paragraphs/statements that delegate to the comitter's >>>>>> judgement call. (e.g., "Committer is trusted", "If committer feels" >>>>>> ..). So the value of adding them is not very clear to me. >>>>> >>>>> >>>>> [2] A few things that are implied (e.g., re-iterating ASF guidelines, >>>>>> lack of agreement will definitely lead to not merging the PR -- >>>>>> whether to discuss the topic as a proposal or not should have a >>>>>> separate process to state what qualifies as a proposal discussion). >>>>> >>>>> I would avoid [1] and [2] to avoid repetition or too much reading of >>>>>> those rules, especially when they are subjective (e.g., [1]) or >>>>>> implied (e.g., [2]). >>>>> >>>>> >>>>> I've tried to remove as much language here as possible and provide >>>>> links to Apache documents. The value it provides is documenting norms so >>>>> new contributors and committers understand the responsibility of the roles >>>>> for doing day to day work. I imagine it will be much more common for >>>>> people >>>>> to read the contributing guide first and then possibly any ASF >>>>> documentation. Leaving things as implied by ASF guidelines works for >>>>> people already familiar with them, but not for newcomers (and the ASF >>>>> documentation is considerably more verbose than the current text). Also, >>>>> some things are not defined in ASF guidelines (I tried to clarify that >>>>> code >>>>> changes in Iceberg follow Review then Commit, since they require a second >>>>> reviewer). >>>>> >>>>> Last, despite ASF guidelines there have been some people expressing >>>>> the opinion on the mailing list that some reviews were blocked on specific >>>>> people having to review them, I think a reminder that this shouldn't be >>>>> the >>>>> case is useful (or if there are specific people that need to review some >>>>> changes we can add guidelines as necessary). >>>>> >>>>> What remains (e.g., [3] or the way to proceed if >>>>>> a committer feels something is worthy of a proposal-level discussion) >>>>>> fits more in a process that organizes what qualifies as a proposal vs >>>>>> code change etc (e.g., the rule should not be about "whether a >>>>>> committer feels that something is worth a proposal", but rather there >>>>>> should be a process that guides both the contributor and committer for >>>>>> what qualifies as proposal vs a direct PR). >>>>> >>>>> >>>>> If I am understanding correctly, this is saying more guidelines for >>>>> what should go through the improvement proposals should be established. >>>>> This sounds reasonable to me, but it also seems like it can be handled >>>>> independently in a follow-up discussion/pull request? >>>>> >>>>> Thanks, >>>>> Micah >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Sat, Aug 3, 2024 at 1:14 AM Robert Stupp <sn...@snazy.de> wrote: >>>>> >>>>>> I agree with Walaa's comments and appreciate Micah's intent to >>>>>> clarify >>>>>> things! >>>>>> >>>>>> Sticking to a few important rules, using a crisp phrasing, and >>>>>> leaving >>>>>> out the subjective, optional and "ASF implied" things, would be >>>>>> better. >>>>>> >>>>>> "Large changes" (5 lines? 5000 lines?) is also rather subjective - >>>>>> proposal: "User facing functional and behavior changes". >>>>>> >>>>>> I'm +1 on clarifying what "Request Changes" on a PR means. Some >>>>>> cultures >>>>>> can interpret this as a "push back", so I'd like to propose that >>>>>> "Request Changes" should only be used to literally block a PR from >>>>>> being >>>>>> merged - either for timing issues (e.g. "can only be merged right >>>>>> after >>>>>> a release is out, because it changes public docs") or because of >>>>>> really >>>>>> fundamental issues with the change, with a clear and precise >>>>>> justification. >>>>>> >>>>>> Robert >>>>>> >>>>>> On 02.08.24 21:06, Walaa Eldin Moustafa wrote: >>>>>> > My concern with this change (in its current form) is that it >>>>>> combines >>>>>> > (mixes?) three things: >>>>>> > >>>>>> > [1] A few paragraphs/statements that delegate to the comitter's >>>>>> > judgement call. (e.g., "Committer is trusted", "If committer feels" >>>>>> > ..). So the value of adding them is not very clear to me. >>>>>> > [2] A few things that are implied (e.g., re-iterating ASF >>>>>> guidelines, >>>>>> > lack of agreement will definitely lead to not merging the PR -- >>>>>> > whether to discuss the topic as a proposal or not should have a >>>>>> > separate process to state what qualifies as a proposal discussion). >>>>>> > [3] A few crisp rules, e.g, how to treat changes under spec and >>>>>> files >>>>>> > under the `format` directory. >>>>>> > >>>>>> > I would avoid [1] and [2] to avoid repetition or too much reading of >>>>>> > those rules, especially when they are subjective (e.g., [1]) or >>>>>> > implied (e.g., [2]). What remains (e.g., [3] or the way to proceed >>>>>> if >>>>>> > a committer feels something is worthy of a proposal-level >>>>>> discussion) >>>>>> > fits more in a process that organizes what qualifies as a proposal >>>>>> vs >>>>>> > code change etc (e.g., the rule should not be about "whether a >>>>>> > committer feels that something is worth a proposal", but rather >>>>>> there >>>>>> > should be a process that guides both the contributor and committer >>>>>> for >>>>>> > what qualifies as proposal vs a direct PR). >>>>>> > >>>>>> > Thanks, >>>>>> > Walaa. >>>>>> > >>>>>> > >>>>>> > On Fri, Aug 2, 2024 at 11:43 AM Micah Kornfield < >>>>>> emkornfi...@gmail.com> wrote: >>>>>> >> I just wanted to follow up on this. >>>>>> >> >>>>>> >> Ryan and Szehon unless you strongly object to the current >>>>>> formulation of the PR, I'd like to move forward with a vote to merge. >>>>>> There are already a fair number of people in the community that have >>>>>> reviewed and seemed to think the proposal is not unreasonable. For votes >>>>>> it seems like most of the recent threads on the matter have decided to >>>>>> err >>>>>> on the side of votes for changes to specs, as it reduces the judgement >>>>>> call >>>>>> of reviewers. Thoughts? >>>>>> >> >>>>>> >> Thanks, >>>>>> >> Micah >>>>>> >> >>>>>> >> On Tue, Jul 30, 2024 at 11:08 AM Micah Kornfield < >>>>>> emkornfi...@gmail.com> wrote: >>>>>> >>>> The problem I'm worried about is the tendency to misuse docs >>>>>> like this and become focused on it as a rule. People tend to apply >>>>>> written >>>>>> rules mechanically and I worry about people substituting a reading of >>>>>> this >>>>>> text for judgement. For example, a strict reading of "encouraged to ask >>>>>> another committer" means that it is optional. >>>>>> >>> >>>>>> >>> I agree with this concern which is why I tried to make everything >>>>>> optional. I hope if there ever needs to be a revision here it is framed >>>>>> as >>>>>> constructive and not a blame game on someone breaking the rules. In >>>>>> general, I think placing the reminder on trust and putting the project >>>>>> first are good reminders in either case. IMO the text is short enough >>>>>> that >>>>>> they should be included directly in the contributors guideline. >>>>>> >>> >>>>>> >>> >>>>>> >>>> To me, (2) is a bit new here and in the grey area for >>>>>> interpretation. I was thinking about this while reviewing >>>>>> https://github.com/apache/iceberg/pull/10793, which could be a >>>>>> category (2) and non-functional change but would need a full >>>>>> code-modification vote as per [Iceberg improvement >>>>>> proposal](#apache-iceberg-improvement-proposals). I can see both sides, >>>>>> to >>>>>> avoid a potential dispute/misunderstanding over the clarification, it >>>>>> would >>>>>> be nice to have a vote on the devlist. But it may also be yet another >>>>>> burden, when something can be more easily decided on the github >>>>>> discussion >>>>>> itself via approval by the relevant parties. >>>>>> >>> >>>>>> >>> The PR documents how I understood the requirement communicated to >>>>>> me about spec modifications. I don't recall seeing exactly where this >>>>>> requirement was agreed. so maybe we can revise it. IMO spec changes are >>>>>> important enough that it is worth explicit email to the mailing list >>>>>> notifying people of the proposed change. A vote might be too heavyweight >>>>>> though. Would updating the guideline to require a separate email to the >>>>>> mailing list for changes and requiring the change be open for at least 72 >>>>>> hours strike the right balance? >>>>>> >>> >>>>>> >>> Thanks, >>>>>> >>> Micah >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> On Mon, Jul 29, 2024 at 3:08 PM Szehon Ho < >>>>>> szehon.apa...@gmail.com> wrote: >>>>>> >>>> Typo , wrong link: >>>>>> >>>> (2) requiring full code-modification vote as per [Iceberg >>>>>> improvement proposal](#apache-iceberg-improvement-proposals) => full >>>>>> code-modification vote as per [code modification]( >>>>>> https://www.apache.org/foundation/voting.html#votes-on-code-modification >>>>>> ) >>>>>> >>>> >>>>>> >>>> On Mon, Jul 29, 2024 at 1:53 PM Szehon Ho < >>>>>> szehon.apa...@gmail.com> wrote: >>>>>> >>>>> Hi, >>>>>> >>>>> >>>>>> >>>>> Also if I read it correctly, I think this proposal imposes the >>>>>> following workflows in "spec" folders : >>>>>> >>>>> >>>>>> >>>>> Large and functional changes. These redirect to Iceberg >>>>>> improvement proposals, which ends in code-modification vote >>>>>> >>>>> bug-fixes or clarification, which is specified to require >>>>>> code-modification vote >>>>>> >>>>> grammar, spelling, minor formatting fix, not covered here (I >>>>>> guess it is like any normal code review?) >>>>>> >>>>> >>>>>> >>>>> To me, (2) is a bit new here and in the grey area for >>>>>> interpretation. I was thinking about this while reviewing >>>>>> https://github.com/apache/iceberg/pull/10793, which could be a >>>>>> category (2) and non-functional change but would need a full >>>>>> code-modification vote as per [Iceberg improvement >>>>>> proposal](#apache-iceberg-improvement-proposals). I can see both sides, >>>>>> to >>>>>> avoid a potential dispute/misunderstanding over the clarification, it >>>>>> would >>>>>> be nice to have a vote on the devlist. But it may also be yet another >>>>>> burden, when something can be more easily decided on the github >>>>>> discussion >>>>>> itself via approval by the relevant parties. So I think I would agree >>>>>> with >>>>>> Ryan in mentioning that significant (would maybe add "functional") spec >>>>>> change needs a vote on the dev list. >>>>>> >>>>> >>>>>> >>>>> Thanks >>>>>> >>>>> Szehon >>>>>> >>>>> >>>>>> >>>>> On Mon, Jul 29, 2024 at 1:16 PM Ryan Blue >>>>>> <b...@databricks.com.invalid> wrote: >>>>>> >>>>>> I think the proposed doc looks good, but I'm not sure that it >>>>>> is better to add this to our guidelines. >>>>>> >>>>>> >>>>>> >>>>>> On one hand the doc describes how ASF communities work in >>>>>> general: committers review and commit PRs and are expected to use good >>>>>> judgement, ask one another for help when necessary, and broaden the set >>>>>> of >>>>>> people in the discussion when there's a disagreement. I really appreciate >>>>>> that Micah called out that this is intentionally vague to emphasize >>>>>> committer judgement. >>>>>> >>>>>> >>>>>> >>>>>> The problem I'm worried about is the tendency to misuse docs >>>>>> like this and become focused on it as a rule. People tend to apply >>>>>> written >>>>>> rules mechanically and I worry about people substituting a reading of >>>>>> this >>>>>> text for judgement. For example, a strict reading of "encouraged to ask >>>>>> another committer" means that it is optional. >>>>>> >>>>>> >>>>>> >>>>>> Given that the majority of the content here is stating how ASF >>>>>> communities work and the only Iceberg-specific parts are the proposal >>>>>> process and calling out that we vote on spec changes, I would probably >>>>>> just >>>>>> have a description of how to handle proposals (which is already there) >>>>>> and >>>>>> a note that significant spec changes should use a vote on the dev list. >>>>>> >>>>>> >>>>>> >>>>>> On Sun, Jul 28, 2024 at 11:15 PM Jean-Baptiste Onofré < >>>>>> j...@nanthrax.net> wrote: >>>>>> >>>>>>> Hi Micah >>>>>> >>>>>>> >>>>>> >>>>>>> Thanks ! It looks good to me now you have included comments >>>>>> from everyone. >>>>>> >>>>>>> >>>>>> >>>>>>> Regards >>>>>> >>>>>>> JB >>>>>> >>>>>>> >>>>>> >>>>>>> On Fri, Jul 26, 2024 at 1:15 AM Micah Kornfield < >>>>>> emkornfi...@gmail.com> wrote: >>>>>> >>>>>>>> As part of the bylaws discussions that have been happening, >>>>>> we are trying to make small focused proposals to move things forward. >>>>>> As a >>>>>> first step towards this I created a proposal for guidelines on committing >>>>>> pull requests [1]. Feedback is appreciated. >>>>>> >>>>>>>> >>>>>> >>>>>>>> Given the level of interest in the discussions so far, it >>>>>> seems that the best path forward is to hold an official vote before >>>>>> merging. I intend to do this once we appear to have consensus but if the >>>>>> people prefer we can try to avoid the overhead. >>>>>> >>>>>>>> >>>>>> >>>>>>>> Thanks, >>>>>> >>>>>>>> Micah >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> [1] https://github.com/apache/iceberg/pull/10780 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Ryan Blue >>>>>> >>>>>> Databricks >>>>>> >>>>>> -- >>>>>> Robert Stupp >>>>>> @snazy >>>>>> >>>>>>