On 03/12/2014 05:57 PM, Alon Bar-Lev wrote: > > > ----- Original Message ----- >> From: "Itamar Heim" <[email protected]> >> To: "Eli Mesika" <[email protected]>, [email protected] >> Cc: "Maor Lipchuk" <[email protected]>, "Tomasz Kołek" >> <[email protected]>, "infra" <[email protected]> >> Sent: Wednesday, March 12, 2014 5:52:25 PM >> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions >> >> On 03/12/2014 04:57 PM, Eli Mesika wrote: >>> >>> >>> ----- Original Message ----- >>>> From: "David Caro" <[email protected]> >>>> To: "Itamar Heim" <[email protected]> >>>> Cc: "Maor Lipchuk" <[email protected]>, [email protected], "Tomasz Kołek" >>>> <[email protected]>, "infra" >>>> <[email protected]> >>>> Sent: Wednesday, March 12, 2014 11:01:21 AM >>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions >>>> >>>> On Wed 12 Mar 2014 08:16:02 AM CET, Itamar Heim wrote: >>>>> On 03/11/2014 10:08 PM, Maor Lipchuk wrote: >>>>>> On 03/11/2014 05:20 PM, Itamar Heim wrote: >>>>>>> On 03/11/2014 05:14 PM, Eyal Edri wrote: >>>>>>>> >>>>>>>> >>>>>>>> ----- Original Message ----- >>>>>>>>> From: "Itamar Heim" <[email protected]> >>>>>>>>> To: "Eyal Edri" <[email protected]>, "Tomasz Kołek" >>>>>>>>> <[email protected]>, [email protected], "infra" <[email protected]> >>>>>>>>> Sent: Tuesday, March 11, 2014 5:10:54 PM >>>>>>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - >>>>>>>>> questions >>>>>>>>> >>>>>>>>> On 03/11/2014 05:06 PM, Ewoud Kohl van Wijngaarden wrote: >>>>>>>>>> On Tue, Mar 11, 2014 at 10:37:22AM -0400, Eyal Edri wrote: >>>>>>>>>>>> Tomasz Kołek wrote: >>>>>>>>>>>> >>>>>>>>>>>> I've got a few questions about project description. >>>>>>>>>>>> Please tell me if my problem's understanding is good or not. >>>>>>>>>>>> We need to add a few flags/methods to git review module. This >>>>>>>>>>>> flags >>>>>>>>>>>> should >>>>>>>>>>>> allow to add potential reviewers in gerrit. >>>>>>>>>>>> So: >>>>>>>>>>>> Let's assume that we've got special flags for this operations. >>>>>>>>>>>> What's >>>>>>>>>>>> next? >>>>>>>>>>>> 1. In gerrit system we need to add special place for potential >>>>>>>>>>>> reviewers? >>>>>>>>>>>> 2. Potential reviewers should agree that they want to review? >>>>>>>>>>>> 3. We can have more than one accepted reviewer? >>>>>>>>>>> >>>>>>>>>>> I'm not sure i understood exactly what you mean by 'potential >>>>>>>>>>> reviewers'. do want gerrit (hook?) to automatically add reviewers >>>>>>>>>>> to >>>>>>>>>>> a patch according to the code sent? so in fact you'll have a place >>>>>>>>>>> somewhere for mapping code & specific developers? >>>>>>>>>> >>>>>>>>>> I really like this idea. Gerrit currently requires new users to know >>>>>>>>>> who >>>>>>>>>> to add as reviewers, IMHO impeding new contributors. >>>>>>>>>> >>>>>>>>>> One relative simple solution would be to look at who recently >>>>>>>>>> touched >>>>>>>>>> the files that are being modified and add them as reviewers. This >>>>>>>>>> can be >>>>>>>>>> done by looking at the git log for a file. Some pseudo python code >>>>>>>>>> solution: >>>>>>>>>> >>>>>>>>>> reviewers = set() >>>>>>>>>> >>>>>>>>>> for modified_file in commit.files: >>>>>>>>>> reviewers += set(commit.author for commit in >>>>>>>>>> git.log(modified_file)) >>>>>>>>>> >>>>>>>>>> return reviewers >>>>>>>>>> >>>>>>>>>> This gives a system that those who touche a file, become the >>>>>>>>>> maintainer >>>>>>>>>> for that file. A more complex solution could improve on that and >>>>>>>>>> limit >>>>>>>>>> the reviewers added per patch. One can think of limiting to only >>>>>>>>>> contributions in the last X months, weigh contributions so common >>>>>>>>>> committers are prefered. It could also combine several methods. >>>>>>>>>> >>>>>>>>>> For example to limit to the 5 authors who touched the most files: >>>>>>>>>> >>>>>>>>>> reviewers = collections.Counter() # New in python 2.7 >>>>>>>>>> >>>>>>>>>> for modified_file in commit.files: >>>>>>>>>> reviewers += collections.Counter(commit.author for commit in >>>>>>>>>> git.log(modified_file)) >>>>>>>>>> >>>>>>>>>> return [author for author, count in reviewers.most_common(5)] >>>>>>>>>> >>>>>>>>>> Since Counter also accepts a dictionary, one could also weigh the >>>>>>>>>> touched lines per file. Downside there is big whitespace/formatting >>>>>>>>>> patches can skew the line count. >>>>>>>>>> >>>>>>>>>> In short, I think an entire thesis could be written on the optimal >>>>>>>>>> way >>>>>>>>>> to determine reviewers but a simple algorithm could do to show the >>>>>>>>>> method works. >>>>>>>>>> >>>>>>>>>> Does this help? >>> >>> Maybe it will be worth to use the information we have in Bugzilla here: >>> >>> We can browse the BZ that were closed/verified in the last XXX days >>> Per BZ , we know which patches are involved, who reviewed the patches, >>> which files were changed, when files were changed and the rank of the >>> change (number of lines changed) >>> I believe that from this information we can compose a simple ranking >>> algorithm that its output will be a list of N potential reviewers for the >>> patch. >>> Since we can aggregate the above information for all files related to the >>> patch we want to add reviewers, we can have this set for the whole patch. >>> This information should be processed and stored each N days and gerrit will >>> be able to use it. >> >> why are we trying to invent hueristics, instead of declaring clear >> ownership? > > Reminder: my source metadata plan that requires cooperation. > > Each source and component should have an explicit ownership up into bug > database. > > I won't repeat it now, it is available at archives. > I think we are discussing two different issues here: The first one considers the GSOC project, of adding reviewers automatically to a patch, this can be done in many different heuristics depending what the submitter prefers (use blame, maintainers who acked the patches, bugs, list of owners to a file and so on).
The second issue is adding and maintain a list of owners by files/folders in oVirt. this is a specific use case to be used by the "add potential reviewers" project. >> >>> >>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Users mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>>>>>> >>>>>>>>> >>>>>>>>> I think if we do this, we want to make sure we cover per file who is >>>>>>>>> required to +2 it before we consider it acked. >>>>>>>>> >>>>>>>> >>>>>>>> won't it require maintaining static lists of people per >>>>>>>> file/path/project? >>>>>>>> >>>>>>> >>>>>>> yes, but considering our project layout, i don't see an alternative. >>>>>>> (some of the layout could be improved to be path based, rather than >>>>>>> file >>>>>>> based) >>>>>> I think it could be done automatically by analysing the file and see who >>>>>> mostly changed it recently, since the "owner" of the file might be >>>>>> dynamic, who ever changed most of it few days ago might be more familiar >>>>>> with it today >>>>>> >>>>>> IMO the algorithm of adding the reviewers should be flexible. >>>>>> For example, using a folder which will contain files, where each file >>>>>> implement an algorithm to add the reviewers. >>>>>> >>>>>> for instance we can have two files: >>>>>> 1. Add a reviewers by blame - the contributor which changed recently the >>>>>> code lines >>>>>> 2. Add a reviewers by file - the contributor who changed most of the >>>>>> file recently. >>>>>> >>>>>> Each file will implement the functional operation and will output the >>>>>> reviewers emails. >>>>>> >>>>>> The user can then add a new algorithm or change it to be more specific >>>>>> to its project. >>>>>> for example the user can add also the maintainers which acked the patch >>>>>> that was blamed. >>>>>>> _______________________________________________ >>>>>>> Users mailing list >>>>>>> [email protected] >>>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>> >>>>> >>>>> this shouldn't be automatic. we need to clearly define ownership. we >>>>> can't >>>>> do >>>>> this per repo for the engine/vdsm. we can do this per repo for the other >>>>> repo's probably (though solving the folder/file approach would cover the >>>>> simpler repos as a private case). >>>>> >>>>> yes, it will require some work, maybe some moving around of files to make >>>>> this >>>>> easier by folders (topics) which should be relevant anyway. >>>> >>>> I think it would easier to maintain if we just have one file at the >>>> root, instead of having the ownership information distributed >>>> throughout the files/directories. That way you'll know where to look at >>>> to check/modify the ownership as opposed to having to walk all the >>>> files and upper directories. >>>> >>>> Also, adding all that ownership logic to gerrit might not be easy, as >>>> it's not meant to go checking the source of the repositories looking >>>> for configuration. We might want to take a look (again) to zuul, the >>>> tool that openstack uses as gateway to trigger jenkins jobs and merge >>>> patches. >>>> >>>>> _______________________________________________ >>>>> Infra mailing list >>>>> [email protected] >>>>> http://lists.ovirt.org/mailman/listinfo/infra >>>> >>>> >>>> >>>> -- >>>> David Caro >>>> >>>> Red Hat S.L. >>>> Continuous Integration Engineer - EMEA ENG Virtualization R&D >>>> >>>> Email: [email protected] >>>> Web: www.redhat.com >>>> RHT Global #: 82-62605 >>>> >>>> >>>> _______________________________________________ >>>> Infra mailing list >>>> [email protected] >>>> http://lists.ovirt.org/mailman/listinfo/infra >>>> >> >> _______________________________________________ >> Infra mailing list >> [email protected] >> http://lists.ovirt.org/mailman/listinfo/infra >> _______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

