RE: Commit reviews' author statistics: bus factor issue?
Please remove this person from the mailing list. He has cancer is no longer able to make cognitive decisions regarding this discussion. Thank-you, Caretaker of Luke Perkins -Original Message- From: Daniel Shahaf Sent: Thursday, April 22, 2021 5:11 PM To: Johan Corveleyn Cc: Subversion Development Subject: Re: Commit reviews' author statistics: bus factor issue? Johan Corveleyn wrote on Thu, 22 Apr 2021 19:53 +00:00: > On Thu, Apr 22, 2021 at 9:22 PM Daniel Shahaf wrote: > > > > [ Forwarding from private@ with an addition between triple dashes > > and some paragraphs omitted altogether. ] > > > > Methodology: In my dev@ mailbox, I looked at "Re: svn commit" > > threads where the subject line contained "trunk" somewhere, filtered > > by date (using, e.g., ~s 'Re: svn commit' !~<( ~s 'Re: svn commit' ) ~d > > '<730d' > > ~s trunk in Mutt ). I then did a author histogram (the moral > > equivalent of SELECT author, COUNT(*) AS cnt FROM > > results_of_the_filter GROUP BY author ORDER BY cnt ). > > > > With the date filter set to ">6 years ago", the histogram is: > > . > > 1, 1, 1, 1, 2, 3, 6, 7, 10, 12, 13, 13, 19, 27, 49, 58, 86 . > > Top three: 28.1%, 19.0%, 16.0%. > > > > With the date filter set to "<2 years ago", the histogram is: > > . > > 1, 1, 1, 1, 1, 1, 1, 1, 4, 5, 30 . > > Top three: 64%, 10.6%, 8.5%. > > > > Do we have a bus factor problem? > > > > --- > > > > I'm deliberately not posting the author identities part of the > > histograms. It's public info (and I literally did just post > > instructions for how to compute it, for reproducibility), but no > > individual's contributions or contribution statistics are the point. > > > > The histogram is of the authors of commit review threads, not of > > everyone who participated in such threads. > > > > --- > > > > Having few reviewers is problematic in various ways: > > > > - Bus factor > > > > - Single point of failure (cf. Linus' Law) > > > > - Possibility of zero reviews for some areas of the code > > > > - Review standards should be seen as community standards rather than > > a reviewer's idiosyncrasies; cf. the point about new projects needing > > at least two mentors ("parents"), rather than just one > > > > - [not an exhaustive list] > > > > Cheers, > > > > Daniel > > > > There may be a better way to express "first in a thread". I tried > > !~<(^) , but couldn't get it to work. > > Good point. But I believe we have many other areas where our bus > factor is getting very low. > > - RM -- [ ] > > - Security issues: [ ] > > - Approving backports. [ ] > > - Signing releases: [ ] > > > Perhaps the lack of review-activity for us as a CTR project is more > critical, I don't know. Good observation in any case. Note some of these issues are related: Approving backports amounts to reviewing some specific commits carefully; work on security issues involves some reviewing, some RMing, and some signing. I'd be hesitant to say "more critical" because I'm not sure the criticalities of these five areas are orderable, but historically, I think we rely on commit reviews to catch certain classes of bugs: for instance, cross-version compatibility (in the ABI, over the wire, in on-disk formats) and FSFS concurrency guarantees are both core promises that have little test coverage. And, of course, there's any number of C gotchas and portability quirks that our compilers don't warn on (in the default build configuration). Cheers, Daniel
RE: The future of the Subversion book - Thank-you
Mike, Thank-you for all your work with subversion. It is a lot of work and we would not have the tool we have today without the effort of people like you. Regards, Luke Perkins 2581 Flagstone Drive San Jose, CA 95132 Cell: 719-339-0987 -Original Message- From: C. Michael Pilato Sent: Wednesday, September 5, 2018 12:39 PM To: dev@subversion.apache.org Subject: The future of the Subversion book Hello, all! It's been a long while since I interacted with any degree of regularity with this community, and I've had to come to terms with some essential truths. First, my time as an active Subversion developer has *definitely* passed. Oh, I may get a chance to return to it at some point in the (likely distant) future, but without CollabNet commissioning my efforts here, I simply don't have the extra cycles these days to offer. Given that my contributions over the last few years can be measured in the smallest of numbers, this isn't news to anyone here and certainly has no effect on the trajectory and velocity of the project! Of greater concern to (at least) myself is that the cognitive distance I have from Subversion these days -- combined with the craziness of just life as an twice-employed[1], soccer-coaching, father of three -- means that the Subversion book is getting next-to-zero attention, too. Oh, I'm still paying attention to the work our translators are doing, and wordsmithing here and there as concerns are raised. But the (as-yet-unfinished) trunk of the book is still attached to Subversion 1.8, which means that this community has pounded out all kinds of improvements whose documentation is mostly limited to release notes and email threads. Put simply, the service that Ben and Fitz (both long gone from contributing to the book at all) and I formerly offered to the wider Subversion community has arguably now become a disservice. I'm done telling myself that I can fix this by re-engaging and taking up authorship again. That just isn't gonna happen. It's time to pass the torch to someone else, and I would love to immediately begin tossing around some ideas toward this end. To be clear, red-bean.com is happy to continue hosting the book's HTML/PDF builds. The source lives at SourceForge these days, and I can grant commit permissions (or transfer ownership) as needed. Moreover, there's no deadline for maintainership handoff that I'm trying to impose or anything. I want to do what's best for the Subversion ecosystem, whatever this community determines that to be. Feel free to consider alternate approaches, too, such as conversion of the book's content into a Wiki. But I would caution against doing anything that discourages or complicates the workflow of the book's translators, especially since they are the only ones actually doing anything in the project at all! :-) So what do you think? -- Mike [1] Beyond my regular CollabNet work week, I give additional hours as a member of the staff of my local church.
RE: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump
Martin and team, Statement: "So the only way to solve your problem is to create a tool which parses the dump files and creates a checksum in a defined way so that they are comparable." Agreed. My thoughts exactly. Thank-you, Luke Perkins -Original Message- From: Martin Furter [mailto:mfur...@bluewin.ch] Sent: Tuesday, January 24, 2017 19:56 To: lukeperk...@epicdgs.us Subject: Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump On 01/25/2017 03:15 AM, Luke Perkins wrote: > Michael, > > I appreciate everyone's audience on this issue. I have not felt a need to be > directly involved in the subversion system mainly because it works so well. > This is the first time in 10 years I have felt the need to get directly > involved in the SVN development team. > > Statement: " As a bug report alone, this one seems pretty easy: > Closed/INVALID." > > I completely disagree with this statement. I have nearly 300GB of dump files > used as a means of backing up my repositories. Some of these dump files are > 10 years old. The incremental SVN dump file is automatically generated at > each and every commit. After these incremental SVN dump files are created, > they are copied and distributed to offsite locations. That way if my server > farm crashes, I have a means of assured recovery. > > Every month I run sha512sum integrity checks on both the dump files (remotely > located in 3 different locations) and the dump file produced by the > subversion server. Transferring thousands of 128 byte files is a much better > option than transferring thousands of MB dump files over the internet to > remote locations. This method and automated scripts have worked for 10 years. > I have rebuilt my servers from the original dump files on at least 2 > occasions because of computer crashes. This provides me a sanity and > validation methodology so that I can spot problems quickly and rebuild before > things get out of hand. > > Asking me to redistribute 300GB of data to 3 different offsite (and remote) > locations, is not a good option. > > The SVN dump file has always been presented as the ultimate backup tool of > the subversion system. The integrity of the SVN dump file system is of > paramount importance. The whole reason why SVN exists in the first place is > "data integrity and traceability". The code was changed back in 2015, for > better or worse, and we need present solutions to address legacy backups. A stable order of header lines will solve your problem for now. But in the future somebody might add a new feature to subversion and a new header field to the dump files. This will break your checksums again. Back in the pre-1.0 days when I was working on svmdumptool i had the same troubles with changing headers and new fields. So the only way to solve your problem is to create a tool which parses the dump files and creates a checksum in a defined way so that they are comparable. - Martin
RE: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump
Mark and team, Again thank you for the team’s consideration of this issue. Statement: “That said, based on I think Julian's comment, it seemed like we could restore the old order quite easily without breaking anything so that seemed harmless to me and I do not see that it has a negative impact on 1.9.x users for whom their order would now change either .. for same reason that we are still not claiming the order is significant to us.” I am all in favor of moving forward fixing the order as proposed in Issue #4668 as soon as possible. If we do not call it an issue but yet still fix it, that would be fine. My experience with the “INVALID” term is that the issue is dropped and ignored; that would not be palatable to address my system needs. As a system administrator of an SVN repository system, I do not see the overwhelming need to maintain the 1.9 node key order. I would need to redistribute about 5GB of dump files instead of 300GB. So implementing a fixed order without a switch would be a reasonable compromise. Thank-you, Luke Perkins From: Mark Phippard [mailto:markp...@gmail.com] Sent: Tuesday, January 24, 2017 14:07 To: lukeperk...@epicdgs.us Cc: C. Michael Pilato <cmpil...@collab.net>; Subversion Development <dev@subversion.apache.org>; Daniel Shahaf <danie...@apache.org>; Julian Foad <julianf...@apache.org> Subject: Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump On Tue, Jan 24, 2017 at 4:45 PM, Luke Perkins <lukeperk...@epicdgs.us <mailto:lukeperk...@epicdgs.us> > wrote: Michael, I appreciate everyone's audience on this issue. I have not felt a need to be directly involved in the subversion system mainly because it works so well. This is the first time in 10 years I have felt the need to get directly involved in the SVN development team. Statement: " As a bug report alone, this one seems pretty easy: Closed/INVALID." I completely disagree with this statement. I have nearly 300GB of dump files used as a means of backing up my repositories. Some of these dump files are 10 years old. The incremental SVN dump file is automatically generated at each and every commit. After these incremental SVN dump files are created, they are copied and distributed to offsite locations. That way if my server farm crashes, I have a means of assured recovery. Every month I run sha512sum integrity checks on both the dump files (remotely located in 3 different locations) and the dump file produced by the subversion server. Transferring thousands of 128 byte files is a much better option than transferring thousands of MB dump files over the internet to remote locations. This method and automated scripts have worked for 10 years. I have rebuilt my servers from the original dump files on at least 2 occasions because of computer crashes. This provides me a sanity and validation methodology so that I can spot problems quickly and rebuild before things get out of hand. Asking me to redistribute 300GB of data to 3 different offsite (and remote) locations, is not a good option. The SVN dump file has always been presented as the ultimate backup tool of the subversion system. The integrity of the SVN dump file system is of paramount importance. The whole reason why SVN exists in the first place is "data integrity and traceability". The code was changed back in 2015, for better or worse, and we need present solutions to address legacy backups. OK, but aren't you moving the goal posts now? You are implying those old dump files no longer work or will not load. That is not true. The only issue is with your own process where you diff a dump file. Mike is simply saying you are doing something we never claimed should work. The fact that it did for you was just luck that may have ran out. That said, based on I think Julian's comment, it seemed like we could restore the old order quite easily without breaking anything so that seemed harmless to me and I do not see that it has a negative impact on 1.9.x users for whom their order would now change either .. for same reason that we are still not claiming the order is significant to us. Mike seemed to be pushing back on trying to formalize support for something we specifically do not support, which is that the headers in the dump file appear in a specific order. I cannot really disagree with him on that point. -- Thanks Mark Phippard http://markphip.blogspot.com/
RE: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump
Daniel and team, I appreciate all the consideration for this issue. I anticipated that there would be some naming adjustments. If someone has a better naming convention, I am all ears. My vote is that we implement fixed order for the four keys outlined in the JIRA issue 4668 as soon as possible. The switch would activate the current 1.9 scheme. I am always sensitive to the plight of the system administrator who is tasked with deploying SVN repositories in a real world environment. Thus, maintaining 1.9 formatting option is appropriate. Regarding the SVN dump comparison tool: I am actively reviewing the options on my own machine. I should have a proposal for a new tool to the development team after I have vetted it with my own servers and networks. I am swamped with other engineering activities so this new tool proposal might be a few weeks out. Thank-you, Luke Perkins -Original Message- From: Daniel Shahaf [mailto:danie...@apache.org] Sent: Tuesday, January 24, 2017 09:01 To: Julian Foad <julianf...@apache.org> Cc: lukeperk...@epicdgs.us; dev@subversion.apache.org Subject: Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump Julian Foad wrote on Tue, Jan 24, 2017 at 09:51:52 +0000: > Luke Perkins wrote: > >I have defined a new switch for svnadmin dump pre-1.8-dump to > >activate the old node key order. I did my best to try to keep the > >original authors style. Is this an acceptable switch name? > > > >A new parameter to the primary function, svn_repos_dump_fs4 , called > > svn_boolean_t pre_1_8_dump, . I have search the source code and I > >think I have all of the function calls covered. Are there any other > >considerations? > > Every new option we add carries new costs with it -- maintenance, > documentation, user education, more variations to test, etc. > > I think there is no significant reason to disable the ordering. Of > course compatibility is always a concern, and we don't want to fix one > use case while breaking another. I suppose it could be that the > supposedly unstable order has actually been coming out stable for some > people, in which case they would potentially benefit by leaving it is > as it -- but that's being too "tricky" -- we should keep it simple by not > adding an option. > > Do you have any particular reason why you think the option is necessary? The bug is about 1.9 using a different order to 1.8. If we make svnadmin use the 1.8 unconditionally, then 1.9 will use a different order to 1.10, which essentially recreates the bug for other users. That is: the option serves to allow admins to choose which of the two orders to be consistent with, 1.8 or 1.9. An alternative to having this option would be a dumpfile comparator that ignores header order differences. > >What is the verification procedure for implementing this change? > > Running the regression test suite with the (rather hidden) > "--dump-load-cross-check" option enabled, plus feedback from yourself > and one or two others that it "works for you". I think Luke was asking about whether a unit test is required. I'll leave it to you guys to decide that, but just point out for Luke that svnadmin_tests.py would be where such a test would live. Cheers, Daniel
RE: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump
Michael, I appreciate everyone's audience on this issue. I have not felt a need to be directly involved in the subversion system mainly because it works so well. This is the first time in 10 years I have felt the need to get directly involved in the SVN development team. Statement: " As a bug report alone, this one seems pretty easy: Closed/INVALID." I completely disagree with this statement. I have nearly 300GB of dump files used as a means of backing up my repositories. Some of these dump files are 10 years old. The incremental SVN dump file is automatically generated at each and every commit. After these incremental SVN dump files are created, they are copied and distributed to offsite locations. That way if my server farm crashes, I have a means of assured recovery. Every month I run sha512sum integrity checks on both the dump files (remotely located in 3 different locations) and the dump file produced by the subversion server. Transferring thousands of 128 byte files is a much better option than transferring thousands of MB dump files over the internet to remote locations. This method and automated scripts have worked for 10 years. I have rebuilt my servers from the original dump files on at least 2 occasions because of computer crashes. This provides me a sanity and validation methodology so that I can spot problems quickly and rebuild before things get out of hand. Asking me to redistribute 300GB of data to 3 different offsite (and remote) locations, is not a good option. The SVN dump file has always been presented as the ultimate backup tool of the subversion system. The integrity of the SVN dump file system is of paramount importance. The whole reason why SVN exists in the first place is "data integrity and traceability". The code was changed back in 2015, for better or worse, and we need present solutions to address legacy backups. Thank-you, Luke Perkins -Original Message- From: C. Michael Pilato [mailto:cmpil...@collab.net] Sent: Tuesday, January 24, 2017 09:38 To: Daniel Shahaf <danie...@apache.org>; Julian Foad <julianf...@apache.org> Cc: lukeperk...@epicdgs.us; dev@subversion.apache.org Subject: Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump On 01/24/2017 12:01 PM, Daniel Shahaf wrote: > The bug is about 1.9 using a different order to 1.8. If we make > svnadmin use the 1.8 unconditionally, then 1.9 will use a different > order to 1.10, which essentially recreates the bug for other users. > > That is: the option serves to allow admins to choose which of the two > orders to be consistent with, 1.8 or 1.9. > > An alternative to having this option would be a dumpfile comparator > that ignores header order differences. As a bug report alone, this one seems pretty easy: Closed/INVALID. Dumpfile headers were never promised in a particular order, therefore that their order should differ in one version than in another is an interesting factoid, but not actionable *as a defect*. I think it unwise to introduce an option to dictate that new code should try to adhere to an old promise that wasn't. Now, it's completely reasonable to introduce into the current release a promise regarding future header ordering, though. And it is completely reasonable to backport an enforcement of that ordering (minus its attached promise?) to older releases for the benefit of users that care. And it may very well be that "the 1.8 ordering" is the very ordering you'd settle on, but perhaps not. But to even get here, I think folks have to decide if this is, in fact, a promise that Subversion wants to make. And if so, how universally? Does this order apply to svndumpfilter and svnrdump, too? In all these scenarios though, nothing can be done about released code save, as you suggest, Daniel, to introduce some external comparator. -- Mike
[PATCH] Issue #4668: Fixing the node key order during svnadmin dump
[[[ Fix issue #4668: svnadmin dump node header order has changed These changes provide a means for the user to output the svnadmin dump using the node key order used in svnadmin version 1.8 and earlier. * subversion/include/svn_repos.h Added comment regarding the switch usage. (svn_repos_dump_fs4): Added Boolean parameter to function prototype called pre_1_8_dump * subversion/svnadmin/svnadmin.c (svnadmin_cmdline_options_t ): Added svnadmin__pre_1_8_dump to enumerated definition. (options_table): Added pre_1_8_dump definition to the table. (cmd_table ): Modified the help usage text to include the pre-1.8-dump switch. (svnadmin_opt_state ): Added pre_1_8_dump boolean member to structure definition. (subcommand_dump): Added pre_1_8_dump variable to svn_repos_dump_fs4 function call. (sub_main): Added case supporting svnadmin__pre_1_8_dump value. * subversion/libsvn_repos/dump.c (write_revision_headers, write_revision_headers_v1651614 ): Renamed original write_revision_headers function to write_revision_headers_v1651614. (write_revision_headers_svn_4668): Added new function with fixed node key ordering as prescribed in the prose of Issue 4668. (write_revision_headers): Modified function to switch between two node key formatting methods. (svn_repos__dump_revision_record, write_revision_record, svn_repos_dump_fs4): Added pre_1_8_dump parameter to define the node key ordering during svnadmin dump operations. Patch by: L. Perkins <lukeperk...@epicdgs.us) ]]] Thank-you, Luke Perkins ProposedChanges_201701231053_issue-4668.patch Description: Binary data
SVN Build Bot Farm
Team Members, I am an active user and SVN administrator for three systems in both personal and professional realms. I have not been part of the SVN developers or user mailing lists until recently, mainly because the SVN system works so well I have not seen a need to get involved. I repent. My personal interest is Low Power Intel Celeron systems running Ubuntu. I have one of these systems available running 16.10 SVN testing. NOTE: I have found an Ubuntu bug with 16.04 LTS. If someone can provide me a setup instructions, I can make this system available for testing. The resources I have available: 1. Two business class HSI routers with bandwidth to spare. These firewalls have DynDNS URLs that I can make available to the appropriate people. 2. 1 Intel NUC Ubuntu 16.10 server. Also, if someone can provide me a system need, I could build a dedicated system and leave it on one of my two networks. This machine would be dedicated to SVN testing and verification. NOTE: I live in California with very expensive electricity. I locate high power servers at an undisclosed remote location where electricity is much cheaper. Thank-you, Luke Perkins Cell: (USA) 719-339-0987