RE: Commit reviews' author statistics: bus factor issue?

2021-04-23 Thread Luke Perkins
Please remove this person from the mailing list. He has cancer is no longer 
able to make cognitive decisions regarding this discussion.

Thank-you,

Caretaker of Luke Perkins

-Original Message-
From: Daniel Shahaf  
Sent: Thursday, April 22, 2021 5:11 PM
To: Johan Corveleyn 
Cc: Subversion Development 
Subject: Re: Commit reviews' author statistics: bus factor issue?

Johan Corveleyn wrote on Thu, 22 Apr 2021 19:53 +00:00:
> On Thu, Apr 22, 2021 at 9:22 PM Daniel Shahaf  wrote:
> >
> > [ Forwarding from private@ with an addition between triple dashes 
> > and some paragraphs omitted altogether. ]
> >
> > Methodology: In my dev@ mailbox, I looked at "Re: svn commit" 
> > threads where the subject line contained "trunk" somewhere, filtered 
> > by date (using, e.g.,  ~s 'Re: svn commit' !~<( ~s 'Re: svn commit' ) ~d 
> > '<730d'
> > ~s trunk  in Mutt ).  I then did a author histogram (the moral 
> > equivalent of  SELECT author, COUNT(*) AS cnt FROM 
> > results_of_the_filter GROUP BY author ORDER BY cnt ).
> >
> > With the date filter set to ">6 years ago", the histogram is:
> > .
> > 1, 1, 1, 1, 2, 3, 6, 7, 10, 12, 13, 13, 19, 27, 49, 58, 86 .
> > Top three: 28.1%, 19.0%, 16.0%.
> >
> > With the date filter set to "<2 years ago", the histogram is:
> > .
> > 1, 1, 1, 1, 1, 1, 1, 1, 4, 5, 30 .
> > Top three: 64%, 10.6%, 8.5%.
> >
> > Do we have a bus factor problem?
> >
> > ---
> >
> > I'm deliberately not posting the author identities part of the 
> > histograms.  It's public info (and I literally did just post 
> > instructions for how to compute it, for reproducibility), but no 
> > individual's contributions or contribution statistics are the point.
> >
> > The histogram is of the authors of commit review threads, not of 
> > everyone who participated in such threads.
> >
> > ---
> >
> > Having few reviewers is problematic in various ways:
> >
> > - Bus factor
> >
> > - Single point of failure (cf. Linus' Law)
> >
> > - Possibility of zero reviews for some areas of the code
> >
> > - Review standards should be seen as community standards rather than
> >   a reviewer's idiosyncrasies; cf. the point about new projects needing
> >   at least two mentors ("parents"), rather than just one
> >
> > - [not an exhaustive list]
> >
> > Cheers,
> >
> > Daniel
> >
> >   There may be a better way to express "first in a thread".  I tried 
> >  !~<(^) , but couldn't get it to work.
> 
> Good point. But I believe we have many other areas where our bus 
> factor is getting very low.
> 
> - RM -- [ ]
> 
> - Security issues: [ ]
> 
> - Approving backports. [ ]
> 
> - Signing releases: [ ]
> 
> 
> Perhaps the lack of review-activity for us as a CTR project is more 
> critical, I don't know. Good observation in any case.

Note some of these issues are related: Approving backports amounts to reviewing 
some specific commits carefully; work on security issues involves some 
reviewing, some RMing, and some signing.

I'd be hesitant to say "more critical" because I'm not sure the criticalities 
of these five areas are orderable, but historically, I think we rely on commit 
reviews to catch certain classes of bugs: for instance, cross-version 
compatibility (in the ABI, over the wire, in on-disk formats) and FSFS 
concurrency guarantees are both core promises that have little test coverage.

And, of course, there's any number of C gotchas and portability quirks that our 
compilers don't warn on (in the default build configuration).

Cheers,

Daniel



RE: The future of the Subversion book - Thank-you

2018-09-06 Thread Luke Perkins
Mike,

Thank-you for all your work with subversion. It is a lot of work and we would 
not have the tool we have today without the effort of people like you.

Regards,
Luke Perkins
2581 Flagstone Drive
San Jose, CA 95132
Cell: 719-339-0987

-Original Message-
From: C. Michael Pilato  
Sent: Wednesday, September 5, 2018 12:39 PM
To: dev@subversion.apache.org
Subject: The future of the Subversion book

Hello, all!

It's been a long while since I interacted with any degree of regularity with 
this community, and I've had to come to terms with some essential truths.

First, my time as an active Subversion developer has *definitely* passed.  Oh, 
I may get a chance to return to it at some point in the (likely distant) 
future, but without CollabNet commissioning my efforts here, I simply don't 
have the extra cycles these days to offer.  Given that my contributions over 
the last few years can be measured in the smallest of numbers, this isn't news 
to anyone here and certainly has no effect on the trajectory and velocity of 
the project!

Of greater concern to (at least) myself is that the cognitive distance I have 
from Subversion these days -- combined with the craziness of just life as an 
twice-employed[1], soccer-coaching, father of three -- means that the 
Subversion book is getting next-to-zero attention, too.  Oh, I'm still paying 
attention to the work our translators are doing, and wordsmithing here and 
there as concerns are raised.  But the
(as-yet-unfinished) trunk of the book is still attached to Subversion 1.8, 
which means that this community has pounded out all kinds of improvements whose 
documentation is mostly limited to release notes and email threads.  Put 
simply, the service that Ben and Fitz (both long gone from contributing to 
the book at all) and I formerly offered to the wider Subversion community has 
arguably now become a disservice.

I'm done telling myself that I can fix this by re-engaging and taking up 
authorship again.  That just isn't gonna happen.  It's time to pass the torch 
to someone else, and I would love to immediately begin tossing around some 
ideas toward this end.

To be clear, red-bean.com is happy to continue hosting the book's HTML/PDF 
builds.  The source lives at SourceForge these days, and I can grant commit 
permissions (or transfer ownership) as needed.  Moreover, there's no deadline 
for maintainership handoff that I'm trying to impose or anything.  I want to do 
what's best for the Subversion ecosystem, whatever this community determines 
that to be.

Feel free to consider alternate approaches, too, such as conversion of the 
book's content into a Wiki.  But I would caution against doing anything that 
discourages or complicates the workflow of the book's translators, especially 
since they are the only ones actually doing anything in the project at all!  :-)

So what do you think?

-- Mike


[1] Beyond my regular CollabNet work week, I give additional hours as a
 member of the staff of my local church.



RE: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

2017-01-25 Thread Luke Perkins
Martin and team,

Statement: "So the only way to solve your problem is to create a tool which 
parses the dump files and creates a checksum in a defined way so that they are 
comparable."

Agreed. My thoughts exactly.

Thank-you,

Luke Perkins

-Original Message-
From: Martin Furter [mailto:mfur...@bluewin.ch] 
Sent: Tuesday, January 24, 2017 19:56
To: lukeperk...@epicdgs.us
Subject: Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

On 01/25/2017 03:15 AM, Luke Perkins wrote:
> Michael,
>
> I appreciate everyone's audience on this issue. I have not felt a need to be 
> directly involved in the subversion system mainly because it works so well. 
> This is the first time in 10 years I have felt the need to get directly 
> involved in the SVN development team.
>
> Statement: " As a bug report alone, this one seems pretty easy:  
> Closed/INVALID."
>
> I completely disagree with this statement. I have nearly 300GB of dump files 
> used as a means of backing up my repositories. Some of these dump files are 
> 10 years old. The incremental SVN dump file is automatically generated at 
> each and every commit. After these incremental SVN dump files are created, 
> they are copied and distributed to offsite locations. That way if my server 
> farm crashes, I have a means of assured recovery.
>
> Every month I run sha512sum integrity checks on both the dump files (remotely 
> located in 3 different locations) and the dump file produced by the 
> subversion server. Transferring thousands of 128 byte files is a much better 
> option than transferring thousands of MB dump files over the internet to 
> remote locations. This method and automated scripts have worked for 10 years. 
> I have rebuilt my servers from the original dump files on at least 2 
> occasions because of computer crashes. This provides me a sanity and 
> validation methodology so that I can spot problems quickly and rebuild before 
> things get out of hand.
>
> Asking me to redistribute 300GB of data to 3 different offsite (and remote) 
> locations, is not a good option.
>
> The SVN dump file has always been presented as the ultimate backup tool of 
> the subversion system. The integrity of the SVN dump file system is of 
> paramount importance. The whole reason why SVN exists in the first place is 
> "data integrity and traceability". The code was changed back in 2015, for 
> better or worse, and we need present solutions to address legacy backups.
A stable order of header lines will solve your problem for now. But in the 
future somebody might add a new feature to subversion and a new header field to 
the dump files. This will break your checksums again. 
Back in the pre-1.0 days when I was working on svmdumptool i had the same 
troubles with changing headers and new fields. So the only way to solve your 
problem is to create a tool which parses the dump files and creates a checksum 
in a defined way so that they are comparable.

- Martin



RE: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

2017-01-24 Thread Luke Perkins
Mark and team,

 

Again thank you for the team’s consideration of this issue.

 

Statement: “That said, based on I think Julian's comment, it seemed like we 
could restore the old order quite easily without breaking anything so that 
seemed harmless to me and I do not see that it has a negative impact on 1.9.x 
users for whom their order would now change either .. for same reason that we 
are still not claiming the order is significant to us.”

 

I am all in favor of moving forward fixing the order as proposed in Issue #4668 
as soon as possible. If we do not call it an issue but yet still fix it, that 
would be fine. My experience with the “INVALID” term is that the issue is 
dropped and ignored; that would not be palatable to address my system needs.

 

As a system administrator of an SVN repository system, I do not see the 
overwhelming need to maintain the 1.9 node key order. I would need to 
redistribute about 5GB of dump files instead of 300GB. So implementing a fixed 
order without a switch would be a reasonable compromise.

 

Thank-you,

 

Luke Perkins

 

From: Mark Phippard [mailto:markp...@gmail.com] 
Sent: Tuesday, January 24, 2017 14:07
To: lukeperk...@epicdgs.us
Cc: C. Michael Pilato <cmpil...@collab.net>; Subversion Development 
<dev@subversion.apache.org>; Daniel Shahaf <danie...@apache.org>; Julian Foad 
<julianf...@apache.org>
Subject: Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

 

On Tue, Jan 24, 2017 at 4:45 PM, Luke Perkins <lukeperk...@epicdgs.us 
<mailto:lukeperk...@epicdgs.us> > wrote:

Michael,

I appreciate everyone's audience on this issue. I have not felt a need to be 
directly involved in the subversion system mainly because it works so well. 
This is the first time in 10 years I have felt the need to get directly 
involved in the SVN development team.

Statement: " As a bug report alone, this one seems pretty easy:  
Closed/INVALID."

I completely disagree with this statement. I have nearly 300GB of dump files 
used as a means of backing up my repositories. Some of these dump files are 10 
years old. The incremental SVN dump file is automatically generated at each and 
every commit. After these incremental SVN dump files are created, they are 
copied and distributed to offsite locations. That way if my server farm 
crashes, I have a means of assured recovery.

Every month I run sha512sum integrity checks on both the dump files (remotely 
located in 3 different locations) and the dump file produced by the subversion 
server. Transferring thousands of 128 byte files is a much better option than 
transferring thousands of MB dump files over the internet to remote locations. 
This method and automated scripts have worked for 10 years. I have rebuilt my 
servers from the original dump files on at least 2 occasions because of 
computer crashes. This provides me a sanity and validation methodology so that 
I can spot problems quickly and rebuild before things get out of hand.

Asking me to redistribute 300GB of data to 3 different offsite (and remote) 
locations, is not a good option.

The SVN dump file has always been presented as the ultimate backup tool of the 
subversion system. The integrity of the SVN dump file system is of paramount 
importance. The whole reason why SVN exists in the first place is "data 
integrity and traceability". The code was changed back in 2015, for better or 
worse, and we need present solutions to address legacy backups.

 

 

OK, but aren't you moving the goal posts now?  You are implying those old dump 
files no longer work or will not load.  That is not true.  The only issue is 
with your own process where you diff a dump file.  Mike is simply saying you 
are doing something we never claimed should work.  The fact that it did for you 
was just luck that may have ran out.

 

That said, based on I think Julian's comment, it seemed like we could restore 
the old order quite easily without breaking anything so that seemed harmless to 
me and I do not see that it has a negative impact on 1.9.x users for whom their 
order would now change either .. for same reason that we are still not claiming 
the order is significant to us. 

 

Mike seemed to be pushing back on trying to formalize support for something we 
specifically do not support, which is that the headers in the dump file appear 
in a specific order.  I cannot really disagree with him on that point.


 

-- 

Thanks

Mark Phippard
http://markphip.blogspot.com/



RE: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

2017-01-24 Thread Luke Perkins
Daniel and team,

I appreciate all the consideration for this issue.

I anticipated that there would be some naming adjustments. If someone has a 
better naming convention, I am all ears.

My vote is that we implement fixed order for the four keys outlined in the JIRA 
issue 4668 as soon as possible. The switch would activate the current 1.9 
scheme. I am always sensitive to the plight of the system administrator who is 
tasked with deploying SVN repositories in a real world environment. Thus, 
maintaining 1.9 formatting option is appropriate.

Regarding the SVN dump comparison tool: I am actively reviewing the options on 
my own machine. I should have a proposal for a new tool to the development team 
after I have vetted it with my own servers and networks. I am swamped with 
other engineering activities so this new tool proposal might be a few weeks out.

Thank-you,

Luke Perkins

-Original Message-
From: Daniel Shahaf [mailto:danie...@apache.org] 
Sent: Tuesday, January 24, 2017 09:01
To: Julian Foad <julianf...@apache.org>
Cc: lukeperk...@epicdgs.us; dev@subversion.apache.org
Subject: Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

Julian Foad wrote on Tue, Jan 24, 2017 at 09:51:52 +0000:
> Luke Perkins wrote:
> >I have defined a new switch for  svnadmin dump  pre-1.8-dump  to 
> >activate the old node key order. I did my best to try to keep the 
> >original authors style. Is this an acceptable switch name?
> >
> >A new parameter to the primary function,  svn_repos_dump_fs4 , called 
> > svn_boolean_t pre_1_8_dump, . I have search the source code and I 
> >think I have all of the function calls covered. Are there any other 
> >considerations?
> 
> Every new option we add carries new costs with it -- maintenance, 
> documentation, user education, more variations to test, etc.
> 
> I think there is no significant reason to disable the ordering. Of 
> course compatibility is always a concern, and we don't want to fix one 
> use case while breaking another. I suppose it could be that the 
> supposedly unstable order has actually been coming out stable for some 
> people, in which case they would potentially benefit by leaving it is 
> as it -- but that's being too "tricky" -- we should keep it simple by not 
> adding an option.
> 
> Do you have any particular reason why you think the option is necessary?

The bug is about 1.9 using a different order to 1.8.  If we make svnadmin use 
the 1.8 unconditionally, then 1.9 will use a different order to 1.10, which 
essentially recreates the bug for other users.

That is: the option serves to allow admins to choose which of the two orders to 
be consistent with, 1.8 or 1.9.

An alternative to having this option would be a dumpfile comparator that 
ignores header order differences.

> >What is the verification procedure for implementing this change?
> 
> Running the regression test suite with the (rather hidden) 
> "--dump-load-cross-check" option enabled, plus feedback from yourself 
> and one or two others that it "works for you".

I think Luke was asking about whether a unit test is required.  I'll leave it 
to you guys to decide that, but just point out for Luke that svnadmin_tests.py 
would be where such a test would live.

Cheers,

Daniel



RE: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

2017-01-24 Thread Luke Perkins
Michael,

I appreciate everyone's audience on this issue. I have not felt a need to be 
directly involved in the subversion system mainly because it works so well. 
This is the first time in 10 years I have felt the need to get directly 
involved in the SVN development team.

Statement: " As a bug report alone, this one seems pretty easy:  
Closed/INVALID."

I completely disagree with this statement. I have nearly 300GB of dump files 
used as a means of backing up my repositories. Some of these dump files are 10 
years old. The incremental SVN dump file is automatically generated at each and 
every commit. After these incremental SVN dump files are created, they are 
copied and distributed to offsite locations. That way if my server farm 
crashes, I have a means of assured recovery.

Every month I run sha512sum integrity checks on both the dump files (remotely 
located in 3 different locations) and the dump file produced by the subversion 
server. Transferring thousands of 128 byte files is a much better option than 
transferring thousands of MB dump files over the internet to remote locations. 
This method and automated scripts have worked for 10 years. I have rebuilt my 
servers from the original dump files on at least 2 occasions because of 
computer crashes. This provides me a sanity and validation methodology so that 
I can spot problems quickly and rebuild before things get out of hand.

Asking me to redistribute 300GB of data to 3 different offsite (and remote) 
locations, is not a good option.

The SVN dump file has always been presented as the ultimate backup tool of the 
subversion system. The integrity of the SVN dump file system is of paramount 
importance. The whole reason why SVN exists in the first place is "data 
integrity and traceability". The code was changed back in 2015, for better or 
worse, and we need present solutions to address legacy backups.

Thank-you,

Luke Perkins

-Original Message-
From: C. Michael Pilato [mailto:cmpil...@collab.net] 
Sent: Tuesday, January 24, 2017 09:38
To: Daniel Shahaf <danie...@apache.org>; Julian Foad <julianf...@apache.org>
Cc: lukeperk...@epicdgs.us; dev@subversion.apache.org
Subject: Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

On 01/24/2017 12:01 PM, Daniel Shahaf wrote:
> The bug is about 1.9 using a different order to 1.8.  If we make 
> svnadmin use the 1.8 unconditionally, then 1.9 will use a different 
> order to 1.10, which essentially recreates the bug for other users.
>
> That is: the option serves to allow admins to choose which of the two 
> orders to be consistent with, 1.8 or 1.9.
>
> An alternative to having this option would be a dumpfile comparator 
> that ignores header order differences.

As a bug report alone, this one seems pretty easy:  Closed/INVALID. 
Dumpfile headers were never promised in a particular order, therefore that 
their order should differ in one version than in another is an interesting 
factoid, but not actionable *as a defect*.  I think it unwise to introduce an 
option to dictate that new code should try to adhere to an old promise that 
wasn't.

Now, it's completely reasonable to introduce into the current release a promise 
regarding future header ordering, though.  And it is completely reasonable to 
backport an enforcement of that ordering (minus its attached promise?) to older 
releases for the benefit of users that care.  And it may very well be that "the 
1.8 ordering" is the very ordering you'd settle on, but perhaps not.  But to 
even get here, I think folks have to decide if this is, in fact, a promise that 
Subversion wants to make.  And if so, how universally?  Does this order apply 
to svndumpfilter and svnrdump, too?

In all these scenarios though, nothing can be done about released code save, as 
you suggest, Daniel, to introduce some external comparator.

-- Mike



[PATCH] Issue #4668: Fixing the node key order during svnadmin dump

2017-01-23 Thread Luke Perkins
[[[
Fix issue #4668: svnadmin dump node header order has changed

These changes provide a means for the user to output the svnadmin dump using
the node key
order used in svnadmin version 1.8 and earlier.

* subversion/include/svn_repos.h
Added comment regarding the switch usage.
(svn_repos_dump_fs4): Added Boolean parameter to function prototype called
pre_1_8_dump
* subversion/svnadmin/svnadmin.c
(svnadmin_cmdline_options_t ): Added svnadmin__pre_1_8_dump to enumerated
definition.
(options_table): Added pre_1_8_dump definition to the table.
(cmd_table ): Modified the help usage text to include the pre-1.8-dump
switch.
(svnadmin_opt_state ): Added pre_1_8_dump boolean member to structure
definition.
(subcommand_dump): Added pre_1_8_dump variable to svn_repos_dump_fs4
function call.
(sub_main): Added case supporting svnadmin__pre_1_8_dump value.
* subversion/libsvn_repos/dump.c
(write_revision_headers, write_revision_headers_v1651614 ): Renamed original
write_revision_headers function to write_revision_headers_v1651614.
(write_revision_headers_svn_4668): Added new function with fixed node key
ordering as prescribed in the prose of Issue 4668.
(write_revision_headers): Modified function to switch between two node key
formatting methods.
(svn_repos__dump_revision_record, write_revision_record,
svn_repos_dump_fs4): Added pre_1_8_dump parameter to define the node key
ordering during svnadmin dump operations.

Patch by: L. Perkins <lukeperk...@epicdgs.us)
]]]

Thank-you,

Luke Perkins


ProposedChanges_201701231053_issue-4668.patch
Description: Binary data


SVN Build Bot Farm

2017-01-23 Thread Luke Perkins
Team Members,

 

I am an active user and SVN administrator for three systems in both personal
and professional realms. I have not been part of the SVN developers or user
mailing lists until recently, mainly because the SVN system works so well I
have not seen a need to get involved. I repent.

 

My personal interest is Low Power Intel Celeron systems running Ubuntu. I
have one of these systems available running 16.10 SVN testing. NOTE: I have
found an Ubuntu bug with 16.04 LTS.

 

If someone can provide me a setup instructions, I can make this system
available for testing.

 

The resources I have available:

 

1.  Two business class HSI routers with bandwidth to spare. These
firewalls have DynDNS URLs that I can make available to the appropriate
people.
2.  1 Intel NUC Ubuntu 16.10 server.

 

Also, if someone can provide me a system need, I could build a dedicated
system and leave it on one of my two networks. This machine would be
dedicated to SVN testing and verification.

 

NOTE: I live in California with very expensive electricity. I locate high
power servers at an undisclosed remote location where electricity is much
cheaper. 

 

Thank-you,

 

Luke Perkins

Cell: (USA) 719-339-0987