Re: [R-SIG-Mac] Contributing to documentation [Was: Installing gfortran]

Duncan Murdoch Sat, 30 Apr 2022 17:13:14 -0700

On 30/04/2022 1:10 p.m., Jeff Newmiller wrote:

Re revision numbers... yes, you might have to take care to handle aligning the 
baseline code against which the patch was generated manually. Given that 
outside contributions would tend to start from specific releases though this 
shouldn't be too onerous.

I hope contributions wouldn't start from releases. Hopefully they'dstart from the head of R-devel on svn or its Github mirror.

R-devel is where almost all changes go first, but releases may havesplit from it a long time ago. For example, right now R 4.2.0 is veryrecent, but there have been 309 revisions in svn since 4.2.0 was splitoff from R-devel back in March, many of which affected the documentation.

The previous release (4.1.3 in March) split from R-devel more than ayear ago.

It's true that people recognize problems in the docs in releases, butthat's not what they should be editing. Often problems there were dealtwith months ago, and just weren't seen as important enough to backportto the R-patched branch so they never made it into a release.


Duncan Murdoch


The bigger impedance is that people who are good with documentation but not 
with code have a significant impedance in learning how patches work. In these 
days where you can use the GitHub web interface to clone a repo, edit a file, 
and submit a pull request without ever leaving the web browser, tools like diff 
and patch seem excessive. Only fogeys/nerds like us view them as the fabric of 
computing.

Re git and empty directories... git is structurally incapable of recording them 
in the repo. A common workaround is to touch a .gitkeep file in the directory, 
but I suspect this will never become an automated feature of git because it 
cannot be hidden from the user without making the chosen filename off-limits to 
the user.


That seems like a design flaw, but a pretty easy one to work around.


Git only cares about the data in files, not how those files are identified. Attempting to 
pretend that a directory is identifiable by its content breaks that principle. There is a 
significant benefit from the implementation and user mental model perspectives associated 
with this shift, but if you don't utilize those benefits then git is probably not for 
you. It has been a worthwhile shift for many, many others though... so calling it a 
design "flaw" might seem innocuous but misses the significant value associated 
with that design principle for others.

Yes, the workaround is easy. But it cannot be hidden unfortunately, so 
automating it within the git software itself has so far been rejected since the 
idea that empty directories exist in git is anathema and pretending otherwise 
breaks the canonical mental model.

On April 30, 2022 9:06:53 AM PDT, Duncan Murdoch <murdoch.dun...@gmail.com> 
wrote:

On 30/04/2022 11:07 a.m., Jeff Newmiller wrote:

Generating patch files is one of the most fundamental capabilities of git. 
Changes to the Linux kernel are (almost?) universally submitted via patch files 
generated from git.


svn uses nearly standard patch files, but they record svn revision numbers.  
I'd guess directly applying a git patch file to an svn working copy would 
almost always work, but I'm not sure it would always be applied correctly in 
the case where the patch was created from rev X and applied to rev Y.

Re git and empty directories... git is structurally incapable of recording them 
in the repo. A common workaround is to touch a .gitkeep file in the directory, 
but I suspect this will never become an automated feature of git because it 
cannot be hidden from the user without making the chosen filename off-limits to 
the user.


That seems like a design flaw, but a pretty easy one to work around.

Duncan Murdoch

On April 30, 2022 7:45:40 AM PDT, Duncan Murdoch <murdoch.dun...@gmail.com> 
wrote:

On 30/04/2022 9:53 a.m., Patrick Schratz wrote:

      If that is the case, why not contribute to the documentation? That
      is the whole point of an open source project after all.

Because often it is not easily accessible, e.g. living in an ancient SVN repo 
or lacking (an easy) and clear contribution guide.


There's a mirror of that repo at https://github.com/wch/r-source .  It is of 
course unofficial and not maintained by R Core so I could understand you might 
worry about using it, but as far as I know it is well maintained.  The only 
difference that I ever heard about in the past was that the official svn repo 
had an empty directory somewhere or other, and git at the time didn't support 
empty directories.  I don't know if either of those is still true.

WRT to the Mac dev instructions, I can see that the source lives in 
https://github.com/R-macos/R-mac-dev <https://github.com/R-macos/R-mac-dev> 
which is definitely a good start.
Yet I think it needs way more cross-linking between the repos, more “official” 
pointers and “how-tos” to really also encourage people to contribute.
The README could give more detailed contribution instructions, such as whom to 
tag for a PR, what should go there and what not, possibly stating that it’s the 
official documentation and define it from other “random” orgs on developer 
portals - all of these could e.g. go into a |CONTRIBUTING.md| which is a widely 
known source for such information.
Just some personal thoughts though which could potentially considered to 
improve things.

To be clear, I acknowledge your effort in opening things up to platforms like 
GH - which not all parts of R/CRAN are doing at the moment AFAIK.
And yes, when complaining about things not being optimal, one should also put 
in effort to make things better.
So I’ll see if I can put some time in to improve things and see how the 
experience is.


If you're happier working in git than in svn, what you could do is fork the 
mirror repo to your own git repo, and make your proposed changes there.  If 
they are good changes it won't be hard for someone (maybe even you) to convert 
into the appropriate format to merge into svn.

The way R development changes is when a change makes things easier for the 
devs.  I suspect whether it's easier for you is only important to them if 
you've got a history of making helpful contributions: they like help, they 
don't like arguments about how to do things.  (I'm saying this as a former 
member of R Core.)

Duncan Murdoch


      The problem is that generally they cannot. You are looking something
      up, because you don't know about it so you can't judge whether it is
      a good answer (SO is good example proving why crowd-souring the
      definition of truth doesn't generally work). At best you may know
      the person and thus judge by that, but even then you may not know if
      the information is still accurate.

I see your point here and generally agree that it’s hard making such judgements 
in this position.
Yet I disagree on referring to Stackoverflow as a “crowd-souring the definition of 
truth doesn't generally work”. Without SO, we would be nowhere where we are today 
and I’d argue it has done a lot more positive things than negative ones to every 
single person who ever accessed it. >
Cheers
Patrick

On 25 Apr 2022, at 1:04, Simon Urbanek wrote:

          On Apr 23, 2022, at 7:44 PM, Patrick Schratz
          <patrick.schr...@gmail.com> wrote:

          FWIW blog posts which explain such things usually have a (good)
          reason - they aim to help people getting started when the
          official documentation is either unclear, hard to find or
          incomplete.

      If that is the case, why not contribute to the documentation? That
      is the whole point of an open source project after all.

      The problem with random blogs is that many of them are written by
      people trying to find an answer without much knowledge on the
      subject and often post very bad advice that does not necesarily
      address the actual issue. There are rare exceptions of knowledgeable
      people posting explanatory blogs, but if you search for an answer
      you have no way of knowing whether it is of the good kind. In
      addition, blogs tend to get out of date quickly, so what used to be
      a good advice may not be anymore (prime example was the R 4.0.0
      release which made a lot of the "hacks" obsolete and the well-meant
      advice out there has only led to more problems).

          It’s on the readers themselves to decide whether such blog posts
          are trustworthy or useful.

      The problem is that generally they cannot. You are looking something
      up, because you don't know about it so you can't judge whether it is
      a good answer (SO is good example proving why crowd-souring the
      definition of truth doesn't generally work). At best you may know
      the person and thus judge by that, but even then you may not know if
      the information is still accurate.

          I have personally profited so often from blog posts of others
          already and therefore find the general advice to not consult
          such resources quite shortsighted.
          Of course the official documentation should always be the first
          point to have a look at - and in this case the required
          information would have been there.

          Apologies for going partly off-topic but I think this point is
          important.

      I agree. That's why I think it would be great if those that have the
      knowledge would help the community to improve the documentation. Of
      all the contributions to R it is the easiest.

      That said, I also agree that complementary information is very
      useful, in particular if it explains the "why" as well - which may
      be too far out of scope of the canonical documentation. In that case
      it is easier to spot mismatches, e.g., if it becomes out of date. It
      is not without precedent to reference such external documentation if
      it is maintained.

      Anyway, I'd like to encourage everyone to contribute - it may be
      pointing out issues in the documentation or by sending PRs with
      proposed updates or posting here. Some did in the past (like you,
      Jan or Bob, thanks!), but the more contribute the better for the
      community. Often this may also uncover genuine issues that should be
      addressed rather than worked around (like the lack of the symlink in
      the gfortran-12 tar-ball discovered just this morning...).

      Cheers,
      Simon

          Cheers
          Patrick

          On 23 Apr 2022, at 2:13, Simon Urbanek wrote:

              For posterity - please always consult

              https://mac.r-project.org/tools/
              <https://mac.r-project.org/tools/> (linked from CRAN)

              The old locations like libs* are no longer updated and have
              been deprecated in favor of /tools and /bin which are
              maintained for all builds. Similarly, I would strongly
              discourage following any advice from blogs as they tend to
              be outdated, wrong or both.

              Cheers
              Simon


_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mac


_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mac


_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Re: [R-SIG-Mac] Contributing to documentation [Was: Installing gfortran]

Reply via email to