Re: [darcs-users] so long and thanks for all the darcs

Ben Franksen Tue, 17 Apr 2018 00:36:55 -0700

Am 13.04.2018 um 10:19 schrieb Stephen J. Turnbull:
> Benjamin Franksen writes:
>  > On 04/10/2018 08:34 AM, Stephen J. Turnbull wrote:
> 
>  > > Any user who understands what a ref is will say "a Darcs tag is
>  > > too a ref!" I think.
>  > 
>  > Perhaps (but you won't, right?).
> 
> I would, in the sense that it is a name that allows you to rebuild a
> version exactly, just as a git tag or branch does.  It's not a ref
> into a DAG, of course.


That's what I meant.

>  > > How do you identify "official"?
>  > 
>  > I can't, unless it's an "official" repo to start with (e.g.
>  > http://darcs.net/) and then I would assume that all branches are
>  > "official" (assuming darcs had branches).
> 
> This is generally not true with git.  In corporate situations,
> including large volunteer projects like Python or GHC, it probably is
> true.  But in cases of smaller projects, or even projects with formal
> organizations that translated repos from centralized systems where
> public branches were an important form of communication, I would
> expect a lot of detritus.

But in principle it seems we agree that branches in a public, shared
repo should not be used, nowadays, to publish e.g. some experimental
development. There should be a better (clearer, more explicit) way to
communicate/publish work that deviates from the shared baseline(s) of
"officially accepted" branches in a project. The "fork" feature of
github is indeed a pretty good solution, except that it is tied to one
central service. I think I need not re-iterate here the reasons why
depending on a single service is problematic, particularly if running
such a service is subject to commercial interests.

So I would like to have something more "distributed" in nature, similar
to peer-to-peer file sharing, with one or more competing services that
merely act as a directory for searching and discovering related repos
(and, perhaps, communication i.e. pull requests), and other services
where repos and/or bug trackers are hosted. For a smooth user experience
this would require some common protocol for all this higher level
information. Perhaps one day someone develops something like that.

> Also, many projects make official "release branches".  Python has
> several score by now.  In the Mercurial days, each was a separate
> repo, but in git there's been substantial merging.  I'm not sure if
> they've *all* been aggregated into one repo, but the backports policy
> suggests they might have, for convenience in cherry-picking.

Release branches are the prototype of what I meant with "official branches".

>  > > I doubt they'd be willing to make "export all branches on clone"
>  > > a default, and it's not clear to me that the "I just want to see
>  > > the mainline" aren't the majority.
>  > 
>  > How do they identify "the mainline"?
> 
> To the folks who just want the VCS to stay out of their way, it's
> "whatever $VCS clone scheme://project.org/official checks out."

This would not work for at least one project I am maintaining. I have
several equally usable and maintained branches (versions). It is true
that serious development happens only on the latest version, but due to
its instability potential contributors would most probably want to
target a more stable release because this is what they use. It depends
on what kind of contributions are expected: occasional bug fixes and
minor improvements, or substantial contributions for new features.

So I maintain that, at least in /my/ experience, there are cases where
this "one default mainline branch" is not appropriate.

> You mentioned "familiar and comfortable with Darcs".  I don't think
> "comfortable" implies "familiar" (in the sense of how the internals
> work and how it differs from other VCSes the user may be comfortable
> with).

I meant "familiar" not with how the internals work but with the UI and
the user level idioms, i.e. the ability to use it effectively and with
confidence (regarding the outcome of commands issued).

(An exception for Darcs is the patch matching options which are messy
and interact in strange and sometimes unpredictable ways. Cleaning that
part up has been on my TODO list for a long time.)

> I think it means (to most users) that the VCS stays out of their way.

I can't speak for "most users" but for me it is a way to structure my
work and thus I expect more than that it stays out of my way. There is
no way around the fact that VC has a great influence on the way you work
and how you share and co-operate with others, very much like the choice
of programming language does. I have long since decided to embrace it
and use it to improve what I am doing and the way I do it. I am using
Darcs even for activities that have nothing to do with programming (e.g.
I've written a short story once and as a matter of course I kept the
text under Darcs control).

>  > Indeed. At work we use Darcs for development of several medium sized
>  > control systems for scientific instruments.
> 
> Interesting to see that description.  Sounds like what I would expect
> (notwithstanding the unfortunate experiment with git submodules, that
> kind of thing happens in the best-run organizations).

This was a different project (EPICS) that we use but of which we are not
the main developers. It is another good example for maintaining a number
of long running branches (there is 3.14, 3.15, 3.16, and now 7), all of
which may and do receive contributions, which, if appropriate, are
forward or backward ported by the maintainers. This is not luxury but
necessity: most users are pretty constrained w.r.t. manpower after the
initial few years of building a facility. Switching to a new release is
usually done only very carefully, step by step, if at all, because we
have an experiment or machine to run 24/7, downtimes are scarce, and a
lot of the equipment are prototypes that exist only this once, so you
have to test and debug on the live machine (there is no other).

This means users are normally interested in a particular branch and will
target mostly that one branch when contributing. The maintainer's job is
greatly simplified by launchpad's pretty advanced bug tracker that
allows to track progress of a single ticket along several branches in
parallel.

>  > > This means that from an
>  > > individual developer's point of view, the state of master is a triple:
>  > > (1) what's actually in the official repo (unknown; another dev may
>  > >     have updated),
>  > 
>  > True. (Though it is easy to check if this is the case (hg incoming,
>  > darcs pull --dry-run, git <whatever>)).
> 
> Your network is not run by the "MIT of Japan" (my employer, where the
> abbreviation *really* expands to "minimally informed technicians"),
> nor is it inside the Great Firewall (currently GitHub is blocked in
> China, I am informed).  And it mostly matters in the last five minutes
> before a feature freeze. ;-)

You are right that comparing with the remote directly will fail if there
is no connectivity. A problem if you are traveling a lot and work from
planes etc, not something I am doing a lot, so I didn't think of that.

>  > > I don't think this comparison is entirely accurate.  All DAG-based
>  > > systems permit cherry-picking and rebasing, although those like
>  > > Mercurial and Bazaar do try to deprecate rebase.  In git they are
>  > > first-class operations.
>  > 
>  > Cherry-picking is an attempt to get the effect of patch commutation
>  > without paying the price. You get what you pay for: an ad-hoc solution
>  > that may or may not give you the results you expect.
> 
> You are willing to say that in public after denying that Darcs has, or
> you even want, a semantic patch theory? ;-)

Oh, the idea of a semantic patch theory is certainly tempting but I
think it is not realistic, today. Even if you succeed in finding the
perfect formalism for a particular language, what about all the other
languages that are used in a project, not to mention documentation in
various formats, the build system, etc etc. And then think about
languages evolving, with various versions and/or option-enabled features.

>  > Making sure the results are what you expect is tedious and error
>  > prone and I understand if people are nervous about it ("untested
>  > versions, gaah!").
> 
> Every Darcs repo implies a number of untested versions which is
> potentially exponential in the number of patches.  I have no idea in
> practice how many versions are typically generated by repeated
> obliteration respecting dependencies, but I imagine it's way larger
> than the number of versions actually subjected to formal testing.  (I
> would guess properly tested versions are approximately linear in the
> number of patches).

You are absolutely right about that. There may be intermediate states
that have never been tested, may not even compile, but so what? Software
has bugs, test suites have holes, and implementing new features can be
tricky and cause regressions. So you fix the bugs and add more
regression tests. This is the same everywhere[1] regardless of VCS, and
just because your VCS does not allow you to (easily, transparently)
re-order changes and thus makes it more difficult to compose these other
intermediate states (create a new branch that forks off in the past and
so on) does not mean it is "better tested". If anything, I would claim
the opposite: the fact that developers tend to work on slightly
different histories actually improves test coverage in that more
intermediate states occur and thus more combinations are tested,
reducing the likelyhood of subtle interactions between different parts
of a program to cause bugs that are discovered only much later. (I have
to admit that I can't provide evidence for this, not even anecdotal ones.)

It would be possible, theoretically at least, to design a special way to
do integration tests with Darcs managed projects by systematically
generating and testing all possible intermediate states. In principle,
this /could/ uncover "sleeping bugs", but I guess what it would uncover
mostly is missing explicit ("semantic") dependencies that developers
forgot (or couldn't be bothered) to add. Feasibility is another matter,
though; as you noted this will probably run into exponential blow-up.

[1] Even if you have a body of formally specified and verified code,
there may be bugs in the specification, and new features mean you have
to make changes to the spec etc.

[re-ordered from below:]
>>> [In scaling Darcs, s]torage blows up, the naming conflicts will 
>>> be frequent unless you're willing to endure network outages and 
>>> delays, and URLs for personal repos are often long and/or 
>>> unintuitive.
>> 
>> Yes, storage blow-up is a problem, and another one is
>> discoverability, which is why I want to add branches to Darcs.
>> 
>> I don't understand what you mean with "naming conflicts will be 
>> frequent".
> 
> Names like "test", "new", and in some cases feature names are likely 
> to be used independently across personal repos.

Oh yes, I would expect lots of those in practice (I am always at a loss
how to name my Darcs repos/branches and often resort to silly names like
that). One solution could be to mark branches meant for public
consumption explicitly and treat all others as private.

>  > > does patch algebra allow you to avoid some conflicts that would
>  > > occur in a DAG-based system?
>  > 
>  > Some, yes. It depends a lot on the foundation i.e. the concrete
>  > implementation of your patch algebra. It also depends on how conflicts
>  > are detected in the DAG based system.
> 
> I don't know of any DAG-based systems with a substantial advance over
> patch(1).

Okay, noted.

>  > But that is not the main point. The main point is that the patch
>  > algebra frees you from having to worry about history, /except/ when
>  > it is relevant, i.e. when patches have dependencies.
> 
> But isn't that costly when you are trying to localize a bug by testing
> which versions exhibit it?  When bisection works in a DAG-based
> system, you have a logarithmic upper bound on search time.  (Also when
> it fails, you find out in logarithmic time.)  It's not obvious to me
> that you get that result in Darcs since its "mainline" is
> fundamentally nonlinear.

As I noted at the beginning of our exchange, the repo contains the
patches in a specific order, and this is the order that 'darcs test'
uses. So for this we temporarily forget about possible re-orderings.
This works because you are normally not interested in the exact set of
versions that fail; instead what matters is the (one) change that marks
the place where behavior goes from good to bad because it is likely this
change where the error was made.

>  > > If not, what is the great advantage of patch algebra from your
>  > > point of view?  Is it related to the ability to claim the same
>  > > branch identity for two workspaces that "haven't diverged too
>  > > much", where a git rebase in a published branch all too often
>  > > results in an unusable mess of conflicts?
>  > 
>  > Well, my experience tells me that "an unusable mess of conflicts" can
>  > happen with Darcs in just the same way.
> 
> I don't think it's "just the same way".  My point is that a rebase
> changes the "identity" of a branch in a nonlinear way because it's
> version-based.  In Darcs (at least in theory) you can walk forward
> applying the patches and fixing conflicts one patch at a time.  (I
> guess this is exactly what "darcs rebase" implements.)  True, in Darcs
> a megapatch can do you in, but *every* git rebase is a megapatch!

Hm, is that necessarily so? I would have thought this is only the case
if you 'squash' all the intermediate commits to one; AFAIR git rebase -i
allows you to proceed more subtly. But perhaps you refer to a certain
way to use rebase, as the Linux kernel devs seem to prefer, where new
features are always squashed into a single big commit?

>  > When i pull a patch from your repo and it doesn't conflict, I have
>  > enlarged the intersection and reduced the (symmetric)
>  > difference. When I repeat this, and also push, and everything
>  > merges cleanly, then our repos are semantically identical,
>  > period. I just don't have to care about the order, either one is
>  > fine.
> 
> This is a useful explanation!

Thanks. This is indeed the essential motivation behind patch theory.

I should perhaps add that this universal property is (and must be)
maintained even in the presence of conflicting patches. Suppose our
patch sets are equal except for one patch A in my and one patch B in
your repo, where A and B conflict. If I pull your patch (allowing
conflicts) and push mine (also allowing conflicts) then we still have
equivalent repos with the same resulting (pristine) tree, even though in
your repo the order is [...,B,A] and in mine it is [...,A,B]. This tree
is missing all primitive changes in A and B that conflict (the
"automatic" resolution is to remove both changes). Trouble ensues when
we both manually resolve the conflict (re-adding one or the other change
in a suitably modified form, or a mixture of both) because these two
resolution patches will inevitably conflict again; so conflict
resolution requires co-ordination between developers. The standard
work-flow (supported by the default that allows conflicts when pulling
but not when pushing) is to push only after resolving conflicts.

/How/ to maintain all the required patch properties in the presence of
conflicting patches (including an efficient representation of
conflicts), is a pretty complicated matter. Also, for fairness, I should
mention that the darcs-2 patch format is unsound as it stands: there are
situations where some of the properties are violated -- not the one
about equal patch sets resulting in equal trees, AFAIK, but certain
others which can lead to observable strangeness in Darcs' behavior and
even crashes. This is due to an improper handling of /duplicate/ patches
which do not necessarily conflict and instead are represented using a
special patch type. Fixing this is possible (thanks to work by Ian
Lynagh, which also fixes the remaining cases where we run into
exponential blow-up). But any such fix will be incompatible with the
current patch representation, so will have to wait until we are ready to
release darcs-3 with a new patch format.

>  > > This is what happens in git now, except that you are able to set
>  > > your own defaults in .git/config, and provide aliases for URLs
>  > > (the remotes).  You can argue that remotes provide more confusion
>  > > than convenience if you like, but several years of experience
>  > > have shown that for the vast majority of git users it's the other
>  > > way around.
>  > 
>  > You sound so confident when you say that. As if the git we have
>  > today was the result of incorporating years of user feedback. OTOH
>  > you keep telling me that git is the way it is because the
>  > developers have mde and still make it for their own good,
>  > primarily. And that the UI is more or less frozen because
>  > Mr. T. said so many years ago.
> 
> There's no conflict between itch-scratching and Mr. T's decrees on one
> hand, and general user satisfaction with the remote feature on the
> other.  Unless you're doing something tricky, the workflow in most
> projects is pretty simple: go to GitHub, fork the official repo to
> your account on GitHub, clone your fork to your workstation, make
> branches for each "piece of work" (defined by the project leadership),
> push them to your fork when done and submit a "pull request".
> Management of remotes in this scenario is completely transparent to
> the ordinary contributor: "clone" does all the work.

This is not how it works in practice for me. I first clone the public
repo locally, build it and experiment with it. Perhaps later i find a
bug, fix it, and want to contribute my patch. Oh noes, I made the patch
without forking the repo first. Okay back to github, fork. (Then comes
the point at which I can't remember how the ssh URL of my fork must be.
I don't want to know how often I googled for that particular
strangeness, I think on has to say g...@github.com:/user/repo or
something like that). Now I have to re-configure my local repo so that
it works with the right remote. Then I can finally push and then I must
make a pull request, using the crappy text editor in the web interface.

You may find this easy and natural but I guess that is because you did
it often enough that you don't notice how unnecessarily complicated all
this is. The usual Darcs work-flow is like this: you clone the repo,
play with it, record an improvement and 'darcs send' your patch, and
that's it.

(You can add a text to the description of the patch bundle you are
sending in your favorite text editor.)

On the other end, maintainer applies the patch in a local clone and
decides whether to accept it or not. If it turns out your patch conficts
(e.g. you forgot to pull before recording you change), or there are
other things to criticize, she'll probably ask you to re-send an amended
version; or perhaps does that herself.

BTW, I have read that Linux kernel development shuns github because it
doesn't scale; they use mailing lists instead. That would fit nicely
with the Darcs work-flow I described here :-)

>  > > This is not true for branches.  "Colocated branches" (ie, the many
>  > > branches per repo model) do seem to cause confusion.  My guess is that
>  > > a Darcs-with-branches would have the same problem.
>  > 
>  > I hope we can avoid that.
> 
> Perhaps you can.  It will depend on how many users with a "centralized
> VCS" mindset you attract.  I'm not sure of whether that mindset is
> "organic", or whether it's a matter of experience with centralized
> systems.  (The canonical example is Richard "I'm a genius hacker and
> I've always committed directly to the production repo" Stallman, who
> obviously had decades of experience with RCS and CVS before Emacs
> switched to Bazaar, and then git.  As people who grew up with DVCS
> become the overwhelming majority, perhaps that mindset will just
> f-f-f-fade away, as Peter Townsend sang.)

I fear people won't understand DVCS any better because they got hooked
by github.

>  > > In context, "short-lived deviation" is exactly the sense I meant:
>  > > in case of a merge with way too many conflicts, you want to
>  > > "rollback" to the pre-merge state.
>  > 
>  > But doesn't this loose the changes you made?
> 
> Which changes?  First, there should be no uncommitted changes in the
> workspace when the merge is started.  If there are, commit them
> (perhaps to another branch).

of course; I meant the changes you recorded/committed.

>  Second, if you've fixed a few files
> before discovering the mess, you can commit them separately to an
> appropriate branch (usually your mainline).

How do i commit a file separately to another branch?

>  You'll have to redo to
> the merge, but for those files you always choose your existing fixed
> version.
> 
> Perhaps it's not as good as it could be but you don't need to lose
> work.  I grant that this is *not* the image you would get from
> "rollback to the premerge state", but in my experience it's usually
> pretty obvious when you've got a mess before trying to fix it, so
> that's the majority of cases anyway.

Okay.

>  > In the situation where I have complicated conflicts, I usually use
>  > 'darcs rebase' to resolve them one patch at a time. The work-flow is
>  > like this: you say 'darcs rebase pull', which suspends any local patches
>  > that conflict with remote ones.  [....]
> 
>  > My experience is that it is much easier to resolve complicated conflicts
>  > in this step-wise fashion.
> 
> This sounds like the optimization I obliquely referred to above.
> 
>  > If you had unrecorded changes you are out of luck:
> 
> There's no good reason for having unrecorded changes in any of the
> DAG-based systems.  They all provide stash or something like it.  I
> can't see any reason for it in Darcs, either, a record followed by an
> immediate reversion patch is effectively a stash, if Darcs doesn't
> already have that feature.

Yes, of course. But Darcs doesn't force you to commit changes before you
pull so it is possible to forget to do it, which is why I recommend
using --no-allow-conflicts as the default for pull, too.

>  > > Sigh.  This simply isn't true.  *The DAG is immutable.* 
>  > 
>  > Ah, I never doubted that the DAG remains consistent in itself. What I
>  > meant is the consistency of the changes to your tree. For instance, if
>  > you use cherry-picking to re-order changes, can you be sure that after
>  > picking all the commits in a branch the resulting tree will be the same
>  > as in the original? I don't think so.
> 
> You can be sure in the same circumstances as in Darcs: when the
> cherry-picking involves no manual resolution of conflicts.

I am pretty certain that this is wrong. Relevant reading includes
http://r6.ca/blog/20110416T204742Z.html with discussion on reddit
https://www.reddit.com/r/programming/comments/grqeu/git_is_inconsistent/

(It's been a while since I looked closely at these examples so perhaps
they are out-dated.)

>  > Assuming a modernized version of Darcs with in-repo branches,
>  > better (guaranteed to be efficient i.e. polynomial, ideally linear)
>  > conflict handling, and a more efficient representation of binary
>  > hunks: yes, I think [managing the Linux kernel or GCC with Darcs]
>  > would be possible and would actually work better than git.
> 
> Good luck!  I hope you have the time and the help to get there.  (I
> don't have time to learn enough Haskell for the foreseeable future.)

Too bad; I would probably enjoy working with you.

>  > I still find it interesting that in Darcs I never missed remote tracking
>  > branches yet.
> 
> I don't see why you would, since Darcs forces you to manage it
> manually anyway.  That is, the only way you can keep a mirror of the
> "official" repository's state is by keeping a pristine repository, as
> you describe below.[1]  Keeping "pristines" is the way I have
> historically managed my Mercurial and Bazaar projects, and still do
> for those projects still using Mercurial (all my remaining Bazaar
> projects are now sufficiently stable that I just work in the pristine
> for my decennial patches ;-).
> 
> Otherwise, you just depend on network connectivity, and pull directly
> into the working copy, or diff against it.

I think i see how with in-repo branches it would become natural to keep
one or two untouched copies of the remote for comparison when working
offline e.g. one for where you started and one for closely tracking the
remote branch.

[re-ordered from below:]
> [1] Theoretically you could use tags, but that would be difficult in 
> Darcs without cooperation from the official repo, AFAICS.

Not at all. You can record tags ad libitum for our own purposes. So
instead of keeping the where-I-started branch you could just add a tag.
I often do this before making complicated or far-reaching changes. Just
make sure you don't accidentally push it...

>  > I guess the work-flow with Darcs is just different enough that some
>  > concepts (or problems) simply do not transfer naturally.
> 
> I think so.  I think some of them will arise in a multibranch version
> of Darcs, though.
> 
>  > > No, that default is only for a clone, and it's whatever is
>  > > checked out in the source repo, which is usually "master" for a
>  > > public repo.
>  > 
>  > But this is horrible.
> 
> Not in practice. :-)

Only because maintainers of public repos know about this and are careful
not to checkout other branches in such repos. You will have a hard time
to convince me that this was a good design decision. It conflates a
purely local setting (what is checked out, i.e. what I currently work
on) with the publically visible interface (the default for what you get
with clone). Mercurial does this differently: what you get when cloning
is independent of what is "checked out" and there is no need at all for
bare repos.  I did some googling and found this on stack overflow:
https://stackoverflow.com/questions/8952865/mercurial-set-a-branch-as-the-new-default-branch
which, if it tells the truth, means that it behaves a bit more like how
I think it should. Which is, assuming that you really want to support a
single 'default' branch, to make this an explicit setting independent of
most of the other operations on the repo.

>> "Whatever is checked out in the source repo" is completely 
>> unpredictable (unless you make sure it is a bare repo so nobody 
>> would checkout anything there).
> 
> There still needs to be a HEAD  (which is what determines what is 
> checked out).  In any formalized workflow, it will be a bare repo,
> so I'm not sure you would experience any problem.
And I thought git was supposed to shine for the "chaotic" kernel dev
style where everyone clones and merges everyone elses repos?

>  > >  > What about the sharing with colleagues? [...]  You really want
>  > >  > a third repo in between upstream and local for that.
>  > > 
>  > > Yes, as I describe above these days it's typically on GitHub.
>  > 
>  > Unacceptable in many companies. Also unnecessarily slow, etc etc.
> 
> Sure, but it's trivial create one in-house: any git repo reachable by
> network will do.  Maintaining and managing that is *non*-trivial;
> that's why GitHub is so successful, they're darn good at automating
> that stuff.  But it's not *that* hard to create a reasonable workflow,
> easier to teach it, and only the gatekeepers need to know the
> necessary operations for acquiring and merging contributions.

Oh, I am sure it can be managed, but I would still prefer it to be
simpler. I /am/ one of the "gatekeepers" at work (not in any formally
recognized way, though) which is one reason why we still use Darcs ;-)

>  > Let's drop [the discussion of what's a URI] and agree to disagree.
> 
> OK, but remember you're also disagreeing with RFC 3986. :^)

I guess you won that round ;-)

>  > >  > You said earlier that git represents a submodule as a tree object
>  > >  > that is itself a commit. But it cannot be the commit that
>  > >  > represents the current (pristine) tree in the submodule, else I
>  > >  > could not make a commit in the submodule (or pull there) without
>  > >  > makeing a commit in the containing repo/branch.
>  > > 
>  > > I'm not sure what you mean by this.
>  > 
>  > I am trying to understand how submodules work in git. So I have a subdir
>  > "bar". The tree referenced by the current commit (of the supermodule)
>  > has an entry for "bar" and its content object is not a file but another
>  > commit. So suppose I pull a different commit inside the submodule. Would
>  > that not mean that the supermodule needs to change, too, i.e. refer to
>  > this new commit instead of the old one? But that cannot be, since the
>  > commit of the supermodule is immutable.... ahh, I think I do understand:
>  > git will show me this update as an uncommitted change! I can commit it
>  > in the supermodule and then it "officially" refers to this new commit of
>  > the submodule. Correct?
> 
> Exactly. 

Good! It makes a lot of sense to do it like that. You track the changes
in a subrepo, but not at the level of files and directories, but at the
much coarser level of commits.

Doing the same in Darcs is a bit more difficult. If we regard the
(pristine) state of a subrepo as a set of patches, then the (primitive)
patches that modify this state consist of a set of abstract patch hashes
to remove and a another such set to add, quite similar to a file hunk.
This requires more storage space than in git, but I guess it's not an
unreasonable amount. If we add an internal data structure where we can
lookup the meta data given its hash, this lets us display the difference
that the subrepo-patch represents in a way similar to 'darcs pull
--dry-run' plus 'darcs push --dry-run' (which is how you compare the
pristine states of two Darcs repos).

One shortcoming of this approach is that we cannot (efficiently) support
either --verbose here (which, in Darcs, means to output not only the
meta data but also the content of the patch); nor --summary (which
summarizes the affected files and how they are changed, similar to 'git
status'). These options require access to the patch content and that
depends on context i.e. the order of patches. For the same reason, the
set of abstract patch hashes is not sufficient to reconstruct a repo,
you need a 'real' repo to actually pull the patches from. In fact, for
the sketched approach to work efficiently when initializing a subrepo,
we probably need an optimized representation of the pristine repo state.
I have a few ideas for that but the details are probably not of
interest in the context of our discussion.

The UI would be less configurable and more automatic than the one git
has. For instance, when I apply a patch that modifies a subrepo, I would
want the subrepo to be updated automatically (obliterating patches to be
removed, then pulling patches to be added). Similarly, when cloning a
repo I want the subrepos to be initialized and updated, too. To
associate subrepos with URLs we can use a file similar to .gitmodules
(the actual name can be a pref setting, as for the boringfile), but with
the difference that a subrepo is identified with a UUID, not a human
readable name.

(I think I'll add a ticket to our tracker or an entry to the wiki with a
copy of this design sketch.)

> It's really complicated.  This is one of those features where "if you
> don't know (1) *why* you need it (what specific workflow issues it
> addresses) *and* (2) *how* you will modify your workflow to address
> those issues using this feature, YOU DO NOT NEED IT and YOU WILL BE
> SORRY if you try it anyway." :-)

Probably. Nevertheless, I learned that with git understanding how the
machinery works is the best approach, and since I had my "aha" above,
submodules have lost a lot their scariness for me.

Do you know the reason why 'git clone' does not initialize submodules?
Is this for backward compatibility or because 'git submodule update
--init' is not deemed a good enough default?

Cheers
Ben

_______________________________________________
darcs-users mailing list
darcs-users@osuosl.org
https://lists.osuosl.org/mailman/listinfo/darcs-users

Re: [darcs-users] so long and thanks for all the darcs

Reply via email to