Re: [PATCH RFC] repo: add an ability to hide nodes in an appropriate way

Pierre-Yves David Thu, 30 Mar 2017 07:22:19 -0700

On 03/27/2017 07:24 PM, Augie Fackler wrote:

On Mon, Mar 27, 2017 at 10:27:33AM +0200, Pierre-Yves David wrote:



On 03/27/2017 12:32 AM, Kostia Balytskyi wrote:

# HG changeset patch
# User Kostia Balytskyi <ikos...@fb.com>
# Date 1490567500 25200
#      Sun Mar 26 15:31:40 2017 -0700
# Node ID 43cd56f38c1ca2ad1920ed6804e1a90a5e263c0d
# Parent  c60091fa1426892552dd6c0dd4b9c49e3c3da045
repo: add an ability to hide nodes in an appropriate way

Potentially frequent usecase is to hide nodes in the most appropriate
for current repo configuration way. Examples of things which use this
are rebase (strip or obsolete old nodes), shelve (strip of obsolete
temporary nodes) and probably others.
Jun Wu had an idea of having one place in core where this functionality
is implemented so that others can utilize it. This patch
implements this idea.


I do not think this is a step in the right direction, creating obsolescence
marker is not just about hiding changeset.


But we're using them for that today, all over the place.

Actually, we do not use it all over the place. Current markers usages incore are:


amend:
 - record evolution history
 - prune the temporary amend commit
  (that I would rather see disappear all together)

rebase:
 - record evolution history on successful completion

histedit:
 - record evolution history on successful completion
 - prune temporary nodes on successsful completion
   (This ones used to be stripped even with evolution)

(let us talk about shelve later)

So they are mainly used to record evolution history on successfuloperation (their intended usage) between "real-changesets". This is theintended usage of obsolescence markers.

In addition, they are also used in a couple of place to hide temporarychangesets (in the "utility-space") after an operation succeeed.This a bit an unorthodox uses of the obsolescence markers, but is "okay"to use them in these case because we know:

 1) this is utility-space so user never interact with them.

2) we only create them for completed successful operation so we'llnever need them ever again.Strictly speaking we could just strip these temporary commits (and wehave done so in the past) and this will not change anything for theuser. Any other hiding mechanism aimed at "internal temporary" commitwould do the job just fine. The internal commit never needs (andactually should ever) to leave the user repository.In practice, we could use in memory commit for most of these temporarycommit (if not all) and never have to deal with hiding them.

We're also having pretty productive (I think!) discussions about non-obsmarker
based mechanisms for hiding things that are implementation details of
a sort (amend changesets that get folded immediately, shelve changes,
etc).


(Let me be at be long to try to be clear and comprehensive)

I can see three categories of things we want to hide:

:obsolete changeset (evolution semantic):

A rewrite operation has created/recorded a new version of thischangesets. that old version no longer needs to be shown to users (ifpossible). There is a strong semantic associated with such changesetsand the property will be propagated to other repositories


:temporary/internal changesets:

Such changesets are created as a side effect of other operation(amend, histedit, etc). They have the following properties (once we aredone using them)


  * They are irrelevant to the users,
  * We should never-ever base anything on them,
  * We should never-ever access them ever again,
  * They should never-ever leave the repository.
  * They never stop being internal-changesets

:locally-hidden changeset:

An hypothetically (not used) yet local-only hidden mechanism (similarto strip but without actual repo stripping). This could be used forlocal hiding/masking of changeset in the "real-changeset" space. Suchdata would not be exchanged. Commit reappears when "recreated" or re-pulled.


---

To take practical examples:

:amend:
  - amend source: "obsolete-changeset"
  - temporary changeset: "internal-changeset"

:rebase:
  - rebase source, on success: "obsolete-changeset"
  - rebase result, on abort:  "local-hiding"


:histedit:
  - histedit source, on success:         "obsolete-changeset"
  - histedit temporary node, on success: "internal-changeset"
  - histedit result, on abort:           "local-hiding"
  - histedit temporary node, on abort:   "local-hiding (internal-node)"

extra notes:

* If local hiding (strip) would take care of obsolescence marker,rebase and histedit could create them "as they go" instead of to theend: "on success".* In rebase --abort and histedit --abort, strip is used on freshlycreated changesets, so its performance impact is limited* we use "local hiding" (strip) on temporary nodes on histedit abortbecause we needs to be able to reuse them if histedit is rerun. So wecannot throw them in oblivion just yet.


---

regarding implementations

"obsolete-changesets": we already have a solid mechanism for that.

"internal-changesets"" has very constrained property. it allow us toabuse obsolescence markers for them.It would be quite easy and fast to build an ad-hoc mechanism for themand stop abusing obsmarkers. (for example a simple bitmaps or maybe some"internal-*" phases. Having a dedicated tacking of internal changesetshelps enforcing the their property and simplify the obsmarker space.

"locally-hidden-changesets": this is trickier. For now the only usablemechanism we have for them is stripping. We cannot use obsolescencemarker for them since obsolescence markers are building a globalpersistent and distributed stated and we are explicitly aims at stayinglight and local in this case (I can elaborate on that last bit if youthink it is needed). Having or not having node-version does not make adifference here. locally-hiding do not want to have any global sideeffect and creating obsmarkers will always results in global side effect.

Having local-hidding mechanism is actually not too far away, it requiresa couple of things:

1) applying the hiding,
2) storing the data,
3) automatic lifting of the hiding.

(1) is already here for free. As explained during the sprint, the APIfor hidden things is already flexible enough from the start to be fedwith other mechanism than just obsolescence.(2) should not be too complicated, either a bitmap approach or a rootbased approached (bitmap is probably more solid).(3) probably already exist in Jun "archived" experiment. We have to makesure recommit/repulling a locally-hidden changeset makes it available again.


Overall, this is not too much work.

One extra thing to take care of is obsmarkers. When we strip achangesets we want to be able to strip directly related obsmarker. Stripis not currently able to do so, but it will be able to do so soon and wehave a clear technical path here. So non-destructive "hiding" ofchangeset would needs its counterpart for obsmarkers before we can useit in case where obsmarkers are involved..

It is about recording the
evolution of time of changesets (predecessor ← [successors]). This data is
used in multiple place for advance logic, exchanged across repository etc.
Being too heavy handed on stripping will have problematic consequence for
the user experiences. Having an easy interfaces to do such heavy handed
pruning will encourage developer do to such mistake.


I'm struggling a bit to understand your argument here. I think your
argument is this: hiding changes can be done for many reasons, and
having a simple "just hide this I don't care how" API means people
won't think before they do the hiding. Is that an accurate
simplification and restating of your position here? (If no, please try
and restate with more nuance.)


My point go a bit further than that. Let me restate:

* The purpose of obsolescence marker is to (globally) track theevolution of changesets through rewrite,

* The fact obsolete changesets get hidden is just an (intended)consequence of this tracking,

* Creation of new obsolescence markers should be done after carefulconsideration and weighting.[1]

This current proposal goes in the wrong direction in regards with theabove. Creating a shinny "This hides nodes" hammer that use obsmarkerunderneath is misleading. It will encourage developer to createobsolescence markers in case where it is inappropriate to do so. Because"hide" is the wrong semantic for markers.

On a general basis, a function creating markers without taking a(predecessors → successors) mapping is a strong hit that this functionnot approaching obsolescence markers in an appropriate way.

[1] eg: we allow ourself to use obsolescence marker for some internalchangeset because we have carefully though about it and we decided itwon't be problematic. I would rather not do it at all.

The fact this function only takes a list of nodes (instead of (pec, suc)
mapping) is a good example of such simplification. Series from Jun about
histedit is another. Obsolescence markers are not meant as a generic
temporary hidden mechanism for local/temporary nodes and should not be used
that way.


In broad strokes, I agree here. Have we made any conclusive progress
on a mechanism for archiving changes? The reason we're seeing all
these "abuses" of obsolete markers is that they're currently the only
append-only means by which things can be discarded, and there are (as
you know) ugly cache implications to running strip.

Strip is far from ideal. However there are multiple mitigation mechanismthat exist today to limit its impact on cache and there is many lowhanging fruit to improves this mitigation.

In addition, changeset filtering currently also have some bad impact oncache, so they do not simply goes away if we stop using strip. Their arealso many low hanging fruits to mitigate this impact.

(I'm sort of +0 on this series, but I'd be more enthusiastic about
having a proper hiding mechanism in place that isn't obsmarkers before
we go too far down this road.)

To summarize my rational above, the proposed function is misusing theobsolescence markers semantic and will leads to problematic usages ofthem. So I'm -1 for it.

I'm not just bringing this out of potential fear. In the past years I'veseen (and prevented) people to misuse obsolescence marker forhistedit-abort, rebase-abort, and histedit-abort again, etc… (we justfixed a regression as this thread was starting).

I hope this long message help to clarify various concept. We have wayforward to reduce the use of stripping without abusing the obsolescenceconcept in a way that will create issue for users. These way forward arein reach and would not take too long to build.


Cheers,

--
Pierre-Yves David
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Re: [PATCH RFC] repo: add an ability to hide nodes in an appropriate way

Reply via email to