On 03/27/2017 07:24 PM, Augie Fackler wrote:
On Mon, Mar 27, 2017 at 10:27:33AM +0200, Pierre-Yves David wrote:


On 03/27/2017 12:32 AM, Kostia Balytskyi wrote:
# HG changeset patch
# User Kostia Balytskyi <ikos...@fb.com>
# Date 1490567500 25200
#      Sun Mar 26 15:31:40 2017 -0700
# Node ID 43cd56f38c1ca2ad1920ed6804e1a90a5e263c0d
# Parent  c60091fa1426892552dd6c0dd4b9c49e3c3da045
repo: add an ability to hide nodes in an appropriate way

Potentially frequent usecase is to hide nodes in the most appropriate
for current repo configuration way. Examples of things which use this
are rebase (strip or obsolete old nodes), shelve (strip of obsolete
temporary nodes) and probably others.
Jun Wu had an idea of having one place in core where this functionality
is implemented so that others can utilize it. This patch
implements this idea.

I do not think this is a step in the right direction, creating obsolescence
marker is not just about hiding changeset.

But we're using them for that today, all over the place.

Actually, we do not use it all over the place. Current markers usages in core are:

amend:
 - record evolution history
 - prune the temporary amend commit
  (that I would rather see disappear all together)

rebase:
 - record evolution history on successful completion

histedit:
 - record evolution history on successful completion
 - prune temporary nodes on successsful completion
   (This ones used to be stripped even with evolution)

(let us talk about shelve later)


So they are mainly used to record evolution history on successful operation (their intended usage) between "real-changesets". This is the intended usage of obsolescence markers.

In addition, they are also used in a couple of place to hide temporary changesets (in the "utility-space") after an operation succeeed. This a bit an unorthodox uses of the obsolescence markers, but is "okay" to use them in these case because we know:
 1) this is utility-space so user never interact with them.
2) we only create them for completed successful operation so we'll never need them ever again. Strictly speaking we could just strip these temporary commits (and we have done so in the past) and this will not change anything for the user. Any other hiding mechanism aimed at "internal temporary" commit would do the job just fine. The internal commit never needs (and actually should ever) to leave the user repository. In practice, we could use in memory commit for most of these temporary commit (if not all) and never have to deal with hiding them.

We're also having pretty productive (I think!) discussions about non-obsmarker
based mechanisms for hiding things that are implementation details of
a sort (amend changesets that get folded immediately, shelve changes,
etc).

(Let me be at be long to try to be clear and comprehensive)

I can see three categories of things we want to hide:

:obsolete changeset (evolution semantic):

A rewrite operation has created/recorded a new version of this changesets. that old version no longer needs to be shown to users (if possible). There is a strong semantic associated with such changesets and the property will be propagated to other repositories

:temporary/internal changesets:

Such changesets are created as a side effect of other operation (amend, histedit, etc). They have the following properties (once we are done using them)

  * They are irrelevant to the users,
  * We should never-ever base anything on them,
  * We should never-ever access them ever again,
  * They should never-ever leave the repository.
  * They never stop being internal-changesets

:locally-hidden changeset:

An hypothetically (not used) yet local-only hidden mechanism (similar to strip but without actual repo stripping). This could be used for local hiding/masking of changeset in the "real-changeset" space. Such data would not be exchanged. Commit reappears when "recreated" or re-pulled.

---

To take practical examples:

:amend:
  - amend source: "obsolete-changeset"
  - temporary changeset: "internal-changeset"

:rebase:
  - rebase source, on success: "obsolete-changeset"
  - rebase result, on abort:  "local-hiding"


:histedit:
  - histedit source, on success:         "obsolete-changeset"
  - histedit temporary node, on success: "internal-changeset"
  - histedit result, on abort:           "local-hiding"
  - histedit temporary node, on abort:   "local-hiding (internal-node)"

extra notes:
* If local hiding (strip) would take care of obsolescence marker, rebase and histedit could create them "as they go" instead of to the end: "on success". * In rebase --abort and histedit --abort, strip is used on freshly created changesets, so its performance impact is limited * we use "local hiding" (strip) on temporary nodes on histedit abort because we needs to be able to reuse them if histedit is rerun. So we cannot throw them in oblivion just yet.

---

regarding implementations

"obsolete-changesets": we already have a solid mechanism for that.

"internal-changesets"" has very constrained property. it allow us to abuse obsolescence markers for them. It would be quite easy and fast to build an ad-hoc mechanism for them and stop abusing obsmarkers. (for example a simple bitmaps or maybe some "internal-*" phases. Having a dedicated tacking of internal changesets helps enforcing the their property and simplify the obsmarker space.

"locally-hidden-changesets": this is trickier. For now the only usable mechanism we have for them is stripping. We cannot use obsolescence marker for them since obsolescence markers are building a global persistent and distributed stated and we are explicitly aims at staying light and local in this case (I can elaborate on that last bit if you think it is needed). Having or not having node-version does not make a difference here. locally-hiding do not want to have any global side effect and creating obsmarkers will always results in global side effect.

Having local-hidding mechanism is actually not too far away, it requires a couple of things:
1) applying the hiding,
2) storing the data,
3) automatic lifting of the hiding.

(1) is already here for free. As explained during the sprint, the API for hidden things is already flexible enough from the start to be fed with other mechanism than just obsolescence. (2) should not be too complicated, either a bitmap approach or a root based approached (bitmap is probably more solid). (3) probably already exist in Jun "archived" experiment. We have to make sure recommit/repulling a locally-hidden changeset makes it available again.

Overall, this is not too much work.

One extra thing to take care of is obsmarkers. When we strip a changesets we want to be able to strip directly related obsmarker. Strip is not currently able to do so, but it will be able to do so soon and we have a clear technical path here. So non-destructive "hiding" of changeset would needs its counterpart for obsmarkers before we can use it in case where obsmarkers are involved..

It is about recording the
evolution of time of changesets (predecessor ← [successors]). This data is
used in multiple place for advance logic, exchanged across repository etc.
Being too heavy handed on stripping will have problematic consequence for
the user experiences. Having an easy interfaces to do such heavy handed
pruning will encourage developer do to such mistake.

I'm struggling a bit to understand your argument here. I think your
argument is this: hiding changes can be done for many reasons, and
having a simple "just hide this I don't care how" API means people
won't think before they do the hiding. Is that an accurate
simplification and restating of your position here? (If no, please try
and restate with more nuance.)

My point go a bit further than that. Let me restate:

* The purpose of obsolescence marker is to (globally) track the evolution of changesets through rewrite,

* The fact obsolete changesets get hidden is just an (intended) consequence of this tracking,

* Creation of new obsolescence markers should be done after careful consideration and weighting.[1]

This current proposal goes in the wrong direction in regards with the above. Creating a shinny "This hides nodes" hammer that use obsmarker underneath is misleading. It will encourage developer to create obsolescence markers in case where it is inappropriate to do so. Because "hide" is the wrong semantic for markers.

On a general basis, a function creating markers without taking a (predecessors → successors) mapping is a strong hit that this function not approaching obsolescence markers in an appropriate way.

[1] eg: we allow ourself to use obsolescence marker for some internal changeset because we have carefully though about it and we decided it won't be problematic. I would rather not do it at all.

The fact this function only takes a list of nodes (instead of (pec, suc)
mapping) is a good example of such simplification. Series from Jun about
histedit is another. Obsolescence markers are not meant as a generic
temporary hidden mechanism for local/temporary nodes and should not be used
that way.

In broad strokes, I agree here. Have we made any conclusive progress
on a mechanism for archiving changes? The reason we're seeing all
these "abuses" of obsolete markers is that they're currently the only
append-only means by which things can be discarded, and there are (as
you know) ugly cache implications to running strip.

Strip is far from ideal. However there are multiple mitigation mechanism that exist today to limit its impact on cache and there is many low hanging fruit to improves this mitigation.

In addition, changeset filtering currently also have some bad impact on cache, so they do not simply goes away if we stop using strip. Their are also many low hanging fruits to mitigate this impact.

(I'm sort of +0 on this series, but I'd be more enthusiastic about
having a proper hiding mechanism in place that isn't obsmarkers before
we go too far down this road.)

To summarize my rational above, the proposed function is misusing the obsolescence markers semantic and will leads to problematic usages of them. So I'm -1 for it.

I'm not just bringing this out of potential fear. In the past years I've seen (and prevented) people to misuse obsolescence marker for histedit-abort, rebase-abort, and histedit-abort again, etc… (we just fixed a regression as this thread was starting).

I hope this long message help to clarify various concept. We have way forward to reduce the use of stripping without abusing the obsolescence concept in a way that will create issue for users. These way forward are in reach and would not take too long to build.

Cheers,

--
Pierre-Yves David
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to