On 03/27/2017 07:24 PM, Augie Fackler wrote:
On Mon, Mar 27, 2017 at 10:27:33AM +0200, Pierre-Yves David wrote:
On 03/27/2017 12:32 AM, Kostia Balytskyi wrote:
# HG changeset patch
# User Kostia Balytskyi <ikos...@fb.com>
# Date 1490567500 25200
# Sun Mar 26 15:31:40 2017 -0700
# Node ID 43cd56f38c1ca2ad1920ed6804e1a90a5e263c0d
# Parent c60091fa1426892552dd6c0dd4b9c49e3c3da045
repo: add an ability to hide nodes in an appropriate way
Potentially frequent usecase is to hide nodes in the most appropriate
for current repo configuration way. Examples of things which use this
are rebase (strip or obsolete old nodes), shelve (strip of obsolete
temporary nodes) and probably others.
Jun Wu had an idea of having one place in core where this
functionality
is implemented so that others can utilize it. This patch
implements this idea.
I do not think this is a step in the right direction, creating
obsolescence
marker is not just about hiding changeset.
But we're using them for that today, all over the place.
Actually, we do not use it all over the place. Current markers usages
in core are:
amend:
- record evolution history
- prune the temporary amend commit
(that I would rather see disappear all together)
rebase:
- record evolution history on successful completion
histedit:
- record evolution history on successful completion
- prune temporary nodes on successsful completion
(This ones used to be stripped even with evolution)
(let us talk about shelve later)
So they are mainly used to record evolution history on successful
operation (their intended usage) between "real-changesets". This is
the intended usage of obsolescence markers.
In addition, they are also used in a couple of place to hide temporary
changesets (in the "utility-space") after an operation succeeed.
This a bit an unorthodox uses of the obsolescence markers, but is
"okay" to use them in these case because we know:
1) this is utility-space so user never interact with them.
2) we only create them for completed successful operation so we'll
never need them ever again.
Strictly speaking we could just strip these temporary commits (and we
have done so in the past) and this will not change anything for the
user. Any other hiding mechanism aimed at "internal temporary" commit
would do the job just fine. The internal commit never needs (and
actually should ever) to leave the user repository.
In practice, we could use in memory commit for most of these temporary
commit (if not all) and never have to deal with hiding them.
We're also having pretty productive (I think!) discussions about
non-obsmarker
based mechanisms for hiding things that are implementation details of
a sort (amend changesets that get folded immediately, shelve changes,
etc).
(Let me be at be long to try to be clear and comprehensive)
I can see three categories of things we want to hide:
:obsolete changeset (evolution semantic):
A rewrite operation has created/recorded a new version of this
changesets. that old version no longer needs to be shown to users (if
possible). There is a strong semantic associated with such changesets
and the property will be propagated to other repositories
:temporary/internal changesets:
Such changesets are created as a side effect of other operation
(amend, histedit, etc). They have the following properties (once we
are done using them)
* They are irrelevant to the users,
* We should never-ever base anything on them,
* We should never-ever access them ever again,
* They should never-ever leave the repository.
* They never stop being internal-changesets
:locally-hidden changeset:
An hypothetically (not used) yet local-only hidden mechanism
(similar to strip but without actual repo stripping). This could be
used for local hiding/masking of changeset in the "real-changeset"
space. Such data would not be exchanged. Commit reappears when
"recreated" or re-pulled.
---
To take practical examples:
:amend:
- amend source: "obsolete-changeset"
- temporary changeset: "internal-changeset"
:rebase:
- rebase source, on success: "obsolete-changeset"
- rebase result, on abort: "local-hiding"
:histedit:
- histedit source, on success: "obsolete-changeset"
- histedit temporary node, on success: "internal-changeset"
- histedit result, on abort: "local-hiding"
- histedit temporary node, on abort: "local-hiding (internal-node)"
extra notes:
* If local hiding (strip) would take care of obsolescence marker,
rebase and histedit could create them "as they go" instead of to the
end: "on success".
* In rebase --abort and histedit --abort, strip is used on freshly
created changesets, so its performance impact is limited
* we use "local hiding" (strip) on temporary nodes on histedit abort
because we needs to be able to reuse them if histedit is rerun. So we
cannot throw them in oblivion just yet.
---
regarding implementations
"obsolete-changesets": we already have a solid mechanism for that.
"internal-changesets"" has very constrained property. it allow us to
abuse obsolescence markers for them.
It would be quite easy and fast to build an ad-hoc mechanism for them
and stop abusing obsmarkers. (for example a simple bitmaps or maybe
some "internal-*" phases. Having a dedicated tacking of internal
changesets helps enforcing the their property and simplify the
obsmarker space.
"locally-hidden-changesets": this is trickier. For now the only usable
mechanism we have for them is stripping. We cannot use obsolescence
marker for them since obsolescence markers are building a global
persistent and distributed stated and we are explicitly aims at
staying light and local in this case (I can elaborate on that last bit
if you think it is needed). Having or not having node-version does not
make a difference here. locally-hiding do not want to have any global
side effect and creating obsmarkers will always results in global side
effect.
Having local-hidding mechanism is actually not too far away, it
requires a couple of things:
1) applying the hiding,
2) storing the data,
3) automatic lifting of the hiding.
(1) is already here for free. As explained during the sprint, the API
for hidden things is already flexible enough from the start to be fed
with other mechanism than just obsolescence.
(2) should not be too complicated, either a bitmap approach or a root
based approached (bitmap is probably more solid).
(3) probably already exist in Jun "archived" experiment. We have to
make sure recommit/repulling a locally-hidden changeset makes it
available again.
Overall, this is not too much work.
One extra thing to take care of is obsmarkers. When we strip a
changesets we want to be able to strip directly related obsmarker.
Strip is not currently able to do so, but it will be able to do so
soon and we have a clear technical path here. So non-destructive
"hiding" of changeset would needs its counterpart for obsmarkers
before we can use it in case where obsmarkers are involved..
It is about recording the
evolution of time of changesets (predecessor ← [successors]). This
data is
used in multiple place for advance logic, exchanged across
repository etc.
Being too heavy handed on stripping will have problematic
consequence for
the user experiences. Having an easy interfaces to do such heavy handed
pruning will encourage developer do to such mistake.
I'm struggling a bit to understand your argument here. I think your
argument is this: hiding changes can be done for many reasons, and
having a simple "just hide this I don't care how" API means people
won't think before they do the hiding. Is that an accurate
simplification and restating of your position here? (If no, please try
and restate with more nuance.)
My point go a bit further than that. Let me restate:
* The purpose of obsolescence marker is to (globally) track the
evolution of changesets through rewrite,
* The fact obsolete changesets get hidden is just an (intended)
consequence of this tracking,
* Creation of new obsolescence markers should be done after careful
consideration and weighting.[1]
This current proposal goes in the wrong direction in regards with the
above. Creating a shinny "This hides nodes" hammer that use obsmarker
underneath is misleading. It will encourage developer to create
obsolescence markers in case where it is inappropriate to do so.
Because "hide" is the wrong semantic for markers.
On a general basis, a function creating markers without taking a
(predecessors → successors) mapping is a strong hit that this function
not approaching obsolescence markers in an appropriate way.
[1] eg: we allow ourself to use obsolescence marker for some internal
changeset because we have carefully though about it and we decided it
won't be problematic. I would rather not do it at all.
The fact this function only takes a list of nodes (instead of (pec,
suc)
mapping) is a good example of such simplification. Series from Jun
about
histedit is another. Obsolescence markers are not meant as a generic
temporary hidden mechanism for local/temporary nodes and should not
be used
that way.
In broad strokes, I agree here. Have we made any conclusive progress
on a mechanism for archiving changes? The reason we're seeing all
these "abuses" of obsolete markers is that they're currently the only
append-only means by which things can be discarded, and there are (as
you know) ugly cache implications to running strip.
Strip is far from ideal. However there are multiple mitigation
mechanism that exist today to limit its impact on cache and there is
many low hanging fruit to improves this mitigation.
In addition, changeset filtering currently also have some bad impact
on cache, so they do not simply goes away if we stop using strip.
Their are also many low hanging fruits to mitigate this impact.
(I'm sort of +0 on this series, but I'd be more enthusiastic about
having a proper hiding mechanism in place that isn't obsmarkers before
we go too far down this road.)
To summarize my rational above, the proposed function is misusing the
obsolescence markers semantic and will leads to problematic usages of
them. So I'm -1 for it.
I'm not just bringing this out of potential fear. In the past years
I've seen (and prevented) people to misuse obsolescence marker for
histedit-abort, rebase-abort, and histedit-abort again, etc… (we just
fixed a regression as this thread was starting).
I hope this long message help to clarify various concept. We have way
forward to reduce the use of stripping without abusing the
obsolescence concept in a way that will create issue for users. These
way forward are in reach and would not take too long to build.