Re: [PATCH RFC] repo: add an ability to hide nodes in an appropriate way

Martin von Zweigbergk via Mercurial-devel Thu, 30 Mar 2017 10:36:59 -0700

On Thu, Mar 30, 2017 at 10:08 AM, Ryan McElroy <r...@fb.com> wrote:
> On 3/30/17 3:15 PM, Pierre-Yves David wrote:
>>
>> On 03/27/2017 07:24 PM, Augie Fackler wrote:
>>>
>>> On Mon, Mar 27, 2017 at 10:27:33AM +0200, Pierre-Yves David wrote:
>>>>
>>>>
>>>>
>>>> On 03/27/2017 12:32 AM, Kostia Balytskyi wrote:
>>>>>
>>>>> # HG changeset patch
>>>>> # User Kostia Balytskyi <ikos...@fb.com>
>>>>> # Date 1490567500 25200
>>>>> #      Sun Mar 26 15:31:40 2017 -0700
>>>>> # Node ID 43cd56f38c1ca2ad1920ed6804e1a90a5e263c0d
>>>>> # Parent  c60091fa1426892552dd6c0dd4b9c49e3c3da045
>>>>> repo: add an ability to hide nodes in an appropriate way
>>>>>
>>>>> Potentially frequent usecase is to hide nodes in the most appropriate
>>>>> for current repo configuration way. Examples of things which use this
>>>>> are rebase (strip or obsolete old nodes), shelve (strip of obsolete
>>>>> temporary nodes) and probably others.
>>>>> Jun Wu had an idea of having one place in core where this functionality
>>>>> is implemented so that others can utilize it. This patch
>>>>> implements this idea.
>>>>
>>>>
>>>> I do not think this is a step in the right direction, creating
>>>> obsolescence
>>>> marker is not just about hiding changeset.
>>>
>>>
>>> But we're using them for that today, all over the place.
>>
>>
>> Actually, we do not use it all over the place. Current markers usages in
>> core are:
>>
>> amend:
>>  - record evolution history
>>  - prune the temporary amend commit
>>   (that I would rather see disappear all together)
>>
>> rebase:
>>  - record evolution history on successful completion
>>
>> histedit:
>>  - record evolution history on successful completion
>>  - prune temporary nodes on successsful completion
>>    (This ones used to be stripped even with evolution)
>>
>> (let us talk about shelve later)
>>
>>
>> So they are mainly used to record evolution history on successful
>> operation (their intended usage) between "real-changesets". This is the
>> intended usage of obsolescence markers.
>>
>> In addition, they are also used in a couple of place to hide temporary
>> changesets (in the "utility-space") after an operation succeeed.
>> This a bit an unorthodox uses of the obsolescence markers, but is "okay"
>> to use them in these case because we know:
>>  1) this is utility-space so user never interact with them.
>>  2) we only create them for completed successful operation so we'll never
>> need them ever again.
>> Strictly speaking we could just strip these temporary commits (and we have
>> done so in the past) and this will not change anything for the user. Any
>> other hiding mechanism aimed at "internal temporary" commit would do the job
>> just fine. The internal commit never needs (and actually should ever) to
>> leave the user repository.
>> In practice, we could use in memory commit for most of these temporary
>> commit  (if not all) and never have to deal with hiding them.
>>
>>> We're also having pretty productive (I think!) discussions about
>>> non-obsmarker
>>> based mechanisms for hiding things that are implementation details of
>>> a sort (amend changesets that get folded immediately, shelve changes,
>>> etc).
>>
>>
>> (Let me be at be long to try to be clear and comprehensive)
>>
>> I can see three categories of things we want to hide:
>>
>> :obsolete changeset (evolution semantic):
>>
>>   A rewrite operation has created/recorded a new version of this
>> changesets. that old version no longer needs to be shown to users (if
>> possible). There is a strong semantic associated with such changesets and
>> the property will be propagated to other repositories
>>
>> :temporary/internal changesets:
>>
>>   Such changesets are created as a side effect of other operation (amend,
>> histedit, etc). They have the following properties (once we are done using
>> them)
>>
>>   * They are irrelevant to the users,
>>   * We should never-ever base anything on them,
>>   * We should never-ever access them ever again,
>>   * They should never-ever leave the repository.
>>   * They never stop being internal-changesets
>>
>> :locally-hidden changeset:
>>
>>   An hypothetically (not used) yet local-only hidden mechanism (similar to
>> strip but without actual repo stripping). This could be used for local
>> hiding/masking of changeset in the "real-changeset" space. Such data would
>> not be exchanged. Commit reappears when "recreated" or re-pulled.
>>
>> ---
>>
>> To take practical examples:
>>
>> :amend:
>>   - amend source: "obsolete-changeset"
>>   - temporary changeset: "internal-changeset"
>>
>> :rebase:
>>   - rebase source, on success: "obsolete-changeset"
>>   - rebase result, on abort:  "local-hiding"
>>
>>
>> :histedit:
>>   - histedit source, on success:         "obsolete-changeset"
>>   - histedit temporary node, on success: "internal-changeset"
>>   - histedit result, on abort:           "local-hiding"
>>   - histedit temporary node, on abort:   "local-hiding (internal-node)"
>>
>> extra notes:
>>  * If local hiding (strip) would take care of obsolescence marker, rebase
>> and histedit could create them "as they go" instead of to the end: "on
>> success".
>>  * In rebase --abort and histedit --abort, strip is used on freshly
>> created changesets, so its performance impact is limited
>>  * we use "local hiding" (strip) on temporary nodes on histedit abort
>> because we needs to be able to reuse them if histedit is rerun. So we cannot
>> throw them in oblivion just yet.
>>
>> ---
>>
>> regarding implementations
>>
>> "obsolete-changesets": we already have a solid mechanism for that.
>>
>> "internal-changesets"" has very constrained property. it allow us to abuse
>> obsolescence markers for them.
>> It would be quite easy and fast to build an ad-hoc mechanism for them and
>> stop abusing obsmarkers. (for example a simple bitmaps or maybe some
>> "internal-*" phases. Having a dedicated tacking of internal changesets helps
>> enforcing the their property and simplify the obsmarker space.
>>
>> "locally-hidden-changesets": this is trickier. For now the only usable
>> mechanism we have for them is stripping. We cannot use obsolescence marker
>> for them since obsolescence markers are building a global persistent and
>> distributed stated and we are explicitly aims at staying light and local in
>> this case (I can elaborate on that last bit if you think it is needed).
>> Having or not having node-version does not make a difference here.
>> locally-hiding do not want to have any global side effect and creating
>> obsmarkers will always results in global side effect.
>>
>> Having local-hidding mechanism is actually not too far away, it requires a
>> couple of things:
>> 1) applying the hiding,
>> 2) storing the data,
>> 3) automatic lifting of the hiding.
>>
>> (1) is already here for free. As explained during the sprint, the API for
>> hidden things is already flexible enough from the start to be fed with other
>> mechanism than just obsolescence.
>> (2) should not be too complicated, either a bitmap approach or a root
>> based approached (bitmap is probably more solid).
>> (3) probably already exist in Jun "archived" experiment. We have to make
>> sure recommit/repulling a locally-hidden changeset makes it available again.
>>
>> Overall, this is not too much work.
>>
>> One extra thing to take care of is obsmarkers. When we strip a changesets
>> we want to be able to strip directly related obsmarker. Strip is not
>> currently able to do so, but it will be able to do so soon and we have a
>> clear technical path here. So non-destructive "hiding" of changeset would
>> needs its counterpart for obsmarkers before we can use it in case where
>> obsmarkers are involved..
>>
>>>> It is about recording the
>>>> evolution of time of changesets (predecessor ← [successors]). This data
>>>> is
>>>> used in multiple place for advance logic, exchanged across repository
>>>> etc.
>>>> Being too heavy handed on stripping will have problematic consequence
>>>> for
>>>> the user experiences. Having an easy interfaces to do such heavy handed
>>>> pruning will encourage developer do to such mistake.
>>>
>>>
>>> I'm struggling a bit to understand your argument here. I think your
>>> argument is this: hiding changes can be done for many reasons, and
>>> having a simple "just hide this I don't care how" API means people
>>> won't think before they do the hiding. Is that an accurate
>>> simplification and restating of your position here? (If no, please try
>>> and restate with more nuance.)
>>
>>
>> My point go a bit further than that. Let me restate:
>>
>> * The purpose of obsolescence marker is to (globally) track the evolution
>> of changesets through rewrite,
>>
>> * The fact obsolete changesets get hidden is just an (intended)
>> consequence of this tracking,
>>
>> * Creation of new obsolescence markers should be done after careful
>> consideration and weighting.[1]
>>
>> This current proposal goes in the wrong direction in regards with the
>> above. Creating a shinny "This hides nodes" hammer that use obsmarker
>> underneath is misleading. It will encourage developer to create obsolescence
>> markers in case where it is inappropriate to do so. Because "hide" is the
>> wrong semantic for markers.
>>
>> On a general basis, a function creating markers without taking a
>> (predecessors → successors) mapping is a strong hit that this function not
>> approaching obsolescence markers in an appropriate way.
>>
>> [1] eg: we allow ourself to use obsolescence marker for some internal
>> changeset because we have carefully though about it and we decided it won't
>> be problematic. I would rather not do it at all.
>>
>>>> The fact this function only takes a list of nodes (instead of (pec, suc)
>>>> mapping) is a good example of such simplification. Series from Jun about
>>>> histedit is another. Obsolescence markers are not meant as a generic
>>>> temporary hidden mechanism for local/temporary nodes and should not be
>>>> used
>>>> that way.
>>>
>>>
>>> In broad strokes, I agree here. Have we made any conclusive progress
>>> on a mechanism for archiving changes? The reason we're seeing all
>>> these "abuses" of obsolete markers is that they're currently the only
>>> append-only means by which things can be discarded, and there are (as
>>> you know) ugly cache implications to running strip.
>>
>>
>> Strip is far from ideal. However there are multiple mitigation mechanism
>> that exist today to limit its impact on cache and there is many low hanging
>> fruit to improves this mitigation.
>>
>> In addition, changeset filtering currently also have some bad impact on
>> cache, so they do not simply goes away if we stop using strip. Their are
>> also many low hanging fruits to mitigate this impact.
>>
>>> (I'm sort of +0 on this series, but I'd be more enthusiastic about
>>> having a proper hiding mechanism in place that isn't obsmarkers before
>>> we go too far down this road.)
>>
>>
>> To summarize my rational above, the proposed function is misusing the
>> obsolescence markers semantic and will leads to problematic usages of them.
>> So I'm -1 for it.
>>
>> I'm not just bringing this out of potential fear. In the past years I've
>> seen (and prevented) people to misuse obsolescence marker for
>> histedit-abort, rebase-abort, and histedit-abort again, etc… (we just fixed
>> a regression as this thread was starting).
>>
>> I hope this long message help to clarify various concept. We have way
>> forward to reduce the use of stripping without abusing the obsolescence
>> concept in a way that will create issue for users. These way forward are in
>> reach and would not take too long to build.
>>
>>
>
> For what it's worth, I find this essay pretty convincing: obsmarkers have
> their use-case, they are lightly abused now (for temp-amend-commit) but it
> probably makes sense to not abuse them further.


I agree.

>
> It sounds like an "internal" phase above "secret" would actually map fairly
> well onto the temp-amend-commit, hidden shelves never meant to be exchanged,
> aborted rebase, etc places.
>
> Pierre-Yves, since you've thought about this a lot, does a phase here make
> sense (note that I'm not saying a general hiding place like "archive" phase,
> but just for internal nodes that we generally don't want shown or exchanged,
> but are also not "technically obsolete".

I was talking to Durham and Augie about this yesterday. We may not
want to overload phases for this purpose, but something like phases
(although boolean) makes sense. By not using phases for it, the phase
information will not be lost when it's hidden. It should probably hide
all descendants just like phases do, but whether we store a set of
roots or a bitmap of nodeids is a different question.

I'll let Durham elaborate here or elsewhere on his ideas.

>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Re: [PATCH RFC] repo: add an ability to hide nodes in an appropriate way

Reply via email to