Re: Topics [was: news from the topic experiment]
Pierre-Yves Davidwrites: > On 10/14/2016 05:48 PM, Erik van Zijst wrote: >> On Thu, Oct 13, 2016 at 7:01 AM, Pierre-Yves David >> > >> wrote: >> >> >> So, thanks for exploring possibilities to make this frontier thiner. >> However, I can see some issues with some aspects of this proposal, >> using the same field for either branch or topic make them mostly >> mutually exclusive. Publishing a topic on a named branch would >> require to alter the changesets data (and therefore, hash) This >> means people could use topic only with the default branch (or >> significant more complexity to work around this). As I understand, >> Bitbucket enforces a single master branch so that might actually fit >> your model. This is probably too restricting for the general project >> (more about that later). >> >> >> I'm not sure I understand. I would think that the mutually exclusive >> part would be desirable, not a limitation. I'm not sure how that would >> limit people to using only the default branch? >> >> One could still create 2 long-lived branched. The most common workflows >> are based around gitflow and typically define 2 long-lived branches: >> e.g. "master" and "develop". While Bitbucket only treats one branch as >> the "main branch", most teams have more than 1 long-lived branch. >> Bitbucket's main branch merely serves as the default pull request >> destination and initial context for the source browser. >> >> It would seem to me that this could have some benefits: >> >> 1. There would be no risk of a name collision between the branch and >> topic namespaces. >> >> >> I'm not certain this actually avoid the risk of name collision. >> People could use the same branch/topic name on different changesets >> with different values for the flag. That would lead to both a topic >> and a named branch to exists with the same name. >> >> >> Clashes after pulling between forks are a reality. I could have a topic >> "foo" and pull in a named branch "foo", this is true. However, I'm not >> sure I see how a separate topic and branch namespace would solve that. >> >> If I have a topic "default:foo", I could pull a named branch "foo" and >> unless we expose the prefix part in the topic name, that would get >> ambiguous too. >> >> Additionally, a distinct namespace might open us up to more fork >> ambiguity. I could pull a topic "develop:foo". Would that be considered >> the same topic? Should it be merged with mine? Would it have a >> consequence for the implicit merge/rebase target? >> >> >> In all cases we should have the local UI fight hard to prevent >> people to create collisions between branch and topic. (And some >> descent way to point out name conflict if they appends). >> >> >> For sure. Though I'm not sure I see the advantage of separate namespace >> in this situation. >> >> >> 2. Interoperability with non-topic clients would mostly just work. >> >> This could be a big boon for existing ecosystem tools like CI >> servers >> that wouldn't have to be modified. >> >> >> There is some very interesting ramification to this proposal. Even >> if we do not go with the flag approach. We store the full (branch, >> topic) pair into the branch field. For example we could use ":" as a >> separator branch=BRANCH:TOPIC. Not only this would allow old clients >> to transport the data (we already have that) but this also mean old >> client can also view and preserve that data (and this is new) even >> if it does not get the behavior improvement related to topic. That >> would be a large usability boost for old client. >> >> >> I really like that idea and I hadn't considered myself. It would, at >> least in theory, mean old clients won't wreak havoc and will be able to >> see all topics and branches. They could even commit on top of a topic >> without breaking anything. > > So, the other branch of that thread made me realised that we cannot do > this "BRANCH:TOPIC" storage in the current "branch" field. > > An important part of topic is to have them fade away when the changesets > become public. This fading away will be implemented client side (logic > to stop reading the value). If we expose this data to old client who > does not have this logic this means that topic would live as > "BRANCH:TOPIC" forever on old client, breaking them in various way. I don't see that. Currently, we have: - old client cannot see topic at all (making it unusable) - new client can see topics If we move to branchname + flag, we have: - old client can see and update using 'hg update NAME' (extremely valuable) - old client can merge a topic to default (extremely helpful and surprising) - new client will hide those branches Your argument is that old clients will see those branches. That's a
Re: [PATCH 1 of 2 sqldirstate] sqldirstate: specify checkambig=True to avoid file stat ambiguity
At Fri, 14 Oct 2016 21:31:36 +0200, Pierre-Yves David wrote: > > Could we make the flag for this 'sqldirstate-ext' to make it clearer > that it has an external target? Oops, sorry for lack of "-ext" > On 10/14/2016 09:10 PM, FUJIWARA Katsunori wrote: > > # HG changeset patch > > # User FUJIWARA Katsunori> > # Date 1476456912 -32400 > > # Fri Oct 14 23:55:12 2016 +0900 > > # Node ID 72486917916f6d47525f721cd5b27012091122ed > > # Parent d2c3a2c02eb6c7e5a7331ba0cf15e5bf7c8dc8dc > > # Available At https://fo...@bitbucket.org/foozy/hg-experimental > > # hg pull https://fo...@bitbucket.org/foozy/hg-experimental -r > > 72486917916f > > sqldirstate: specify checkambig=True to avoid file stat ambiguity > > > > With sqldirstate extension, size of .hg/dirstate is always fixed: node > > ID x 2 + dummy fixed text. > > > > This increases occurrence of file stat ambiguity, and > > processes/threads running parallelly might overlook change of > > dirstate, because reloading @filecache-ed property repo.dirstate is > > avoided in such case. See wiki page below for detail about file stat > > ambiguity. > > > > https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan > > > > This patch specifies checkambig=True at opening .hg/dirstate file for > > writing, to avoid file stat ambiguity. > > > > Strictly speaking, to enable sqldirstate with Mercurial earlier than > > upcoming 4.0 (or 76f1ea360c7e), it should be examined whether > > checkambig optional argument is available for opener or not, because > > it has been available since 4.0 (or 76f1ea360c7e). > > > > But current sqldirstate implementation depends on 2ebd507e5ac3 of > > Mercurial, which has introduced dirstate._origpl after 76f1ea360c7e. > > > > Therefor, changes in this patch are safe enough, unless removing > > dependence on dirstate._origpl from sqldirstate. > > > > diff --git a/sqldirstate/sqldirstate.py b/sqldirstate/sqldirstate.py > > --- a/sqldirstate/sqldirstate.py > > +++ b/sqldirstate/sqldirstate.py > > @@ -563,7 +563,7 @@ def makedirstate(cls): > > > > > > def writefakedirstate(dirstate): > > -st = dirstate._opener(FAKEDIRSTATE, "w", atomictemp=True) > > +st = dirstate._opener(FAKEDIRSTATE, "w", atomictemp=True, > > checkambig=True) > > st.write("".join(dirstate._pl)) > > st.write("\nThis is fake dirstate put here by the sqldirsate.") > > st.write("\nIt contains only working copy parents info.") > > @@ -629,7 +629,7 @@ def toflat(sqldirstate): > > it's not touching anything but the dirstate file > > """ > > # converts a sqldirstate to a flat one > > -st = sqldirstate._opener("dirstate", "w", atomictemp=True) > > +st = sqldirstate._opener("dirstate", "w", atomictemp=True, > > checkambig=True) > > newmap = {} > > for k, v in sqldirstate._map.iteritems(): > > newmap[k] = v > > ___ > > Mercurial-devel mailing list > > Mercurial-devel@mercurial-scm.org > > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > > > > -- > Pierre-Yves David > -- [FUJIWARA Katsunori] fo...@lares.dti.ne.jp ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 2 sqldirstate] sqldirstate: specify checkambig=True to avoid file stat ambiguity
Could we make the flag for this 'sqldirstate-ext' to make it clearer that it has an external target? On 10/14/2016 09:10 PM, FUJIWARA Katsunori wrote: # HG changeset patch # User FUJIWARA Katsunori# Date 1476456912 -32400 # Fri Oct 14 23:55:12 2016 +0900 # Node ID 72486917916f6d47525f721cd5b27012091122ed # Parent d2c3a2c02eb6c7e5a7331ba0cf15e5bf7c8dc8dc # Available At https://fo...@bitbucket.org/foozy/hg-experimental # hg pull https://fo...@bitbucket.org/foozy/hg-experimental -r 72486917916f sqldirstate: specify checkambig=True to avoid file stat ambiguity With sqldirstate extension, size of .hg/dirstate is always fixed: node ID x 2 + dummy fixed text. This increases occurrence of file stat ambiguity, and processes/threads running parallelly might overlook change of dirstate, because reloading @filecache-ed property repo.dirstate is avoided in such case. See wiki page below for detail about file stat ambiguity. https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan This patch specifies checkambig=True at opening .hg/dirstate file for writing, to avoid file stat ambiguity. Strictly speaking, to enable sqldirstate with Mercurial earlier than upcoming 4.0 (or 76f1ea360c7e), it should be examined whether checkambig optional argument is available for opener or not, because it has been available since 4.0 (or 76f1ea360c7e). But current sqldirstate implementation depends on 2ebd507e5ac3 of Mercurial, which has introduced dirstate._origpl after 76f1ea360c7e. Therefor, changes in this patch are safe enough, unless removing dependence on dirstate._origpl from sqldirstate. diff --git a/sqldirstate/sqldirstate.py b/sqldirstate/sqldirstate.py --- a/sqldirstate/sqldirstate.py +++ b/sqldirstate/sqldirstate.py @@ -563,7 +563,7 @@ def makedirstate(cls): def writefakedirstate(dirstate): -st = dirstate._opener(FAKEDIRSTATE, "w", atomictemp=True) +st = dirstate._opener(FAKEDIRSTATE, "w", atomictemp=True, checkambig=True) st.write("".join(dirstate._pl)) st.write("\nThis is fake dirstate put here by the sqldirsate.") st.write("\nIt contains only working copy parents info.") @@ -629,7 +629,7 @@ def toflat(sqldirstate): it's not touching anything but the dirstate file """ # converts a sqldirstate to a flat one -st = sqldirstate._opener("dirstate", "w", atomictemp=True) +st = sqldirstate._opener("dirstate", "w", atomictemp=True, checkambig=True) newmap = {} for k, v in sqldirstate._map.iteritems(): newmap[k] = v ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 1 of 2 sqldirstate] sqldirstate: specify checkambig=True to avoid file stat ambiguity
# HG changeset patch # User FUJIWARA Katsunori# Date 1476456912 -32400 # Fri Oct 14 23:55:12 2016 +0900 # Node ID 72486917916f6d47525f721cd5b27012091122ed # Parent d2c3a2c02eb6c7e5a7331ba0cf15e5bf7c8dc8dc # Available At https://fo...@bitbucket.org/foozy/hg-experimental # hg pull https://fo...@bitbucket.org/foozy/hg-experimental -r 72486917916f sqldirstate: specify checkambig=True to avoid file stat ambiguity With sqldirstate extension, size of .hg/dirstate is always fixed: node ID x 2 + dummy fixed text. This increases occurrence of file stat ambiguity, and processes/threads running parallelly might overlook change of dirstate, because reloading @filecache-ed property repo.dirstate is avoided in such case. See wiki page below for detail about file stat ambiguity. https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan This patch specifies checkambig=True at opening .hg/dirstate file for writing, to avoid file stat ambiguity. Strictly speaking, to enable sqldirstate with Mercurial earlier than upcoming 4.0 (or 76f1ea360c7e), it should be examined whether checkambig optional argument is available for opener or not, because it has been available since 4.0 (or 76f1ea360c7e). But current sqldirstate implementation depends on 2ebd507e5ac3 of Mercurial, which has introduced dirstate._origpl after 76f1ea360c7e. Therefor, changes in this patch are safe enough, unless removing dependence on dirstate._origpl from sqldirstate. diff --git a/sqldirstate/sqldirstate.py b/sqldirstate/sqldirstate.py --- a/sqldirstate/sqldirstate.py +++ b/sqldirstate/sqldirstate.py @@ -563,7 +563,7 @@ def makedirstate(cls): def writefakedirstate(dirstate): -st = dirstate._opener(FAKEDIRSTATE, "w", atomictemp=True) +st = dirstate._opener(FAKEDIRSTATE, "w", atomictemp=True, checkambig=True) st.write("".join(dirstate._pl)) st.write("\nThis is fake dirstate put here by the sqldirsate.") st.write("\nIt contains only working copy parents info.") @@ -629,7 +629,7 @@ def toflat(sqldirstate): it's not touching anything but the dirstate file """ # converts a sqldirstate to a flat one -st = sqldirstate._opener("dirstate", "w", atomictemp=True) +st = sqldirstate._opener("dirstate", "w", atomictemp=True, checkambig=True) newmap = {} for k, v in sqldirstate._map.iteritems(): newmap[k] = v ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: Topics [was: news from the topic experiment]
On 10/14/2016 05:48 PM, Erik van Zijst wrote: On Thu, Oct 13, 2016 at 7:01 AM, Pierre-Yves David> wrote: So, thanks for exploring possibilities to make this frontier thiner. However, I can see some issues with some aspects of this proposal, using the same field for either branch or topic make them mostly mutually exclusive. Publishing a topic on a named branch would require to alter the changesets data (and therefore, hash) This means people could use topic only with the default branch (or significant more complexity to work around this). As I understand, Bitbucket enforces a single master branch so that might actually fit your model. This is probably too restricting for the general project (more about that later). I'm not sure I understand. I would think that the mutually exclusive part would be desirable, not a limitation. I'm not sure how that would limit people to using only the default branch? One could still create 2 long-lived branched. The most common workflows are based around gitflow and typically define 2 long-lived branches: e.g. "master" and "develop". While Bitbucket only treats one branch as the "main branch", most teams have more than 1 long-lived branch. Bitbucket's main branch merely serves as the default pull request destination and initial context for the source browser. It would seem to me that this could have some benefits: 1. There would be no risk of a name collision between the branch and topic namespaces. I'm not certain this actually avoid the risk of name collision. People could use the same branch/topic name on different changesets with different values for the flag. That would lead to both a topic and a named branch to exists with the same name. Clashes after pulling between forks are a reality. I could have a topic "foo" and pull in a named branch "foo", this is true. However, I'm not sure I see how a separate topic and branch namespace would solve that. If I have a topic "default:foo", I could pull a named branch "foo" and unless we expose the prefix part in the topic name, that would get ambiguous too. Additionally, a distinct namespace might open us up to more fork ambiguity. I could pull a topic "develop:foo". Would that be considered the same topic? Should it be merged with mine? Would it have a consequence for the implicit merge/rebase target? In all cases we should have the local UI fight hard to prevent people to create collisions between branch and topic. (And some descent way to point out name conflict if they appends). For sure. Though I'm not sure I see the advantage of separate namespace in this situation. 2. Interoperability with non-topic clients would mostly just work. This could be a big boon for existing ecosystem tools like CI servers that wouldn't have to be modified. There is some very interesting ramification to this proposal. Even if we do not go with the flag approach. We store the full (branch, topic) pair into the branch field. For example we could use ":" as a separator branch=BRANCH:TOPIC. Not only this would allow old clients to transport the data (we already have that) but this also mean old client can also view and preserve that data (and this is new) even if it does not get the behavior improvement related to topic. That would be a large usability boost for old client. I really like that idea and I hadn't considered myself. It would, at least in theory, mean old clients won't wreak havoc and will be able to see all topics and branches. They could even commit on top of a topic without breaking anything. So, the other branch of that thread made me realised that we cannot do this "BRANCH:TOPIC" storage in the current "branch" field. An important part of topic is to have them fade away when the changesets become public. This fading away will be implemented client side (logic to stop reading the value). If we expose this data to old client who does not have this logic this means that topic would live as "BRANCH:TOPIC" forever on old client, breaking them in various way. [sad panda] I guess topic will keep being stored in there dedicated extra field then. Cheers, -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: Topics [was: news from the topic experiment]
On 10/14/2016 07:29 PM, Erik van Zijst wrote: On Fri, Oct 14, 2016 at 9:54 AM, Pierre-Yves Davidwrote: If we use the same field for either topic or name branch a changeset can either be: * on a named branch but no topic * on a topic but no named branch (so default branch) Why would a topic imply that it is on the default branch? I don't think I see that. In my mental model a topic is really just a branch, just like any named branch. What does it mean for a topic to be "on the default branch"? What does that facilitate? The current rules are: * without branch information, a changeset is on the default branch, * When a changeset is public, the topic fade away and stop applying to the changeset. You proposal is: * let's use the branch field for topic + a flag (if I got that right) So the result of this is: * When a changeset turns public we stop reading the topic data, the changeset. As the changeset have no other information, it is on the default branch. However, going deeper on this made me realize that we cannot use the branch field for anything topic related. A strong part of topic is that it fades away when it become public. An old client will not have any logic to enforce the fade out of the topic informations. So we cannot expose the topic information to any old client as they would wrongly keep being affected by closed topic. Sadly this also affect the BRANCH:TOPIC idea. If one insists on some form of named-branch-context, then wouldn't that simply be the named branch that the topic was originally forked off of? I don't think I see the need for that information to be carried forward in every topic commit. The current way branch works is by having a dedicated field stored in all changesets. The work around topic is not planning to change that and this will probably be true until the end of time for backward compatibility reason. Cheers, -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: Topics [was: news from the topic experiment]
> On Oct 14, 2016, at 1:29 PM, Erik van Zijstwrote: > >> * on a named branch but no topic >> * on a topic but no named branch (so default branch) > > Why would a topic imply that it is on the default branch? I don't > think I see that. In my mental model a topic is really just a branch, > just like any named branch. What does it mean for a topic to be "on > the default branch"? What does that facilitate? > > If one insists on some form of named-branch-context, then wouldn't > that simply be the named branch that the topic was originally forked > off of? I don't think I see the need for that information to be > carried forward in every topic commit. I agree. signature.asc Description: Message signed with OpenPGP using GPGMail ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: Topics [was: news from the topic experiment]
On Fri, Oct 14, 2016 at 9:54 AM, Pierre-Yves Davidwrote: > If we use the same field for either topic or name branch a changeset can > either be: > > * on a named branch but no topic > * on a topic but no named branch (so default branch) Why would a topic imply that it is on the default branch? I don't think I see that. In my mental model a topic is really just a branch, just like any named branch. What does it mean for a topic to be "on the default branch"? What does that facilitate? If one insists on some form of named-branch-context, then wouldn't that simply be the named branch that the topic was originally forked off of? I don't think I see the need for that information to be carried forward in every topic commit. -E ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: Topics [was: news from the topic experiment]
Erik van Zijstwrites: > On Thu, Oct 13, 2016 at 7:01 AM, Pierre-Yves David < > pierre-yves.da...@ens-lyon.org> wrote: > >> >> So, thanks for exploring possibilities to make this frontier thiner. >> However, I can see some issues with some aspects of this proposal, using >> the same field for either branch or topic make them mostly mutually >> exclusive. Publishing a topic on a named branch would require to alter the >> changesets data (and therefore, hash) This means people could use topic >> only with the default branch (or significant more complexity to work around >> this). As I understand, Bitbucket enforces a single master branch so that >> might actually fit your model. This is probably too restricting for the >> general project (more about that later). >> > > I'm not sure I understand. I would think that the mutually exclusive part > would be desirable, not a limitation. I'm not sure how that would limit > people to using only the default branch? > > One could still create 2 long-lived branched. The most common workflows are > based around gitflow and typically define 2 long-lived branches: e.g. > "master" and "develop". While Bitbucket only treats one branch as the "main > branch", most teams have more than 1 long-lived branch. Bitbucket's main > branch merely serves as the default pull request destination and initial > context for the source browser. > > It would seem to me that this could have some benefits: >>> >>> 1. There would be no risk of a name collision between the branch and >>> topic namespaces. >>> >> >> I'm not certain this actually avoid the risk of name collision. People >> could use the same branch/topic name on different changesets with different >> values for the flag. That would lead to both a topic and a named branch to >> exists with the same name. >> > > Clashes after pulling between forks are a reality. I could have a topic > "foo" and pull in a named branch "foo", this is true. However, I'm not sure > I see how a separate topic and branch namespace would solve that. > > If I have a topic "default:foo", I could pull a named branch "foo" and > unless we expose the prefix part in the topic name, that would get > ambiguous too. > > Additionally, a distinct namespace might open us up to more fork ambiguity. > I could pull a topic "develop:foo". Would that be considered the same > topic? Should it be merged with mine? Would it have a consequence for the > implicit merge/rebase target? I hadn't thought of these two situations. I am now of the opinion that this is extremely confusing and undesirable. >> In all cases we should have the local UI fight hard to prevent people to >> create collisions between branch and topic. (And some descent way to point >> out name conflict if they appends). >> > > For sure. Though I'm not sure I see the advantage of separate namespace in > this situation. > > >> 2. Interoperability with non-topic clients would mostly just work. >>> >>> This could be a big boon for existing ecosystem tools like CI servers >>> that wouldn't have to be modified. >>> >> >> There is some very interesting ramification to this proposal. Even if we >> do not go with the flag approach. We store the full (branch, topic) pair >> into the branch field. For example we could use ":" as a separator >> branch=BRANCH:TOPIC. Not only this would allow old clients to transport the >> data (we already have that) but this also mean old client can also view and >> preserve that data (and this is new) even if it does not get the behavior >> improvement related to topic. That would be a large usability boost for old >> client. >> > > I really like that idea and I hadn't considered myself. It would, at least > in theory, mean old clients won't wreak havoc and will be able to see all > topics and branches. They could even commit on top of a topic without > breaking anything. > > That said, the situation might be different for CI servers and generally > any other automated tool that doesn't anticipate the (in their legacy view, > illegal branch name). Assuming new clients would still hide the prefix in > their UI (and so would Bitbucket), telling a legacy CI server to checkout > topic "foo" would likely fail. > > I understand we might be driven by somewhat different motivations. The > prospect to some level of compatibility with existing tools (not > necessarily just an old hg client) I think is huge in our world. If > Bitbucket teams couldn't switch to using topics until all CI and deployment > infrastructure had been patched and upgraded first, that might lead to a > chicken-and-egg problem. What about other hosting services (e.g. kallithea)? Requiring them to adjust their UI and data model to accommodate a sub-branch name / different namespace is quite onerous at this point. Having 'hg up NAME' work with existing tools and expected behavior (not to mention a developer's mental model) is such a clear win that I am pretty strongly for it. Plus, it doesn't
Re: [PATCH 1 of 2 V2] py3: refactor token parsing to handle call args properly
> This loop would never stop if malformed tokens were provided. I guess you > had a working version in which _isop() could raise IndexError. Nice catch! I'll turn the `while True` into a `for j in xrange(i + 2, len(tokens)):` loop instead, and avoid IndexError or looping to infinity altogether. Look for a V3! signature.asc Description: Message signed with OpenPGP using GPGMail ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 2 of 2 V3] py3: use namedtuple._replace to produce new tokens
# HG changeset patch # User Martijn Pieters# Date 1476347257 -3600 # Thu Oct 13 09:27:37 2016 +0100 # Node ID d20dd7db86044bdca79825499b913840d726d841 # Parent 9031460519503abe5dc430c8ece29d198121cd65 py3: use namedtuple._replace to produce new tokens diff --git a/mercurial/__init__.py b/mercurial/__init__.py --- a/mercurial/__init__.py +++ b/mercurial/__init__.py @@ -233,9 +233,7 @@ """ st = tokens[j] if st.type == token.STRING and st.string.startswith(("'", '"')): -rt = tokenize.TokenInfo(st.type, 'u%s' % st.string, -st.start, st.end, st.line) -tokens[j] = rt +tokens[j] = st._replace(string='u%s' % st.string) for i, t in enumerate(tokens): # Convert most string literals to byte literals. String literals @@ -266,8 +264,7 @@ continue # String literal. Prefix to make a b'' string. -yield tokenize.TokenInfo(t.type, 'b%s' % s, t.start, t.end, - t.line) +yield t._replace(string='b%s' % t.string) continue # Insert compatibility imports at "from __future__ import" line. @@ -287,10 +284,8 @@ for u in tokenize.tokenize(io.BytesIO(l).readline): if u.type in (tokenize.ENCODING, token.ENDMARKER): continue -yield tokenize.TokenInfo(u.type, u.string, - (r, c + u.start[1]), - (r, c + u.end[1]), - '') +yield u._replace( +start=(r, c + u.start[1]), end=(r, c + u.end[1])) continue # This looks like a function call. @@ -322,8 +317,7 @@ # It changes iteritems to items as iteritems is not # present in Python 3 world. elif fn == 'iteritems': -yield tokenize.TokenInfo(t.type, 'items', - t.start, t.end, t.line) +yield t._replace(string='items') continue # Emit unmodified token. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 1 of 2 V3] py3: refactor token parsing to handle call args properly
# HG changeset patch # User Martijn Pieters# Date 1476464102 -3600 # Fri Oct 14 17:55:02 2016 +0100 # Node ID 9031460519503abe5dc430c8ece29d198121cd65 # Parent 733fb9f7bc92c694ba6bededaeb93206528c0bcd py3: refactor token parsing to handle call args properly The token parsing was getting unwieldy and was too naive about accessing arguments. diff --git a/mercurial/__init__.py b/mercurial/__init__.py --- a/mercurial/__init__.py +++ b/mercurial/__init__.py @@ -185,6 +185,58 @@ OR CACHED FILES WON'T GET INVALIDATED PROPERLY. """ futureimpline = False + +# The following utility functions access the tokens list and i index of +# the for i, t enumerate(tokens) loop below +def _isop(j, *o): +"""Assert that tokens[j] is an OP with one of the given values""" +try: +return tokens[j].type == token.OP and tokens[j].string in o +except IndexError: +return False + +def _findargnofcall(n): +"""Find arg n of a call expression (start at 0) + +Returns index of the first token of that argument, or None if +there is not that many arguments. + +Assumes that token[i + 1] is '('. + +""" +nested = 0 +for j in xrange(i + 2, len(tokens)): +if _isop(j, ')', ']', '}'): +# end of call, tuple, subscription or dict / set +nested -= 1 +if nested < 0: +return None +elif n == 0: +# this is the starting position of arg +return j +elif _isop(j, '(', '[', '{'): +nested += 1 +elif _isop(j, ',') and nested == 0: +n -= 1 + +return None + +def _ensureunicode(j): +"""Make sure the token at j is a unicode string + +This rewrites a string token to include the unicode literal prefix +so the string transformer won't add the byte prefix. + +Ignores tokens that are not strings. Assumes bounds checking has +already been done. + +""" +st = tokens[j] +if st.type == token.STRING and st.string.startswith(("'", '"')): +rt = tokenize.TokenInfo(st.type, 'u%s' % st.string, +st.start, st.end, st.line) +tokens[j] = rt + for i, t in enumerate(tokens): # Convert most string literals to byte literals. String literals # in Python 2 are bytes. String literals in Python 3 are unicode. @@ -241,91 +293,35 @@ '') continue -try: -nexttoken = tokens[i + 1] -except IndexError: -nexttoken = None - -try: -prevtoken = tokens[i - 1] -except IndexError: -prevtoken = None - # This looks like a function call. -if (t.type == token.NAME and nexttoken and -nexttoken.type == token.OP and nexttoken.string == '('): +if t.type == token.NAME and _isop(i + 1, '('): fn = t.string # *attr() builtins don't accept byte strings to 2nd argument. -# Rewrite the token to include the unicode literal prefix so -# the string transformer above doesn't add the byte prefix. -if fn in ('getattr', 'setattr', 'hasattr', 'safehasattr'): -try: -# (NAME, 'getattr') -# (OP, '(') -# (NAME, 'foo') -# (OP, ',') -# (NAME|STRING, foo) -st = tokens[i + 4] -if (st.type == token.STRING and -st.string[0] in ("'", '"')): -rt = tokenize.TokenInfo(st.type, 'u%s' % st.string, -st.start, st.end, st.line) -tokens[i + 4] = rt -except IndexError: -pass +if (fn in ('getattr', 'setattr', 'hasattr', 'safehasattr') and +not _isop(i - 1, '.')): +arg1idx = _findargnofcall(1) +if arg1idx is not None: +_ensureunicode(arg1idx) # .encode() and .decode() on str/bytes/unicode don't accept -# byte strings on Python 3. Rewrite the token to include the -# unicode literal prefix so the string transformer above doesn't -# add the byte prefix. The loop helps in handling multiple -# arguments. -if (fn
Re: Topics [was: news from the topic experiment]
On 10/14/2016 05:48 PM, Erik van Zijst wrote: On Thu, Oct 13, 2016 at 7:01 AM, Pierre-Yves David> wrote: So, thanks for exploring possibilities to make this frontier thiner. However, I can see some issues with some aspects of this proposal, using the same field for either branch or topic make them mostly mutually exclusive. Publishing a topic on a named branch would require to alter the changesets data (and therefore, hash) This means people could use topic only with the default branch (or significant more complexity to work around this). As I understand, Bitbucket enforces a single master branch so that might actually fit your model. This is probably too restricting for the general project (more about that later). I'm not sure I understand. I would think that the mutually exclusive part would be desirable, not a limitation. I'm not sure how that would limit people to using only the default branch? If we use the same field for either topic or name branch a changeset can either be: * on a named branch but no topic * on a topic but no named branch (so default branch) As a result, when the changeset become public we end up with: * on a named branch [no topic] * on the default branch [no topic] So using named branch and topic on the same project become impossible//impracticable as topic is restricted to the default branch me and I cannot use topic when targeting a named branch (also meh). (I'm repeating myself a lot here, but I'm still unsure if I managed to get the idea through) One could still create 2 long-lived branched. The most common workflows are based around gitflow and typically define 2 long-lived branches: e.g. "master" and "develop". While Bitbucket only treats one branch as the "main branch", most teams have more than 1 long-lived branch. Bitbucket's main branch merely serves as the default pull request destination and initial context for the source browser. It would seem to me that this could have some benefits: 1. There would be no risk of a name collision between the branch and topic namespaces. I'm not certain this actually avoid the risk of name collision. People could use the same branch/topic name on different changesets with different values for the flag. That would lead to both a topic and a named branch to exists with the same name. Clashes after pulling between forks are a reality. I could have a topic "foo" and pull in a named branch "foo", this is true. However, I'm not sure I see how a separate topic and branch namespace would solve that. I'm not claiming having different namespace is preventing name collision between topic and branch. It does not. What I'm saying it that your proposal of using the branch field and a flag does not prevent it either. If I have a topic "default:foo", I could pull a named branch "foo" and unless we expose the prefix part in the topic name, that would get ambiguous too. [yep, name colision are possible] Additionally, a distinct namespace might open us up to more fork ambiguity. I could pull a topic "develop:foo". Would that be considered the same topic? Should it be merged with mine? Would it have a consequence for the implicit merge/rebase target?\ Yes, topic on multiple branch is something we need to iron more in the future. In all cases we should have the local UI fight hard to prevent people to create collisions between branch and topic. (And some descent way to point out name conflict if they appends). For sure. Though I'm not sure I see the advantage of separate namespace in this situation. [None, I'm just saying that using the same field does not seems to have advantages either] 2. Interoperability with non-topic clients would mostly just work. This could be a big boon for existing ecosystem tools like CI servers that wouldn't have to be modified. There is some very interesting ramification to this proposal. Even if we do not go with the flag approach. We store the full (branch, topic) pair into the branch field. For example we could use ":" as a separator branch=BRANCH:TOPIC. Not only this would allow old clients to transport the data (we already have that) but this also mean old client can also view and preserve that data (and this is new) even if it does not get the behavior improvement related to topic. That would be a large usability boost for old client. I really like that idea and I hadn't considered myself. It would, at least in theory, mean old clients won't wreak havoc and will be able to see all topics and branches. They could even commit on top of a topic without breaking anything. That said, the situation might be different for CI servers and generally any other automated tool that doesn't anticipate the (in their legacy view, illegal branch
Re: [PATCH] revset: for x^2, do not take null as a valid p2 revision
On 10/14/2016 05:35 PM, Yuya Nishihara wrote: # HG changeset patch # User Yuya Nishihara# Date 1476455580 -32400 # Fri Oct 14 23:33:00 2016 +0900 # Node ID 18ba0036dd81db9e45fd1a450c7f207f1b89f22c # Parent 5cb830801855dbb63e98b948e355bc995d295bf3 revset: for x^2, do not take null as a valid p2 revision Since we don't count null p2 revision as a parent, x^2 should never return null even if null is explicitly populated. diff --git a/mercurial/revset.py b/mercurial/revset.py --- a/mercurial/revset.py +++ b/mercurial/revset.py @@ -1620,7 +1620,7 @@ def parentspec(repo, subset, x, n, order ps.add(cl.parentrevs(r)[0]) elif n == 2: parents = cl.parentrevs(r) -if len(parents) > 1: +if parents[1] != node.nullrev: Is there any chance that p1 is nullid and p2 is not? Also, how does this behave if r is a root (p1 and p2 is nullid). Should we instead filter the parents list ?:: parents = [p for p in cl.parentrevs(r) if p != nullrev] ps.add(parents[1]) return subset & ps diff --git a/tests/test-revset.t b/tests/test-revset.t --- a/tests/test-revset.t +++ b/tests/test-revset.t @@ -2750,6 +2750,7 @@ parentrevspec 5 $ log 'merge()^2' 4 + $ log '(not merge())^2' $ log 'merge()^^' 3 $ log 'merge()^1^' ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: Topics [was: news from the topic experiment]
On Thu, Oct 13, 2016 at 7:01 AM, Pierre-Yves David < pierre-yves.da...@ens-lyon.org> wrote: > > So, thanks for exploring possibilities to make this frontier thiner. > However, I can see some issues with some aspects of this proposal, using > the same field for either branch or topic make them mostly mutually > exclusive. Publishing a topic on a named branch would require to alter the > changesets data (and therefore, hash) This means people could use topic > only with the default branch (or significant more complexity to work around > this). As I understand, Bitbucket enforces a single master branch so that > might actually fit your model. This is probably too restricting for the > general project (more about that later). > I'm not sure I understand. I would think that the mutually exclusive part would be desirable, not a limitation. I'm not sure how that would limit people to using only the default branch? One could still create 2 long-lived branched. The most common workflows are based around gitflow and typically define 2 long-lived branches: e.g. "master" and "develop". While Bitbucket only treats one branch as the "main branch", most teams have more than 1 long-lived branch. Bitbucket's main branch merely serves as the default pull request destination and initial context for the source browser. It would seem to me that this could have some benefits: >> >> 1. There would be no risk of a name collision between the branch and >> topic namespaces. >> > > I'm not certain this actually avoid the risk of name collision. People > could use the same branch/topic name on different changesets with different > values for the flag. That would lead to both a topic and a named branch to > exists with the same name. > Clashes after pulling between forks are a reality. I could have a topic "foo" and pull in a named branch "foo", this is true. However, I'm not sure I see how a separate topic and branch namespace would solve that. If I have a topic "default:foo", I could pull a named branch "foo" and unless we expose the prefix part in the topic name, that would get ambiguous too. Additionally, a distinct namespace might open us up to more fork ambiguity. I could pull a topic "develop:foo". Would that be considered the same topic? Should it be merged with mine? Would it have a consequence for the implicit merge/rebase target? > In all cases we should have the local UI fight hard to prevent people to > create collisions between branch and topic. (And some descent way to point > out name conflict if they appends). > For sure. Though I'm not sure I see the advantage of separate namespace in this situation. > 2. Interoperability with non-topic clients would mostly just work. >> >> This could be a big boon for existing ecosystem tools like CI servers >> that wouldn't have to be modified. >> > > There is some very interesting ramification to this proposal. Even if we > do not go with the flag approach. We store the full (branch, topic) pair > into the branch field. For example we could use ":" as a separator > branch=BRANCH:TOPIC. Not only this would allow old clients to transport the > data (we already have that) but this also mean old client can also view and > preserve that data (and this is new) even if it does not get the behavior > improvement related to topic. That would be a large usability boost for old > client. > I really like that idea and I hadn't considered myself. It would, at least in theory, mean old clients won't wreak havoc and will be able to see all topics and branches. They could even commit on top of a topic without breaking anything. That said, the situation might be different for CI servers and generally any other automated tool that doesn't anticipate the (in their legacy view, illegal branch name). Assuming new clients would still hide the prefix in their UI (and so would Bitbucket), telling a legacy CI server to checkout topic "foo" would likely fail. I understand we might be driven by somewhat different motivations. The prospect to some level of compatibility with existing tools (not necessarily just an old hg client) I think is huge in our world. If Bitbucket teams couldn't switch to using topics until all CI and deployment infrastructure had been patched and upgraded first, that might lead to a chicken-and-egg problem. Cheers, Erik ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 6] largefiles: when setting/clearing x bit on largefiles, don't change other bits
On Thu, 13 Oct 2016 17:57:57 +0200, Pierre-Yves David wrote: > On 10/13/2016 04:13 PM, Yuya Nishihara wrote: > > On Thu, 13 Oct 2016 16:03:03 +0200, Mads Kiilerich wrote: > >> On 10/13/2016 03:58 PM, Yuya Nishihara wrote: > >>> On Thu, 13 Oct 2016 15:46:25 +0200, Mads Kiilerich wrote: > On 10/13/2016 03:39 PM, Yuya Nishihara wrote: > > The series looks good to me. Queued up to the patch 5, thanks. > I have also looked into reworking it to increasing the default chunksize > to 128k everywhere - that seems a bit cleaner in hindsight. I will send > that version if you un-queue it again ;-) > >>> Unqueue everything? > >> > >> I have no changes to the first two so it would be great to keep them ... > >> but I could also easily resend them - whatever is least trouble ;-) > > > > Okay, pruned the following changes from hg-committed: > > Urg, direct pruning on hg-commmitted is problematic as it makes > changesets disappear on the contributor see (see the related thread on > the reviewers mailing list a while back and the drophack¹ extension (for > reviewers only)). > > [1] https://www.mercurial-scm.org/repo/evolve/file/tip/hgext/drophack.py Oh, but my name is not Matt Mackall. :-) I pruned them assuming Mads would already start working on new version so he would have new commits in his local repository. Do we need to ssh into m-s-o and run "hg drop" ? ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 2 V2] py3: refactor token parsing to handle call args properly
On Thu, 13 Oct 2016 10:22:08 +0100, Martijn Pieters wrote: > # HG changeset patch > # User Martijn Pieters> # Date 1476350485 -3600 > # Thu Oct 13 10:21:25 2016 +0100 > # Node ID 36607cf1bbd9fd4d99b607f927bc807fcc48d0ea > # Parent 733fb9f7bc92c694ba6bededaeb93206528c0bcd > py3: refactor token parsing to handle call args properly Nice, but one nit. > +def _findargnofcall(n): > +"""Find arg n of a call expression (start at 0) > + > +Returns index of the first token of that argument, or None if > +there is not that many arguments. > + > +Assumes that token[i + 1] is '('. > + > +""" > +j = i + 2 > +nested = 0 > +try: > +while True: > +if _isop(j, ')', ']', '}'): > +# end of call, tuple, subscription or dict / set > +nested -= 1 > +if nested < 0: > +return None > +elif n == 0: > +# this is the starting position of arg > +return j > +elif _isop(j, '(', '[', '{'): > +nested += 1 > +elif _isop(j, ',') and nested == 0: > +n -= 1 > +j += 1 > +except IndexError: > +return None This loop would never stop if malformed tokens were provided. I guess you had a working version in which _isop() could raise IndexError. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 2 of 2] eol: make sure we always release the wlock when writing cache
On Thu, 13 Oct 2016 13:53:13 +0200, Pierre-Yves David wrote: > # HG changeset patch > # User Pierre-Yves David> # Date 1476359267 -7200 > # Thu Oct 13 13:47:47 2016 +0200 > # Node ID db0b3bc2e7ae19973240275fb9e3f964ecdeb019 > # Parent 88cc944830d0c1895e527d6ca13687f1d5e1c785 > eol: make sure we always release the wlock when writing cache Queued this, thanks. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] pathencode: use assert() for PyBytes_Check()
On Thu, 13 Oct 2016 21:42:42 +0200, Gregory Szorc wrote: > # HG changeset patch > # User Gregory Szorc> # Date 1476387731 -7200 > # Thu Oct 13 21:42:11 2016 +0200 > # Node ID 4bbdf012fafbdffd4d4402a712fdc01d3252d00c > # Parent c0410814002f467c24ef07ce73850ba15b306f8e > pathencode: use assert() for PyBytes_Check() Queued, thanks! ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] dirs: document Py_SIZE weirdness
FYI This is the violation of C API, explicit assignment would be probably better (at least it's CLEAR that you're doing something dodgy) On Thu, Oct 13, 2016 at 9:43 PM, Gregory Szorcwrote: > # HG changeset patch > # User Gregory Szorc > # Date 1475939263 -7200 > # Sat Oct 08 17:07:43 2016 +0200 > # Node ID c05716f3eb0a4a70ef1f59b51cd1b29ee683a145 > # Parent c0410814002f467c24ef07ce73850ba15b306f8e > dirs: document Py_SIZE weirdness > > Assigning to what looks like a function is clown shoes. Document that > it is a macro referring to a struct member. > > diff --git a/mercurial/dirs.c b/mercurial/dirs.c > --- a/mercurial/dirs.c > +++ b/mercurial/dirs.c > @@ -75,8 +75,10 @@ static int _addpath(PyObject *dirs, PyOb > pos < 2 ? 2 : pos); > if (key == NULL) > goto bail; > } > + /* Py_SIZE(o) refers to the ob_size member of the struct. Yes, > + * assigning to what looks like a function seems wrong. */ > Py_SIZE(key) = pos; > ((PyBytesObject *)key)->ob_sval[pos] = '\0'; > > val = PyDict_GetItem(dirs, key); > ___ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 3 of 3] merge: clarify warning for (not) merging flags without ancestor
On 10/12/2016 12:25 PM, Mads Kiilerich wrote: # HG changeset patch # User Mads Kiilerich# Date 1476267738 -7200 # Wed Oct 12 12:22:18 2016 +0200 # Node ID 9588752fc3a6d2b5b1f40ddfc48965997270892d # Parent 29bd20c4999865fc19ca3f0344c2c1231a318b1c merge: clarify warning for (not) merging flags without ancestor Pushed, thanks -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] dirs: document Py_SIZE weirdness
On 10/13/2016 09:43 PM, Gregory Szorc wrote: # HG changeset patch # User Gregory Szorc# Date 1475939263 -7200 # Sat Oct 08 17:07:43 2016 +0200 # Node ID c05716f3eb0a4a70ef1f59b51cd1b29ee683a145 # Parent c0410814002f467c24ef07ce73850ba15b306f8e dirs: document Py_SIZE weirdness Assigning to what looks like a function is clown shoes. Document that it is a macro referring to a struct member. Pushed, thanks. -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 2 of 2 V2] record: return code from underlying commit
On 10/14/2016 09:55 AM, Philippe Pepiot wrote: # HG changeset patch # User Philippe Pepiot# Date 1476267774 -7200 # Wed Oct 12 12:22:54 2016 +0200 # Node ID 617220c6b76e9faf873dbacf69baff0196b52d92 # Parent 9eb1db967e9db678e06723b26966b06f060ef36e # EXP-Topic record-return-code record: return code from underlying commit These two are pushed, thanks. Cheers, -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] demandimport: disable lazy import of __builtin__
On 10/14/2016 03:03 AM, Mads Kiilerich wrote: # HG changeset patch # User Mads Kiilerich# Date 1476407019 -7200 # Fri Oct 14 03:03:39 2016 +0200 # Node ID c0fd76f612735f0354102603ee630437208d8336 # Parent c0410814002f467c24ef07ce73850ba15b306f8e demandimport: disable lazy import of __builtin__ Pushed, thanks. -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 2 of 2 V2] record: return code from underlying commit
# HG changeset patch # User Philippe Pepiot# Date 1476267774 -7200 # Wed Oct 12 12:22:54 2016 +0200 # Node ID 617220c6b76e9faf873dbacf69baff0196b52d92 # Parent 9eb1db967e9db678e06723b26966b06f060ef36e # EXP-Topic record-return-code record: return code from underlying commit diff --git a/hgext/record.py b/hgext/record.py --- a/hgext/record.py +++ b/hgext/record.py @@ -70,7 +70,7 @@ def record(ui, repo, *pats, **opts): backup = ui.backupconfig('experimental', 'crecord') try: ui.setconfig('experimental', 'crecord', False, 'record') -commands.commit(ui, repo, *pats, **opts) +return commands.commit(ui, repo, *pats, **opts) finally: ui.restoreconfig(backup) diff --git a/tests/test-record.t b/tests/test-record.t --- a/tests/test-record.t +++ b/tests/test-record.t @@ -77,6 +77,7 @@ Select no files examine changes to 'empty-rw'? [Ynesfdaq?] n no changes to record + [1] $ hg tip -p changeset: -1: ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 2 of 2] changegroup: skip delta when the underlying revlog do not use them
Cool. I was going to author this patch when I got back home! This patch will result in CPU regression for old clients having to re-deltify. It would be nice to have numbers for that. I'm optimistic it is roughly the same as the server gains and it won't be significant enough to not take the patch. We also don't have a perf* command to measure changegroup application for a single component IIRC. So getting data isn't trivial :/ > On Oct 14, 2016, at 03:08, Pierre-Yves David> wrote: > > # HG changeset patch > # User Pierre-Yves David > # Date 1476401471 -7200 > # Fri Oct 14 01:31:11 2016 +0200 > # Node ID 747c0a1084ef6251a987e469197bad7796756403 > # Parent e19eb107706e7210c3b359d66f5274911b181566 > # EXP-Topic storedeltachains > changegroup: skip delta when the underlying revlog do not use them > > Revlog can now be configured to store full snapshot only. This is used on the > changelog. However, the changegroup packing was still recomputing deltas to be > sent over the wire. > > We now just reuse the full snapshot directly in this case, skipping delta > computation. This provides use with a large speed up(-30%): > > # perfchangegroupchangelog on mercurial > ! wall 2.010326 comb 2.02 user 2.00 sys 0.02 (best of 5) > ! wall 1.382039 comb 1.38 user 1.37 sys 0.01 (best of 8) > > # perfchangegroupchangelog on pypy > ! wall 5.792589 comb 5.78 user 5.78 sys 0.00 (best of 3) > ! wall 3.911158 comb 3.92 user 3.90 sys 0.02 (best of 3) > > # perfchangegroupchangelog on mozilla central > ! wall 20.683727 comb 20.68 user 20.63 sys 0.05 (best of 3) > ! wall 14.190204 comb 14.19 user 14.15 sys 0.04 (best of 3) > > Many tests have to be updated because of the change in bundle content. All > theses update have been verified. Because diffing changelog was not very > valuable, the resulting bundle have similar size (often a bit smaller): > > # full bundle of mozilla central > with delta:1142740533B > without delta: 1142173300B > > So this is a win all over the board. > > diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py > --- a/mercurial/changegroup.py > +++ b/mercurial/changegroup.py > @@ -818,18 +818,24 @@ class cg2packer(cg1packer): > > def deltaparent(self, revlog, rev, p1, p2, prev): > dp = revlog.deltaparent(rev) > -# Avoid sending full revisions when delta parent is null. Pick > -# prev in that case. It's tempting to pick p1 in this case, as p1 > -# will be smaller in the common case. However, computing a delta > -# against p1 may require resolving the raw text of p1, which could > -# be expensive. The revlog caches should have prev cached, meaning > -# less CPU for changegroup generation. There is likely room to add > -# a flag and/or config option to control this behavior. > -# > -# Pick prev when we can't be sure remote has the base revision. > -if dp == nullrev or (dp != p1 and dp != p2 and dp != prev): > +if dp == nullrev and revlog.storedeltachains: > +# Avoid sending full revisions when delta parent is null. Pick > prev > +# in that case. It's tempting to pick p1 in this case, as p1 will > +# be smaller in the common case. However, computing a delta > against > +# p1 may require resolving the raw text of p1, which could be > +# expensive. The revlog caches should have prev cached, meaning > +# less CPU for changegroup generation. There is likely room to > add > +# a flag and/or config option to control this behavior. > return prev > -return dp > +elif dp == nullrev: > +# revlog is configure to use full snapshot for a reason, > +# stick to full snapshot. > +return nullrev > +elif dp not in (p1, p2, prev): > +# Pick prev when we can't be sure remote has the base revision. > +return prev > +else: > +return dp > > def builddeltaheader(self, node, p1n, p2n, basenode, linknode, flags): > # Do nothing with flags, it is implicitly 0 in cg1 and cg2problem > diff --git a/tests/test-acl.t b/tests/test-acl.t > --- a/tests/test-acl.t > +++ b/tests/test-acl.t > @@ -113,7 +113,7 @@ Extension disabled for lack of a hook > adding quux/file.py revisions > added 3 changesets with 3 changes to 3 files > updating the branch cache > - bundle2-input-part: total payload size 1606 > + bundle2-input-part: total payload size 1553 > bundle2-input-part: "pushkey" (params: 4 mandatory) supported > pushing key for "phases:911600dab2ae7a9baff75958b84fe606851ce955" > bundle2-input-bundle: 3 parts total > @@ -178,7 +178,7 @@ Extension disabled for lack of acl.sourc > calling hook pretxnchangegroup.acl: hgext.acl.hook > acl: changes have
Re: [PATCH] color: replace "purple_background" by "magenta_background"
Pierre-Yves David a écrit : On 10/11/2016 11:39 AM, Denis Laxalde wrote: # HG changeset patch # User Denis Laxalde# Date 1476176128 -7200 # Tue Oct 11 10:55:28 2016 +0200 # Node ID 1cf20c1c54cb8172816604de5e1d98d3cd62d711 # Parent b85fa6bf298be07804a74d8fdec0d19fdbc6d740 # EXP-Topic color-purple-crash color: replace "purple_background" by "magenta_background" We probably want to keep both value around for backward compatibility reason. Aggreed. I'm also a bit curious about what all this is about. Can I convinced you to dig it a bit further? Not sure how deep you'd like me to dig. This occurs with "terminfo" color mode. So, to reproduce: HGRCPATH= hg --config extensions.color= --config color.mode=terminfo debugcolor -- Denis Laxalde Logilab http://www.logilab.fr ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH] osutil: implement listdir for linux
# HG changeset patch # User Maciej Fijalkowski# Date 1476428549 -7200 # Fri Oct 14 09:02:29 2016 +0200 # Node ID 4e80a66124279e235dec7ae8f58c0a14b0b137bf # Parent c770219dc4c253d7cd82519ce3c74438bb2829d3 osutil: implement listdir for linux diff --git a/mercurial/pure/osutil.py b/mercurial/pure/osutil.py --- a/mercurial/pure/osutil.py +++ b/mercurial/pure/osutil.py @@ -66,13 +66,20 @@ return result ffi = None -if modulepolicy not in policynocffi and sys.platform == 'darwin': +if modulepolicy not in policynocffi and ( +sys.platform == 'darwin' or sys.platform.startswith('linux')): try: from _osutil_cffi import ffi, lib except ImportError: if modulepolicy == 'cffi': # strict cffi import raise +class stat_res(object): +def __init__(self, st_mode, st_mtime, st_size): +self.st_mode = st_mode +self.st_mtime = st_mtime +self.st_size = st_size + if sys.platform == 'darwin' and ffi is not None: listdir_batch_size = 4096 # tweakable number, only affects performance, which chunks @@ -88,12 +95,6 @@ attrkinds[lib.VFIFO] = statmod.S_IFIFO attrkinds[lib.VSOCK] = statmod.S_IFSOCK -class stat_res(object): -def __init__(self, st_mode, st_mtime, st_size): -self.st_mode = st_mode -self.st_mtime = st_mtime -self.st_size = st_size - tv_sec_ofs = ffi.offsetof("struct timespec", "tv_sec") buf = ffi.new("char[]", listdir_batch_size) @@ -117,7 +118,7 @@ tp = attrkinds[cur.obj_type] if name == "." or name == "..": continue -if skip == name and tp == statmod.S_ISDIR: +if skip == name and tp == statmod.S_IFDIR: return [] if stat: mtime = cur.mtime.tv_sec @@ -146,12 +147,74 @@ try: ret = listdirinternal(dfd, req, stat, skip) finally: +lib.close(dfd) +return ret + +elif sys.platform.startswith('linux') and ffi is not None: + +_known_common_file_types = { +lib.DT_DIR: statmod.S_IFDIR, +lib.DT_CHR: statmod.S_IFCHR, +lib.DT_BLK: statmod.S_IFBLK, +lib.DT_REG: statmod.S_IFREG, +lib.DT_FIFO: statmod.S_IFIFO, +lib.DT_LNK: statmod.S_IFLNK, +lib.DT_SOCK: statmod.S_IFSOCK, +} + +def listdirinternal(dir, dirfd, stat, skip): +buf = ffi.new("struct stat *") +ret = [] +while True: +ffi.errno = 0 +dirent = lib.readdir(dir) +if not dirent: +break +dname = dirent.d_name +if dname[0] == "." and (dname[1] == "\x00" or +(dname[1] == "." and dname[2] == "\x00")): +continue +# +dtype = dirent.d_type +if stat or dtype not in _known_common_file_types: +if lib.fstatat(dirfd, dirent.d_name, buf, + lib.AT_SYMLINK_NOFOLLOW) != 0: +raise OSError(ffi.errno, os.strerror(ffi.errno)) +tp = statmod.S_IFMT(buf.st_mode) +else: +tp = _known_common_file_types[dtype] +# +name = ffi.string(dirent.d_name) +if skip == name and tp == statmod.S_IFDIR: +return [] +if stat: +ret.append((name, tp, stat_res( + st_mode = buf.st_mode, + st_mtime = buf.st_mtim.tv_sec, + st_size = buf.st_size))) +else: +ret.append((name, tp)) + +if ffi.errno != 0: +raise OSError(ffi.errno, os.strerror(ffi.errno)) +return ret + +def listdir(path, stat=False, skip=None): +dfd = os.open(path, os.O_RDONLY) +dir = lib.fdopendir(dfd) +if dir == ffi.NULL: try: -lib.close(dfd) +os.close(dfd) except BaseException: -pass # we ignore all the errors from closing, not -# much we can do about that +pass +raise OSError(ffi.errno, os.strerror(ffi.errno)) + +try: +ret = listdirinternal(dir, dfd, stat, skip) +finally: +lib.closedir(dir) return ret + else: listdir = listdirpure diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -323,7 +323,7 @@ exts = [setup_mpatch_cffi.ffi.distutils_extension(), setup_bdiff_cffi.ffi.distutils_extension()] # cffi modules go here -if sys.platform == 'darwin': +if sys.platform == 'darwin' or sys.platform.startswith('linux'): import setup_osutil_cffi exts.append(setup_osutil_cffi.ffi.distutils_extension())