Re: Git garden shears, was Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hi Kuba, On Wed, 21 Sep 2016, Jakub Narębski wrote: > W dniu 11.09.2016 o 10:33, Johannes Schindelin napisał: > > On Fri, 9 Sep 2016, Jakub Narębski wrote: > [...] > > >> When preserving merges, there are (as far as I understand it), two > >> problems: > >> - what it means to preserve changes (which change to pick, > >>that is what is the mainline changes rebase is re-applying) > >> - what are parents of the merge commit (at least one parent > >>would be usually rewritten) > >> > >> Maybe the internal (and perhaps also user-visible) representation > >> of merge in instruction sheet could use the notation of filter-branch, > >> that is 'map()'... it could also imply the mainline. > >> > >> That is the instruction in the internal instruction sheet could > >> look like this: > >> > >> merge -m 1 map(2fd4e1c6...) da39a3ee... \t Merge 'foo' into master > >> > >> > >> Note that it has nothing to do with this series! > > > > Right. But I did solve that already. In the Git garden shears [*1*] > > (essentially my New And Improved attempt at recreating branch structures > > while rebasing), I generate and process scripts like this: > > > > mark onto > > > > # Branch: super-cool-feature > > rewind onto > > pick 1 feature > > pick 2 documentation > > mark super-cool-feature > > > > # Branch: typo-fix > > rewind onto > > pick a fix a tyop > > There probably should be there > > mark typo-fix Correct. Sorry for the omission. > > rewind onto > > merge -C cafebabe super-cool-feature > > merge -C babecafe typo-fix > > > > cleanup super-cool-feature typo-fix > > > > Of course this will change a little, still, once I get around to implement > > this on top of the rebase--helper. > > Do I understand it correctly that it is user-visible instruction sheet, and > not the internal instruction sheet for sequencer? This looks very nice > and is well readable. It is intended as that. Currently I need a little trickery to make this work, as rebase -i does not understand rewind nor merge. The trick is to re-use the shears.sh script as editor that then populates the edit script, calls the real editor, and then installs a temporary alias that gets called for all custom commands via exec (turning e.g. "rewind abc" into "exec git .r rewind abc"). The temporary alias point back to the shears script, too, of course. In the end, I hope to teach the sequencer a variant of this dialect, as well as the trick to generate such edit scripts. > I guess that it needs to be pre-populated by Git based on topology of the > branch being rebased. Yes. The shears.sh script is in charge of that, and it has to perform a couple of Git calls to do so. > As I see, there are three basic topologies of non-linear branch to be > rebased; all else is combination of thereof, or derivative: > > 1. Merge commit without branching point, that is we need to go >from the following situation > >*---*---*---#---o---o---o<-- old base >\\ > \\=a===b===M===c<-- branch being rebased > / > ...---x---x---x-/ <-- side branch > > to the following: > >*---*---*---#---o---o---o > \ > \-a'--b'--M'--c' > / > ...---x---x---x-/ In other words: rebasing a merge commit merging non-rebased commits. This is not yet supported by the shears script, as it would require logic that is not only slow in shell script, but also convoluted. IOW this feature waits for the sequencer to know how to run regular rebase -i already. > I think this case is the only one supported by `--preserve-merges`, > but I may be mistaken - I never had the need to use this feature IRL. No, -p would handle merges of non-rebased commits as well as merges of to-be-rebased commits. > 2. Branching point without accompanying merge commit, or in other words >rebasing many branches tied together; a shrub if you will. That is, >we want to go from the following situation: > >*---*---*---#---o---o---o <-- old base >\ > \--a---b---c <-- branch being rebased > \ > \-1 <-- dependent branch > >to the following one: > >*---*---*---#---o---o---o > \ > \--a'--b'--c' > \ > \-1' This is outside the scope of rebase -i (and of the shears), as you are talking about *parallel* rebases. I don't do that, nor does rebase -i, rebase -p nor the shears. > I don't think Git supports something like that out of the box, but it > is not hard to create something like that "by hand". It is not much > of a problem... unless you forget to rebase the
Re: Git garden shears, was Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hello Dscho, W dniu 11.09.2016 o 10:33, Johannes Schindelin napisał: > On Fri, 9 Sep 2016, Jakub Narębski wrote: [...] >> When preserving merges, there are (as far as I understand it), two >> problems: >> - what it means to preserve changes (which change to pick, >>that is what is the mainline changes rebase is re-applying) >> - what are parents of the merge commit (at least one parent >>would be usually rewritten) >> >> Maybe the internal (and perhaps also user-visible) representation >> of merge in instruction sheet could use the notation of filter-branch, >> that is 'map()'... it could also imply the mainline. >> >> That is the instruction in the internal instruction sheet could >> look like this: >> >> merge -m 1 map(2fd4e1c6...) da39a3ee... \t Merge 'foo' into master >> >> >> Note that it has nothing to do with this series! > > Right. But I did solve that already. In the Git garden shears [*1*] > (essentially my New And Improved attempt at recreating branch structures > while rebasing), I generate and process scripts like this: > > mark onto > > # Branch: super-cool-feature > rewind onto > pick 1 feature > pick 2 documentation > mark super-cool-feature > > # Branch: typo-fix > rewind onto > pick a fix a tyop There probably should be there mark typo-fix > > rewind onto > merge -C cafebabe super-cool-feature > merge -C babecafe typo-fix > > cleanup super-cool-feature typo-fix > > Of course this will change a little, still, once I get around to implement > this on top of the rebase--helper. Do I understand it correctly that it is user-visible instruction sheet, and not the internal instruction sheet for sequencer? This looks very nice and is well readable. I guess that it needs to be pre-populated by Git based on topology of the branch being rebased. As I see, there are three basic topologies of non-linear branch to be rebased; all else is combination of thereof, or derivative: 1. Merge commit without branching point, that is we need to go from the following situation *---*---*---#---o---o---o<-- old base \\ \\=a===b===M===c<-- branch being rebased / ...---x---x---x-/ <-- side branch to the following: *---*---*---#---o---o---o \ \-a'--b'--M'--c' / ...---x---x---x-/ I think this case is the only one supported by `--preserve-merges`, but I may be mistaken - I never had the need to use this feature IRL. 2. Branching point without accompanying merge commit, or in other words rebasing many branches tied together; a shrub if you will. That is, we want to go from the following situation: *---*---*---#---o---o---o <-- old base \ \--a---b---c <-- branch being rebased \ \-1 <-- dependent branch to the following one: *---*---*---#---o---o---o \ \--a'--b'--c' \ \-1' I don't think Git supports something like that out of the box, but it is not hard to create something like that "by hand". It is not much of a problem... unless you forget to rebase the second dependent branch. 3. Branching point with merge point, that is subbranch created and merged - an "eye" (it is not a loop in DAG): *---*---*---#---o---o---o <-- old base \ \--a---b---c---M---d <-- branch being rebased \ / \-1---2-/ [ <-- possibly a branch ] All edges are directed edges, with arrows pointing from right to left; that is *---* is really *<---* The expected result is: *---*---*---#---o---o---o \ \--a'--b'--c'--M'--d' \ / \-1'--2'/ I guess that is the main purpose of your git-garden-shears script, isn't it? > > For example, I am not so hot about the "merge -C ..." syntax. I'll > probably split that into a "remerge " and a new "merge > " command (the latter asking interactively for the merge commit > message). There is also an additional complication in that merge commit message may be *partially* automatically generated. First there is the subject generated by 'git merge' ("Merge branch 'foo'") or 'git pull '. It might have been translated, or extended. Second there is a place for branch cover letter. Third, subject to merge.log / merge.summary there is a shortlog. >From those shortlog should be surely updated to correspond to the post-rebase state. The first line could b
Git garden shears, was Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hi Kuba, On Fri, 9 Sep 2016, Jakub Narębski wrote: > Hello Johannes, > > W dniu 09.09.2016 o 17:12, Johannes Schindelin napisał: > > On Thu, 1 Sep 2016, Junio C Hamano wrote: > >> Johannes Schindelin writes: > > >> I was sort of expecting that, when you do the preserve-merges mode > >> of "rebase -i", you would need to jump around, doing "we have > >> reconstructed the side branch on a new 'onto', let's give the result > >> this temporary name ':1', and then switch to the trunk (which would > >> call for 'reset ' instruction) and merge that thing (which > >> would be 'merge :1' or perhaps called 'pick :1')", and at that point > >> you no longer validate the object references upfront. > > > > Except that is not how --preserve-merges works: it *still* uses the SHA-1s > > as identifiers, even when the SHA-1 may have changed in the meantime. > > > > That is part of why it was a bad design. > > When preserving merges, there are (as far as I understand it), two > problems: > - what it means to preserve changes (which change to pick, >that is what is the mainline changes rebase is re-applying) > - what are parents of the merge commit (at least one parent >would be usually rewritten) > > Maybe the internal (and perhaps also user-visible) representation > of merge in instruction sheet could use the notation of filter-branch, > that is 'map()'... it could also imply the mainline. > > That is the instruction in the internal instruction sheet could > look like this: > > merge -m 1 map(2fd4e1c67a2d28fced849ee1bb76e7391b93eb12) > da39a3ee5e6b4b0d3255bfef95601890afd80709 \t Merge 'foo' into master > > > Note that it has nothing to do with this series! Right. But I did solve that already. In the Git garden shears [*1*] (essentially my New And Improved attempt at recreating branch structures while rebasing), I generate and process scripts like this: mark onto # Branch: super-cool-feature rewind onto pick 1 feature pick 2 documentation mark super-cool-feature # Branch: typo-fix rewind onto pick a fix a tyop rewind onto merge -C cafebabe super-cool-feature merge -C babecafe typo-fix cleanup super-cool-feature typo-fix Of course this will change a little, still, once I get around to implement this on top of the rebase--helper. For example, I am not so hot about the "merge -C ..." syntax. I'll probably split that into a "remerge " and a new "merge " command (the latter asking interactively for the merge commit message). And also: the cleanup stage should not be necessary, as the "mark" commands can accumulate the known marks into a file in the state directory. But you get the idea. No :1 or some such. That's machine readable. But it's utter nonsense for user-facing UIs. Ciao, Dscho Footnote *1*: https://github.com/git-for-windows/build-extra/blob/master/shears.sh
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hello Johannes, W dniu 09.09.2016 o 17:12, Johannes Schindelin napisał: > On Thu, 1 Sep 2016, Junio C Hamano wrote: >> Johannes Schindelin writes: >> I was sort of expecting that, when you do the preserve-merges mode >> of "rebase -i", you would need to jump around, doing "we have >> reconstructed the side branch on a new 'onto', let's give the result >> this temporary name ':1', and then switch to the trunk (which would >> call for 'reset ' instruction) and merge that thing (which >> would be 'merge :1' or perhaps called 'pick :1')", and at that point >> you no longer validate the object references upfront. > > Except that is not how --preserve-merges works: it *still* uses the SHA-1s > as identifiers, even when the SHA-1 may have changed in the meantime. > > That is part of why it was a bad design. When preserving merges, there are (as far as I understand it), two problems: - what it means to preserve changes (which change to pick, that is what is the mainline changes rebase is re-applying) - what are parents of the merge commit (at least one parent would be usually rewritten) Maybe the internal (and perhaps also user-visible) representation of merge in instruction sheet could use the notation of filter-branch, that is 'map()'... it could also imply the mainline. That is the instruction in the internal instruction sheet could look like this: merge -m 1 map(2fd4e1c67a2d28fced849ee1bb76e7391b93eb12) da39a3ee5e6b4b0d3255bfef95601890afd80709 \t Merge 'foo' into master Note that it has nothing to do with this series! Best regards, -- Jakub Narębski
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hi Junio, On Thu, 1 Sep 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > >> though). The "one sequencer to rule them all" may even have to say > >> "now give name ':1' to the result of the previous operation" in one > >> step and in another later step have an instruction "merge ':1'". > >> When that happens, you cannot even pre-populate the commit object > >> when the sequencer reads the file, as the commit has not yet been > >> created at that point. > > > > These considerations are pretty hypothetical. I would even place a bet > > that we will *never* have ":1" as names, not if I have anything to say... > > ;-) > > If you can always work with pre-existing commit, then you can > validate all object references that appear in the instructions > upfront. Or if *some* of the commands work with pre-existing commits, *those* commands can be validated up-front. Which is exactly what my code does. > I was sort of expecting that, when you do the preserve-merges mode > of "rebase -i", you would need to jump around, doing "we have > reconstructed the side branch on a new 'onto', let's give the result > this temporary name ':1', and then switch to the trunk (which would > call for 'reset ' instruction) and merge that thing (which > would be 'merge :1' or perhaps called 'pick :1')", and at that point > you no longer validate the object references upfront. Except that is not how --preserve-merges works: it *still* uses the SHA-1s as identifiers, even when the SHA-1 may have changed in the meantime. That is part of why it was a bad design. > If you do not have to have such a "mark this point" and a "refer to > that point we previously marked", then I agree that you should be > able to pre-validate and keep the result in the structure. Even then, those markers should *still* be validated. They, too, need to be created and later used, usage before creation would be an error. But... 1) this is not yet a problem, so why are we discussing it here? Do we not have actual problems with these patches to discuss anymore? 2) the SHA-1s that *can* be validated *should* be validated, so I find the objection a little bogus. Ciao, Dscho
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hi Junio, On Thu, 1 Sep 2016, Junio C Hamano wrote: > Jakub Narębski writes: > > > I wonder how probable is situation where we save instruction sheet > > for interactive rebase, with shortened SHA-1, and during rebase > > shortened SHA-1 stops being unambiguous... > > It is my understanding that the shortened ones are only for end-user > consumption. The insn sheet internally uses fully expanded form for > this exact reason, and then abbreviated back at each step before the > updated one is presented to the end-user. Uniqueness guarantee is > enforced with new objects created during each step taken into > account by doing it this way. Indeed, the rebase -i shortens the SHA-1s just before letting the user edit git-rebase-todo and then expands them back. Ciao, Dscho
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Jakub Narębski writes: > I wonder how probable is situation where we save instruction sheet > for interactive rebase, with shortened SHA-1, and during rebase > shortened SHA-1 stops being unambiguous... It is my understanding that the shortened ones are only for end-user consumption. The insn sheet internally uses fully expanded form for this exact reason, and then abbreviated back at each step before the updated one is presented to the end-user. Uniqueness guarantee is enforced with new objects created during each step taken into account by doing it this way.
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hello, W dniu 01.09.2016 o 15:12, Johannes Schindelin pisze: > On Wed, 31 Aug 2016, Jakub Narębski wrote: >> W dniu 31.08.2016 o 21:10, Junio C Hamano pisze: >>> So I am not sure if we want a parsed commit there (I would not >>> object if we kept the texual object name read from the file, >>> though). The "one sequencer to rule them all" may even have to say >>> "now give name ':1' to the result of the previous operation" in one >>> step and in another later step have an instruction "merge ':1'". >>> When that happens, you cannot even pre-populate the commit object >>> when the sequencer reads the file, as the commit has not yet been >>> created at that point. >> >> True, --preserve-merges rebase is well, different. > > It is mis-designed. And I can be that harsh because it was my design. > > In the meantime I came up with a much better design, and implemented it as > a shell script on top of rebase -i. Since shell scripts run like slow > molasses, even more so on Windows, I have a loose plan to implement its > functionality as a new --recreate-merges option, and to deprecate > --preserve-merges when that new option works. > > It needs to be a new option (not a --preserve-merges=v2) because it is a > totally different beast. For starters, it does not need its own code path > that overrides pick_one, as --preserve-merges does. Better preserving for merges (with cleanly defined sematics) would be certainly nice to have. > But I get way ahead of myself. First we need to get these last few bits > and pieces in place to accelerate (non --preserve-merges) rebase -i. But it can wait, right. -- Jakub Narębski
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
W dniu 01.09.2016 o 11:37, Johannes Schindelin pisze: > On Wed, 31 Aug 2016, Junio C Hamano wrote: >> Jakub Narębski writes: >> diff --git a/sequencer.c b/sequencer.c index 06759d4..3398774 100644 --- a/sequencer.c +++ b/sequencer.c @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts *opts) struct todo_item { enum todo_command command; struct commit *commit; + const char *arg; + int arg_len; >>> >>> Why 'arg', and not 'oneline', or 'subject'? >>> I'm not saying it is bad name. >> >> I am not sure what the "commit" field of type "struct commit *" is >> for. It is not needed until it is the commit's turn to be picked or >> reverted; if we end up stopping in the middle, parsing the commit >> object for later steps will end up being wasted effort. > > No, it won't be wasted effort, as we *validate* the todo script this way. > And since we may very well need the info later (most rebases do not fail > in the middle), we store it, too. The question was (I think) whether we should do eager parsing of commits, or whether we can do lazy parsing by postponing full parsing "until it is the commit's turn to be picked or reverted", and possibly when saving todo file. I wonder how probable is situation where we save instruction sheet for interactive rebase, with shortened SHA-1, and during rebase shortened SHA-1 stops being unambiguous... >[...] after parsing the todo_list, we will > have to act on the information contained therein. For example we will have > to cherry-pick some of the indicated commits (requiring a struct commit * > for use in do_pick_commit()). Another example: we may need to determine > the oneline for use in fixup!/squash! reordering. > > So: keeping *that* aspect of the previous todo_list parsing, i.e. store a > pointer to the already-parsed commit, is the right thing to do. The above probably means that eager eval is better Best, -- Jakub Narębski
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Johannes Schindelin writes: >> though). The "one sequencer to rule them all" may even have to say >> "now give name ':1' to the result of the previous operation" in one >> step and in another later step have an instruction "merge ':1'". >> When that happens, you cannot even pre-populate the commit object >> when the sequencer reads the file, as the commit has not yet been >> created at that point. > > These considerations are pretty hypothetical. I would even place a bet > that we will *never* have ":1" as names, not if I have anything to say... > ;-) If you can always work with pre-existing commit, then you can validate all object references that appear in the instructions upfront. I was sort of expecting that, when you do the preserve-merges mode of "rebase -i", you would need to jump around, doing "we have reconstructed the side branch on a new 'onto', let's give the result this temporary name ':1', and then switch to the trunk (which would call for 'reset ' instruction) and merge that thing (which would be 'merge :1' or perhaps called 'pick :1')", and at that point you no longer validate the object references upfront. If you do not have to have such a "mark this point" and a "refer to that point we previously marked", then I agree that you should be able to pre-validate and keep the result in the structure.
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hi Junio, On Wed, 31 Aug 2016, Junio C Hamano wrote: > Jakub Narębski writes: > > @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts > *opts) > struct todo_item { > enum todo_command command; > struct commit *commit; > +const char *arg; > +int arg_len; > > > >> I am not sure what the "commit" field of type "struct commit *" is > >> for. It is not needed until it is the commit's turn to be picked or > >> reverted; if we end up stopping in the middle, parsing the commit > >> object for later steps will end up being wasted effort. > > > > From what I understand this was what sequencer did before this > > series, so it is not a regression (I think; the commit parsing > > was in different function, but I think at the same place in > > the callchain). > > Yes, I agree with you and I didn't mean "this is a new bug" at all. > It just is an indication that further refactoring after this step is > needed, and it is likely to involve removal or modification of this > field. Not so. After you review the sequencer-i patches, you will see that it makes tons of sense to have that "commit" field: "pick" is by far the most common case, and it would not make sense at all to validate the todo script, only to throw away the information we got, only to re-gain it later in the sequencer. Read: I strongly disagree with your cursory verdict that the "commit" field should go. Ciao, Dscho
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hi Kuba & Junio, On Wed, 31 Aug 2016, Jakub Narębski wrote: > W dniu 31.08.2016 o 21:10, Junio C Hamano pisze: > > Jakub Narębski writes: > > > >>> diff --git a/sequencer.c b/sequencer.c > >>> index 06759d4..3398774 100644 > >>> --- a/sequencer.c > >>> +++ b/sequencer.c > >>> @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts > >>> *opts) > >>> struct todo_item { > >>> enum todo_command command; > >>> struct commit *commit; > >>> + const char *arg; > >>> + int arg_len; > > > I am not sure what the "commit" field of type "struct commit *" is > > for. It is not needed until it is the commit's turn to be picked or > > reverted; if we end up stopping in the middle, parsing the commit > > object for later steps will end up being wasted effort. > > From what I understand this was what sequencer did before this > series, so it is not a regression (I think; the commit parsing > was in different function, but I think at the same place in > the callchain). Exactly. > > Also, when the sequencer becomes one sequencer to rule them all, the > > command set may contain something that does not even mention any > > commit at all (think: exec). > > The "exec" line is a bit of exception, all other rebase -i commands > take commit as parameter. It could always use NULL. There is also "noop". > > So I am not sure if we want a parsed commit there (I would not > > object if we kept the texual object name read from the file, > > though). The "one sequencer to rule them all" may even have to say > > "now give name ':1' to the result of the previous operation" in one > > step and in another later step have an instruction "merge ':1'". > > When that happens, you cannot even pre-populate the commit object > > when the sequencer reads the file, as the commit has not yet been > > created at that point. > > True, --preserve-merges rebase is well, different. It is mis-designed. And I can be that harsh because it was my design. In the meantime I came up with a much better design, and implemented it as a shell script on top of rebase -i. Since shell scripts run like slow molasses, even more so on Windows, I have a loose plan to implement its functionality as a new --recreate-merges option, and to deprecate --preserve-merges when that new option works. It needs to be a new option (not a --preserve-merges=v2) because it is a totally different beast. For starters, it does not need its own code path that overrides pick_one, as --preserve-merges does. But I get way ahead of myself. First we need to get these last few bits and pieces in place to accelerate (non --preserve-merges) rebase -i. Ciao, Dscho
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hi Junio, On Wed, 31 Aug 2016, Junio C Hamano wrote: > Jakub Narębski writes: > > >> diff --git a/sequencer.c b/sequencer.c > >> index 06759d4..3398774 100644 > >> --- a/sequencer.c > >> +++ b/sequencer.c > >> @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts > >> *opts) > >> struct todo_item { > >>enum todo_command command; > >>struct commit *commit; > >> + const char *arg; > >> + int arg_len; > > > > Why 'arg', and not 'oneline', or 'subject'? > > I'm not saying it is bad name. > > I am not sure what the "commit" field of type "struct commit *" is > for. It is not needed until it is the commit's turn to be picked or > reverted; if we end up stopping in the middle, parsing the commit > object for later steps will end up being wasted effort. No, it won't be wasted effort, as we *validate* the todo script this way. And since we may very well need the info later (most rebases do not fail in the middle), we store it, too. > Also, when the sequencer becomes one sequencer to rule them all, the > command set may contain something that does not even mention any > commit at all (think: exec). Right, in which case the "commit" field will have the value... *drum roll*... NULL! > So I am not sure if we want a parsed commit there (I would not > object if we kept the texual object name read from the file, > though). The "one sequencer to rule them all" may even have to say > "now give name ':1' to the result of the previous operation" in one > step and in another later step have an instruction "merge ':1'". > When that happens, you cannot even pre-populate the commit object > when the sequencer reads the file, as the commit has not yet been > created at that point. These considerations are pretty hypothetical. I would even place a bet that we will *never* have ":1" as names, not if I have anything to say... ;-) What is not so hypothetical is that after parsing the todo_list, we will have to act on the information contained therein. For example we will have to cherry-pick some of the indicated commits (requiring a struct commit * for use in do_pick_commit()). Another example: we may need to determine the oneline for use in fixup!/squash! reordering. So: keeping *that* aspect of the previous todo_list parsing, i.e. store a pointer to the already-parsed commit, is the right thing to do. Ciao, Dscho
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Hi Kuba, On Wed, 31 Aug 2016, Jakub Narębski wrote: > W dniu 29.08.2016 o 10:05, Johannes Schindelin pisze: > > > The `git-rebase-todo` file contains a list of commands. Most of those > > commands have the form > > > > > > > > The is displayed primarily for the user's convenience, as > > rebase -i really interprets only the part. However, there > > are *some* places in interactive rebase where the is used to > > display messages, e.g. for reporting at which commit we stopped. > > > > So let's just remember it when parsing the todo file; we keep a copy of > > the entire todo file anyway (to write out the new `done` and > > `git-rebase-todo` file just before processing each command), so all we > > need to do is remember the begin and end offsets. > > Actually what we remember is pointer and length, or begin offset and length, > not offset and offset. Right. Fixed. > > diff --git a/sequencer.c b/sequencer.c > > index 06759d4..3398774 100644 > > --- a/sequencer.c > > +++ b/sequencer.c > > @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts > > *opts) > > struct todo_item { > > enum todo_command command; > > struct commit *commit; > > + const char *arg; > > + int arg_len; > > Why 'arg', and not 'oneline', or 'subject'? > I'm not saying it is bad name. Because we will use it for `exec` commands' args, too. Clarified in the commit message. > > @@ -760,6 +762,9 @@ static int parse_insn_line(struct todo_item *item, > > const char *bol, char *eol) > > status = get_sha1(bol, commit_sha1); > > *end_of_object_name = saved; > > > > + item->arg = end_of_object_name + strspn(end_of_object_name, " \t"); > > + item->arg_len = (int)(eol - item->arg); > > + > > Does it work correctly for line without , that is > > > > I think it does, but I not entirely sure. It does work correctly: in the example, *end_of_object_name would be '\n', and strspn(end_of_object_name, " \t") would return 0. Thanks for the review! Dscho
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Jakub Narębski writes: @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts *opts) struct todo_item { enum todo_command command; struct commit *commit; + const char *arg; + int arg_len; > >> I am not sure what the "commit" field of type "struct commit *" is >> for. It is not needed until it is the commit's turn to be picked or >> reverted; if we end up stopping in the middle, parsing the commit >> object for later steps will end up being wasted effort. > > From what I understand this was what sequencer did before this > series, so it is not a regression (I think; the commit parsing > was in different function, but I think at the same place in > the callchain). Yes, I agree with you and I didn't mean "this is a new bug" at all. It just is an indication that further refactoring after this step is needed, and it is likely to involve removal or modification of this field.
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
W dniu 31.08.2016 o 21:10, Junio C Hamano pisze: > Jakub Narębski writes: > >>> diff --git a/sequencer.c b/sequencer.c >>> index 06759d4..3398774 100644 >>> --- a/sequencer.c >>> +++ b/sequencer.c >>> @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts >>> *opts) >>> struct todo_item { >>> enum todo_command command; >>> struct commit *commit; >>> + const char *arg; >>> + int arg_len; > I am not sure what the "commit" field of type "struct commit *" is > for. It is not needed until it is the commit's turn to be picked or > reverted; if we end up stopping in the middle, parsing the commit > object for later steps will end up being wasted effort. >From what I understand this was what sequencer did before this series, so it is not a regression (I think; the commit parsing was in different function, but I think at the same place in the callchain). > > Also, when the sequencer becomes one sequencer to rule them all, the > command set may contain something that does not even mention any > commit at all (think: exec). The "exec" line is a bit of exception, all other rebase -i commands take commit as parameter. It could always use NULL. > > So I am not sure if we want a parsed commit there (I would not > object if we kept the texual object name read from the file, > though). The "one sequencer to rule them all" may even have to say > "now give name ':1' to the result of the previous operation" in one > step and in another later step have an instruction "merge ':1'". > When that happens, you cannot even pre-populate the commit object > when the sequencer reads the file, as the commit has not yet been > created at that point. True, --preserve-merges rebase is well, different. Best, -- Jakub Narebski
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
Jakub Narębski writes: >> diff --git a/sequencer.c b/sequencer.c >> index 06759d4..3398774 100644 >> --- a/sequencer.c >> +++ b/sequencer.c >> @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts >> *opts) >> struct todo_item { >> enum todo_command command; >> struct commit *commit; >> +const char *arg; >> +int arg_len; > > Why 'arg', and not 'oneline', or 'subject'? > I'm not saying it is bad name. I am not sure what the "commit" field of type "struct commit *" is for. It is not needed until it is the commit's turn to be picked or reverted; if we end up stopping in the middle, parsing the commit object for later steps will end up being wasted effort. Also, when the sequencer becomes one sequencer to rule them all, the command set may contain something that does not even mention any commit at all (think: exec). So I am not sure if we want a parsed commit there (I would not object if we kept the texual object name read from the file, though). The "one sequencer to rule them all" may even have to say "now give name ':1' to the result of the previous operation" in one step and in another later step have an instruction "merge ':1'". When that happens, you cannot even pre-populate the commit object when the sequencer reads the file, as the commit has not yet been created at that point.
Re: [PATCH 13/22] sequencer: remember the onelines when parsing the todo file
W dniu 29.08.2016 o 10:05, Johannes Schindelin pisze: > The `git-rebase-todo` file contains a list of commands. Most of those > commands have the form > > > > The is displayed primarily for the user's convenience, as > rebase -i really interprets only the part. However, there > are *some* places in interactive rebase where the is used to > display messages, e.g. for reporting at which commit we stopped. > > So let's just remember it when parsing the todo file; we keep a copy of > the entire todo file anyway (to write out the new `done` and > `git-rebase-todo` file just before processing each command), so all we > need to do is remember the begin and end offsets. Actually what we remember is pointer and length, or begin offset and length, not offset and offset. > > Signed-off-by: Johannes Schindelin Nice, I'll see how it is used later (and in which commit in series). > --- > sequencer.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/sequencer.c b/sequencer.c > index 06759d4..3398774 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts > *opts) > struct todo_item { > enum todo_command command; > struct commit *commit; > + const char *arg; > + int arg_len; Why 'arg', and not 'oneline', or 'subject'? I'm not saying it is bad name. > size_t offset_in_buf; > }; > > @@ -760,6 +762,9 @@ static int parse_insn_line(struct todo_item *item, const > char *bol, char *eol) > status = get_sha1(bol, commit_sha1); > *end_of_object_name = saved; > > + item->arg = end_of_object_name + strspn(end_of_object_name, " \t"); > + item->arg_len = (int)(eol - item->arg); > + Does it work correctly for line without , that is I think it does, but I not entirely sure. > if (status < 0) > return -1; > > @@ -880,6 +885,8 @@ static int walk_revs_populate_todo(struct todo_list > *todo_list, > > item->command = command; > item->commit = commit; > + item->arg = NULL; > + item->arg_len = 0; > item->offset_in_buf = todo_list->buf.len; > subject_len = find_commit_subject(commit_buffer, &subject); > strbuf_addf(&todo_list->buf, "%s %s %.*s\n", >
[PATCH 13/22] sequencer: remember the onelines when parsing the todo file
The `git-rebase-todo` file contains a list of commands. Most of those commands have the form The is displayed primarily for the user's convenience, as rebase -i really interprets only the part. However, there are *some* places in interactive rebase where the is used to display messages, e.g. for reporting at which commit we stopped. So let's just remember it when parsing the todo file; we keep a copy of the entire todo file anyway (to write out the new `done` and `git-rebase-todo` file just before processing each command), so all we need to do is remember the begin and end offsets. Signed-off-by: Johannes Schindelin --- sequencer.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/sequencer.c b/sequencer.c index 06759d4..3398774 100644 --- a/sequencer.c +++ b/sequencer.c @@ -709,6 +709,8 @@ static int read_and_refresh_cache(struct replay_opts *opts) struct todo_item { enum todo_command command; struct commit *commit; + const char *arg; + int arg_len; size_t offset_in_buf; }; @@ -760,6 +762,9 @@ static int parse_insn_line(struct todo_item *item, const char *bol, char *eol) status = get_sha1(bol, commit_sha1); *end_of_object_name = saved; + item->arg = end_of_object_name + strspn(end_of_object_name, " \t"); + item->arg_len = (int)(eol - item->arg); + if (status < 0) return -1; @@ -880,6 +885,8 @@ static int walk_revs_populate_todo(struct todo_list *todo_list, item->command = command; item->commit = commit; + item->arg = NULL; + item->arg_len = 0; item->offset_in_buf = todo_list->buf.len; subject_len = find_commit_subject(commit_buffer, &subject); strbuf_addf(&todo_list->buf, "%s %s %.*s\n", -- 2.10.0.rc1.114.g2bd6b38