Re: [GUILT 06/28] Fix and simplify the do_get_patch function.
n Tue, May 6, 2014 at 9:08 PM, Jeff Sipek wrote: > On Sun, Mar 23, 2014 at 10:03:48PM +0100, Per Cederqvist wrote: >> On Sun, Mar 23, 2014 at 6:09 PM, Jeff Sipek wrote: >> >> > On Fri, Mar 21, 2014 at 08:31:44AM +0100, Per Cederqvist wrote: >> >> When extracting the patch, we only want the actual patches. We don't >> >> want the "---" delimiter. Simplify the extraction by simply deleting >> >> everything before the first "diff " line. (Use sed instead of awk to >> >> simplify the code.) >> >> >> >> Without this patch, "guilt fold" and "guilt push" sometimes fails if >> >> guilt.diffstat is true. > > Hrm, I just did a test and guilt-push seems to work with a patch such as: > > aoeuaoeu > this is > --- > not a patch! > --- > foo |1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/foo b/foo > index e69de29..203a557 100644 > --- a/foo > +++ b/foo > @@ -0,0 +1 @@ > +aoue > > What sort of thing breaks fold/push? This sequence breaks: mkdir guiltdemo cd guiltdemo git init git commit --allow-empty -mstart guilt init git config guilt.diffstat true guilt new empty-1 guilt new empty-2 guilt pop guilt fold empty-2 guilt pop guilt push That is, create two empty patches, fold them together, pop them, and try to push the combined (still empty) patch. The last command fails with: Applying patch..empty-1 fatal: unrecognized input To force apply this patch, use 'guilt push -f' At this point, the patch series consists of a single patch, empty-1, which contains the five characters "-", "-", "-", newline, newline. The 06 patch (Added test cases for "guilt fold".) contains a test case for this issue. Which opens a style issue. When you fix a bug, should you: - commit the bug fix first and the test case later, so that "git bisect" always finds commits where "make test" works, or - commit the test case first, and the bug fix later, so that it is more obvious that you are actually fixing a pre-existing bug, or - commit the test case and the bug fix in the same commit? In this series I have committed bug fixes first and test cases later, but I'm not convinced that is the right thing to do. > ... >> >> diff --git a/guilt b/guilt >> >> index 8701481..c59cd0f 100755 >> >> --- a/guilt >> >> +++ b/guilt >> >> @@ -332,12 +332,7 @@ do_make_header() >> >> # usage: do_get_patch patchfile >> >> do_get_patch() >> >> { >> >> - awk ' >> >> -BEGIN{} >> >> -/^(diff |---$|--- )/ {patch = 1} >> >> -patch == 1 {print $0} >> >> -END{} >> >> -' < "$1" >> >> + sed -n '/^diff /,$p' < "$1" >> > >> > So, the thought behind this mess was to allow minimal patches to work. The >> > 'diff' line is *not* required by patch(1)! >> >> I see. I can see that this is sometimes useful, even though I think >> it is more important that guilt actually works with the patches itself >> creates. > > Fair enough. I'm convinced that we should assume that all patches start > with 'diff '. > > ... >> I had to solve a similar problem when I wrote my utility for diffing two >> diff files (https://git.lysator.liu.se/pdiffdiff). Based on the experience >> I got doing that, I propose this new do_get_patch function: > ... >> >> The idea is to collect lines that *might* start the patch in >> patchheader. Once we detect the start of the patch (via a >> line that matches "--- " or any of the mode change lines) >> we print the patcheader and the current line. If we get a >> line that neither looks like a patchheader nor starts a patch, >> we discard the patcheader. This is the case of a commit >> message with a line that starts with "diff ". >> >> The function above solves the issue with lines that >> start with "diff " in the commit message. On the other >> hand, it introduces the same issue for lines that start >> with for instance "old mode ". > > Hrm. I'm open to revisiting the get-patch/get-header functions to address > the ambiguity issues in the future. Postponing that for the future seems like a good plan. I think there are three possible ways to deal with the problem. - Store the commit message, diffstat, and diffs in separate files. I don't like this option, as it would make it a lot less convenient to work with the patch files. - Add a quoting mechanism, so that you need to write ">diff" or ">---" instead of just "diff" or "---" if you want to include those characters at the start of the line in a commit message. - Ignore the problem. This is what guilt has done in the past, and I see no reason why it needs to change *in this patch series*. >> Actually, a smaller change that probably fixes the >> issue with diffstat is to replace >> >> /^(diff |---$|--- )/ {patch = 1} >> >> witih >> >> /^(diff |--- )/ {patch = 1} >> >> as the problem with the original implementation is that >> the "---\n" line that starts the diffstat should not start >> the patch. > > (Thinking out loud...) I suppose there are three portions to a patch file... > > (1) the description > (2) o
Re: [GUILT 06/28] Fix and simplify the do_get_patch function.
On Sun, Mar 23, 2014 at 10:03:48PM +0100, Per Cederqvist wrote: > On Sun, Mar 23, 2014 at 6:09 PM, Jeff Sipek wrote: > > > On Fri, Mar 21, 2014 at 08:31:44AM +0100, Per Cederqvist wrote: > >> When extracting the patch, we only want the actual patches. We don't > >> want the "---" delimiter. Simplify the extraction by simply deleting > >> everything before the first "diff " line. (Use sed instead of awk to > >> simplify the code.) > >> > >> Without this patch, "guilt fold" and "guilt push" sometimes fails if > >> guilt.diffstat is true. Hrm, I just did a test and guilt-push seems to work with a patch such as: aoeuaoeu this is --- not a patch! --- foo |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/foo b/foo index e69de29..203a557 100644 --- a/foo +++ b/foo @@ -0,0 +1 @@ +aoue What sort of thing breaks fold/push? ... > >> diff --git a/guilt b/guilt > >> index 8701481..c59cd0f 100755 > >> --- a/guilt > >> +++ b/guilt > >> @@ -332,12 +332,7 @@ do_make_header() > >> # usage: do_get_patch patchfile > >> do_get_patch() > >> { > >> - awk ' > >> -BEGIN{} > >> -/^(diff |---$|--- )/ {patch = 1} > >> -patch == 1 {print $0} > >> -END{} > >> -' < "$1" > >> + sed -n '/^diff /,$p' < "$1" > > > > So, the thought behind this mess was to allow minimal patches to work. The > > 'diff' line is *not* required by patch(1)! > > I see. I can see that this is sometimes useful, even though I think > it is more important that guilt actually works with the patches itself > creates. Fair enough. I'm convinced that we should assume that all patches start with 'diff '. ... > I had to solve a similar problem when I wrote my utility for diffing two > diff files (https://git.lysator.liu.se/pdiffdiff). Based on the experience > I got doing that, I propose this new do_get_patch function: ... > > The idea is to collect lines that *might* start the patch in > patchheader. Once we detect the start of the patch (via a > line that matches "--- " or any of the mode change lines) > we print the patcheader and the current line. If we get a > line that neither looks like a patchheader nor starts a patch, > we discard the patcheader. This is the case of a commit > message with a line that starts with "diff ". > > The function above solves the issue with lines that > start with "diff " in the commit message. On the other > hand, it introduces the same issue for lines that start > with for instance "old mode ". Hrm. I'm open to revisiting the get-patch/get-header functions to address the ambiguity issues in the future. > Actually, a smaller change that probably fixes the > issue with diffstat is to replace > > /^(diff |---$|--- )/ {patch = 1} > > witih > > /^(diff |--- )/ {patch = 1} > > as the problem with the original implementation is that > the "---\n" line that starts the diffstat should not start > the patch. (Thinking out loud...) I suppose there are three portions to a patch file... (1) the description (2) optional diffstat (3) the patch You just convinced me that the patch should start with '^diff '. Currently, the diffstat begins with '^---$'. Sadly, the description can contain what looks like the beginning of a diffstat or a patch. In the case of do_get_patch, we're only interested in the patch, so we can just look for '^diff ' (because garbage before the diff itself seems to be ignored by git). (If we wanted to allow patches without the 'diff ' line, we'd need '^(diff |--- )'.) I feel like I'm missing something. Jeff. -- I'm somewhere between geek and normal. - Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GUILT 06/28] Fix and simplify the do_get_patch function.
On Sun, Mar 23, 2014 at 6:09 PM, Jeff Sipek wrote: > On Fri, Mar 21, 2014 at 08:31:44AM +0100, Per Cederqvist wrote: >> When extracting the patch, we only want the actual patches. We don't >> want the "---" delimiter. Simplify the extraction by simply deleting >> everything before the first "diff " line. (Use sed instead of awk to >> simplify the code.) >> >> Without this patch, "guilt fold" and "guilt push" sometimes fails if >> guilt.diffstat is true. >> >> Signed-off-by: Per Cederqvist >> --- >> guilt | 7 +-- >> 1 file changed, 1 insertion(+), 6 deletions(-) >> >> diff --git a/guilt b/guilt >> index 8701481..c59cd0f 100755 >> --- a/guilt >> +++ b/guilt >> @@ -332,12 +332,7 @@ do_make_header() >> # usage: do_get_patch patchfile >> do_get_patch() >> { >> - awk ' >> -BEGIN{} >> -/^(diff |---$|--- )/ {patch = 1} >> -patch == 1 {print $0} >> -END{} >> -' < "$1" >> + sed -n '/^diff /,$p' < "$1" > > So, the thought behind this mess was to allow minimal patches to work. The > 'diff' line is *not* required by patch(1)! I see. I can see that this is sometimes useful, even though I think it is more important that guilt actually works with the patches itself creates. > Is it a problem if a patch description contains a line that starts with > 'diff '? (The current code has this issue as well.) Yes. > For the record, this ambiguity is what's on several occasions made me > consider splitting the header and the patch into separate files. Yeah, it'd > be messier. :/ I really like having the header and the patches in the same file. I had to solve a similar problem when I wrote my utility for diffing two diff files (https://git.lysator.liu.se/pdiffdiff). Based on the experience I got doing that, I propose this new do_get_patch function: # usage: do_get_patch patchfile do_get_patch() { awk ' BEGIN { patchheader = "" patch = 0 } patch == 1 { print $0 next } /^(diff |index |=|RCS file:|retrieving revision )/ { patchheader = patchheader $0 "\n" next } /^(--- |((new|deleted) file|old) mode )/ { printf "%s", patchheader -/^(diff |---$|--- )/ {patch = 1} print $0 patch = 1 } { patchheader = "" }' < "$1" } The idea is to collect lines that *might* start the patch in patchheader. Once we detect the start of the patch (via a line that matches "--- " or any of the mode change lines) we print the patcheader and the current line. If we get a line that neither looks like a patchheader nor starts a patch, we discard the patcheader. This is the case of a commit message with a line that starts with "diff ". The function above solves the issue with lines that start with "diff " in the commit message. On the other hand, it introduces the same issue for lines that start with for instance "old mode ". Actually, a smaller change that probably fixes the issue with diffstat is to replace /^(diff |---$|--- )/ {patch = 1} witih /^(diff |--- )/ {patch = 1} as the problem with the original implementation is that the "---\n" line that starts the diffstat should not start the patch. /ceder >> } >> >> # usage: do_get_header patchfile >> -- >> 1.8.3.1 >> > > -- > Defenestration n. (formal or joc.): > The act of removing Windows from your computer in disgust, usually > followed by the installation of Linux or some other Unix-like operating > system. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GUILT 06/28] Fix and simplify the do_get_patch function.
On Fri, Mar 21, 2014 at 08:31:44AM +0100, Per Cederqvist wrote: > When extracting the patch, we only want the actual patches. We don't > want the "---" delimiter. Simplify the extraction by simply deleting > everything before the first "diff " line. (Use sed instead of awk to > simplify the code.) > > Without this patch, "guilt fold" and "guilt push" sometimes fails if > guilt.diffstat is true. > > Signed-off-by: Per Cederqvist > --- > guilt | 7 +-- > 1 file changed, 1 insertion(+), 6 deletions(-) > > diff --git a/guilt b/guilt > index 8701481..c59cd0f 100755 > --- a/guilt > +++ b/guilt > @@ -332,12 +332,7 @@ do_make_header() > # usage: do_get_patch patchfile > do_get_patch() > { > - awk ' > -BEGIN{} > -/^(diff |---$|--- )/ {patch = 1} > -patch == 1 {print $0} > -END{} > -' < "$1" > + sed -n '/^diff /,$p' < "$1" So, the thought behind this mess was to allow minimal patches to work. The 'diff' line is *not* required by patch(1)! Is it a problem if a patch description contains a line that starts with 'diff '? (The current code has this issue as well.) For the record, this ambiguity is what's on several occasions made me consider splitting the header and the patch into separate files. Yeah, it'd be messier. :/ > } > > # usage: do_get_header patchfile > -- > 1.8.3.1 > -- Defenestration n. (formal or joc.): The act of removing Windows from your computer in disgust, usually followed by the installation of Linux or some other Unix-like operating system. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html