Greetings,
I am attempting to improve CVS -> CVSps -> Git-cvsimport process.
The part involving Git-cvsimport has to do with parsing of CVSps
PatchSet file. Consider what happens if a CVS log/commit message
includes lines which start with "Members:", say from copy-and-paste
[2].
To avoid this issue, I have proposed that CVSps append the "Log:" tag
with line count of original CVS log/commit message [1].
The idea is if line-count is found after "Log:", that many (CVS log
message) lines get consumed before advancing $state to look for
"^Members:"
Current Git-cvsimport isn't strict in matching the "Log:" tag
(fortunately) and my proposed change to Git-cvsimport should be fully
backward compatible.
See attached patch.
Cheers,
--patrick
p.s., For reference: Why I'm doing this and RFC sent to CVS list:
http://lists.nongnu.org/archive/html/info-cvs/2017-11/msg00000.html
[1] https://github.com/andreyvit/cvsps/pull/4
[2] Example PatchSet with "Members:" line in original CVS commit message:
---------------------
PatchSet 3
Date: 2017/10/30 23:25:20
Author: catbert
Branch: HEAD
Tag: (none)
Log:
This will confuse git-cvsimport's parser
Members:
somefile.c:1.1->1.2
another.h:1.7->1.8
foo.mk:1.22->1.23
Imagine these were lines pasted to note something
Members:
ABC:1.1->1.2
commit c3e406c54b8cd3a2bbf0aa729fef201e20fa6df5
Author: patrick keshishian <[email protected]>
Date: Sat Nov 4 08:42:12 2017 -0700
Optionally parse line count out of PatchSets with "Log: count"
This is a change being suggested to CVSps where the line count of the
commit message gets added to the "Log:" tag to help Git cvsimport not
get confused if the CVS log/commit message included lines starting with
any of the tags found in CVSps PatchSet, e.g., Members:
This is part of a larger change to make CVS to Git import more robust.
diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index 36929921e..5d78c5e87 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -786,6 +786,13 @@ open(CVS, "<$cvspsfile") or die $!;
#
#---------------------
+# NOTE:
+## pk, 2017/10/30
+# patched cvsps will output ^Log: line with number of lines of log
+# which are to follow. This makes parsing robust for cases where the
+# log message contains ^Members: lines! Happens in OpenBSD sources:
+# e.g., See src/usr.sbin/bgpd/rde.c
+
my $state = 0;
sub update_index (\@\@) {
@@ -816,7 +823,7 @@ sub write_tree () {
return $tree;
}
-my
($patchset,$date,$author_name,$author_email,$author_tz,$branch,$ancestor,$tag,$logmsg);
+my
($patchset,$date,$author_name,$author_email,$author_tz,$branch,$ancestor,$tag,$logmsg,$loglines);
my (@old,@new,@skipped,%ignorebranch,@commit_revisions);
# commits that cvsps cannot place anywhere...
@@ -1005,8 +1012,13 @@ while (<CVS>) {
$tag = $_;
}
$state = 7;
- } elsif ($state == 7 and /^Log:/) {
+ } elsif ($state == 7 and /^Log:\s*(\d+)?$/) {
+ $loglines = $1 // -1;
$logmsg = "";
+ while ($loglines-- > 0 && ($_ = <CVS>)) {
+ chomp;
+ $logmsg .= "$_\n";
+ }
$state = 8;
} elsif ($state == 8 and /^Members:/) {
$branch = $opt_o if $branch eq "HEAD";