Hi Rom, On 2/25/13 18:14 , Rom Walton wrote: > Wouldn't the SHA-1 values of all the SVN commits be different between v1 > and v2 because of the missing files?
Sure, that's why I picked the respective SHA1s of equivalent commits (repo states) in both repos involved and compared their *content* using git diff. > Remember the commit you pointed us to where a previously missing file > was first modified, after SVN #12399. From svn2git's perspective, the > file didn't exist in Git so svn2git spontaneously brought the whole file > into existence. So what should have been just a one or two line diff, > became a whole file inclusion. That would change the SHA-1 value for > that commit. That's irrelevant when you compare content (see above). > Now we come to the extraction of patch sets from v1, the patches > reference SHA-1 values that exist in the botched SVN migration. So > git-am would complain about missing SHA-1 values and abort. In order to > fix that situation I created a new branch, manually applied the patch, > extracted the patch, copy the patch header from the old patch file to > the new patch file, rename new patch file to the old patch file, > checkout master and apply modified patch. Fine. Anyway, whatever you do in this step, the content should be the same again afterwards, despite the changes to the commit history itself. > It turns out line feed issues from the svn2git migration were my biggest > nemesis. What exactly do you mean by "line feed issues"? This might be the actual root cause of the problem since content is being modified. I haven't yet checked what the actual differences I found are... Any idea where these line feed issues come from? Since you're using .gitattributes, did you make sure all v2 files fit those rules? > Since any subsequent commit depends on the SHA-1 value of the previous > version of said file I would expect every file changed post-SVN > migration to be considered different from gits perspective. Relative to what? > I'm not > sure git would be a good tool to use to check differences on files as it > would tend to take shortcuts based on that it stores in the object > database. You'll probably need to use a diffing tool that will look at > the contents of a file. No, that's what git diff (or git diff-tree) does. It just runs a diff against two repo states (or trees) that can be specified in numerous ways. Running a git diff against two SHA1s will compare their respective trees - no shortcuts. You may of course use two local clones, checkout the commits I compared and run your favorite diff tool against those two trees manually to verify. > I've uploaded the patch sets I applied post-svn conversion: > http://boinc.berkeley.edu/dl2/boincgitpatches.zip > > patches contains all the patches I applied post SVN migration and brings > the tree up to 12/14/2012. I'll have a look at those in more detail or might even try to roll my own set. The only obstacle should be to discern the v1 commits that should be omitted in v2 because they were solely used to fix the flawed v1 migration. Do you have a list of those? In general: do you agree with me that, eventually, the *content* of boinc.git and boinc-v2.git should be identical when comparing their latest common commits/trees? > I created the 7.0b branch and did > the merges at the same points in the git log as they existed in > boinc-v1. I only applied the tag/version change commits after the > merge. I created the tag after that. Why? This makes it hard to review the history. I need to look into this as well. Cheers, Oliver _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
