Thanks for all this careful analysis and presentation of alternatives, Jamon. To me, option 2 sounds fine. An important part of our responsibility in keeping a source history is keeping track of IP - however, empty revisions could could carry no IP. I think it is important to get our work into git as soon as is practicable and option 2 seems the fastest approach which will yield an acceptable result. One issue might be that we have some implementation which depended on the existence of a particular empty directory, and that a git checkout of this history would produce a non-working image. I believe there has only ever been one instance of this issue, in the mid-period of Engage... but as I understand it, none of the other options would help with this in any case, since empty directories cannot be represented in git. That's my opinion - Colin, what are your thoughts?
Jamon Camisso <[email protected]> wrote: >On 1/25/2011 4:07 PM, Jamon Camisso wrote: >> For anyone who is interested, here is a list of all relevant directories >> from SVN, including those that were deleted at some point in the past. >> The plan is to map what Colin has outlined below to directories in this >> file, and then to convert each to a tag, branch, or master branch >> depending on where it needs to live in Git. > >Responding to myself here, and would like to hear from people about the >following: > >Justin and I have been working on importing SVN into Git this week, with >a fair amount of success. We managed to cut infusion down to about >22-24mb by removing extraneous psd files from the repository. > >However, in shuffling repositories and branches around, we have >discovered that the tool being used svn-all-fast-export[1][2] does not >incorporate SVN commits to empty directories into the git repository. >This behaviour is by design - both Git and Mercurial explicitly do not >support tracking directories. > >This feature (or bug depending on which side of the fence is most >attractive or comfortable) means that where historical changes to SVN >like the move from /utoronto/fluid to /fluid occurred, the particular >commit tracking that change is not present in Git. > >One of goals during this migration to Git is to preserve as much history >in the various repositories that are being forked as possible. This >attempt at maintaining the historical integrity of Fluid's source code >repositories will ensure that future members or external participants in >the Fluid community will have access to relevant information about the >historical development of various projects. > >With all that in mind, Justin and I can think of a few options that are >or will be more or less palatable to those who have read this far: > >Option 1) Stick with SVN. Unlikely. This choice would not be in keeping >with the distributed collaborative nature of Fluid. As such it would be >a very unsavory outcome. > >Option 2) Use svn-all-fast-export as it currently runs, with the proviso >that any SVN commit of an empty directory or directories will be elided >from the history of the repository. This option is semi-palatable in >that the final repositories would look and behave exactly as if they >were created in Git in the first place. > >Option 3) Convert repositories using svn-all-fast-export and run "git >commit --append" on each commit in question. Said commits can be found >using the output of the svn-all-fast-export tool with full rule >debugging output enabled and piped to a log file or extracted directly >using grep: > >grep -E "Exporting revision ([0-9]{4,5})?{4,5}(.*)nothing to do" import.log > >That output (of 4286 commits) could then be matched to specific commits >that solely affected A/D changes to directories in SVN. For example, >r4124-4126 is one such series of commits. > >Whereas each Git commit would initially look like the following: > >commit ec2571d0833cbd72fa42d471ba2acdbe9ece71dd >Author: Joseph Scheuhammer <[email protected]> >Date: Fri May 18 15:56:36 2007 +0000 > > Initial Fluid branch of Berkeley's Gallery Tool > > svn path=/utoronto/fluid/gallery/; revision=4126 > >The affected commits can then be edited to look like this: > > svn path=/utoronto/fluid/gallery/; revision=4124,4125,4126 > Extra comment here pointing to Wiki, or SVN, or a file in Git > outlining changes to the repository > >Option 4) Hack on svn-all-fast-export to make it do something with >directory modifications. This option would likely take a fair amount of >time and work to get it working just right, and is not in keeping with >the fundamental design of Git. > >Option 5) Use a different tool altogether, like git-svn, or the original >svn2git tool. These tools are not nearly as sophisticated as >svn-all-fast-export in that they are a) incredibly slow and b) unable to >track changes to a file's location between directories historically >deleted directories the same way that svn-all-fast-export does. > >My first preference would be Option 3. However, successfully mapping >commits of empty directories to preceding commits depends on how much >information can be extracted and correlated programmatically. If there >is too much manual work required then my other preference would be Option 2. > >Option 2 is viable and would be the fastest of the two. This optiont >akes into account the fact that SVN will still be online. I would >imagine that anyone who is interested enough in who created an empty >directory would probably be willing to do the work of quickly doing and >svn log -r0001 on the repository and extracting the information that way. > >The fact that not all information is being imported from SVN to Git >(Photoshop psd files for example) makes option 2 that much more >compelling in that it would take very little time to freeze SVN and just >do the conversion. > >In the end options 2 and 3 both preserve information about empty >directories, albeit in two different locations. Whereas the former >retains an intact record in SVN, the latter entails taking small >liberties with the historical record in Git. However, in both cases, the >fact that committer X created directory Y will still be easily gleaned >from some easily found and well documented location for those who are >interested in such information. > >tl;dr there is no easy way to import empty directories into Git. Option >2 is less disruptive and faster, while leaving information in multiple >locations. Option 3 will require some small amount of historical >revisionism, while retaining what history and files are deemed important >in one repository format. > >Feedback is welcome at this point. I imagine Colin and Antranig will be >especially interested in sharing their thoughts. > >Regards, Jamon > >[1] http://packages.debian.org/testing/main/svn-all-fast-export >[2] svn-all-fast-export has been forked and named svn2git, the confusing >part being that there is a Ruby project that precedes the fork with the >same name..) >_______________________________________________________ >fluid-work mailing list - [email protected] >To unsubscribe, change settings or access archives, >see http://fluidproject.org/mailman/listinfo/fluid-work _______________________________________________________ fluid-work mailing list - [email protected] To unsubscribe, change settings or access archives, see http://fluidproject.org/mailman/listinfo/fluid-work
