Re: Fwd: Another migration tool

Toby Johnson Thu, 30 Nov 2006 23:31:23 -0800

Kirit Sælensminde wrote:

Is there more information available in the actual SourceSafe databasethan is exposed through SS.EXE? If so then it may be that replacingSS.EXE with a database reader could simplify some of the heuristicsused for ordering etc. in the tool. It would only be worthwhile thoughif my tool has better modeling of how the SourceSafe repositorychanged over time than the existing vss2svn tool does. How is thishandled?

Reading the database directly, as you can probably imagine, solves manyof the problems of reading ss.exe (not the least of which is how toparse the output, which as you said can be tricky) but also raises newones. Of course we -- and by "we", I mean Dirk :) -- had toreverse-engineer the database format through trial-and-error, and thereare likely still some places where we're doing it wrong.

However, this approach also exposes some data that is simply impossibleto retrieve using ss.exe or the OLE API, such as recovering child itemsfrom a deleted project, or correctly recovering the history of a renameditem, especially if different items of the same name existed in therepository at multiple points in time.

Unfortunately, the bottom line is that the VSS database structure israther cumbersome, incomplete, and fragile, and regardless of how thedata is retrieved there's a good chance some information is lost. Forexample, there is no sort of auto-incremented counter in any of thedatabase files to give even the correct order of actions (although thess.exe output gives the illusion of ordered version numbers, these arederived at runtime and aren't actually stored anywhere). So this meanswe must rely entirely on timestamps, and since VSS is a file-basedsystem that has only the system clocks of the various client machinesthat connect (sometimes even in different time zones!), this informationis very unreliable -- especially, as Dirk mentioned, when anarchive/restore cycle is performed, because then the timestamps areoverwritten with the time of the restore, and not the time of theoriginal commit!!

    Since we worked also very hard on getting things "right" during the
    conversion, there are a few concepts that are not easily mapped
    between
    the two tools. Esp. the archive and restore cycles are the most
    problematic one. Have you solved this problem domain and how did you
solve it?
I'm not 100% sure what you mean here. SourceSafe has no concept oftransactions - each file submission is handled seperately, so themigration doesn't attempt to guess where transactions might be valid.In practice each file version that is sent to Subversion is a seperatetransaction (revision number).

Dirk was referring here to the act of using the VSS "Archive" commandfollowed by a later "Restore"; as I mentioned above, this really screwswith the timestamps. However, since you mention transactions, I shouldpoint out that we try to deduce atomic transactions in VSS by assumingthat if consecutive VSS commits have the same author and comment, theyare part of the same logical transaction, and are recreated inSubversion that way. We keep track of any files that are modified in agiven transaction, and "commit" that transaction whenever the same fileis about to be modified twice (there are also other cases where wealways immediately commit, such as after a rename).

Better handling of shared files is the main thing that the tool isable to handle. If you have a simple situation where a file isdeveloped and then shared to each location it is used then this toolwill handle that much better than other tools I've seen, i.e. it willnot put multiple versions of that file into Subversion until after theshare occurs.
What does vss2svn do in this situation? I've been thinking of puttingtogether a single page with all of the tools I can find with a shortdescription of what they actually import in terms of the SourceSafehistory into Subversion.

I believe we are doing the same thing here; specifically, when an itemwas shared in VSS we treat that as a Subversion "cheap copy". We keeptrack of all shares during the migration, and after a share occurs, thenany commits which are made to any of the various logical locations whichpoint to the same physical file are propagated to each file inSubversion. So when foo.txt is shared to bar.txt, that is treated as an"svn copy" action. Then if a commit is made to foo.txt, that change willbe made to both foo.txt and bar.txt in the same transaction.

Unfortunately, as you can imagine, all of this is rather complex, andthe learning curve for just getting familiar with the code is verysteep. Couple that with the fact that most people will only use such atool once, and you can see that it very difficult to continue innovationof such a project! I doubt I will ever need to perform another VSSmigration (I hope to live the rest of my life without ever actuallyusing the tool for real source control again :) so the "scratch youritch" motivation of most open source projects quickly diminishes.


toby

_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org
Mailing list web interface (with searchable archives):
http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user

Re: Fwd: Another migration tool

Reply via email to