Re: Various problems with import

Stephen Lee Fri, 13 Oct 2006 09:28:58 -0700

Many thanks for your rapid response. It is certainly looking quitepromising.


Toby Johnson wrote:


Due to the way vss2svn retrieves data, it often doesn't know until later
that a data item refers to a "destroyed" item. Even after you use
"destroy" in VSS, there are traces left of the file.

Fair enough. There are plenty of items that were destroyed in thisdatabase, particularly when we were first starting to adopt SourceSafeduring the Windows port, or in temporary test projects. Just that theydominate the output so much that it is quite a bit to trawl through forthe real errors... I could soon pipe the output through grep -v though.

However, from further investigation later, it seems some items are beingincorrectly identified as orphaned. See below.

> ERROR -- No more active itempath to commit to 'IYAAAAAA':
This is a bit trickier and I'm not sure I've ever actually seen this
error, but it probably also has to do with destroyed or corrupted files.

This appears to be related to a project that was moved around andrenamed somewhat... not sure what the best way to extract relevant debugis but approx sequence of actions follows (depending on what isconsidered as a "related" action). It ended up imported in "the wrongplace".


ADD /Project/Subproject/
RENAME /Project/Subproject/ to Sub/
MOVE /Project/Sub/ to /Sub/
ADD /Project/Sub/ (different physical name)
RENAME /Project/Sub/ to Sub1/
MOVE /Sub/ to /Project/Sub/
DELETE /Project/Sub1/

vss2svn seems to have actually imported this at /Sub/
It is thus also missed out of all labelled copies of /Project/

Other subprojects that had been similarly moved around and renamed etc.ended up in the Orphaned folder, and were similarly missed out of therelevant labels.

Also for individual files that have been messed around in similar ways,it seems (at least sometimes) to be losing track of which file is which,and leaving projects with missing files.

e.g. for one file name (a header file, the final version of which[GCDAAAAA] is used and shared between several projects) the physicalaction file (first few columns, when sorted by date and filtered byfilename) shows:

7840    FPBAAAAA    \N    FNCAAAAA    SHARE
15492    MTCAAAAA    \N    KTCAAAAA    ADD
15749    MTCAAAAA    \N    KTCAAAAA    DELETE
9315    GCDAAAAA    1    \N    ADD
15824    GCDAAAAA    \N    KTCAAAAA    ADD
7927    FPBAAAAA    \N    FNCAAAAA    DELETE
7930    GCDAAAAA    \N    FNCAAAAA    SHARE
33639    GCDAAAAA    3    YQFAAAAA    SHARE

(the above excludes COMMIT actions which do not mention the filename)

KTCAAAAA is a project that remains in the final version of thesourcesafe database. Interestingly FPBAAAAA seems to have "appeared fromnowhere". Note that all of FPBAAAAA, MTCAAAAA, GCDAAAAA have the samefilename, and have been present in (some of) the same projects as eachother.


In the VssAction, I see the following operations on that file:
5794    _GCDAAAAA    GCDAAAAA    1    ADD
5837    \N    GCDAAAAA    2    COMMIT
6395    FNCAAAAA    GCDAAAAA    2    SHARE
8505    \N    GCDAAAAA    3    COMMIT
9863    YQFAAAAA    GCDAAAAA    3    SHARE
... more shares later, etc.

The file is added as an orphan when it should have been added toKTCAAAAA. Indeed adjacent VssActions are working on KTCAAAAA


5793    KTCAAAAA    FCDAAAAA    1    ADD
5794    _GCDAAAAA    GCDAAAAA    1    ADD
5795    KTCAAAAA    HCDAAAAA    1    ADD

PhysicalAction (sorted by timestamp) shows for these
7208    FCDAAAAA    1    \N    ADD
15823    FCDAAAAA    \N    KTCAAAAA    ADD
9315    GCDAAAAA    1    \N    ADD
10544    HCDAAAAA    1    \N    ADD
15824    GCDAAAAA    \N    KTCAAAAA    ADD
15825    HCDAAAAA    \N    KTCAAAAA    ADD

This, and probably many other files / projects seem to be gettingincorrectly identified as orphans. Possibly the script is gettingconfused between GCDAAAAA and FPBAAAAA?

Yes, that seems to be the problem. Probably another item for the
sanity-checker to prevent adding the same item twice. I thought we were
already checking for that, and making sure the same file wasn't touched
twice in the same delete, but maybe we're not doing that for adds, or
maybe the label logic is bypassing it.

Sequence for this file seems to be
ADD (at /Project1/filename)
SHARE (to /Project2/filename)
DELETE (/Project2/filename)
COMMIT (at /Project1/filename)
SHARE (to Project2/filename)
[... other actions ... COMMIT and SHARE to other projects ...]
LABEL

Another label that had a similar problem (this time for a whole project)went through the sequence

ADD (at location1)
MOVE (to location2)
MOVE (to location1)
LABEL

I note that as you say, duplicates seem to be silently swallowed for thesubsequent "COMMIT" actions on the first file.

Yes, it's possible to modify the datacache files and then use the
"resume" feature to restart the conversion at a particular point.

Couldn't get this to work... if I resume from the BUILDACTIONHIST orGETPHYSHIST phase then the datacache files are recreated, but if Iresume from immediately following phases, (MERGEPARENTDATA or IMPORTSVN)then datacache files seem to be ignored - having presumably beenincorporated into the vss_data.db database already.

As you point out, there are countless other ways to trim the sectionsout of the file if this proves necessary, and I've now just knocked up aquick filter to pipe the output through that removes these knownduplicates. It all imported ok then.

b: not too surprising; the pin/unpin logic can be tricky to follow.

It seems that there are only about 2 or 3 dozen files pinned to thewrong version, so these shouldn't be too arduous to fix up manually ifthis proves necessary.

It does looks like I'll have a fair bit of work to recover old versionsof orphaned projects and copy these to the appropriate label foldersunless those other problems get fixed in the meantime. It may be betterjust to not bother with labels, and keep SourceSafe handy formaintaining older versions.

Since you will likely need to change your development
model to not rely on these features anyway, I would suggest running the
conversion as you did, checking it out from SVN, then doing one final
export from VSS "on top of" the SVN checkout directory and committing to
ensure you have the latest of everything. Then rearrange your project as
necessary to not rely on those VSS features.

Was thinking of doing part of this the other way round... while varioushighly duplicated header files remain "shared", it is easier to replaceall of these with a stub to #include a central copy rather than huntingthem down in SVN after a conversion.

The amount of things that may need manually fixing up afterwards ispotentially looking concerning though, particularly files that aremissing from and/or incorrectly [un]pinned in labelled versions. Irecognise that this is an alpha release not expected to be perfect, soif there is anyone who would like more detailed debug info or for me torun additional checks, please let me know.

I filed a ticket for the duplicate-label and uninitialized-value issues,
but again there is likely not much to be done regarding the
destroyed/corrupted items.

There does seem (from the above) to be some kind of problem with thevss2svn script that gets itself out of sync between the PhysicalActionand VssAction output.

About time I did what I should have done in the first place and checkedoutstanding problem logs for anything that correlates, or theappropriate form for a minimal test case...


--
Stephen Lee

_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org
Mailing list web interface (with searchable archives):
http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user

Re: Various problems with import

Reply via email to