Hi Stephen,

thanks for looking this up. This is indeed a problem. MOVES and esp. ARCHIVE / RESTORES are kind of lacking support up to now, since it was not used so much in the main contributors context.

The GetChildRecs function is used in two contexts: MergeParentData and MergeMoveData. And in both context a match of more than one file is an error. Previoulsy, we had this time-constraint that led to numerous orphaned files, because of missing matching rows. Since we use the "widetime" branch, things are better in this area.

As you write yourself, your fix will still give a matching pair, even if the two actions are years away. What would happen if we had picked up the wrong action in the first place?

Two ideas come into my mind in order to solve the problem:

1.) small timeframe for move actions:
the need for the big timeframe is due to the fact, that commiting files into the archive and finally updating the parent is not a distinct action. If you have folders with very large files, it can take some time to transfer via the network, create the physical file and so on. The parent is updated after the last file is committed to the folder. In a move action the move is completed more or less in a very short time frame, since no files have to be transferred, no new files created and so on. Only source and destination parents have to be changed. So we can expect short timeframes here.

2.) In a move situation the database rows must be next to each other. Again, since the move action is done in one go, even if the timestamp for both actions are not the same, in the sorted database the two move actions must be next to each other, since no other vss action could interfere (hopefully)

I will think about this a little further.
Good to here, that your conversion  is now nearly successful.

Best regards
Dirk


Stephen Lee schrieb:
If a project is moved more than once by the same person, this causes problems (on vss2svn trunk).

You see log messages like this:
TASK: MERGEMOVEDATA
ERROR -- Multiple chidl recs for parent MOVE rec '36'
at vss2svn.pl line 620
ERROR -- Multiple chidl recs for parent MOVE rec '116'
at vss2svn.pl line 620
ERROR -- Multiple chidl recs for parent MOVE rec '17614'
at vss2svn.pl line 620

This seems to be because GetChildRecs will match the "other half" of any move done by the same user to the same folder irrespective of how far apart the timestamps are. If a project has been moved back and forth, it pulls in both (or all) records and "does the wrong thing".

Since both calls to GetChildRecs complain if they get multiple records (and thus presumably have no valid reason to do so), I fixed this by adding a "LIMIT 1" to the SQL. Together with the 'ORDER BY' this will then return only the record with the closest timestamp. It should probably have some sensible limit on timestamp though, as in some cases it could match events that happened months or even years apart.


I also had a project that had been moved into a folder that was later deleted, and moved back out again. i.e. equivalent of:

Create $/Project/Subproj
Move to $/Project/Tmpproj/Subproj
Move to $/Project/Subproj  (by a different user)
Destroy $/Project/Tmpproj

The parent records from "the other half of the move" were lost with Tmpproj. Other projects initially created in Tmpproj were orphaned, and I see that there is code in the move to try and "do the right thing" by moving to orphaned in such a case.

The change to Dumpfile.pm in move_handler ensures that it will create the appropriate subfolder of orphaned (based on the same thing in the add_handler)

Finally, when the project is moved back out of its orphaned location, it was not getting picked up correctly. Instead an old, inactive copy of the project was being shared.

The change to ActionHandler.pm's move_handler changes how it determines if it needs to move from orphaned... now based on active parents rather than physinfo->{order} (which seems to be all parents an item has ever had). At the same time I also promoted the source comment
        # Don't know from where to move. Share it there instead
into an error
$self->{errmsg} .= "Don't know from where to move $physname. Sharing instead.\n";


This works well on my database, which now fully matches SourceSafe except for one file on a couple of branches (that file had such obscure things done to it I have trouble following it by hand!)


There are a few possible caveats with the changes, in particular where I've tweaked code that was obviously in there to fix problems that did not exist in my repository... would be interested to know if it improves or has adverse effects on other repositories.

I also attach a batch file that I had used to create a test SourceSafe database to trigger this problem (not quite a minimal test case - I'd kept on adding things on to try and match the history of what the real project had done, until I realised the critical part was the move done by a different user - 'Guest' in the batch file).

------------------------------------------------------------------------

Index: script/vss2svn.pl
===================================================================
--- script/vss2svn.pl   (revision 285)
+++ script/vss2svn.pl   (working copy)
@@ -550,6 +550,7 @@
     AND author = ?
 ORDER BY
     ABS(? - timestamp)
+LIMIT 1
 EOSQL
my $sth = $gCfg{dbh}->prepare($sql);
Index: script/Vss2Svn/ActionHandler.pm
===================================================================
--- script/Vss2Svn/ActionHandler.pm     (revision 285)
+++ script/Vss2Svn/ActionHandler.pm     (working copy)
@@ -413,14 +413,17 @@
if (!defined $row->{parentphys}) {
       # Check if this is an orphaned item
-      if (scalar @{$physinfo->{order}} == 1) {
-        $row->{parentphys} = $physinfo->{order}[0];
+      # or if there is otherwise only one active item path
+      my @parents = $self->_get_active_parents ($row->{physname});
+      if (scalar @parents == 1) {
+        $row->{parentphys} = $parents[0];
       } else {
-        # Don't know from where to move. Share it there instead
+        $self->{errmsg} .= "Don't know from where to move $physname. Sharing 
instead.\n";
         $row->{parentphys} = $row->{info};
         $row->{info} = undef;
         $self->{action} = 'SHARE';
-        return $self->_share_handler();
+        $self->_share_handler();
+        return 0;
       }
     }
Index: script/Vss2Svn/Dumpfile.pm
===================================================================
--- script/Vss2Svn/Dumpfile.pm  (revision 285)
+++ script/Vss2Svn/Dumpfile.pm  (working copy)
@@ -390,7 +390,20 @@
             . "missing recover; skipping");
         return 0;
     }
- +
+    my $success = $self->{repository}->exists_parent ($newpath);
+    if(!defined($success)) {
+        $self->add_error("Attempt to move item '$itempath' to '$newpath' at "
+            . "revision $data->{revision_id}, but path consistency failure at 
dest");
+        return 0;
+    }
+    elsif ($success == 0) {
+        $self->add_error("Parent path missing while trying to move "
+            . "item '$itempath' to '$newpath' at "
+            . "revision $data->{revision_id}: adding missing parents");
+        $self->_create_svn_path ($nodes, $newpath);
+    }
+ my $node = Vss2Svn::Dumpfile::Node->new();
     $node->set_initial_props($newpath, $data);
     $node->{action} = 'add';
------------------------------------------------------------------------

set SSDIR=D:\sstestdb
rmdir /s /q %SSDIR%\data
mkdir %SSDIR%\data
mkss %SSDIR%\data
ddconv %SSDIR%\data
ss cp $/ -YAdmin
ss create myproject -YAdmin -Crevision1
pause
ss create myproject/mysubproj -YAdmin -Crevision1
pause
echo revision1 > testfile.txt
ss cp $/myproject/mysubproj -YAdmin
ss add testfile.txt -YAdmin -Crevision1 -W
pause
ss cp $/ -YAdmin
ss rename myproject/mysubproj myproject/subproj -YAdmin
pause
ss rename myproject project -YAdmin
pause
ss move project/subproj subproj -YAdmin
pause
ss create project/subproj -YAdmin -Ctemporary
pause
ss move subproj project/subproj/subproj -YAdmin
pause
ss cp $/ -YAdmin
ss create project/subproj/subproj2 -YAdmin -Csubproj2
pause
ss cp $/project/subproj/subproj2 -YAdmin
ss share $/project/subproj/subproj/testfile.txt -YAdmin -G-
pause
ss cp $/project/subproj/subproj -YAdmin
ss checkout testfile.txt -YAdmin -G-
echo revision2 > testfile.txt
ss checkin testfile.txt -YAdmin -W -Crevision2
pause
ss cp $/ -YAdmin
ss rename project/subproj project/tmpproj -YAdmin
pause
ss move project/tmpproj/subproj project/subproj -YGuest
pause
ss move project/tmpproj/subproj2 project/subproj2 -YAdmin
pause
ss destroy project/tmpproj -YAdmin
------------------------------------------------------------------------

_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org
Mailing list web interface (with searchable archives):
http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user

_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org
Mailing list web interface (with searchable archives):
http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user

Reply via email to