Greetings,
First some background:  I've been working on converting a rather large
(~19 GB) VSS repository to SVN.  Yes, it's a mess.  The source
repository is all full of gunk.  But the other issues can be addressed
separately.

The issue here is that being such a large repository (the resulting
SVN repository is about 22000 revisions), and considering that there
are a few files in this repository of several hundred megabytes in
size (why, I don't know), memory usage became a huge problem.

Note that I've never used VSS in my life, and my background in Perl is
fairly limited.  I'm just the poor sucker who got tasked with
converting this repository.  This patch implements something more or
less like the recommendation made by Toby here:
http://thread.gmane.org/gmane.comp.version-control.subversion.vss2svn.user/1652/focus=1654

It adds a 'file' attribute to the Node class associating a node with
the physical file that that node's contents are read from. It also
keeps the 'text' attribute around, though now nothing is using it.

In Dumpfile.pm it changes get_export_contents to get_export_file which
sets the 'file' attribute, but doesn't actually open the file yet.
Finally, in output_content, if $node->{text} is undefined, it checks
for $node->{file}, and if that exists it uses File::Copy to copy that
file's contents into the dump file.

I've tested this with a couple smaller repositories, and our huge one,
and it works as advertised.  Whereas previously my memory usage would
slowly increase, slowing my system to a crawl, now it just stays flat
since File::Copy only buffers 2MB at a time.  So assuming I haven't
made any glaring errors, (and please point them out if I have), enjoy
:)

Erik
diff -ur vss2svn/script/Vss2Svn/Dumpfile/Node.pm vss2svn-working/script/Vss2Svn/Dumpfile/Node.pm
--- vss2svn/script/Vss2Svn/Dumpfile/Node.pm	2007-06-11 17:19:29.000000000 -0400
+++ vss2svn-working/script/Vss2Svn/Dumpfile/Node.pm	2007-06-11 17:20:59.000000000 -0400
@@ -32,6 +32,7 @@
          copypath => undef,
          props => undef,
          hideprops => 0,
+         file => undef,
          text => undef,
         };
 
diff -ur vss2svn/script/Vss2Svn/Dumpfile.pm vss2svn-working/script/Vss2Svn/Dumpfile.pm
--- vss2svn/script/Vss2Svn/Dumpfile.pm	2007-06-11 17:15:42.000000000 -0400
+++ vss2svn-working/script/Vss2Svn/Dumpfile.pm	2007-06-11 17:20:59.000000000 -0400
@@ -9,6 +9,8 @@
 use warnings;
 use strict;
 
+use File::Copy;
+
 our %gHandlers =
     (
      ADD        => \&_add_handler,
@@ -215,7 +217,7 @@
     $node->{action} = 'add';
 
     if ($data->{itemtype} == 2) {
-        $self->get_export_contents($node, $data, $expdir);
+        $self->get_export_file($node, $data, $expdir);
     }
 
 #    $self->track_modified($data->{physname}, $data->{revision_id}, $itempath);
@@ -243,7 +245,7 @@
     $node->{action} = 'change';
 
     if ($data->{itemtype} == 2) {
-        $self->get_export_contents($node, $data, $expdir);
+        $self->get_export_file($node, $data, $expdir);
     }
 
 #    $self->track_modified($data->{physname}, $data->{revision_id}, $itempath);
@@ -706,9 +708,9 @@
 }  #  End last_deleted_rev_path
 
 ###############################################################################
-#  get_export_contents
+#  get_export_file
 ###############################################################################
-sub get_export_contents {
+sub get_export_file {
     my($self, $node, $data, $expdir) = @_;
 
     if (!defined($expdir)) {
@@ -719,23 +721,10 @@
         return 0;
     }
 
-    my $file = "$expdir/$data->{physname}.$data->{version}";
-
-    if (!open EXP, "$file") {
-        $self->add_error("Could not open export file '$file'");
-        return 0;
-    }
-
-    binmode(EXP);
-
-#   $node->{text} = join('', <EXP>);
-    $node->{text} = do { local( $/ ) ; <EXP> } ;
-
-    close EXP;
-
+    $node->{file} = "$expdir/$data->{physname}.$data->{version}";
     return 1;
 
-}  #  End get_export_contents
+}  #  End get_export_file
 
 ###############################################################################
 #  output_node
@@ -747,18 +736,18 @@
     my $string = $node->get_headers();
     print $fh $string;
     $self->output_content($node->{hideprops}? undef : $node->{props},
-                          $node->{text});
+                          $node->{text}, $node->{file});
 }  #  End output_node
 
 ###############################################################################
 #  output_content
 ###############################################################################
 sub output_content {
-    my($self, $props, $text) = @_;
+    my($self, $props, $text, $file) = @_;
 
     my $fh = $self->{fh};
 
-    $text = '' unless defined $text;
+    $text = '' unless defined $text || defined $file;
 
     my $proplen = 0;
     my $textlen = 0;
@@ -780,7 +769,11 @@
         $proplen = length($propout);
     }
 
-    $textlen = length($text);
+    if(!defined $text && defined $file) {
+        $textlen = -s $file;
+    } else {
+        $textlen = length($text);
+    }
     return if ($textlen + $proplen == 0);
 
     if ($proplen > 0) {
@@ -792,7 +785,14 @@
     }
 
     print $fh "Content-length: " . ($proplen + $textlen)
-        . "\n\n$propout$text\n";
+        . "\n\n$propout";
+
+    if(!defined $text && defined $file) {
+        copy($file, $fh);
+        print $fh "\n";
+    } else {
+        print $fh "$text\n";
+    }
 
 }  #  End output_content
 
_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org
Mailing list web interface (with searchable archives):
http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user

Reply via email to