This attached patch greatly enhances DBMirror.pl performance.
DBMirror.pl was known to have problems when replicating
bytea columns whose data was binary bytes (non printable).
(E.g. tiff, pdf, jpeg, bzip2, etc...)

Minutes, (or hours) for 500kb columns were not unusual.

There has even been an effort by Peter Wilson 
(<petew ( at ) yellowhawk ( dot ) co ( dot ) uk>) 
to write a C alternative,
which tries to overcome these problems.
However i think another C program is not really needed
at this point.

This patch changes the way extractData parses the data field.

Please have a look at the patch, if you see any potential problems.

Thank you.

P.S. 
I also emailed the -sql,-general,-pgreplication lists
with a relevant message.

Steven Singer (the original author) is not reachable.
However i think that some users might already suffer from
poor DBMirror.pl performance.

-- 
-Achilleus
*** DBMirror.pl Tue Jan 24 09:36:24 2006
--- DBMirror.pl.new     Tue Jan 24 09:41:11 2006
***************
*** 874,879 ****
--- 874,880 ----
  }
  
  
+ 
  sub extractData($$) {
    my $pendingResult = $_[0];
    my $currentTuple = $_[1];
***************
*** 881,886 ****
--- 882,888 ----
    my %valuesHash;
    $fnumber = 4;
    my $dataField = $pendingResult->getvalue($currentTuple,$fnumber);
+   my $numofbs;
  
    while(length($dataField)>0) {
      # Extract the field name that is surronded by double quotes
***************
*** 902,929 ****
             #Recommended in perlsyn manpage.
        do {
        my $matchString;
        #Find the substring ending with the first ' or first \
!       $dataField =~ m/(.*?[\'\\])?/s; 
        $matchString = $1;
-       $value .= substr $matchString,0,length($matchString)-1;
- 
-       if($matchString =~ m/(\'$)/s) {
-         # $1 runs to the end of the field value.
-           $dataField = substr $dataField,length($matchString)+1;
-           last;
-         
-       }
-       else {
-         #deal with the escape character.
-         #It The character following the escape gets appended.
-           $dataField = substr $dataField,length($matchString);            
-           $dataField =~ s/(^.)//s;        
-           $value .=  $1;
  
  
!         
        }
!       
           
        } until(length($dataField)==0);
    }
--- 904,930 ----
             #Recommended in perlsyn manpage.
        do {
        my $matchString;
+       my $matchString2;
        #Find the substring ending with the first ' or first \
!       $dataField =~ m/(.*?[\'])?/s; 
        $matchString = $1;
  
+       $numofbs = ($matchString =~ tr/\\//) % 2;       
  
!       if ($numofbs == 1) { #// odd number of \, i.e. intermediate '
!               $matchString2 = substr $matchString,0, length($matchString)-2;
!               $matchString2 =~ s/\\\\/\\/g;
!               $value .= ($matchString2 . "\'");
!               $dataField = substr $dataField,length($matchString);
        }
!       else { #// even number of \, i.e. found end of data
!               $matchString2 = substr $matchString,0, length($matchString)-1;
!               $matchString2 =~ s/\\\\/\\/g;
!               $value .= $matchString2;
!               $dataField = substr $dataField,length($matchString)+1;
!               last;
!       }
! 
           
        } until(length($dataField)==0);
    }
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to