Date: Mon, 4 Aug 2003 13:53:52 -0700 (PDT)
   From: David Byrne <[EMAIL PROTECTED]>

   I am fairly new to Perl and haven't approached a scipt
   this complex or computation this intensive.  So I
   would certainly appreciate any advice.

   I have successfully created a hash of arrays
   equivalent to a 122 x 6152 matrix that I want to run
   in 'pairwise combinations' and execute the 'sum of the
   difference squares' for each combination.

   In other words:
   rows:      y1...y122
   columns:   x1...x6152

This is a single large matrix?  Sparse or dense?
If sparse, a hash of hashes is probably the memory efficient way to store it:
$matrix{y32}{x53} = "value for row 32, column 53";
If dense, you could use an array of arrays:
$matrix[32][53] = "value for row 32, column 53";
Or you could investigate PDL ("Piddle", Perl Data Language).

   so...
   comb(y1,y2): 
   {( y1[x1] - y2[x1] ) ^2 + ( y1[x2] - y2[x2] ) ^2 + ...
   + ( y1[x122] - y2[x122] ) ^2};

You've reversed x and y compared to above.

# array of arrays version
for my $i (1..6152) {
    for my $j ($i+1 .. 6152) {
        $comb[$i][$j] = 0;
        $comb[$i][$j] += ($matrix[$i][$_] - $matrix[$j][$_]) **2
            for (1..122);
    }
}

   This is going to be very large.  According to the
   combinations formula (nCk, n=6152, k=2), the output
   will be a hash (with, for example, 'y1y2' key and
   'd^2' value) of about 19 million records.  

Yes.  PDL is more memory efficient.  Or just run it on a machine that
has lots of RAM+swap.  Or use various techniques to move most of the
storage out of memory into files or a database.

(Simplest example: instead of creating a $comb AoA above, just create
a $comb scalar each round, then write it out:
print "comb of rows $i and $j is $comb\n";
)

--kag
_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to