Date: Mon, 4 Aug 2003 13:53:52 -0700 (PDT)
From: David Byrne <[EMAIL PROTECTED]>
I am fairly new to Perl and haven't approached a scipt
this complex or computation this intensive. So I
would certainly appreciate any advice.
I have successfully created a hash of arrays
equivalent to a 122 x 6152 matrix that I want to run
in 'pairwise combinations' and execute the 'sum of the
difference squares' for each combination.
In other words:
rows: y1...y122
columns: x1...x6152
This is a single large matrix? Sparse or dense?
If sparse, a hash of hashes is probably the memory efficient way to store it:
$matrix{y32}{x53} = "value for row 32, column 53";
If dense, you could use an array of arrays:
$matrix[32][53] = "value for row 32, column 53";
Or you could investigate PDL ("Piddle", Perl Data Language).
so...
comb(y1,y2):
{( y1[x1] - y2[x1] ) ^2 + ( y1[x2] - y2[x2] ) ^2 + ...
+ ( y1[x122] - y2[x122] ) ^2};
You've reversed x and y compared to above.
# array of arrays version
for my $i (1..6152) {
for my $j ($i+1 .. 6152) {
$comb[$i][$j] = 0;
$comb[$i][$j] += ($matrix[$i][$_] - $matrix[$j][$_]) **2
for (1..122);
}
}
This is going to be very large. According to the
combinations formula (nCk, n=6152, k=2), the output
will be a hash (with, for example, 'y1y2' key and
'd^2' value) of about 19 million records.
Yes. PDL is more memory efficient. Or just run it on a machine that
has lots of RAM+swap. Or use various techniques to move most of the
storage out of memory into files or a database.
(Simplest example: instead of creating a $comb AoA above, just create
a $comb scalar each round, then write it out:
print "comb of rows $i and $j is $comb\n";
)
--kag
_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm