Ok, Thanks Jim. I may be able to slim down some of the hashes and/or
maybe break up the program some.
Thanks,
Aimee
On Jan 8, 2010, at 4:46 PM, Jim Gibson wrote:
On 1/8/10 Fri Jan 8, 2010 2:06 PM, "Aimee Cardenas"
<aim...@sfbrgenetics.org> scribbled:
Hi, All,
Hope everyone's new year is starting out very well.
I'm having to adjust a perl program I wrote to manipulate some
genetics data. Originally, the program had no memory problems but
now
that I've added a couple more hashes, I'm having memory issues. It
now runs out of memory when it's about half way through processing
the
data. Unfortunately, the data is very interconnected and the
statistics I need to execute involves data from several of the
hashes. I'm thinking of using threads & threads::shared in order to
be able to process, store and access the data among several of 8
processors on a Sun Spark system. Of course, I'll need to install
the
threads and threads::shared modules and possible even re-compile perl
on this machine but before I go and do all this fun stuff, I wanted
to
ask your opinion about whether or not I'm going down the right rabbit
hole or if I'm just digging myself a shallow grave. Would this be
the
way you might do it? I've also heard of Semaphores. Might this be a
better way to go about spreading the data in hash form among several
processors on one machine and still be able to access the data in
each
hash from the main program?
I am not familiar with the details of the Sparc platform
architecture, but
in general multi-threading will not help with memory problems.
Normally, all
processors in a multi-processor system are accessing the same physical
memory, although each process will have access to different parts.
In a
multi-threaded program, all threads are accessing the same subset of
memory.
How much memory does your system have? Are you running in 32-bit
mode or
64-bit mode (the latter can access more memory)?
You need to minimize your memory usage by trimming your hashes to the
minimum, making sure you have deleted data from memory after you are
through
with it, and using a compact data representation. If you still run
out of
memory, you will need to keep some data in a disk file or a disk-based
database, although that will slow down your program noticeably.
Arrays have
a little less overhead than hashes, so use them instead when you can.
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/