Peter, --- "Lobsingerp,Peter [CIS]" <[EMAIL PROTECTED]> wrote:
> As for sharing resources, the data that I am analysing is a 3-4 Gs > (more > than half of what I have available in RAM), so I wasn't too keen to > copy > all of this into 101-1001 processes, although forks.pm shows promise. > Sounds a lot easier than convincing my sysadmin to recompile Perl. > Given this memory usage, I also *highly* recommend you evaluate forks::BerkeleyDB: http://search.cpan.org/~rybskej/forks-BerkeleyDB-0.03/lib/forks/BerkeleyDB/shared.pm It is a strap-on module for forks.pm that abstracts all shared variable data into separate, high-performance BerkeleyDB databases. This means means *all* your shared data will be stored/accessed from physical drive with very efficient shared mem caching of commonly accessed data via the BerkeleyDB shared memory architecture. In your case, it sounds like this should result in huge memory savings, allowing you to run many more threads. You could further optimize performance if you were to set up a RAM disk to logically remap all this data back into memory: http://search.cpan.org/~rybskej/forks-BerkeleyDB-0.03/lib/forks/BerkeleyDB/shared.pm#Location_of_database_files which may seem like an odd workaround, except that you now have an ithreads application that benefits from fork() copy-on-write but still keeps all shared data in RAM. I've actually done this for one forks::BerkelyDB real-time data processing application that required fast access to cached data stored in nested forks::BerkeyeyDB::shared hashes. forks::BerkeleyDB also frees forks::shared up from the heaviest IPC socket load--shared data access--so it massively improves overall forks::shared performance under medium to high volume shared data access (to a level surprisingly comparable to native threads::shared, in some preliminary analysis I've conducted). > Another thing to note is that I do not use locking because this is an > analysis and I do not write to my data. > Both forks::shared and forks::BerleleyDB::shared support safe, concurrent reads of shared data. (I'm pretty certain this is also safe with native threads::shared.) In general, if you do any modify shared data while other threads read it, you must lock before all associated reads to insure data state is consistent; otherwise, it's "at your own risk" to not use locks (and may even affect application stability with native threads::shared). Regards, Eric
