> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > > > > So, if I've got a massively parallel database, and I want to run > > a batch of queries against it in parallel--let's say five hundred, > > six at a time, what would be the best way to do that? My > first thought > > was a fork/connect for every query, but that's way too much > overhead. > > Six connections wouldn't be much. Fork the processes, connect each to > the database, and use IO::Socket and IO::Select to feed the child > processes queries. Thats what I did with my upd_stats utility (the > utility is just for Informix databases, but the principle is > there), at > http://www.iiug.org/ver2/software/index_DBA.html >
I tried something similar with Threads using Ruby (because I, too, wanted to avoid the overhead of lots of fork calls). I thought it would be really cool - create a separate thread for each query and run them all in parallel. What did I discover? At the end of the day, the database engine (in my case, the Oracle server) was the bottleneck, not a lack of forking or threading. This was probably because I couldn't set non-blocking mode without crashing (The Ruby Oracle driver isn't as far along as the Perl version) - it's been a while, so I can't remember for sure. My point is that, if your DB engine has a non-blocking mode, you will need to set it or you will get very little, if any, performance increase. Even then, it may not be the windfall you're expecting. Just my 2 cents. Regards, Mr. Sunblade
