> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> > 
> > So, if I've got a massively parallel database, and I want to run
> > a batch of queries against it in parallel--let's say five hundred,
> > six at a time, what would be the best way to do that? My 
> first thought
> > was a fork/connect for every query, but that's way too much 
> overhead.
> 
> Six connections wouldn't be much. Fork the processes, connect each to
> the database, and use IO::Socket and IO::Select to feed the child
> processes queries. Thats what I did with my upd_stats utility (the
> utility is just for Informix databases, but the principle is 
> there), at
> http://www.iiug.org/ver2/software/index_DBA.html
>

I tried something similar with Threads using Ruby (because I, too, wanted to
avoid the overhead of lots of fork calls).  I thought it would be really
cool - create a separate thread for each query and run them all in parallel.
What did I discover?

At the end of the day, the database engine (in my case, the Oracle server)
was the bottleneck, not a lack of forking or threading.  This was probably
because I couldn't set non-blocking mode without crashing (The Ruby Oracle
driver isn't as far along as the Perl version) - it's been a while, so I
can't remember for sure.  My point is that, if your DB engine has a
non-blocking mode, you will need to set it or you will get very little, if
any, performance increase.  Even then, it may not be the windfall you're
expecting.

Just my 2 cents.

Regards,

Mr. Sunblade



 

Reply via email to