On Aug 18, 2005, at 10:24 PM, Mark Cotner wrote:
I'm currently working on an application that will poll
thousands of cable modems per minute and I would like
to use PostgreSQL to maintain state between polls of
each device. This requires a very heavy amount of
updates in place on a reasonably large table(100k-500k
rows, ~7 columns mostly integers/bigint). Each row
will be refreshed every 15 minutes, or at least that's
how fast I can poll via SNMP. I hope I can tune the
DB to keep up.
The app is threaded and will likely have well over 100
concurrent db connections. Temp tables for storage
aren't a preferred option since this is designed to be
a shared nothing approach and I will likely have
several polling processes.
Somewhat OT, but..
The easiest way to speed that up is to use less threads. You're
adding a whole TON of overhead with that many threads that you just
don't want or need. You should probably be using something event-
driven to solve this problem, with just a few database threads to
store all that state. Less is definitely more in this case. See
<http://www.kegel.com/c10k.html> (and there's plenty of other
literature out there saying that event driven is an extremely good
way to do this sort of thing).
Here are some frameworks to look at for this kind of network code:
(Python) Twisted - <http://twistedmatrix.com/>
(Perl) POE - <http://poe.perl.org/>
(Java) java.nio (not familiar enough with the Java thing to know
whether or not there's a high-level wrapper)
(C++) ACE - <http://www.cs.wustl.edu/~schmidt/ACE.html>
(Ruby) IO::Reactor - <http://www.deveiate.org/code/IO-Reactor.html>
(C) libevent - <http://monkey.org/~provos/libevent/>
.. and of course, you have select/poll/kqueue/WaitNextEvent/whatever
that you could use directly, if you wanted to roll your own solution,
but don't do that.
If you don't want to optimize the whole application, I'd at least
just push the DB operations down to a very small number of
connections (*one* might even be optimal!), waiting on some kind of
thread-safe queue for updates from the rest of the system. This way
you can easily batch those updates into transactions and you won't be
putting so much unnecessary synchronization overhead into your
application and the database.
Generally, once you have more worker threads (or processes) than
CPUs, you're going to get diminishing returns in a bad way, assuming
those threads are making good use of their time.
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly