On Aug 18, 2005, at 10:24 PM, Mark Cotner wrote:

I'm currently working on an application that will poll
thousands of cable modems per minute and I would like
to use PostgreSQL to maintain state between polls of
each device.  This requires a very heavy amount of
updates in place on a reasonably large table(100k-500k
rows, ~7 columns mostly integers/bigint).  Each row
will be refreshed every 15 minutes, or at least that's
how fast I can poll via SNMP.  I hope I can tune the
DB to keep up.

The app is threaded and will likely have well over 100
concurrent db connections.  Temp tables for storage
aren't a preferred option since this is designed to be
a shared nothing approach and I will likely have
several polling processes.

Somewhat OT, but..

The easiest way to speed that up is to use less threads. You're adding a whole TON of overhead with that many threads that you just don't want or need. You should probably be using something event- driven to solve this problem, with just a few database threads to store all that state. Less is definitely more in this case. See <http://www.kegel.com/c10k.html> (and there's plenty of other literature out there saying that event driven is an extremely good way to do this sort of thing).

Here are some frameworks to look at for this kind of network code:
(Python) Twisted - <http://twistedmatrix.com/>
(Perl) POE - <http://poe.perl.org/>
(Java) java.nio (not familiar enough with the Java thing to know whether or not there's a high-level wrapper)
(C++) ACE - <http://www.cs.wustl.edu/~schmidt/ACE.html>
(Ruby) IO::Reactor - <http://www.deveiate.org/code/IO-Reactor.html>
(C) libevent - <http://monkey.org/~provos/libevent/>

.. and of course, you have select/poll/kqueue/WaitNextEvent/whatever that you could use directly, if you wanted to roll your own solution, but don't do that.

If you don't want to optimize the whole application, I'd at least just push the DB operations down to a very small number of connections (*one* might even be optimal!), waiting on some kind of thread-safe queue for updates from the rest of the system. This way you can easily batch those updates into transactions and you won't be putting so much unnecessary synchronization overhead into your application and the database.

Generally, once you have more worker threads (or processes) than CPUs, you're going to get diminishing returns in a bad way, assuming those threads are making good use of their time.


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to