Excellent feedback. Thank you. Please do keep in mind I'm storing the results of SNMP queries. The majority of the time each thread is in a wait state, listening on a UDP port for return packet. The number of threads is high because in order to sustain poll speed I need to minimize the impact of timeouts and all this waiting for return packets.
I had intended to have a fallback plan which would build a thread safe queue for db stuffs, but the application isn't currently architected that way. It's not completely built yet so now is the time for change. I hadn't thought of building up a batch of queries and creating a transaction from them. I've been looking into memcached as a persistent object store as well and hadn't seen the reactor pattern yet. Still trying to get my puny brain around that one. Again, thanks for the help. 'njoy, Mark On 8/19/05 5:09 AM, "Bob Ippolito" <[EMAIL PROTECTED]> wrote: > > On Aug 18, 2005, at 10:24 PM, Mark Cotner wrote: > >> I'm currently working on an application that will poll >> thousands of cable modems per minute and I would like >> to use PostgreSQL to maintain state between polls of >> each device. This requires a very heavy amount of >> updates in place on a reasonably large table(100k-500k >> rows, ~7 columns mostly integers/bigint). Each row >> will be refreshed every 15 minutes, or at least that's >> how fast I can poll via SNMP. I hope I can tune the >> DB to keep up. >> >> The app is threaded and will likely have well over 100 >> concurrent db connections. Temp tables for storage >> aren't a preferred option since this is designed to be >> a shared nothing approach and I will likely have >> several polling processes. > > Somewhat OT, but.. > > The easiest way to speed that up is to use less threads. You're > adding a whole TON of overhead with that many threads that you just > don't want or need. You should probably be using something event- > driven to solve this problem, with just a few database threads to > store all that state. Less is definitely more in this case. See > <http://www.kegel.com/c10k.html> (and there's plenty of other > literature out there saying that event driven is an extremely good > way to do this sort of thing). > > Here are some frameworks to look at for this kind of network code: > (Python) Twisted - <http://twistedmatrix.com/> > (Perl) POE - <http://poe.perl.org/> > (Java) java.nio (not familiar enough with the Java thing to know > whether or not there's a high-level wrapper) > (C++) ACE - <http://www.cs.wustl.edu/~schmidt/ACE.html> > (Ruby) IO::Reactor - <http://www.deveiate.org/code/IO-Reactor.html> > (C) libevent - <http://monkey.org/~provos/libevent/> > > .. and of course, you have select/poll/kqueue/WaitNextEvent/whatever > that you could use directly, if you wanted to roll your own solution, > but don't do that. > > If you don't want to optimize the whole application, I'd at least > just push the DB operations down to a very small number of > connections (*one* might even be optimal!), waiting on some kind of > thread-safe queue for updates from the rest of the system. This way > you can easily batch those updates into transactions and you won't be > putting so much unnecessary synchronization overhead into your > application and the database. > > Generally, once you have more worker threads (or processes) than > CPUs, you're going to get diminishing returns in a bad way, assuming > those threads are making good use of their time. > > -bob > ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings