Excellent feedback. Thank you. Please do keep in mind I'm storing the
results of SNMP queries. The majority of the time each thread is in a wait
state, listening on a UDP port for return packet. The number of threads is
high because in order to sustain poll speed I need to minimize the impact of
timeouts and all this waiting for return packets.
I had intended to have a fallback plan which would build a thread safe queue
for db stuffs, but the application isn't currently architected that way.
It's not completely built yet so now is the time for change. I hadn't
thought of building up a batch of queries and creating a transaction from
I've been looking into memcached as a persistent object store as well and
hadn't seen the reactor pattern yet. Still trying to get my puny brain
around that one.
Again, thanks for the help.
On 8/19/05 5:09 AM, "Bob Ippolito" <[EMAIL PROTECTED]> wrote:
> On Aug 18, 2005, at 10:24 PM, Mark Cotner wrote:
>> I'm currently working on an application that will poll
>> thousands of cable modems per minute and I would like
>> to use PostgreSQL to maintain state between polls of
>> each device. This requires a very heavy amount of
>> updates in place on a reasonably large table(100k-500k
>> rows, ~7 columns mostly integers/bigint). Each row
>> will be refreshed every 15 minutes, or at least that's
>> how fast I can poll via SNMP. I hope I can tune the
>> DB to keep up.
>> The app is threaded and will likely have well over 100
>> concurrent db connections. Temp tables for storage
>> aren't a preferred option since this is designed to be
>> a shared nothing approach and I will likely have
>> several polling processes.
> Somewhat OT, but..
> The easiest way to speed that up is to use less threads. You're
> adding a whole TON of overhead with that many threads that you just
> don't want or need. You should probably be using something event-
> driven to solve this problem, with just a few database threads to
> store all that state. Less is definitely more in this case. See
> <http://www.kegel.com/c10k.html> (and there's plenty of other
> literature out there saying that event driven is an extremely good
> way to do this sort of thing).
> Here are some frameworks to look at for this kind of network code:
> (Python) Twisted - <http://twistedmatrix.com/>
> (Perl) POE - <http://poe.perl.org/>
> (Java) java.nio (not familiar enough with the Java thing to know
> whether or not there's a high-level wrapper)
> (C++) ACE - <http://www.cs.wustl.edu/~schmidt/ACE.html>
> (Ruby) IO::Reactor - <http://www.deveiate.org/code/IO-Reactor.html>
> (C) libevent - <http://monkey.org/~provos/libevent/>
> .. and of course, you have select/poll/kqueue/WaitNextEvent/whatever
> that you could use directly, if you wanted to roll your own solution,
> but don't do that.
> If you don't want to optimize the whole application, I'd at least
> just push the DB operations down to a very small number of
> connections (*one* might even be optimal!), waiting on some kind of
> thread-safe queue for updates from the rest of the system. This way
> you can easily batch those updates into transactions and you won't be
> putting so much unnecessary synchronization overhead into your
> application and the database.
> Generally, once you have more worker threads (or processes) than
> CPUs, you're going to get diminishing returns in a bad way, assuming
> those threads are making good use of their time.
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings