Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-22 Thread Mark Cotner
Thanks again everyone for the excellent suggestions.

I looked into IO::Reactor, but after a few hours of fiddling decided I was
getting the kind of performance I wanted from using a slightly more than
modest number of threads and decided(due to dev timelines) to come back to
patching the SNMP libraries for Ruby to do async using Reactor later.

I am unfortunately stuck with updates, but I think(with you're suggestions)
I've made it work for me.

MySQL = 1500 updates/sec
PostgreSQL w/10k tx per commit using single thread = 1400 updates/sec
Given the update heavy nature of this table I felt it was necessary to test
during a vacuum.  Turns out the hit wasn't that bad . . .
PostgreSQL w/10k tx per commit using a single thread during a vacuum = 1300
updates/sec

100-200 updates/sec is a small price to pay for mature stored procedures,
more stored procedure language options, acid compliance, mvcc, very few if
any corrupt tables(get about 2 a week from MySQL on the 40 DBs I manage),
more crash resistant db(crash about once a month on one of my 40 MySQL dbs),
and replication that actually works for more than a day before quitting for
no apparent reason ;) [/flame off]

For those of you with Cox Communications cable modems look forward to better
customer service and cable plant management.  :)

And if anyone's curious here's the app I'm rebuilding/updating
http://www.mysql.com/customers/customer.php?id=16
We won runner up behind Saabre airline reservation system for MySQL app of
the year.  Needless to say they weren't too happy when they heard we might
be switching DBs. 

'njoy,
Mark

On 8/19/05 1:12 PM, "J. Andrew Rogers" <[EMAIL PROTECTED]> wrote:

> On 8/19/05 1:24 AM, "Mark Cotner" <[EMAIL PROTECTED]> wrote:
>> I'm currently working on an application that will poll
>> thousands of cable modems per minute and I would like
>> to use PostgreSQL to maintain state between polls of
>> each device.  This requires a very heavy amount of
>> updates in place on a reasonably large table(100k-500k
>> rows, ~7 columns mostly integers/bigint).  Each row
>> will be refreshed every 15 minutes, or at least that's
>> how fast I can poll via SNMP.  I hope I can tune the
>> DB to keep up.
>> 
>> The app is threaded and will likely have well over 100
>> concurrent db connections.  Temp tables for storage
>> aren't a preferred option since this is designed to be
>> a shared nothing approach and I will likely have
>> several polling processes.
> 
> 
> Mark,
> 
> We have PostgreSQL databases on modest hardware doing exactly what you are
> attempting to (massive scalable SNMP monitoring system).  The monitoring
> volume for a single database server appears to exceed what you are trying to
> do by a few orders of magnitude with no scaling or performance issues, so I
> can state without reservation that PostgreSQL can easily handle your
> application in theory.
> 
> However, that is predicated on having a well-architected system that
> minimizes resource contention and unnecessary blocking, and based on your
> description you may be going about it a bit wrong.
> 
> The biggest obvious bottleneck is the use of threads and massive
> process-level parallelization.  As others have pointed out, async queues are
> your friends, as is partitioning the workload horizontally rather than
> vertically through the app stack.  A very scalable high-throughput engine
> for SNMP polling only requires two or three threads handling different parts
> of the workload to saturate the network, and by choosing what each thread
> does carefully you can all but eliminate blocking when there is work to be
> done.
> 
> We only use a single database connection to insert all the data into
> PostgreSQL, and that process/thread receives its data from a work queue.
> Depending on how you design your system, you can batch many records in your
> queue as a single transaction.  In our case, we also use very few updates,
> mostly just inserts, which is probably advantageous in terms of throughput
> if you have the disk for it.  The insert I/O load is easily handled, and our
> disk array is a modest 10k SCSI rig.  The only thing that really hammers the
> server is when multiple reporting processes are running, which frequently
> touch several million rows each (the database is much larger than the system
> memory), and even this is manageable with clever database design.
> 
> 
> In short, what you are trying to do is easily doable on PostgreSQL in
> theory.  However, restrictions on design choices may pose significant
> hurdles.  We did not start out with an ideal system either; it took a fair
> amount of re-engineering to solve all the bottlenecks and problems that pop
> up.
> 
> Good luck,
> 
> J. Andrew Rogers
> [EMAIL PROTECTED]
> 
> 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq



---(end of broadcast)

Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Mark Cotner
:)  Most of the ppl on this list are systems programmers, however I am not.
The tool of choice for this app is Ruby and the libraries don't support
async SNMP at the moment.

I've done a good deal of async snmp and the libraries that actually pull it
off generally aren't that good(Net-SNMP and Perl's Net::SNMP).  Granted, UDP
is connectionless to an extent, but you still have to send the PDU, and bind
to the return socket and wait.  If you batch the outgoing PDUs then you can
get away with sending them out synchronously and listening on the returning
socket synchronously, but it would require that your libraries support this.
I understand the concepts well enough, maybe I'll put together a patch.  It
would be much lower overhead than managing all those threads.  Looks like
it's gonna be a fun weekend.

Thanks again for all the great feedback.

'njoy,
Mark


On 8/19/05 2:11 PM, "PFC" <[EMAIL PROTECTED]> wrote:

> 
>> While I agree that hundreds of threads seems like overkill, I think the
>> above advice might be going too far in the other direction.  The problem
>> with single-threaded operation is that any delay affects the whole
>> system --- eg, if you're blocked waiting for disk I/O, the CPU doesn't
> 
> You use UDP which is a connectionless protocol... then why use threads ?
> 
> I'd advise this :
> 
> Use asynchronous network code (one thread) to do your network stuff. This
> will lower the CPU used by this code immensely.
> Every minute, dump a file contianing everything to insert into the table.
> Use another thread to COPY it into the DB, in a temporary table if you
> wish, and then INSERT INTO ... SELECT.
> This should be well adapted to your requirements.
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Ron

At 03:31 PM 8/19/2005, Alex Turner wrote:

Don't forget that Ultra 320 is the speed of the bus, not each drive.
No matter how many honking 15k disks you put on a 320MB bus, you can
only get 320MB/sec! and have so many outstanding IO/s on the bus.


Of course.  This is exactly why multi-channel SCSI and multichannel 
Fibre Channel cards exist; and why external RAID enclosures usually 
have multiple such cards in them...


Even moderately acceptable U320 SCSI cards are dual channel at this 
point (think Adaptec dual channel AHA's), and Quad channel ones 
are just as common.  The Quads will, of course, saturate a 64b 133MHz 
PCI-X bus.  _IF_ the chipset on them can keep up.


The current kings of RAID card performance are all Fibre Channel 
based, and all the ones I know of are theoretically capable of 
saturating a 64b 133MHz PCI-X bus.  Again, _IF_ the chipset on them 
can keep up.


Most commodity RAID card have neither adequate CPU nor enough 
buffer.  Regardless of the peripheral IO technology they use.




Not so with SATA! Each drive is on it's own bus, and you are only
limited by the speed of your PCI-X Bus, which can be as high as
800MB/sec at 133Mhz/64bit.


That's the Theory anyway, and latency should be lower as well.  OTOH, 
as my wife likes to say "In theory, Theory and Practice are the 
same.  In practice, they almost never are."


You are only getting the performance you mention as long as your card 
can keep up with multiplexing N IO streams, crunching RAID 5 XORs 
(assuming you are using RAID 5), etc, etc.  As I'm sure you know, 
"The chain is only as strong as its weakest link.".


Most commodity SATA RAID cards brag about being able to pump 300MB/s 
(they were all over LW SF bragging about this!?), which in this 
context is woefully unimpressive.  Sigh.


I'm impressed with the Areca cards because they usually have CPUs 
that actually can come close to pushing the theoretical IO limit of 
the bus they are plugged into; and they can be upgraded to (barely) 
acceptable buffer amounts (come on, manufacturers! 4GB of DDR 
PC3200 is only -2- DIMMs, and shortly that will be enough to hold 8GB 
of DDR PC3200.  Give us more buffer!).



It's cheap and it's fast - all you have to do is pay for the 
enclosure, which can be a bit pricey, but there are some nice 24bay 
and even 40bay enclosures out there for SATA.


I've even seen 48 bay ones.  However, good enclosures, particularly 
for larger numbers of HDs, are examples of non-trivial engineering 
and priced accordingly.  Too many times I see people buy "bargain" 
enclosures and set themselves and their organizations up for some 
_very_ unpleasant times that could easily have been avoided by being 
careful to buy quality products.  "Pay when you buy or pay much more later."




Yes a 15k RPM drive will give you better seek time and better peak
through put, but put them all on a single U320 bus and you won't see
much return past a stripe size of 3 or 4


Agreed.  Same holds for 2Gbps FC.  Haven't tested 4Gbps FC personally 
yet, but I'm told the limit is higher in the manner you'd expect.




If it's raw transactions per second data warehouse style, it's all
about the xlog baby which is sequential writes, and all about large
block reads, which is sequential reads.

Alex Turner
NetEconomist
P.S. Sorry if i'm a bit punchy, I've been up since yestarday with
server upgrade nightmares that continue ;)


My condolences and sympathies.  I've definitely been there and done that.

Ron Peacetree



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Alex Turner
Don't forget that Ultra 320 is the speed of the bus, not each drive. 
No matter how many honking 15k disks you put on a 320MB bus, you can
only get 320MB/sec! and have so many outstanding IO/s on the bus.

Not so with SATA! Each drive is on it's own bus, and you are only
limited by the speed of your PCI-X Bus, which can be as high as
800MB/sec at 133Mhz/64bit.

It's cheap and it's fast - all you have to do is pay for the
enclosure, which can be a bit pricey, but there are some nice 24bay
and even 40bay enclosures out there for SATA.

Yes a 15k RPM drive will give you better seek time and better peak
through put, but put them all on a single U320 bus and you won't see
much return past a stripe size of 3 or 4.

If it's raw transactions per second data warehouse style, it's all
about the xlog baby which is sequential writes, and all about large
block reads, which is sequential reads.

Alex Turner
NetEconomist
P.S. Sorry if i'm a bit punchy, I've been up since yestarday with
server upgrade nightmares that continue ;)

On 8/19/05, Ron <[EMAIL PROTECTED]> wrote:
> Alex mentions a nice setup, but I'm pretty sure I know how to beat
> that IO subsystems HW's performance by at least 1.5x or 2x.  Possibly
> more.  (No, I do NOT work for any vendor I'm about to discuss.)
> 
> Start by replacing the WD Raptors with Maxtor Atlas 15K II's.
> At 5.5ms average access, 97.4MB/s outer track throughput, 85.9MB/s
> average, and 74.4 MB/s inner track throughput, they have the best
> performance characteristics of any tested shipping HDs I know
> of.  (Supposedly the new SAS versions will _sustain_ ~98MB/s, but
> I'll believe that only if I see it under independent testing).
> In comparison, the numbers on the WD740GD are 8.1ms average access,
> 71.8, 62.9, and 53.9 MB/s outer, average and inner track throughputs
> respectively.
> 
> Be prepared to use as many of them as possible (read: as many you can
> afford) if you want to maximize transaction rates, particularly for
> small transactions like this application seems to be mentioning.
> 
> Next, use a better RAID card.  The TOL enterprise stuff (Xyratex,
> Engino, Dot-hill) is probably too expensive, but in the commodity
> market benchmarks indicate that that Areca's 1GB buffer RAID cards
> currently outperform all the other commodity RAID stuff.
> 
> 9 Atlas II's per card in a RAID 5 set, or 16 per card in a RAID 10
> set, should max the RAID card's throughput and come very close to, if
> not attaining, the real world peak bandwidth of the 64b 133MHz PCI-X
> bus they are plugged into.  Say somewhere in the 700-800MB/s range.
> 
> Repeat the above for as many independent PCI-X buses as you have for
> a very fast commodity RAID IO subsystem.
> 
> Two such configured cards used in the dame manner as mentioned by
> Alex should easily attain 1.5x - 2x the transaction numbers mentioned
> by Alex unless there's a bottleneck somewhere else in the system design.
> 
> Hope this helps,
> Ron Peacetree
> 
> At 08:40 AM 8/19/2005, Alex Turner wrote:
> >I have managed tx speeds that high from postgresql going even as high
> >as 2500/sec for small tables, but it does require a good RAID
> >controler card (yes I'm even running with fsync on).  I'm using 3ware
> >9500S-8MI with Raptor drives in multiple RAID 10s.  The box wasn't too
> >$$$ at just around $7k.  I have two independant controlers on two
> >independant PCI buses to give max throughput. on with a 6 drive RAID
> >10 and the other with two 4 drive RAID 10s.
> >
> >Alex Turner
> >NetEconomist
> >
> >On 8/19/05, Mark Cotner <[EMAIL PROTECTED]> wrote:
> > > Hi all,
> > > I bet you get tired of the same ole questions over and
> > > over.
> > >
> > > I'm currently working on an application that will poll
> > > thousands of cable modems per minute and I would like
> > > to use PostgreSQL to maintain state between polls of
> > > each device.  This requires a very heavy amount of
> > > updates in place on a reasonably large table(100k-500k
> > > rows, ~7 columns mostly integers/bigint).  Each row
> > > will be refreshed every 15 minutes, or at least that's
> > > how fast I can poll via SNMP.  I hope I can tune the
> > > DB to keep up.
> > >
> > > The app is threaded and will likely have well over 100
> > > concurrent db connections.  Temp tables for storage
> > > aren't a preferred option since this is designed to be
> > > a shared nothing approach and I will likely have
> > > several polling processes.
> > >
> > > Here are some of my assumptions so far . . .
> > >
> > > HUGE WAL
> > > Vacuum hourly if not more often
> > >
> > > I'm getting 1700tx/sec from MySQL and I would REALLY
> > > prefer to use PG.  I don't need to match the number,
> > > just get close.
> > >
> > > Is there a global temp table option?  In memory tables
> > > would be very beneficial in this case.  I could just
> > > flush it to disk occasionally with an insert into blah
> > > select from memory table.
> > >
> > > Any help or creative alternatives would be greatly
> >

Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread PFC



While I agree that hundreds of threads seems like overkill, I think the
above advice might be going too far in the other direction.  The problem
with single-threaded operation is that any delay affects the whole
system --- eg, if you're blocked waiting for disk I/O, the CPU doesn't


You use UDP which is a connectionless protocol... then why use threads ?

I'd advise this :

	Use asynchronous network code (one thread) to do your network stuff. This  
will lower the CPU used by this code immensely.

Every minute, dump a file contianing everything to insert into the 
table.
	Use another thread to COPY it into the DB, in a temporary table if you  
wish, and then INSERT INTO ... SELECT.

This should be well adapted to your requirements.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Ron

At 12:34 PM 8/19/2005, Jeffrey W. Baker wrote:

On Fri, 2005-08-19 at 10:54 -0400, Ron wrote:
> Maxtor Atlas 15K II's.

> Areca's 1GB buffer RAID cards

The former are SCSI disks and the latter is an SATA controller.  The
combination would have a transaction rate of approximately 0.


You are evidently thinking of the Areca ARC-11xx controllers (and you 
are certainly right for that HW combination ;-) ).  Those are not the 
only product Areca makes that can be upgraded to a 1GB cache.


Until SAS infrastructure is good enough, U320 SCSI and FC HD's remain 
the top performing HD's realistically available.  At the most 
fundamental, your DBMS is only as good as your HD IO subsystem, and 
your HD IO subsystem is only as good as your HDs.  As others have 
said here, skimping on your HDs is _not_ a good design choice where 
DBMSs are concerned.


As an aside, the Atlas 15K II's are now available in SAS:
http://www.maxtor.com/portal/site/Maxtor/menuitem.ba88f6d7cf664718376049b291346068/?channelpath=/en_us/Products/SCSI%20Hard%20Drives/Atlas%2015K%20Family/Atlas%2015K%20II%20SAS

I haven't seen independent benches on them, so I explicitly 
referenced the U320 Atlas 15K II's known performance numbers 
instead.  As I said, Maxtor is claiming even better for the SAS 
version of the Atlas 15K II.


None of the SAS <-> PCI-X or PCI-E RAID cards I know of are ready for 
mass market yet, although a few are in beta..




I can vouch for the Areca controllers, however.  You can certainly
achieve pgbench transaction rates in the hundreds per second even with
only 5 7200RPM disks and 128MB cache.

Don't forget to buy the battery.


Agreed.

Hope this is helpful,
Ron Peacetree



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread J. Andrew Rogers
On 8/19/05 1:24 AM, "Mark Cotner" <[EMAIL PROTECTED]> wrote:
> I'm currently working on an application that will poll
> thousands of cable modems per minute and I would like
> to use PostgreSQL to maintain state between polls of
> each device.  This requires a very heavy amount of
> updates in place on a reasonably large table(100k-500k
> rows, ~7 columns mostly integers/bigint).  Each row
> will be refreshed every 15 minutes, or at least that's
> how fast I can poll via SNMP.  I hope I can tune the
> DB to keep up.
> 
> The app is threaded and will likely have well over 100
> concurrent db connections.  Temp tables for storage
> aren't a preferred option since this is designed to be
> a shared nothing approach and I will likely have
> several polling processes.


Mark,

We have PostgreSQL databases on modest hardware doing exactly what you are
attempting to (massive scalable SNMP monitoring system).  The monitoring
volume for a single database server appears to exceed what you are trying to
do by a few orders of magnitude with no scaling or performance issues, so I
can state without reservation that PostgreSQL can easily handle your
application in theory.

However, that is predicated on having a well-architected system that
minimizes resource contention and unnecessary blocking, and based on your
description you may be going about it a bit wrong.

The biggest obvious bottleneck is the use of threads and massive
process-level parallelization.  As others have pointed out, async queues are
your friends, as is partitioning the workload horizontally rather than
vertically through the app stack.  A very scalable high-throughput engine
for SNMP polling only requires two or three threads handling different parts
of the workload to saturate the network, and by choosing what each thread
does carefully you can all but eliminate blocking when there is work to be
done.

We only use a single database connection to insert all the data into
PostgreSQL, and that process/thread receives its data from a work queue.
Depending on how you design your system, you can batch many records in your
queue as a single transaction.  In our case, we also use very few updates,
mostly just inserts, which is probably advantageous in terms of throughput
if you have the disk for it.  The insert I/O load is easily handled, and our
disk array is a modest 10k SCSI rig.  The only thing that really hammers the
server is when multiple reporting processes are running, which frequently
touch several million rows each (the database is much larger than the system
memory), and even this is manageable with clever database design.


In short, what you are trying to do is easily doable on PostgreSQL in
theory.  However, restrictions on design choices may pose significant
hurdles.  We did not start out with an ideal system either; it took a fair
amount of re-engineering to solve all the bottlenecks and problems that pop
up.

Good luck,

J. Andrew Rogers
[EMAIL PROTECTED]



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Jeffrey W. Baker
On Fri, 2005-08-19 at 10:54 -0400, Ron wrote:
> Maxtor Atlas 15K II's.

> Areca's 1GB buffer RAID cards 

The former are SCSI disks and the latter is an SATA controller.  The
combination would have a transaction rate of approximately 0.

I can vouch for the Areca controllers, however.  You can certainly
achieve pgbench transaction rates in the hundreds per second even with
only 5 7200RPM disks and 128MB cache.

Don't forget to buy the battery.

-jwb

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Ron

At 09:58 AM 8/19/2005, Andreas Pflug wrote:

The 1-2k xact/sec for MySQL seems suspicious, sounds very much like 
write-back cached, not write-through, esp. considering that heavy 
concurrent write access isn't said to be MySQLs strength...


Don't be suspicious.

I haven't seen the code under discussion, but I have seen mySQL 
easily achieve these kinds of numbers using the myISAM storage engine 
in write-through cache

mode.

myISAM can be =FAST=.  Particularly when decent HW is thrown at it.

Ron



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Ron
Alex mentions a nice setup, but I'm pretty sure I know how to beat 
that IO subsystems HW's performance by at least 1.5x or 2x.  Possibly 
more.  (No, I do NOT work for any vendor I'm about to discuss.)


Start by replacing the WD Raptors with Maxtor Atlas 15K II's.
At 5.5ms average access, 97.4MB/s outer track throughput, 85.9MB/s 
average, and 74.4 MB/s inner track throughput, they have the best 
performance characteristics of any tested shipping HDs I know 
of.  (Supposedly the new SAS versions will _sustain_ ~98MB/s, but 
I'll believe that only if I see it under independent testing).
In comparison, the numbers on the WD740GD are 8.1ms average access, 
71.8, 62.9, and 53.9 MB/s outer, average and inner track throughputs 
respectively.


Be prepared to use as many of them as possible (read: as many you can 
afford) if you want to maximize transaction rates, particularly for 
small transactions like this application seems to be mentioning.


Next, use a better RAID card.  The TOL enterprise stuff (Xyratex, 
Engino, Dot-hill) is probably too expensive, but in the commodity 
market benchmarks indicate that that Areca's 1GB buffer RAID cards 
currently outperform all the other commodity RAID stuff.


9 Atlas II's per card in a RAID 5 set, or 16 per card in a RAID 10 
set, should max the RAID card's throughput and come very close to, if 
not attaining, the real world peak bandwidth of the 64b 133MHz PCI-X 
bus they are plugged into.  Say somewhere in the 700-800MB/s range.


Repeat the above for as many independent PCI-X buses as you have for 
a very fast commodity RAID IO subsystem.


Two such configured cards used in the dame manner as mentioned by 
Alex should easily attain 1.5x - 2x the transaction numbers mentioned 
by Alex unless there's a bottleneck somewhere else in the system design.


Hope this helps,
Ron Peacetree

At 08:40 AM 8/19/2005, Alex Turner wrote:

I have managed tx speeds that high from postgresql going even as high
as 2500/sec for small tables, but it does require a good RAID
controler card (yes I'm even running with fsync on).  I'm using 3ware
9500S-8MI with Raptor drives in multiple RAID 10s.  The box wasn't too
$$$ at just around $7k.  I have two independant controlers on two
independant PCI buses to give max throughput. on with a 6 drive RAID
10 and the other with two 4 drive RAID 10s.

Alex Turner
NetEconomist

On 8/19/05, Mark Cotner <[EMAIL PROTECTED]> wrote:
> Hi all,
> I bet you get tired of the same ole questions over and
> over.
>
> I'm currently working on an application that will poll
> thousands of cable modems per minute and I would like
> to use PostgreSQL to maintain state between polls of
> each device.  This requires a very heavy amount of
> updates in place on a reasonably large table(100k-500k
> rows, ~7 columns mostly integers/bigint).  Each row
> will be refreshed every 15 minutes, or at least that's
> how fast I can poll via SNMP.  I hope I can tune the
> DB to keep up.
>
> The app is threaded and will likely have well over 100
> concurrent db connections.  Temp tables for storage
> aren't a preferred option since this is designed to be
> a shared nothing approach and I will likely have
> several polling processes.
>
> Here are some of my assumptions so far . . .
>
> HUGE WAL
> Vacuum hourly if not more often
>
> I'm getting 1700tx/sec from MySQL and I would REALLY
> prefer to use PG.  I don't need to match the number,
> just get close.
>
> Is there a global temp table option?  In memory tables
> would be very beneficial in this case.  I could just
> flush it to disk occasionally with an insert into blah
> select from memory table.
>
> Any help or creative alternatives would be greatly
> appreciated.  :)
>
> 'njoy,
> Mark
>
>
> --
> Writing software requires an intelligent person,
> creating functional art requires an artist.
> -- Unknown
>
>
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
>
>http://www.postgresql.org/docs/faq
>

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Tom Lane
Andreas Pflug <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> As far as the question "can PG do 1-2k xact/sec", the answer is "yes
>> if you throw enough hardware at it".  Spending enough money on the
>> disk subsystem is the key ...
>> 
> The 1-2k xact/sec for MySQL seems suspicious, sounds very much like 
> write-back cached, not write-through, esp. considering that heavy 
> concurrent write access isn't said to be MySQLs strength...

> I wonder if preserving the database after a fatal crash is really 
> necessary, since the data stored sounds quite volatile; in this case, 
> fsync=false might be sufficient.

Yeah, that's something to think about.  If you do need full transaction
safety, then you *must* have a decent battery-backed-write-cache setup,
else your transaction commit rate will be limited by disk rotation
speed --- for instance, a single connection can commit at most 250 xacts
per second if the WAL log is on a 15000RPM drive.  (You can improve this
to the extent that you can spread activity across multiple connections,
but I'm not sure you can expect to reliably have 8 or more connections
ready to commit each time the disk goes 'round.)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Andreas Pflug

Tom Lane wrote:


Bob Ippolito <[EMAIL PROTECTED]> writes:
 

If you don't want to optimize the whole application, I'd at least  
just push the DB operations down to a very small number of  
connections (*one* might even be optimal!), waiting on some kind of  
thread-safe queue for updates from the rest of the system.
   



While I agree that hundreds of threads seems like overkill, I think the
above advice might be going too far in the other direction.  The problem
with single-threaded operation is that any delay affects the whole
system --- eg, if you're blocked waiting for disk I/O, the CPU doesn't
get anything done either.  You want enough DB connections doing things
in parallel to make sure that there's always something else useful to do
for each major component.  This is particularly important for Postgres,
which doesn't do any internal query parallelization (not that it would
help much anyway for the sorts of trivial queries you are worried about).
If you have, say, a 4-way CPU you want at least 4 active connections to
make good use of the CPUs.

I'd suggest trying to build the system so that it uses a dozen or two
active database connections.  If that doesn't match up to the number of
polling activities you want to have in flight at any instant, then you
can do something like what Bob suggested on the client side to bridge
the gap.

As far as the question "can PG do 1-2k xact/sec", the answer is "yes
if you throw enough hardware at it".  Spending enough money on the
disk subsystem is the key ...
 

The 1-2k xact/sec for MySQL seems suspicious, sounds very much like 
write-back cached, not write-through, esp. considering that heavy 
concurrent write access isn't said to be MySQLs strength...


I wonder if preserving the database after a fatal crash is really 
necessary, since the data stored sounds quite volatile; in this case, 
fsync=false might be sufficient.


Regards,
Andreas


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Tom Lane
Bob Ippolito <[EMAIL PROTECTED]> writes:
> If you don't want to optimize the whole application, I'd at least  
> just push the DB operations down to a very small number of  
> connections (*one* might even be optimal!), waiting on some kind of  
> thread-safe queue for updates from the rest of the system.

While I agree that hundreds of threads seems like overkill, I think the
above advice might be going too far in the other direction.  The problem
with single-threaded operation is that any delay affects the whole
system --- eg, if you're blocked waiting for disk I/O, the CPU doesn't
get anything done either.  You want enough DB connections doing things
in parallel to make sure that there's always something else useful to do
for each major component.  This is particularly important for Postgres,
which doesn't do any internal query parallelization (not that it would
help much anyway for the sorts of trivial queries you are worried about).
If you have, say, a 4-way CPU you want at least 4 active connections to
make good use of the CPUs.

I'd suggest trying to build the system so that it uses a dozen or two
active database connections.  If that doesn't match up to the number of
polling activities you want to have in flight at any instant, then you
can do something like what Bob suggested on the client side to bridge
the gap.

As far as the question "can PG do 1-2k xact/sec", the answer is "yes
if you throw enough hardware at it".  Spending enough money on the
disk subsystem is the key ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Alex Turner
I have managed tx speeds that high from postgresql going even as high
as 2500/sec for small tables, but it does require a good RAID
controler card (yes I'm even running with fsync on).  I'm using 3ware
9500S-8MI with Raptor drives in multiple RAID 10s.  The box wasn't too
$$$ at just around $7k.  I have two independant controlers on two
independant PCI buses to give max throughput. on with a 6 drive RAID
10 and the other with two 4 drive RAID 10s.

Alex Turner
NetEconomist

On 8/19/05, Mark Cotner <[EMAIL PROTECTED]> wrote:
> Hi all,
> I bet you get tired of the same ole questions over and
> over.
> 
> I'm currently working on an application that will poll
> thousands of cable modems per minute and I would like
> to use PostgreSQL to maintain state between polls of
> each device.  This requires a very heavy amount of
> updates in place on a reasonably large table(100k-500k
> rows, ~7 columns mostly integers/bigint).  Each row
> will be refreshed every 15 minutes, or at least that's
> how fast I can poll via SNMP.  I hope I can tune the
> DB to keep up.
> 
> The app is threaded and will likely have well over 100
> concurrent db connections.  Temp tables for storage
> aren't a preferred option since this is designed to be
> a shared nothing approach and I will likely have
> several polling processes.
> 
> Here are some of my assumptions so far . . .
> 
> HUGE WAL
> Vacuum hourly if not more often
> 
> I'm getting 1700tx/sec from MySQL and I would REALLY
> prefer to use PG.  I don't need to match the number,
> just get close.
> 
> Is there a global temp table option?  In memory tables
> would be very beneficial in this case.  I could just
> flush it to disk occasionally with an insert into blah
> select from memory table.
> 
> Any help or creative alternatives would be greatly
> appreciated.  :)
> 
> 'njoy,
> Mark
> 
> 
> --
> Writing software requires an intelligent person,
> creating functional art requires an artist.
> -- Unknown
> 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq
>

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Bob Ippolito


On Aug 19, 2005, at 12:14 AM, Mark Cotner wrote:

Excellent feedback.  Thank you.  Please do keep in mind I'm storing  
the
results of SNMP queries.  The majority of the time each thread is  
in a wait
state, listening on a UDP port for return packet.  The number of  
threads is
high because in order to sustain poll speed I need to minimize the  
impact of

timeouts and all this waiting for return packets.


Asynchronous IO via select/poll/etc. basically says: "given these 100  
sockets, wake me up when any of them has something to tell me, or  
wake me up anyway in N milliseconds".  From one thread, you can  
usually deal with thousands of connections without breaking a sweat,  
where with thread-per-connection you have so much overhead just for  
the threads that you probably run out of RAM before your network is  
throttled.  The reactor pattern basically just abstracts this a bit  
so that you worry about what do to when the sockets have something to  
say, and also allow you to schedule timed events, rather than having  
to worry about how to implement that correctly *and* write your  
application.


With 100 threads you are basically invoking a special-case of the  
same mechanism that only looks at one socket, but this makes for 100  
different data structures that end up in both userspace and kernel  
space, plus the thread stacks (which can easily be a few megs each)  
and context switching when any of them wakes up..  You're throwing a  
lot of RAM and CPU cycles out the window by using this design.


Also, preemptive threads are hard.

I had intended to have a fallback plan which would build a thread  
safe queue
for db stuffs, but the application isn't currently architected that  
way.

It's not completely built yet so now is the time for change.  I hadn't
thought of building up a batch of queries and creating a  
transaction from

them.


It should be *really* easy to just swap out the implementation of  
your "change this record" function with one that simply puts its  
arguments on a queue, with another thread that gets them from the  
queue and actually does the work.


I've been looking into memcached as a persistent object store as  
well and
hadn't seen the reactor pattern yet.  Still trying to get my puny  
brain

around that one.


memcached is RAM based, it's not persistent at all... unless you are  
sure all of your nodes will be up at all times and will never go  
down.  IIRC, it also just starts throwing away data once you hit its  
size limit.  If course, this isn't really any different than MySQL's  
MyISAM tables if you hit the row limit, but I think that memcached  
might not even give you an error when this happens.  Also, memcached  
is just key/value pairs over a network, not much of a database going  
on there.


If you can fit all this data in RAM and you don't care so much about  
the integrity, you might not benefit much from a RDBMS at all.   
However, I don't really know what you're doing with the data once you  
have it so I might be very wrong here...


-bob



Again, thanks for the help.

'njoy,
Mark


On 8/19/05 5:09 AM, "Bob Ippolito" <[EMAIL PROTECTED]> wrote:




On Aug 18, 2005, at 10:24 PM, Mark Cotner wrote:



I'm currently working on an application that will poll
thousands of cable modems per minute and I would like
to use PostgreSQL to maintain state between polls of
each device.  This requires a very heavy amount of
updates in place on a reasonably large table(100k-500k
rows, ~7 columns mostly integers/bigint).  Each row
will be refreshed every 15 minutes, or at least that's
how fast I can poll via SNMP.  I hope I can tune the
DB to keep up.

The app is threaded and will likely have well over 100
concurrent db connections.  Temp tables for storage
aren't a preferred option since this is designed to be
a shared nothing approach and I will likely have
several polling processes.



Somewhat OT, but..

The easiest way to speed that up is to use less threads.  You're
adding a whole TON of overhead with that many threads that you just
don't want or need.  You should probably be using something event-
driven to solve this problem, with just a few database threads to
store all that state.  Less is definitely more in this case.  See
 (and there's plenty of other
literature out there saying that event driven is an extremely good
way to do this sort of thing).

Here are some frameworks to look at for this kind of network code:
(Python) Twisted - 
(Perl) POE - 
(Java) java.nio (not familiar enough with the Java thing to know
whether or not there's a high-level wrapper)
(C++) ACE - 
(Ruby) IO::Reactor - 
(C) libevent - 

.. and of course, you have select/poll/kqueue/WaitNextEvent/whatever
that you could use directly, if you wanted to roll your own solution,
but d

Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Mark Cotner
Excellent feedback.  Thank you.  Please do keep in mind I'm storing the
results of SNMP queries.  The majority of the time each thread is in a wait
state, listening on a UDP port for return packet.  The number of threads is
high because in order to sustain poll speed I need to minimize the impact of
timeouts and all this waiting for return packets.

I had intended to have a fallback plan which would build a thread safe queue
for db stuffs, but the application isn't currently architected that way.
It's not completely built yet so now is the time for change.  I hadn't
thought of building up a batch of queries and creating a transaction from
them.

I've been looking into memcached as a persistent object store as well and
hadn't seen the reactor pattern yet.  Still trying to get my puny brain
around that one.

Again, thanks for the help.

'njoy,
Mark


On 8/19/05 5:09 AM, "Bob Ippolito" <[EMAIL PROTECTED]> wrote:

> 
> On Aug 18, 2005, at 10:24 PM, Mark Cotner wrote:
> 
>> I'm currently working on an application that will poll
>> thousands of cable modems per minute and I would like
>> to use PostgreSQL to maintain state between polls of
>> each device.  This requires a very heavy amount of
>> updates in place on a reasonably large table(100k-500k
>> rows, ~7 columns mostly integers/bigint).  Each row
>> will be refreshed every 15 minutes, or at least that's
>> how fast I can poll via SNMP.  I hope I can tune the
>> DB to keep up.
>> 
>> The app is threaded and will likely have well over 100
>> concurrent db connections.  Temp tables for storage
>> aren't a preferred option since this is designed to be
>> a shared nothing approach and I will likely have
>> several polling processes.
> 
> Somewhat OT, but..
> 
> The easiest way to speed that up is to use less threads.  You're
> adding a whole TON of overhead with that many threads that you just
> don't want or need.  You should probably be using something event-
> driven to solve this problem, with just a few database threads to
> store all that state.  Less is definitely more in this case.  See
>  (and there's plenty of other
> literature out there saying that event driven is an extremely good
> way to do this sort of thing).
> 
> Here are some frameworks to look at for this kind of network code:
> (Python) Twisted - 
> (Perl) POE - 
> (Java) java.nio (not familiar enough with the Java thing to know
> whether or not there's a high-level wrapper)
> (C++) ACE - 
> (Ruby) IO::Reactor - 
> (C) libevent - 
> 
> .. and of course, you have select/poll/kqueue/WaitNextEvent/whatever
> that you could use directly, if you wanted to roll your own solution,
> but don't do that.
> 
> If you don't want to optimize the whole application, I'd at least
> just push the DB operations down to a very small number of
> connections (*one* might even be optimal!), waiting on some kind of
> thread-safe queue for updates from the rest of the system.  This way
> you can easily batch those updates into transactions and you won't be
> putting so much unnecessary synchronization overhead into your
> application and the database.
> 
> Generally, once you have more worker threads (or processes) than
> CPUs, you're going to get diminishing returns in a bad way, assuming
> those threads are making good use of their time.
> 
> -bob
> 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] sustained update load of 1-2k/sec

2005-08-19 Thread Bob Ippolito


On Aug 18, 2005, at 10:24 PM, Mark Cotner wrote:


I'm currently working on an application that will poll
thousands of cable modems per minute and I would like
to use PostgreSQL to maintain state between polls of
each device.  This requires a very heavy amount of
updates in place on a reasonably large table(100k-500k
rows, ~7 columns mostly integers/bigint).  Each row
will be refreshed every 15 minutes, or at least that's
how fast I can poll via SNMP.  I hope I can tune the
DB to keep up.

The app is threaded and will likely have well over 100
concurrent db connections.  Temp tables for storage
aren't a preferred option since this is designed to be
a shared nothing approach and I will likely have
several polling processes.


Somewhat OT, but..

The easiest way to speed that up is to use less threads.  You're  
adding a whole TON of overhead with that many threads that you just  
don't want or need.  You should probably be using something event- 
driven to solve this problem, with just a few database threads to  
store all that state.  Less is definitely more in this case.  See  
 (and there's plenty of other  
literature out there saying that event driven is an extremely good  
way to do this sort of thing).


Here are some frameworks to look at for this kind of network code:
(Python) Twisted - 
(Perl) POE - 
(Java) java.nio (not familiar enough with the Java thing to know  
whether or not there's a high-level wrapper)

(C++) ACE - 
(Ruby) IO::Reactor - 
(C) libevent - 

.. and of course, you have select/poll/kqueue/WaitNextEvent/whatever  
that you could use directly, if you wanted to roll your own solution,  
but don't do that.


If you don't want to optimize the whole application, I'd at least  
just push the DB operations down to a very small number of  
connections (*one* might even be optimal!), waiting on some kind of  
thread-safe queue for updates from the rest of the system.  This way  
you can easily batch those updates into transactions and you won't be  
putting so much unnecessary synchronization overhead into your  
application and the database.


Generally, once you have more worker threads (or processes) than  
CPUs, you're going to get diminishing returns in a bad way, assuming  
those threads are making good use of their time.


-bob


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly