Re: [PERFORM] Decreasing BLKSZ

2006-09-26 Thread Bucky Jordan
  The bottom line here is likely to be you need more RAM :-(
 
 Yup.  Just trying to get a handle on what I can do if I need more than
 16G
 Of ram... That's as much as I can put on the installed based of
 servers 100s of them.
 
 
  I wonder whether there is a way to use table partitioning to
  make the insert pattern more localized?  We'd need to know a
  lot more about your insertion patterns to guess how, though.
 
  regards, tom lane
 
 We're doing partitioning as well.
 
I'm guessing that you basically have a data collection application that
sends in lots of records, and a reporting application that wants
summaries of the data? So, if I understand the problem correctly, you
don't have enough ram (or may not in the future) to index the data as it
comes in. 

Not sure how much you can change the design, but what about either
updating a summary table(s) as the records come in (trigger, part of the
transaction, or do it in the application) or, index periodically? In
otherwords, load a partition (say a day's worth) then index that
partition all at once. If you're doing real-time analysis that might not
work so well though, but the summary tables should. 

I assume the application generates unique records on its own due to the
timestamp, so this isn't really about checking for constraint
violations? If so, you can probably do away with the index on the tables
that you're running the inserts on.

- Bucky

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Decreasing BLKSZ

2006-09-26 Thread Marc Morin
Yes, that is our application.   We have implemented both scenarios...

1- partitions loaded without indexes on them.. And build index when
partition is full.  Slow to drill down into incomplete partitions.
2- paritions with index as loaded.  Slow, on insert (problem mentioned)
but good to drill down

So, I'd like my cake and eat it too... :-)

I'd like to have my indexes built as rows are inserted into the
partition so help with the drill down...

 -Original Message-
 From: Bucky Jordan [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, September 26, 2006 5:26 PM
 To: Marc Morin; Tom Lane
 Cc: Markus Schaber; pgsql-performance@postgresql.org
 Subject: RE: [PERFORM] Decreasing BLKSZ 
 
   The bottom line here is likely to be you need more RAM :-(
  
  Yup.  Just trying to get a handle on what I can do if I 
 need more than 
  16G Of ram... That's as much as I can put on the installed based of 
  servers 100s of them.
  
  
   I wonder whether there is a way to use table partitioning to make 
   the insert pattern more localized?  We'd need to know a lot more 
   about your insertion patterns to guess how, though.
  
 regards, tom lane
  
  We're doing partitioning as well.
  
 I'm guessing that you basically have a data collection 
 application that sends in lots of records, and a reporting 
 application that wants summaries of the data? So, if I 
 understand the problem correctly, you don't have enough ram 
 (or may not in the future) to index the data as it comes in. 
 
 Not sure how much you can change the design, but what about 
 either updating a summary table(s) as the records come in 
 (trigger, part of the transaction, or do it in the 
 application) or, index periodically? In otherwords, load a 
 partition (say a day's worth) then index that partition all 
 at once. If you're doing real-time analysis that might not 
 work so well though, but the summary tables should. 
 
 I assume the application generates unique records on its own 
 due to the timestamp, so this isn't really about checking for 
 constraint violations? If so, you can probably do away with 
 the index on the tables that you're running the inserts on.
 
 - Bucky
 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Decreasing BLKSZ

2006-09-26 Thread Bucky Jordan
 
 So, I'd like my cake and eat it too... :-)
 
 I'd like to have my indexes built as rows are inserted into the
 partition so help with the drill down...
 
So you want to drill down so fine grained that summary tables don't do
much good? Keep in mind, even if you roll up only two records, that's
half as many you have to process (be it for drill down or index). 

I've seen applications that have a log table with no indexes/constraints
and lots of records being inserted, then they only report on very fine
grained summary tables. Drill downs still work pretty well, but if you
get audited and want to see that specific action, well, you're in for a
bit of a wait, but hopefully that doesn't happen too often.

If that's the case (summary tables won't work), I'd be very curious how
you manage to get your cake and eat it too :) 

- Bucky

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Decreasing BLKSZ

2006-09-26 Thread Markus Schaber
Hi, Marc,

Marc Morin wrote:

 I wonder whether there is a way to use table partitioning to 
 make the insert pattern more localized?  We'd need to know a 
 lot more about your insertion patterns to guess how, though.
 
 We're doing partitioning as well.

And is constraint exclusion set up properly, and have you verified that
it works?

HTH,
Markus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Decreasing BLKSZ

2006-09-26 Thread Jim Nasby

On Sep 26, 2006, at 5:36 PM, Marc Morin wrote:

1- partitions loaded without indexes on them.. And build index when
partition is full.  Slow to drill down into incomplete partitions.
2- paritions with index as loaded.  Slow, on insert (problem  
mentioned)

but good to drill down


How big are your partitions? The number of rows in your active  
partition will determine how large your indexes are (and probably  
more importantly, how many levels there are), which will definitely  
affect your timing. So, you might have better luck with a smaller  
partition size.


I'd definitely try someone else's suggestion of making the PK  
logtime, key (assuming that you need to enforce uniqueness) and  
having an extra index on just key. If you don't need to enforce  
uniqueness, just have one index on key and one on logtime. Or if your  
partitions are small enough, don't even create the logtime index  
until the partition isn't being inserted into anymore.


If the number of key values is pretty fixed, it'd be an interesting  
experiment to try partitioning on that, perhaps even with one key per  
partition (which would allow you to drop the key from the tables  
entirely, ie:


CREATE TABLE stats_1 (logtime PRIMARY KEY, stat1, stat2, stat3);
CREATE TABLE stats_2 ...

CREATE VIEW stats AS
SELECT 1 AS  key, * FROM stats_1
UNION ALL SELECT 2, * FROM stats_2
...

I wouldn't put too much work into that as no real effort's been  
expended to optimize for that case (especially the resulting monster  
UNION ALL), but you might get lucky.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] Decreasing BLKSZ

2006-09-25 Thread Marc Morin



Our application has 
a number of inserters posting rows of network statistics into a database. 
This is occuring continously. The following is an example of a stats table 
(simplified but maintains key concepts).


CREATE TABLE stats 

(
 logtime timestamptz,
 key 
int,
 stat1 
bigint,
 stat2 
bigint,
 stat3 
bigint,
 PRIMARY KEY 
(key,logtime)
);
CREATE INDEX x ON 
stats(logtime);

There are on the 
order of 1M unique values for "key" and a new row for each key value will be 
inserted say every 15 minutes. These rows are divided up between a number 
of different inserting elements, but that isn't relevant.

The problem is, the 
insert pattern has low correlation with the (key,logtime) index. In 
this case, would need 1M blocks in my shared_buffer space to prevent a 
read-modify-write type of pattern happening during the inserts (given a large 
enough database).

Wondering about 
lowering the BLKSZ value so that the total working set of blocks required can be 
maintained in my shared buffers. Our database only has 8G of memory and 
likely need to reduce BLKSZ to 512

Any comment on other 
affects or gotchas with lowering the size of BLKSZ? Currently, our 
database is thrashing its cache of blocks we we're getting only ~100 
inserts/second, every insert results in a evict-read-modify 
operation.


Ideally, like to 
keep the entire working set of blocks in memory across insert periods so that 
the i/o looks more like write full blocks

Thanks
Marc




Re: [PERFORM] Decreasing BLKSZ

2006-09-25 Thread Markus Schaber
Hi, Marc,

Marc Morin wrote:

 The problem is, the insert pattern has low correlation with the
 (key,logtime) index.   In this case, would need 1M blocks in my
 shared_buffer space to prevent a read-modify-write type of pattern
 happening during the inserts (given a large enough database).

Would it be possible to change the primary key to (logtime,key)? This
could help keeping the working window small.

Secondly, the real working set is smaller, as the rows are all inserted
at the end of the table, filling each page until it's full, so only the
last pages are accessed. There's no relation between the index order,
and the order of data on disk, unless you CLUSTER.

 Any comment on other affects or gotchas with lowering the size of
 BLKSZ?  Currently, our database is thrashing its cache of blocks we
 we're getting only ~100 inserts/second, every insert results in a
 evict-read-modify operation.

I'm not shure that's the correct diagnosis.

Do you have one transaction per insert? Every transaction means a forced
sync to the disk, so you won't get more than about 100-200 commits per
second, depending on your actual disk rotation speed.

To improve concurrency of the numer of inserters running in parallel,
try to tweak the config variables commit_delay and commit_sibling, so
you get a higher overall throughput at cost of an increased delay per
connection, and increase the number of inserters. Using sensible
tweaking, the throughput should scale nearly linear with the number of
backens. :-)

If feasible for your application, you can also bundle several log
entries into a single transaction. If you're CPU bound, you can use COPY
instead of INSERT or (if you can wait for 8.2) the new multi-row INSERT
to further improve performance, but I doubt that you're CPU bound.

The only way to really get over the sync limit is to have (at least)
the WAL on a battery backed ram / SSD media that has no spinning disk
physical limit, or abandon crash safety by turning fsync off.

Thanks,
Markus.
-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Decreasing BLKSZ

2006-09-25 Thread Marc Morin
 Would it be possible to change the primary key to 
 (logtime,key)? This could help keeping the working window small.

No, the application accessing the data wants all the rows between start
and end time for a particular key value.  

 
 Secondly, the real working set is smaller, as the rows are 
 all inserted at the end of the table, filling each page until 
 it's full, so only the last pages are accessed. There's no 
 relation between the index order, and the order of data on 
 disk, unless you CLUSTER.

I'd theorizing that my problem is in updating the index itself and not
the heap.  Insert order
Refers to the order by which the applications are inserting the rows and
as such, the order by
Which the index is being updated.  This in turn, is causing the b-tree
to be traverse.  Problem
Is the working set of blocks at the bottom of the btree is too big for
my cache.

 
  Any comment on other affects or gotchas with lowering the size of 
  BLKSZ?  Currently, our database is thrashing its cache of blocks we 
  we're getting only ~100 inserts/second, every insert results in a 
  evict-read-modify operation.
 
 I'm not shure that's the correct diagnosis.
 
 Do you have one transaction per insert? Every transaction 
 means a forced sync to the disk, so you won't get more than 
 about 100-200 commits per second, depending on your actual 
 disk rotation speed.

No, an insert consists of roughly 10,000+ rows per transaction block.  

 
 To improve concurrency of the numer of inserters running in 
 parallel, try to tweak the config variables commit_delay and 
 commit_sibling, so you get a higher overall throughput at 
 cost of an increased delay per connection, and increase the 
 number of inserters. Using sensible tweaking, the throughput 
 should scale nearly linear with the number of backens. :-)

I don't think this will help us here due to large transactions already.

 
 If feasible for your application, you can also bundle several 
 log entries into a single transaction. If you're CPU bound, 
 you can use COPY instead of INSERT or (if you can wait for 
 8.2) the new multi-row INSERT to further improve performance, 
 but I doubt that you're CPU bound.

 
 The only way to really get over the sync limit is to have 
 (at least) the WAL on a battery backed ram / SSD media that 
 has no spinning disk
 physical limit, or abandon crash safety by turning fsync off.

Again, problem is not with WAL writing, already on it's own raid1 disk
pair.  The 
I/O pattern we see is about 1-2% load on WAL and 100% load on the array
holding the indexes and tables. Throughput is very low, something like
150k-200K bytes/second of real  rows being deposited on the disk.

The disks are busy seeking all over the disk platter to fetch a block,
add a single row, then seek to another spot and write back a previously
dirty buffer

 
 Thanks,
 Markus.
 --
 Markus Schaber | Logical TrackingTracing International AG
 Dipl. Inf. | Software Development GIS
 
 Fight against software patents in Europe! www.ffii.org 
 www.nosoftwarepatents.org
 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Decreasing BLKSZ

2006-09-25 Thread Tom Lane
Marc Morin [EMAIL PROTECTED] writes:
 No, an insert consists of roughly 10,000+ rows per transaction block.  

Perhaps it would help to pre-sort these rows by key?

Like Markus, I'm pretty suspicious of lowering BLCKSZ ... you can try it
but it's likely to prove counterproductive (more btree index levels,
more rows requiring toasting, a tighter limit on what rows will fit at
all).  I doubt I'd try to make it lower than a couple K in any case.

The bottom line here is likely to be you need more RAM :-(

I wonder whether there is a way to use table partitioning to make the
insert pattern more localized?  We'd need to know a lot more about your
insertion patterns to guess how, though.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Decreasing BLKSZ

2006-09-25 Thread Mark Lewis
I'm not sure if decreasing BLKSZ is the way to go.  It would allow you
to have more smaller blocks in memory, but the actual coverage of the
index would remain the same; if only 33% of the index fits in memory
with the 8K BLKSZ then only 33% would fit in memory with a 4k BLKSZ.  I
can see where you're going if the tree nodes for all 15 million key
entries fit in memory as well as the most recent nodes for the logtime
nodes lower down in the index; basically trying to make sure that the
right 33% of the index is in memory. 

But it seems like it might be more useful to have two indexes, one on
logtime and one on key.  Inserts into the logtime index would be
correlated with your insert order and as such be cache-friendly so
that's not an issue.  The index on just the key column would be at least
as small as the active subset of a combined index, so performance should
be at least as good as you could possibly achieve by reducing BLKSIZE.

PG 8.1 is smart enough to use a bitmap index scan to combine the two
indexes at query time; if that gives you adequate performance then it
would be simpler than reducing BLKSIZE.

-- Mark Lewis

On Mon, 2006-09-25 at 17:54 -0400, Marc Morin wrote:
  Would it be possible to change the primary key to 
  (logtime,key)? This could help keeping the working window small.
 
 No, the application accessing the data wants all the rows between start
 and end time for a particular key value.  
 
  
  Secondly, the real working set is smaller, as the rows are 
  all inserted at the end of the table, filling each page until 
  it's full, so only the last pages are accessed. There's no 
  relation between the index order, and the order of data on 
  disk, unless you CLUSTER.
 
 I'd theorizing that my problem is in updating the index itself and not
 the heap.  Insert order
 Refers to the order by which the applications are inserting the rows and
 as such, the order by
 Which the index is being updated.  This in turn, is causing the b-tree
 to be traverse.  Problem
 Is the working set of blocks at the bottom of the btree is too big for
 my cache.
 
  
   Any comment on other affects or gotchas with lowering the size of 
   BLKSZ?  Currently, our database is thrashing its cache of blocks we 
   we're getting only ~100 inserts/second, every insert results in a 
   evict-read-modify operation.
  
  I'm not shure that's the correct diagnosis.
  
  Do you have one transaction per insert? Every transaction 
  means a forced sync to the disk, so you won't get more than 
  about 100-200 commits per second, depending on your actual 
  disk rotation speed.
 
 No, an insert consists of roughly 10,000+ rows per transaction block.  
 
  
  To improve concurrency of the numer of inserters running in 
  parallel, try to tweak the config variables commit_delay and 
  commit_sibling, so you get a higher overall throughput at 
  cost of an increased delay per connection, and increase the 
  number of inserters. Using sensible tweaking, the throughput 
  should scale nearly linear with the number of backens. :-)
 
 I don't think this will help us here due to large transactions already.
 
  
  If feasible for your application, you can also bundle several 
  log entries into a single transaction. If you're CPU bound, 
  you can use COPY instead of INSERT or (if you can wait for 
  8.2) the new multi-row INSERT to further improve performance, 
  but I doubt that you're CPU bound.
 
  
  The only way to really get over the sync limit is to have 
  (at least) the WAL on a battery backed ram / SSD media that 
  has no spinning disk
  physical limit, or abandon crash safety by turning fsync off.
 
 Again, problem is not with WAL writing, already on it's own raid1 disk
 pair.  The 
 I/O pattern we see is about 1-2% load on WAL and 100% load on the array
 holding the indexes and tables. Throughput is very low, something like
 150k-200K bytes/second of real  rows being deposited on the disk.
 
 The disks are busy seeking all over the disk platter to fetch a block,
 add a single row, then seek to another spot and write back a previously
 dirty buffer
 
  
  Thanks,
  Markus.
  --
  Markus Schaber | Logical TrackingTracing International AG
  Dipl. Inf. | Software Development GIS
  
  Fight against software patents in Europe! www.ffii.org 
  www.nosoftwarepatents.org
  
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Decreasing BLKSZ

2006-09-25 Thread Marc Morin
 
 
 The bottom line here is likely to be you need more RAM :-(

Yup.  Just trying to get a handle on what I can do if I need more than
16G
Of ram... That's as much as I can put on the installed based of
servers 100s of them.

 
 I wonder whether there is a way to use table partitioning to 
 make the insert pattern more localized?  We'd need to know a 
 lot more about your insertion patterns to guess how, though.
 
   regards, tom lane

We're doing partitioning as well.
 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings