[PERFORM] How many tables is too many tables?

2005-09-19 Thread [EMAIL PROTECTED]
I have a database of hundreds of millions of web links (between sites)
in Postgres.  For each link, we record the url, the referer, and the
most recent date the link exists.  I'm having some serious performance
issues when it comes to writing new data into the database.

One machine is simply not going to be able to scale with the quantities
of links we hope to store information about and we want to move to some
kind of cluster.  Because of the quantities of data, it seems to make
sense to go for a cluster setup such that in a 4 machine cluster, each
machine has a quarter of the data (is this Share nothing, or, Share
everything?).  To that end, we figured a good first step was to
partition the data on one machine into multiple tables defining the
logic which would find the appropriate table given a piece of data.
Then, we assumed, adding the logic to find the appropriate machine and
database in our cluster would only be an incremental upgrade.

We implemented a partitioning scheme that segments the data by the
referring domain of each link.  This is clearly not the most regular
(in terms of even distribution) means of partitioning, but the data in
each table is most likely related to each other, so queries would hit
the smallest number of tables.  We currently have around 400,000 tables
and I would estimate that the vast majority of these tables are
relatively small (less than 200 rows).

Our queries use UNION ALL to combine data from multiple tables (when
that's applicable, never more than 1000 tables at once, usually much
fewer).  When writing to the database, the table for the referring
domain is locked while data is added and updated for the whole
referring domain at once.  We only store one copy of each link, so when
loading we have to do a SELECT (for the date) then INSERT or UPDATE
where applicable for each link.

At this point, the primary performance bottleneck is in adding
additional data to the database.  Our loader program (we load text
files of link information) is currently getting about 40 rows a second,
which is nowhere near the performance we need to be seeing.  In theory,
we want to be able to re-write our entire archive of data within on a
1-2 month cycle, so this is a very heavy write application (though
we're also constantly generating reports from the data, so its not
write only).

Is the total number of tables prohibitively affecting our write speed
or is that an IO problem that can only be addressed by better drive
partitioning (all data is on one drive, which I've already read is a
problem)?  Is this approach to data partitioning one which makes any
sense for performance, or should we move to a more normal distribution
of links across fewer tables which house more rows each?

Thanks in advance for your advice.

-matt


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] RAID Stripe size

2005-09-19 Thread bm\\mbn
Hi Everyone

The machine is IBM x345 with ServeRAID 6i 128mb cache and 6 SCSI 15k
disks.

2 disks are in RAID1 and hold the OS, SWAP  pg_xlog
4 disks are in RAID10 and hold the Cluster itself.

the DB will have two major tables 1 with 10 million rows and one with
100 million rows.
All the activities against this tables will be SELECT.

Currently the strip size is 8k. I read in many place this is a poor
setting.

Am i right ?


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] How can this be?

2005-09-19 Thread Martin Nickel
Hello all,
Mostly Postgres makes sense to me.  But now and then it does something
that  boggles my brain.  Take the statements below.  I have a table
(agent) with 5300 rows.  The primary key is agent_id.  I can do SELECT
agent_id FROM agent and it returns all PK values in less than half a
second (dual Opteron box, 4G ram, SATA Raid 10 drive system).

But when I do a DELETE on two rows with an IN statement, using the primary
key index (as stated by EXPLAIN) it take almost 4 minutes.
pg_stat_activity shows nine other connections, all idle.

If someone can explain this to me it will help restore my general faith in
order and consistancy in the universe.

Martin


-- Executing query:
SELECT count(*) from agent;
Total query runtime: 54 ms.
Data retrieval runtime: 31 ms.
1 rows retrieved.
Result: 5353

-- Executing query:
VACUUM ANALYZE agent;

-- Executing query:
DELETE FROM agent WHERE agent_id IN (15395, 15394);
Query returned successfully: 2 rows affected, 224092 ms execution time.

-- Executing query:
EXPLAIN DELETE FROM agent WHERE agent_id IN (15395, 15394);
Index Scan using agent2_pkey, agent2_pkey on agent  (cost=0.00..7.27
rows=2 width=6)
Index Cond: ((agent_id = 15395) OR (agent_id = 15394))

Here's my table
CREATE TABLE agent
(
  agent_id int4 NOT NULL DEFAULT nextval('agent_id_seq'::text),
  office_id int4 NOT NULL,
  lastname varchar(25),
  firstname varchar(25),
...other columns...  
  CONSTRAINT agent2_pkey PRIMARY KEY (agent_id),
  CONSTRAINT agent_office_fk FOREIGN KEY (office_id) REFERENCES office 
(office_id) ON UPDATE RESTRICT ON DELETE RESTRICT
) 
WITHOUT OIDS;



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] How can this be?

2005-09-19 Thread Steve Atkins
On Fri, Sep 16, 2005 at 08:34:14PM -0500, Martin Nickel wrote:

 Hello all,
 Mostly Postgres makes sense to me.  But now and then it does something
 that  boggles my brain.  Take the statements below.  I have a table
 (agent) with 5300 rows.  The primary key is agent_id.  I can do SELECT
 agent_id FROM agent and it returns all PK values in less than half a
 second (dual Opteron box, 4G ram, SATA Raid 10 drive system).
 
 But when I do a DELETE on two rows with an IN statement, using the primary
 key index (as stated by EXPLAIN) it take almost 4 minutes.
 pg_stat_activity shows nine other connections, all idle.
 
 If someone can explain this to me it will help restore my general faith in
 order and consistancy in the universe.

When you delete a row from agent PG needs to find any matching rows in
office. Is office large? Is office(office_id) indexed?

 -- Executing query:
 DELETE FROM agent WHERE agent_id IN (15395, 15394);
 Query returned successfully: 2 rows affected, 224092 ms execution time.
 
 -- Executing query:
 EXPLAIN DELETE FROM agent WHERE agent_id IN (15395, 15394);
 Index Scan using agent2_pkey, agent2_pkey on agent  (cost=0.00..7.27
 rows=2 width=6)
 Index Cond: ((agent_id = 15395) OR (agent_id = 15394))
 
 Here's my table
 CREATE TABLE agent
 (
   agent_id int4 NOT NULL DEFAULT nextval('agent_id_seq'::text),
   office_id int4 NOT NULL,
   lastname varchar(25),
   firstname varchar(25),
 ...other columns...  
   CONSTRAINT agent2_pkey PRIMARY KEY (agent_id),
   CONSTRAINT agent_office_fk FOREIGN KEY (office_id) REFERENCES office 
 (office_id) ON UPDATE RESTRICT ON DELETE RESTRICT
 ) 
 WITHOUT OIDS;

Cheers,
  Steve

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Index Selection: ORDER BY vs. PRIMARY KEY

2005-09-19 Thread Thomas F. O'Connell
I have a query that looks roughly like this (I've removed irrelevant  
SELECT clause material and obfuscated names, trying to keep them  
consistent where altered in EXPLAIN output):


SELECT u.emma_member_id, h.action_ts
FROM user as u, history as h
WHERE u.user_id = h.user_id
AND h.action_id = '$constant_data'
ORDER BY h.action_ts DESC LIMIT 100 OFFSET 0

The user table has ~25,000 rows. The history table has ~750,000 rows.  
Currently, there is an index on history.action_ts and a separate one  
on history.action_id. There's also a PRIMARY KEY on user.user_id. If  
I run the query as such, I get a plan like this:


   QUERY PLAN
 
 

Limit  (cost=0.00..2196.30 rows=100 width=925) (actual  
time=947.208..3178.775 rows=3 loops=1)
   -  Nested Loop  (cost=0.00..83898.65 rows=3820 width=925)  
(actual time=947.201..3178.759 rows=3 loops=1)
 -  Index Scan Backward using h_action_ts_idx on history h   
(cost=0.00..60823.53 rows=3820 width=480) (actual  
time=946.730..3177.953 rows=3 loops=1)

   Filter: (action_id = $constant_data::bigint)
 -  Index Scan using user_pkey on user u  (cost=0.00..6.01  
rows=1 width=445) (actual time=0.156..0.161 rows=1 loops=3)

   Index Cond: (u.user_id = outer.user_id)
Total runtime: 3179.143 ms
(7 rows)

If I drop the index on the timestamp field, I get a plan like this:

   QUERY PLAN
 
 
-
Limit  (cost=17041.41..17041.66 rows=100 width=925) (actual  
time=201.725..201.735 rows=3 loops=1)
   -  Sort  (cost=17041.41..17050.96 rows=3820 width=925) (actual  
time=201.719..201.722 rows=3 loops=1)

 Sort Key: h.action_ts
 -  Merge Join  (cost=13488.15..16814.13 rows=3820  
width=925) (actual time=7.306..201.666 rows=3 loops=1)

   Merge Cond: (outer.user_id = inner.user_id)
   -  Index Scan using user_pkey on user u   
(cost=0.00..3134.82 rows=26802 width=445) (actual time=0.204..151.351  
rows=24220 loops=1)
   -  Sort  (cost=13488.15..13497.70 rows=3820  
width=480) (actual time=0.226..0.234 rows=3 loops=1)

 Sort Key: h.user_id
 -  Index Scan using h_action_id_idx on history  
h  (cost=0.00..13260.87 rows=3820 width=480) (actual  
time=0.184..0.195 rows=3 loops=1)
   Index Cond: (action_id =  
$constant_data::bigint)

Total runtime: 202.089 ms
(11 rows)

Clearly, if the index on the timestamp field is there, postgres wants  
to use it for the ORDER BY, even though the performance is worse. How  
is this preference made internally? If both indexes exist, will  
postgres always prefer the index on an ordered column? If I need the  
index on the timestamp field for other queries, is my best bet just  
to increase sort_mem for this query?


Here's my version string:
PostgreSQL 8.0.3 on i686-pc-linux-gnu, compiled by GCC 2.95.4

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-469-5150
615-469-5151 (fax)


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] How can this be?

2005-09-19 Thread Stephan Szabo
On Fri, 16 Sep 2005, Martin Nickel wrote:

 Hello all,
 Mostly Postgres makes sense to me.  But now and then it does something
 that  boggles my brain.  Take the statements below.  I have a table
 (agent) with 5300 rows.  The primary key is agent_id.  I can do SELECT
 agent_id FROM agent and it returns all PK values in less than half a
 second (dual Opteron box, 4G ram, SATA Raid 10 drive system).

 But when I do a DELETE on two rows with an IN statement, using the primary
 key index (as stated by EXPLAIN) it take almost 4 minutes.
 pg_stat_activity shows nine other connections, all idle.

Are there any tables that reference agent or other triggers?  My first
guess would be that there's a foreign key check for something else that's
referencing agent.agent_id for which an index scan isn't being used.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Index Selection: ORDER BY vs. PRIMARY KEY

2005-09-19 Thread Tom Lane
Thomas F. O'Connell [EMAIL PROTECTED] writes:
 Clearly, if the index on the timestamp field is there, postgres wants  
 to use it for the ORDER BY, even though the performance is worse. How  
 is this preference made internally? If both indexes exist, will  
 postgres always prefer the index on an ordered column? If I need the  
 index on the timestamp field for other queries, is my best bet just  
 to increase sort_mem for this query?

If you suppose that Postgres has a preference for one index over
another, you're already fatally off track.  It's all about estimated
costs.  In this case, the plan with h_action_ts_idx is preferred because
it has a lower estimated cost (2196.30) than the other plan (17041.66).
The way to think about this is not that Postgres prefers one index
over another, but that the estimated costs aren't in line with reality.

It looks from the plans that there are a number of estimation errors
giving you trouble, but the one that seems most easily fixable is
here:

  -  Index Scan using h_action_id_idx on history h  (cost=0.00..13260.87 
rows=3820 width=480) (actual time=0.184..0.195 rows=3 loops=1)
Index Cond: (action_id = $constant_data::bigint)

Estimating 3820 rows matching $constant_data when there are really only
3 is a pretty serious estimation error :-( ... certainly more than
enough to explain a factor-of-100 error in the total estimated costs.

How recently did you last ANALYZE the history file?  If the ANALYZE
stats are up-to-date and it's still blowing the rowcount estimate by
a factor of 1000, maybe you need to increase the statistics target for
this column.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] How many tables is too many tables?

2005-09-19 Thread Tom Lane
[EMAIL PROTECTED] [EMAIL PROTECTED] writes:
 We currently have around 400,000 tables
 and I would estimate that the vast majority of these tables are
 relatively small (less than 200 rows).

Stop right there, and go redesign your schema.  This is unbelievably
wrong :-(

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] How many tables is too many tables?

2005-09-19 Thread John A Meinel
[EMAIL PROTECTED] wrote:
 I have a database of hundreds of millions of web links (between sites)
 in Postgres.  For each link, we record the url, the referer, and the
 most recent date the link exists.  I'm having some serious performance
 issues when it comes to writing new data into the database.

 One machine is simply not going to be able to scale with the quantities
 of links we hope to store information about and we want to move to some
 kind of cluster.  Because of the quantities of data, it seems to make
 sense to go for a cluster setup such that in a 4 machine cluster, each
 machine has a quarter of the data (is this Share nothing, or, Share
 everything?).  To that end, we figured a good first step was to
 partition the data on one machine into multiple tables defining the
 logic which would find the appropriate table given a piece of data.
 Then, we assumed, adding the logic to find the appropriate machine and
 database in our cluster would only be an incremental upgrade.

In a database app, you generally don't win by going to a cluster,
because you are almost always bound by your I/O. Which means that a
single machine, just with more disks, is going to outperform a group of
machines.

As Tom mentioned, your schema is not very good. So lets discuss what a
better schema would be, and also how you might be able to get decent
performance with a cluster.

First, 200rows * 400,000 tables = 80M rows. Postgres can handle this in
a single table without too much difficulty. It all depends on the
selectivity of your indexes, etc.

I'm not sure how you are trying to normalize your data, but it sounds
like having a url table so that each entry can be a simple integer,
rather than the full path, considering that you are likely to have a
bunch of repeated information.

This makes your main table something like 2 integers, plus the
interesting stuff (from url, to url, data).

If you are finding you are running into I/O problems, you probably could
use this layout to move your indexes off onto their own spindles, and
maybe separate the main table from the url tables.

What is your hardware? What are you trying to do that you don't think
will scale?

If you were SELECT bound, then maybe a cluster would help you, because
you could off-load the SELECTs onto slave machines, and leave your
primary machine available for INSERTs and replication.


...


 At this point, the primary performance bottleneck is in adding
 additional data to the database.  Our loader program (we load text
 files of link information) is currently getting about 40 rows a second,
 which is nowhere near the performance we need to be seeing.  In theory,
 we want to be able to re-write our entire archive of data within on a
 1-2 month cycle, so this is a very heavy write application (though
 we're also constantly generating reports from the data, so its not
 write only).

Are you VACUUMing enough? If you are rewriting all of the data, postgres
needs you to clean up afterwards. It is pessimistic, and leaves old rows
in their place.


 Is the total number of tables prohibitively affecting our write speed
 or is that an IO problem that can only be addressed by better drive
 partitioning (all data is on one drive, which I've already read is a
 problem)?  Is this approach to data partitioning one which makes any
 sense for performance, or should we move to a more normal distribution
 of links across fewer tables which house more rows each?

If all data is on a single drive, you are nowhere near needing a cluster
to improve your database. What you need is a 14-drive RAID array. It's
probably cheaper than 4x powerful machines, and will provide you with
much better performance. And put all of your tables back into one.

John
=:-


 Thanks in advance for your advice.

 -matt


 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] RAID Stripe size

2005-09-19 Thread John A Meinel
bm\mbn wrote:
 Hi Everyone

 The machine is IBM x345 with ServeRAID 6i 128mb cache and 6 SCSI 15k
 disks.

 2 disks are in RAID1 and hold the OS, SWAP  pg_xlog
 4 disks are in RAID10 and hold the Cluster itself.

 the DB will have two major tables 1 with 10 million rows and one with
 100 million rows.
 All the activities against this tables will be SELECT.

What type of SELECTs will you be doing? Mostly sequential reads of a
bunch of data, or indexed lookups of random pieces?


 Currently the strip size is 8k. I read in many place this is a poor
 setting.

From what I've heard of RAID, if you are doing large sequential
transfers, larger stripe sizes (128k, 256k) generally perform better.
For postgres, though, when you are writing, having the stripe size be
around the same size as your page size (8k) could be advantageous, as
when postgres reads a page, it only reads a single stripe. So if it were
reading a series of pages, each one would come from a different disk.

I may be wrong about that, though.

John
=:-


 Am i right ?



signature.asc
Description: OpenPGP digital signature