Re: [PERFORM] Performances with new Intel Core* processors

2006-08-01 Thread Luke Lonergan
Vivek,

On 7/31/06 2:04 PM, Vivek Khera [EMAIL PROTECTED] wrote:

 No, but it *does* matter how fast said processor can sling the memory
 around, and in my experience, the opterons have been much better at
 that due to the efficiency of the memory transport layer.

My Mac laptop with a Core 1 and DDR2 RAM does 2700 MB/s memory bandwidth.
The Core 2 also has lower memory latency than the Opteron.

That said - Intel still hasn't figured out how to do cache-coherent SMP
scaling yet - the Opteron has the outstanding EV6/HTX bus and the cc-numa
cache coherency logic working today.

- Luke 



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performances with new Intel Core* processors

2006-08-01 Thread Florian Weimer
* Arjen van der Meijden:

 For a database system, however, processors hardly ever are the main
 bottleneck, are they?

Not directly, but the choice of processor influences which
chipsets/mainboards are available, which in turn has some impact on
the number of RAM slots.  (According to our hardware supplier, beyound
8 GB, the price per GB goes up sharply.)  Unfortunately, it seems that
the Core 2 Duo mainboards do not change that much in this area.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Are there any performance penalty for opposite edian platform combinations....

2006-08-01 Thread Tom Lane
Guoping Zhang [EMAIL PROTECTED] writes:
 In fact, it is a general question that Is it a good practice we shall avoid
 to run application server and database server on the platform with opposite
 edian? or it simply doesn't matter?

Our network protocol uses big-endian consistently, so there will be some
tiny hit for little-endian machines, independently of what's on the
other end of the wire.  I can't imagine you could measure the difference
though.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Query/database optimization

2006-08-01 Thread Eugeny N Dzhurinsky
Hello, I have a query:

explain analyze select tu.url_id, tu.url, coalesce(sd.recurse, 100), case when
COALESCE(get_option('use_banner')::integer,0) = 0 then 0 else ts.use_banner
end as use_banner, ts.use_cookies, ts.use_robots,  ts.includes, ts.excludes,
ts.track_domain, ts.task_id,get_available_pages(ts.task_id,ts.customer_id),
ts.redirects from task_url tu  inner join task_scheduler ts on
tu.task_id=ts.task_id  inner join  (subscription s inner join subscription_dic
sd  on sd.id=s.dict_id )  on s.customer_id=ts.customer_id  inner join customer
c on c.customer_id=ts.customer_id AND c.active  WHERE
get_available_pages(ts.task_id,ts.customer_id)  0 AND
((get_option('expired_users')::integer = 0) OR (isfinite(last_login) AND
extract('day' from current_timestamp - last_login)::integer =
coalesce(get_option('expired_users')::integer,100))) AND  ((s.status is null
AND ts.customer_id is null)  OR s.status  0) AND
(get_check_period(ts.task_id,ts.next_check) is null OR
(unix_timestamp(get_check_period(ts.task_id,ts.next_check)) -
unix_timestamp(timenow())  3600)) AND ts.status  1 AND ((ts.start_time 
current_time AND ts.stop_time  current_time)  OR (ts.start_time is null AND
ts.stop_time is null))  AND tu.url_id = 1  AND ts.customer_id not in (select
distinct customer_id from task_scheduler where status = 1)  order by
ts.next_check is not null, unix_timestamp(ts.next_check) -
unix_timestamp(timenow()) limit 10;

which produces this query plan:
 Limit  (cost=2874.98..2874.99 rows=2 width=88) (actual 
time=11800.535..11800.546 rows=3 loops=1)
   -  Sort  (cost=2874.98..2874.99 rows=2 width=88) (actual 
time=11800.529..11800.532 rows=3 loops=1)
 Sort Key: (ts.next_check IS NOT NULL), (date_part('epoch'::text, 
ts.next_check) - date_part('epoch'::text, (timenow())::timestamp without time 
zone))
 -  Nested Loop  (cost=4.37..2874.97 rows=2 width=88) (actual 
time=10249.115..11800.486 rows=3 loops=1)
   -  Nested Loop  (cost=4.37..2868.87 rows=2 width=55) (actual 
time=10247.721..11796.303 rows=3 loops=1)
 Join Filter: (inner.id = outer.dict_id)
 -  Nested Loop  (cost=2.03..2865.13 rows=2 width=55) 
(actual time=10247.649..11796.142 rows=3 loops=1)
   Join Filter: (((inner.status IS NULL) AND 
(outer.customer_id IS NULL)) OR (inner.status  0))
   -  Nested Loop  (cost=2.03..2858.34 rows=2 
width=55) (actual time=10247.583..11795.936 rows=3 loops=1)
 -  Seq Scan on customer c  (cost=0.00..195.71 
rows=231 width=4) (actual time=0.082..154.344 rows=4161 loops=1)
   Filter: (active AND isfinite(last_login) 
AND ((date_part('day'::text, (('now'::text)::timestamp(6) with time zone - 
(last_login)::timestamp with time zone)))::integer = 150))
 -  Index Scan using 
task_scheduler_icustomer_id on task_scheduler ts  (cost=2.03..11.51 rows=1 
width=51) (actual time=2.785..2.785 rows=0 loops=4161)
   Index Cond: (outer.customer_id = 
ts.customer_id)
   Filter: ((get_available_pages(task_id, 
customer_id)  0) AND ((get_check_period(task_id, next_check) IS NULL) OR 
((date_part('epoch'::text, get_check_period(task_id, next_check)) - 
date_part('epoch'::text, (timenow())::timestamp without time zone))  
3600::double precision)) AND (status  1) AND start_time)::time with time 
zone  ('now'::text)::time(6) with time zone) AND ((stop_time)::time with time 
zone  ('now'::text)::time(6) with time zone)) OR ((start_time IS NULL) AND 
(stop_time IS NULL))) AND (NOT (hashed subplan)))
   SubPlan
 -  Unique  (cost=2.02..2.03 rows=1 
width=4) (actual time=0.617..0.631 rows=3 loops=1)
   -  Sort  (cost=2.02..2.03 
rows=1 width=4) (actual time=0.613..0.617 rows=3 loops=1)
 Sort Key: customer_id
 -  Index Scan using 
task_scheduler_istatus on task_scheduler  (cost=0.00..2.01 rows=1 width=4) 
(actual time=0.044..0.580 rows=3 loops=1)
   Index Cond: (status 
= 1)
   -  Index Scan using subscription_icustomer_id on 
subscription s  (cost=0.00..3.38 rows=1 width=12) (actual time=0.035..0.041 
rows=1 loops=3)
 Index Cond: (outer.customer_id = 
s.customer_id)
 -  Materialize  (cost=2.34..2.65 rows=31 width=8) (actual 
time=0.008..0.027 rows=6 loops=3)
   -  Seq Scan on subscription_dic sd  
(cost=0.00..2.31 rows=31 width=8) (actual time=0.013..0.034 rows=6 loops=1)
   -  Index Scan using task_url_storage_task_id on task_url tu  
(cost=0.00..3.03 rows=1 width=37) (actual 

Re: Fwd: [PERFORM] Savepoint performance

2006-08-01 Thread Ernest Nishiseki
Actually, what we did in the tests at EnterpriseDB was encapsulate each
SQL statement within its own BEGIN/EXCEPTION/END block.

Using this approach, if a SQL statement aborts, the rollback is
confined 
to the BEGIN/END block that encloses it.  Other SQL statements would
not be affected since the block would isolate and capture that
exception.

In the tests, the base-line version was a PL/pgSQL function for the
dbt-2 new order transaction written within a single BEGIN/END block.
The experimental version was a variation of the base-line altered so
the processing of each order entailed entering three sub-blocks from
the main BEGIN/END block. In addition, another sub-block was
entered each time a detail line within an order was processed.

The transactions per minute were recorded for runs of 20 minutes
simulating 10 terminals and 6 hours simulating 10 terminals.
Below are some of the numbers we got:

   With Sub-
  Test #       Base Line   Blocks   
Difference  % Variation
       ---   
-  --
10 terminals,   1   6128        5861
20 minutes  2           5700        5702
   3            6143         5556
   4           5954          5750
  5          5695          5925

Average of tests 1 - 5  5924  5758.8
-165.2 -2.79

10 terminals, 6 hours        5341          5396
55 1.03

As you can see, we didn't encounter a predictable, significant
difference.

Ernie Nishiseki, Architect
EnterpriseDB Corporation  wrote:

-- Forwarded message --
From: Denis Lussier 
Date: Jul 27, 2006 10:33 PM
Subject: Re: [PERFORM] Savepoint performance
To: Tom Lane 
Cc: pgsql-performance@postgresql.org


My understanding of EDB's approach is that our prototype just
implicitly does a savepoint before each INSERT, UPDATE, or DELETE
statement inside of PLpgSQL. We then rollback to that savepoint if a
sql error occurs. I don 't believe our prelim approach changes any
transaction start/end semantics on the server side and it doesn't
change any PLpgSQL syntax either (although it does allow you to
optionally code commits /or rollbacks inside stored procs).

Can anybody point me to a thread on the 7.3 disastrous experiment?

I personally think that doing commit or rollbacks inside stored
procedures is usually bad coding practice AND can be avoided... It's
a backward compatibility thing for non-ansi legacy stuff and this is
why I was previously guessing that the community wouldn't be
interested in this for PLpgSQL. Actually... does anybody know
offhand if the ansi standard for stored procs allows for explicit
transaction control inside of a stored procedure?

--Luss

On 7/27/06, Tom Lane wrote:
Denis Lussier writes:
Would the community be potentially interested in this feature if we
created
a BSD Postgres patch of this feature for PLpgSQL (likely for 8.3)??

Based on our rather disastrous experiment in 7.3, I'd say that fooling
around with transaction start/end semantics on the server side is
unlikely to fly ...

regards, tom lane


---(end of
broadcast)---
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation | fax: 732.331.1301
33 Wood Ave S, 2nd Floor | [EMAIL PROTECTED]
Iselin, New Jersey 08830 | http://www.enterprisedb.com/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] PITR performance overhead?

2006-08-01 Thread George Pavlov
I am looking for some general guidelines on what is the performance
overhead of enabling point-in-time recovery (archive_command config) on
an 8.1 database. Obviously it will depend on a multitude of factors, but
some broad-brush statements and/or anecdotal evidence will suffice.
Should one worry about its performance implications? Also, what can one
do to mitigate it? 

Thanks,

George

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PITR performance overhead?

2006-08-01 Thread Bill Moran
In response to George Pavlov [EMAIL PROTECTED]:

 I am looking for some general guidelines on what is the performance
 overhead of enabling point-in-time recovery (archive_command config) on
 an 8.1 database. Obviously it will depend on a multitude of factors, but
 some broad-brush statements and/or anecdotal evidence will suffice.
 Should one worry about its performance implications? Also, what can one
 do to mitigate it? 

Prior to implementing PITR, I did some testing to see what kind of
overhead it would add.  It was negligible.  I don't remember the details,
but I seem to remember the performance hit was barely measurable.

Note that in our usage scenarios, we have very little IO compared to
CPU usage.  The result is that our DB servers have plenty of disk
bandwidth to spare.  Since the log backup occurs as a background
process, it made almost no difference in our tests.  If your DB is
very IO intensive, you may have different results.

-- 
Bill Moran
Collaborative Fusion Inc.


IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL scalability on Sun UltraSparc T1

2006-08-01 Thread Jim C. Nasby
On Sat, Jul 29, 2006 at 08:43:49AM -0700, Joshua D. Drake wrote:
 Jochem van Dieten wrote:
 Tweakers.net has done a database performance test between a Sun T2000 (8
 core T1) and a Sun X4200 (2 dual core Opteron 280). The database
 benchmark is developed inhouse and represents the average query pattern
 from their website. It is MySQL centric because Tweakers.net runs on
 MySQL, but Arjen van der Meijden has ported it to PostgreSQL and has
 done basic optimizations like adding indexes.
 
 Arjen wrote about some of the preliminary results previously in
 http://archives.postgresql.org/pgsql-performance/2006-06/msg00358.php
 but the article has now been published http://tweakers.net/reviews/633/7
 This is all the more impressive if you scroll down and look at the
 behaviour of MySQL (after tweaking by both MySQL AB and Sun).
 
 I would love to get my hands on that postgresql version and see how much 
 farther it could be optimized.

I'd love to get an english translation that we could use for PR.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL scalability on Sun UltraSparc T1

2006-08-01 Thread Arjen van der Meijden

On 1-8-2006 19:26, Jim C. Nasby wrote:

On Sat, Jul 29, 2006 at 08:43:49AM -0700, Joshua D. Drake wrote:

I'd love to get an english translation that we could use for PR.


Actually, we have an english version of the Socket F follow-up. 
http://tweakers.net/reviews/638 which basically displays the same 
results for Postgres vs MySQL.
If and when a translation of the other article arrives, I don't know. 
Other follow-up stories will follow as well, whether and how soon those 
will be translated, I also don't know. We are actually pretty interested 
in doing so, but its a lot of work to translate correctly :)


Best regards,

Arjen

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] How to speed up this translation query?

2006-08-01 Thread tlm
I need some expert advice on how to optimize a translation query (this word choice will become clear shortly, I hope).

Say I have aHUMONGOUS table of foreign language translations (call it TRANS) with records like these:

meaning_id: 1
language_id: 5
translation: jidoosha

meaning_id: 1
language_id: 2
translation: voiture


meaning_id: 1
language_id: 5
translation: kuruma

meaning_id: 2
language_id: 2
translation: chat

meaning_id: 2
language_id: 5
translation: neko

meaning_id: 2
language_id: 3
translation: katz

meaning_id: 3
language_id: 4
translation: pesce


meaning_id: 3
language_id: 2
translation: poisson


meaning_id: 3
language_id: 5
translation: sakana

For the sake of this description, let's assume that the records above are all the records in TRANS (though in fact the number of records in TRANS is really abouttenmillion times greater).

Now suppose I have a tiny table calledINPUT consisting of single textfield (say, word). E.g. suppose thatINPUT looks like this:

katz

voiturepesce

Now, let's fix a language_id, say 5. This is the target language_id. Given this target language_id, and this particular INPUT table, I want the results of the query to be something like this:

neko
jidoosha
kuruma
sakana

I.e. for each word W in INPUT, the query must first findeach record Rin TRANS that hasW as its translation field; then find each record Q in TRANSwhose language_id is 5 (the target language_id) AND has the same meaning_id as R does. 
E.g.if W is 'katz', then R is


meaning_id: 2
language_id: 3
translation: katz

and therefore the desired Q is


meaning_id: 2
language_id: 5
translation: neko

...and so on.

The only difficulty here is that performance is critical, and in real life,TRANS has around 50M records (and growing),while INPUT has typicallybetween 500and 1000 records.

Any advice on how to make this as fast as possible would be much appreciated.

Thanks!

G.

P.S. Just to show that this post is not just from a college student trying to get around doing homework, below I post my most successful query so far. It works, but it's performance isn't great. And it is annoyingly complex, to boot; I'mvery much the SQLnoob, and if nothing else, at least I'd like to learn to writebetter (
i.e. more elegant,morelegible, more clueful)SQL that this:

SELECT q3.translation, q2.otherstuffFROM( SELECT INPUT.word, q1.meaning_id, INPUT.otherstuff FROM INPUT INNER JOIN ( SELECT translation, meaning_id FROM TRANS WHERE translation IN (SELECT word FROM INPUT)
 ) AS q1 ON INPUT.word = q1.translation) AS q2LEFT JOIN( SELECT translation, meaning_id FROM TRANS WHERE language_id=5) AS q3ON q2.meaning_id=q3.meaning_id;
As you can see, there are additional fields that I didn't mention in my original description (e.g. INPUT.otherstuff). Also the above is actually a subquery in a larger query, but it is by far, the worst bottleneck. Last, there's an index on TRANS(translation).




Re: [PERFORM] How to speed up this translation query?

2006-08-01 Thread Niklas Johansson


On 1 aug 2006, at 20.09, tlm wrote:

SELECT q3.translation, q2.otherstuff
FROM
(
  SELECT INPUT.word, q1.meaning_id, INPUT.otherstuff
  FROM
  INPUT
  INNER JOIN
  (
SELECT translation, meaning_id
FROM TRANS
WHERE translation IN (SELECT word FROM INPUT)
  ) AS q1
  ON INPUT.word = q1.translation
) AS q2
LEFT JOIN
(
  SELECT translation, meaning_id
  FROM TRANS
  WHERE language_id=5
) AS q3
ON q2.meaning_id=q3.meaning_id;


Maybe I'm not following you properly, but I think you've made things  
a little bit more complicated than they need to be. The nested sub- 
selects look a little nasty.


Now, you didn't provide any explain output but I think the following  
SQL will achieve the same result, and hopefully produce a better plan:


SELECT t2.translation, i.otherstuff
FROM input i INNER JOIN trans t ON i.word=t.translation
INNER JOIN trans t2 ON t.meaning_id=t2.meaning_id
WHERE t2.language_id=5;

The query will also benefit from indices on trans.meaning_id and  
trans.language_id. Also make sure the tables are vacuumed and  
analyzed, to allow the planner to make good estimates.




Sincerely,

Niklas Johansson





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] XFS filessystem for Datawarehousing

2006-08-01 Thread Milen Kulev
I intend to test  Postgres/Bizgres for DWH use. I want to use XFS filesystem to 
get the best possible performance at FS
level(correct me if I am wrong !).

Is anyone using XFS for storing/retrieving relatively large amount of data  (~ 
200GB)?

If yes, what about the performance and stability of  XFS.
I am especially interested in recommendations about XFS mount options and 
mkfs.xfs options.
My setup will be roughly this:
1) 4  SCSI HDD , 128GB each, 
2) RAID 0 on the four SCSI HDD disks using LVM (software RAID)

There are two other SATA HDD in the server.  Server has 2 physical CPUs (XEON 
at 3 GHz),  4 Logical CPUs, 8 GB RAM,  OS
= SLES9 SP3 

My questions:
1) Should I place external XFS journal on separate device ?
2) What  should be the journal buffer size (logbsize) ?
3)  How many journal buffers (logbufs) should I configure ?
4) How many allocations groups  (for mkfs.xfs) should I  configure
5)  Is it wortj settion noatime ?
6) What I/O scheduler(elevators) should I use (massive sequencial reads)
7) What is the ideal stripe unit and width (for a RAID device) ? 

I will appreciate any options, suggestions, pointers.

Best  Regards.
Milen Kulev


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


FW: [PERFORM] XFS filessystem for Datawarehousing -2

2006-08-01 Thread Milen Kulev
Sorry, forgot to ask:
What is the recommended/best  PG block size for DWH  database?  16k, 32k, 64k ?
What hsould be the relation  between XFS/RAID stripe size and PG block size ?

Best  Regards. 
Milen Kulev
 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Milen Kulev
Sent: Tuesday, August 01, 2006 11:50 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] XFS filessystem for Datawarehousing


I intend to test  Postgres/Bizgres for DWH use. I want to use XFS filesystem to 
get the best possible performance at FS
level(correct me if I am wrong !).

Is anyone using XFS for storing/retrieving relatively large amount of data  (~ 
200GB)?

If yes, what about the performance and stability of  XFS.
I am especially interested in recommendations about XFS mount options and 
mkfs.xfs options. My setup will be roughly
this:
1) 4  SCSI HDD , 128GB each, 
2) RAID 0 on the four SCSI HDD disks using LVM (software RAID)

There are two other SATA HDD in the server.  Server has 2 physical CPUs (XEON 
at 3 GHz),  4 Logical CPUs, 8 GB RAM,  OS
= SLES9 SP3 

My questions:
1) Should I place external XFS journal on separate device ?
2) What  should be the journal buffer size (logbsize) ?
3)  How many journal buffers (logbufs) should I configure ?
4) How many allocations groups  (for mkfs.xfs) should I  configure
5)  Is it wortj settion noatime ?
6) What I/O scheduler(elevators) should I use (massive sequencial reads)
7) What is the ideal stripe unit and width (for a RAID device) ? 

I will appreciate any options, suggestions, pointers.

Best  Regards.
Milen Kulev


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] XFS filessystem for Datawarehousing

2006-08-01 Thread Milen Kulev
Hi Andrew, 
Thank you for your prompt reply.
Are you using some special XFS options ? 
I mean special values for logbuffers bufferiosize , extent  size preallocations 
etc ?
I will have only 6 big tables and about 20 other relatively small (fact 
aggregation) tables (~ 10-20 GB each). 
I believe it should be a a good idea to use as much contigious chunks of space 
(from  OS point of view) as possible in
order to make full table scans  as fast as possible. 


Best Regards,
Milen Kulev

-Original Message-
From: J. Andrew Rogers [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 02, 2006 12:47 AM
To: Milen Kulev
Cc: Pgsql-Performance ((E-mail))
Subject: Re: [PERFORM] XFS filessystem for Datawarehousing



On Aug 1, 2006, at 2:49 PM, Milen Kulev wrote:
 Is anyone using XFS for storing/retrieving relatively large amount
 of data  (~ 200GB)?


Yes, we've been using it on Linux since v2.4 (currently v2.6) and it  
has been rock solid on our database servers (Opterons, running in  
both 32-bit and 64-bit mode).  Our databases are not quite 200GB  
(maybe 75GB for a big one currently), but ballpark enough that the  
experience is probably valid.  We also have a few terabyte+ non- 
database XFS file servers too.

Performance has been very good even with nearly full file systems,  
and reliability has been perfect so far. Some of those file systems  
get used pretty hard for months or years non-stop.  Comparatively, I  
can only tell you that XFS tends to be significantly faster than  
Ext3, but we never did any serious file system tuning either.

Knowing nothing else, my experience would suggest that XFS is a fine  
and safe choice for your application.


J. Andrew Rogers


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] sub select performance due to seq scans

2006-08-01 Thread H Hale
Not sure if this helps solve the problem but... (see below)  As new records are added Indexes are used for awhile and then at some point postgres switches to seq scan. It is repeatable.   Any suggestions/comments to try and solve this are welcome. Thanks  Data is as follows: capsa.flatommemberrelation 1458 records capsa.flatommemberrelation(srcobj) 3 distinct capsa.flatommemberrelation(dstobj) 730 distinct capsa.flatomfilesysentry 732 records capsa.flatommemberrelation(objectid) 732 distinct  capsa=# set enable_seqscan=on; SET Time: 0.599 ms capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409');
  QUERY PLAN -- Aggregate (cost=196.01..196.02 rows=1 width=0) (actual time=965.420..965.422 rows=1 loops=1)  - Nested Loop IN Join (cost=0.00..194.19 rows=728 width=0) (actual time=3.373..964.371 rows=729 loops=1)  Join Filter: ("outer".objectid = "inner".dstobj)  - Seq Scan on flatomfilesysentry (cost=0.00..65.28 rows=728
 width=16) (actual time=0.007..1.505 rows=732 loops=1)  - Seq Scan on flatommemberrelation (cost=0.00..55.12 rows=725 width=16) (actual time=0.004..0.848 rows=366 loops=732)  Filter: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid) Total runtime: 965.492 ms (7 rows)  Time: 966.806 ms --- capsa=# set enable_seqscan=off; SET Time: 0.419 ms capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409');
  QUERY PLAN -- Aggregate (cost=24847.73..24847.74 rows=1 width=0) (actual time=24.859..24.860 rows=1 loops=1)  - Nested Loop (cost=90.05..24845.91 rows=728 width=0) (actual time=2.946..23.640 rows=729 loops=1)  - Unique
 (cost=88.04..91.67 rows=363 width=16) (actual time=2.917..6.671 rows=729 loops=1)  - Sort (cost=88.04..89.86 rows=725 width=16) (actual time=2.914..3.998 rows=729 loops=1)  Sort Key: flatommemberrelation.dstobj  - Bitmap Heap Scan on flatommemberrelation (cost=7.54..53.60 rows=725 width=16) (actual time=0.260..1.411 rows=729 loops=1)  Recheck Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
  - Bitmap Index Scan on capsa_flatommemberrelation_srcobj_idx (cost=0.00..7.54 rows=725 width=0) (actual time=0.244..0.244 rows=729 loops=1)  Index Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)  - Bitmap Heap Scan on flatomfilesysentry (cost=2.00..63.64 rows=364 width=16) (actual time=0.014..0.015 rows=1 loops=729)  Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
  - Bitmap Index Scan on flatomfilesysentry_pkey (cost=0.00..2.00 rows=364 width=0) (actual time=0.009..0.009 rows=1 loops=729)  Index Cond: (flatomfilesysentry.objectid = "outer".dstobj) Total runtime: 25.101 ms (14 rows)  Time: 26.878 ms  H Hale [EMAIL PROTECTED] wrote: Tom,   It is unique.  Indexes:  "flatomfilesysentry_pkey" PRIMARY KEY, btree (objectid)  "capsa_flatomfilesysentry_name_idx" btree (name) Foreign-key constraints:  "objectid" FOREIGN KEY (objectid) REFERENCES
 capsa_sys.master(objectid) ON DELETE CASCADE  Tom Lane [EMAIL PROTECTED] wrote: H Hale  writes: - Bitmap Heap Scan on flatomfilesysentry (cost=2.00..274.38 rows=3238 width=30) (actual time=0.011..0.013 rows=1 loops=6473)   Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj) - Bitmap Index Scan on flatomfilesysentry_pkey (cost=0.00..2.00 rows=3238 width=0) (actual time=0.007..0.007 rows=1 loops=6473) Index Cond: (flatomfilesysentry.objectid =  "outer".dstobj)Well, there's our estimation failure: 3238 rows expected, one rowactual.What is the data distribution of flatomfilesysentry.objectid?It looks from this example like it is unique or nearly so,but the planner evidently does not think that.  
 regards, tom lane---(end of broadcast)---TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] XFS filessystem for Datawarehousing

2006-08-01 Thread J. Andrew Rogers


On Aug 1, 2006, at 2:49 PM, Milen Kulev wrote:
Is anyone using XFS for storing/retrieving relatively large amount  
of data  (~ 200GB)?



Yes, we've been using it on Linux since v2.4 (currently v2.6) and it  
has been rock solid on our database servers (Opterons, running in  
both 32-bit and 64-bit mode).  Our databases are not quite 200GB  
(maybe 75GB for a big one currently), but ballpark enough that the  
experience is probably valid.  We also have a few terabyte+ non- 
database XFS file servers too.


Performance has been very good even with nearly full file systems,  
and reliability has been perfect so far. Some of those file systems  
get used pretty hard for months or years non-stop.  Comparatively, I  
can only tell you that XFS tends to be significantly faster than  
Ext3, but we never did any serious file system tuning either.


Knowing nothing else, my experience would suggest that XFS is a fine  
and safe choice for your application.



J. Andrew Rogers


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] XFS filessystem for Datawarehousing

2006-08-01 Thread Alvaro Herrera
J. Andrew Rogers wrote:
 
 On Aug 1, 2006, at 2:49 PM, Milen Kulev wrote:
 Is anyone using XFS for storing/retrieving relatively large amount  
 of data  (~ 200GB)?
 
 
 Yes, we've been using it on Linux since v2.4 (currently v2.6) and it  
 has been rock solid on our database servers (Opterons, running in  
 both 32-bit and 64-bit mode).  Our databases are not quite 200GB  
 (maybe 75GB for a big one currently), but ballpark enough that the  
 experience is probably valid.  We also have a few terabyte+ non- 
 database XFS file servers too.
 
 Performance has been very good even with nearly full file systems,  
 and reliability has been perfect so far. Some of those file systems  
 get used pretty hard for months or years non-stop.  Comparatively, I  
 can only tell you that XFS tends to be significantly faster than  
 Ext3, but we never did any serious file system tuning either.

Most likely ext3 was used on the default configuration, which logs data
operations as well as metadata, which is what XFS logs.  I don't think
I've seen any credible comparison between XFS and ext3 with the
metadata-only journal option.

On the other hand I don't think it makes sense to journal data on a
PostgreSQL environment.  Metadata is enough, given that we log data on
WAL anyway.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] PITR performance overhead?

2006-08-01 Thread Merlin Moncure

On 8/1/06, George Pavlov [EMAIL PROTECTED] wrote:

I am looking for some general guidelines on what is the performance
overhead of enabling point-in-time recovery (archive_command config) on
an 8.1 database. Obviously it will depend on a multitude of factors, but
some broad-brush statements and/or anecdotal evidence will suffice.
Should one worry about its performance implications? Also, what can one
do to mitigate it?


pitr is extremely cheap both in performance drag and administation
overhead for the benefits it provides.  it comes almost for free, just
make sure you can handle all the wal files and do sane backup
scheduling.  in fact, pitr can actually reduce the load on a server
due to running less frequent backups.  if your server is heavy i/o
loaded, it might take a bit of planning.

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] XFS filessystem for Datawarehousing

2006-08-01 Thread Mark Kirkwood

Milen Kulev wrote:


Is anyone using XFS for storing/retrieving relatively large amount of data  (~ 
200GB)?



Yes, but not for that large - only about 40-50 GB of database data.


If yes, what about the performance and stability of  XFS.


I'm pretty happy with the performance, particularly read (get 215MB/s 
sequential 8K reads from 4 (P)ATA drives setup as software RAID 0). I 
have always found XFS very stable (used it on servers for several years).



I am especially interested in recommendations about XFS mount options and 
mkfs.xfs options.
My setup will be roughly this:
1) 4  SCSI HDD , 128GB each, 
2) RAID 0 on the four SCSI HDD disks using LVM (software RAID)






My questions:
1) Should I place external XFS journal on separate device ?
2) What  should be the journal buffer size (logbsize) ?
3)  How many journal buffers (logbufs) should I configure ?
4) How many allocations groups  (for mkfs.xfs) should I  configure
5)  Is it wortj settion noatime ?
6) What I/O scheduler(elevators) should I use (massive sequencial reads)
7) What is the ideal stripe unit and width (for a RAID device) ? 





1-3) I have not done any experimentation with where to put the journal, 
or its buffer size / number of them (well worth doing I suspect tho).


4) I left it at the default.

5) I use noatime, but have not measured if there is any impact if I 
leave it off.


6) deadline scheduler seemed to give slightly better performance for 
sequential performance.


7) I tried out stripe width 2,4 (with 4 disks), and they seemed to give 
the same results. Stripe unit of 256K (tested 32K, 64K, 128K) seemed to 
give the best sequential performance. My software raid stripe size was 
matched to this in each case.



I'll be interested to hear what you discover :-)

Cheers

Mark

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] XFS filessystem for Datawarehousing -2

2006-08-01 Thread Luke Lonergan
Milen,

On 8/1/06 3:19 PM, Milen Kulev [EMAIL PROTECTED] wrote:

 Sorry, forgot to ask:
 What is the recommended/best  PG block size for DWH  database?  16k, 32k, 64k
 ?
 What hsould be the relation  between XFS/RAID stripe size and PG block size ?

We have found that the page size in PG starts to matter only at very high
disk performance levels around 1000MB/s.  Other posters have talked about
maintenance tasks improving in performance, but I haven't seen it.

- Luke



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Query/database optimization

2006-08-01 Thread Tom Lane
Eugeny N Dzhurinsky [EMAIL PROTECTED] writes:
 [slow query]

The bulk of your time seems to be going into this indexscan:

  -  Index Scan using 
 task_scheduler_icustomer_id on task_scheduler ts  (cost=2.03..11.51 rows=1 
 width=51) (actual time=2.785..2.785 rows=0 loops=4161)
Index Cond: (outer.customer_id = 
 ts.customer_id)
Filter: ((get_available_pages(task_id, 
 customer_id)  0) AND ((get_check_period(task_id, next_check) IS NULL) OR 
 ((date_part('epoch'::text, get_check_period(task_id, next_check)) - 
 date_part('epoch'::text, (timenow())::timestamp without time zone))  
 3600::double precision)) AND (status  1) AND start_time)::time with 
 time zone  ('now'::text)::time(6) with time zone) AND ((stop_time)::time 
 with time zone  ('now'::text)::time(6) with time zone)) OR ((start_time IS 
 NULL) AND (stop_time IS NULL))) AND (NOT (hashed subplan)))
SubPlan
  -  Unique  (cost=2.02..2.03 rows=1 
 width=4) (actual time=0.617..0.631 rows=3 loops=1)
 ...

I kinda doubt that the index search itself is that slow --- doubtless
the problem comes from having to evaluate that filter condition on a lot
of rows.  How fast are those functions you're calling?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly