Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-21 Thread pginfo
Hi,

Dual Xeon P4 2.8
linux RedHat AS 3
kernel 2.4.21-4-EL-smp
2 GB ram

I can see the same problem:

procs  memory  swap  io
system cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy
id wa
1  0  0  96212  61056 172024000 0 0  10111 25  0
75  0
 1  0  0  96212  61056 172024000 0 0  108   139 25
0 75  0
 1  0  0  96212  61056 172024000 0 0  104   173 25
0 75  0
 1  0  0  96212  61056 172024000 0 0  10211 25
0 75  0
 1  0  0  96212  61056 172024000 0 0  10111 25
0 75  0
 2  0  0  96204  61056 172024000 0 0  110 53866 31
4 65  0
 2  0  0  96204  61056 172024000 0 0  101 83176 41
5 54  0
 2  0  0  96204  61056 172024000 0 0  102 86050 39
6 55  0
 2  0  0  96204  61056 172024000 049  113 73642 41
5 54  0
 2  0  0  96204  61056 172024000 0 0  102 84211 40
5 55  0
 2  0  0  96204  61056 172024000 0 0  101 105165 39
7 54  0
 2  0  0  96204  61056 172024000 0 0  103 97754 38
6 56  0
 2  0  0  96204  61056 172024000 0 0  103 113668 36
7 57  0
 2  0  0  96204  61056 172024000 0 0  103 112003 37
7 56  0

regards,
ivan.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] slow seqscan

2004-04-21 Thread Edoardo Ceccarelli
My first post to this list :)

Scenario:
I have a database used only with search queries with only one table that
holds about 450.000/500.000 records.
The table is well indexed so that most of the queries are executed with
index scan but since there is a big text field in the table (360chars)
some search operation (with certain filters) ends up with seq scans.
This table is not written during normal operation: twice per week there
is a batch program that insert about 35.000 records and updates another
40.000.
last friday morning, after that batch has been executed, the database 
started responding really slowly to queries (expecially seq scans), 
after a vacuum full analize things did get something better.
Yesterday the same: before the batch everything was perfect, after every 
query was really slow, I've vacuum it again and now is ok.
Since now the db was working fine, it's 4 month's old with two updates 
per week and I vacuum about once per month.

I am using version 7.3 do I need to upgrade to 7.4? also, I was thinking
about setting this table in a kind of  read-only mode to improve
performance, is this possible?
Thank you for your help
Edoardo Ceccarelli
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] slow seqscan

2004-04-21 Thread Christopher Kings-Lynne
Hi Edoardo,

The table is well indexed so that most of the queries are executed with
index scan but since there is a big text field in the table (360chars)
some search operation (with certain filters) ends up with seq scans.
Please paste the exact SELECT query that uses a seqscan, plus the 
EXPLAIN ANALYZE of the SELECT, and the psql output of \d table.

This table is not written during normal operation: twice per week there
is a batch program that insert about 35.000 records and updates another
40.000.
After such an update, you need to run VACUUM ANALYZE table;  Run it 
before the update as well, if it doesn't take that long.

last friday morning, after that batch has been executed, the database 
started responding really slowly to queries (expecially seq scans), 
after a vacuum full analize things did get something better.
Yesterday the same: before the batch everything was perfect, after every 
query was really slow, I've vacuum it again and now is ok.
Since now the db was working fine, it's 4 month's old with two updates 
per week and I vacuum about once per month.
You need to vacuum analyze (NOT full) once and HOUR, not once a month. 
Add this command to your crontab to run once an hour and verify that 
it's working:

vacuumdb -a -z -q

Otherwise, install the auto vacuum utility found in 
contrib/pg_autovacuum in the postgres source.  Set this up.  It will 
monitor postgres and run vacuums and analyzes when necessary.  You can 
then remove your cron job.

I am using version 7.3 do I need to upgrade to 7.4? also, I was thinking
about setting this table in a kind of  read-only mode to improve
performance, is this possible?
There's no read only mode to improve performance.

Upgrading to 7.4 will more than likely improve the performance of your 
database in general.  Be careful to read the upgrade notes because there 
were a few incompatibilities.

Chris

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] slow seqscan

2004-04-21 Thread Nick Barr
Edoardo Ceccarelli wrote:

My first post to this list :)

Scenario:
I have a database used only with search queries with only one table that
holds about 450.000/500.000 records.
The table is well indexed so that most of the queries are executed with
index scan but since there is a big text field in the table (360chars)
some search operation (with certain filters) ends up with seq scans.
This table is not written during normal operation: twice per week there
is a batch program that insert about 35.000 records and updates another
40.000.
last friday morning, after that batch has been executed, the database 
started responding really slowly to queries (expecially seq scans), 
after a vacuum full analize things did get something better.
Yesterday the same: before the batch everything was perfect, after 
every query was really slow, I've vacuum it again and now is ok.
Since now the db was working fine, it's 4 month's old with two updates 
per week and I vacuum about once per month.

I am using version 7.3 do I need to upgrade to 7.4? also, I was thinking
about setting this table in a kind of  read-only mode to improve
performance, is this possible?
Thank you for your help
Edoardo Ceccarelli
---(end of broadcast)---
TIP 8: explain analyze is your friend
In general we are going to need more information, like what kind of 
search filters you are using on the text field and an EXPLAIN ANALYZE. 
But can you try and run the following, bearing in mind it will take a 
while to complete.

REINDEX TABLE table_name

From what I remember there were issues with index space not being 
reclaimed in a vacuum. I believe this was fixed in 7.4. By not 
reclaiming the space the indexes grow larger and larger over time, 
causing PG to prefer a sequential scan over an index scan (I think).

Hope that helps

Nick



---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Nick Barr
Hi,

Has anyone had a look at:

http://people.ac.upc.es/zgomez/

I realize that MySQL  PG cannot really be compared (especially when you 
consider the issues that MySQL has with things like data integrity) but 
still surely PG would perform better than the stats show (i.e. #7 31.28 
seconds versus 42 minutes!!!).

On a side note it certainly looks like linux kernel 2.6 is quite a bit 
faster in comparision to 2.4.

Nick



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] slow seqscan

2004-04-21 Thread Edoardo Ceccarelli

In general we are going to need more information, like what kind of 
search filters you are using on the text field and an EXPLAIN ANALYZE. 
But can you try and run the following, bearing in mind it will take a 
while to complete.

REINDEX TABLE table_name

From what I remember there were issues with index space not being 
reclaimed in a vacuum. I believe this was fixed in 7.4. By not 
reclaiming the space the indexes grow larger and larger over time, 
causing PG to prefer a sequential scan over an index scan (I think).


The query is this:
SELECT *, oid FROM annuncio400
WHERE  rubric = 'DD' AND LOWER(testo) Like LOWER('cbr%')
OFFSET 0 LIMIT 11
dba400=# explain analyze SELECT *, oid FROM annuncio400 WHERE  rubric = 
'DD' AND LOWER(testo) Like LOWER('cbr%') OFFSET 0 LIMIT 11;
   QUERY 
PLAN
 

Limit  (cost=0.00..3116.00 rows=11 width=546) (actual time=51.47..56.42 
rows=11 loops=1)
 -  Seq Scan on annuncio400  (cost=0.00..35490.60 rows=125 width=546) 
(actual time=51.47..56.40 rows=12 loops=1)
   Filter: ((rubric = 'DD'::bpchar) AND (lower((testo)::text) ~~ 
'cbr%'::text))
Total runtime: 56.53 msec
(4 rows)

But the strangest thing ever is that if I change the filter with another 
one that represent a smaller amount of data  it uses the index scan!!!
check this (same table, same query, different rubric=MA index):

dba400=# explain analyze SELECT *, oid FROM annuncio400 WHERE  rubric = 
'MA' AND LOWER(testo) Like LOWER('cbr%') OFFSET 0 LIMIT 11; 
QUERY 
PLAN  
--- 

Limit  (cost=0.00..6630.72 rows=9 width=546) (actual time=42.74..42.74 
rows=0 loops=1)
 -  Index Scan using rubric on annuncio400  (cost=0.00..6968.48 rows=9 
width=546) (actual time=42.73..42.73 rows=0 loops=1)
   Index Cond: (rubric = 'MA'::bpchar)
   Filter: (lower((testo)::text) ~~ 'cbr%'::text)
Total runtime: 42.81 msec
(5 rows)

Thanks for your help
Edoardo


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] slow seqscan

2004-04-21 Thread Christopher Kings-Lynne
dba400=# explain analyze SELECT *, oid FROM annuncio400 WHERE  rubric = 
'DD' AND LOWER(testo) Like LOWER('cbr%') OFFSET 0 LIMIT 11;
   QUERY 
PLAN
 

Limit  (cost=0.00..3116.00 rows=11 width=546) (actual time=51.47..56.42 
rows=11 loops=1)
 -  Seq Scan on annuncio400  (cost=0.00..35490.60 rows=125 width=546) 
(actual time=51.47..56.40 rows=12 loops=1)
   Filter: ((rubric = 'DD'::bpchar) AND (lower((testo)::text) ~~ 
'cbr%'::text))
Total runtime: 56.53 msec
(4 rows)
What happens if you go:

CREATE INDEX annuncio400_rubric_testo_idx ON annuncio400(rubric, 
LOWER(testo));

or even just:

CREATE INDEX annuncio400_rubric_testo_idx ON annuncio400(LOWER(testo));

But the strangest thing ever is that if I change the filter with another 
one that represent a smaller amount of data  it uses the index scan!!!
What's strange about that?  The less data is going to be retrieved, the 
more likely postgres is to use the index.

I suggest maybe increasing the amount of stats recorded for your rubrik 
column:

ALTER TABLE annuncio400 ALTER rubrik SET STATISTICS 100;
ANALYZE annuncio400;
You could also try reducing the random_page_cost value in your 
postgresql.conf a little, say to 3 (if it's currently 4).  That will 
make postgres more likely to use index scans over seq scans.

Chris

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] slow seqscan

2004-04-21 Thread Edoardo Ceccarelli

What happens if you go:

CREATE INDEX annuncio400_rubric_testo_idx ON annuncio400(rubric, 
LOWER(testo));

or even just:

CREATE INDEX annuncio400_rubric_testo_idx ON annuncio400(LOWER(testo));

I wasn't able to make this 2 field index with lower:

dba400=# CREATE INDEX annuncio400_rubric_testo_idx ON 
annuncio400(rubric, LOWER(testo));
ERROR:  parser: parse error at or near ( at character 71

seems impossible to creat 2 field indexes with lower function.

The other one does not make it use the index.


But the strangest thing ever is that if I change the filter with 
another one that represent a smaller amount of data  it uses the 
index scan!!!


What's strange about that?  The less data is going to be retrieved, 
the more likely postgres is to use the index.

can't understand this policy:

dba400=# SELECT count(*) from annuncio400 where rubric='DD';
count
---
 6753
(1 row)
dba400=# SELECT count(*) from annuncio400 where rubric='MA';
count
---
 2165
(1 row)
so it's using the index on 2000 rows and not for 6000?  it's not that 
big difference, isn't it?


I suggest maybe increasing the amount of stats recorded for your 
rubrik column:

ALTER TABLE annuncio400 ALTER rubrik SET STATISTICS 100;
ANALYZE annuncio400;
done, almost the same, still not using index

You could also try reducing the random_page_cost value in your 
postgresql.conf a little, say to 3 (if it's currently 4).  That will 
make postgres more likely to use index scans over seq scans.

changed the setting on postgresql.conf, restarted the server,
nothing has changed.
what about setting this to false?
#enable_seqscan = true
thanks again
Edoardo
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] slow seqscan

2004-04-21 Thread Edoardo Ceccarelli
tried the

enable_seqscan = false

and I'm having all index scans, timing has improved from 600ms to 18ms

wondering what other implications I might expect.



Edoardo Ceccarelli ha scritto:


What happens if you go:

CREATE INDEX annuncio400_rubric_testo_idx ON annuncio400(rubric, 
LOWER(testo));

or even just:

CREATE INDEX annuncio400_rubric_testo_idx ON annuncio400(LOWER(testo));

I wasn't able to make this 2 field index with lower:

dba400=# CREATE INDEX annuncio400_rubric_testo_idx ON 
annuncio400(rubric, LOWER(testo));
ERROR:  parser: parse error at or near ( at character 71

seems impossible to creat 2 field indexes with lower function.

The other one does not make it use the index.


But the strangest thing ever is that if I change the filter with 
another one that represent a smaller amount of data  it uses the 
index scan!!!


What's strange about that?  The less data is going to be retrieved, 
the more likely postgres is to use the index.

can't understand this policy:

dba400=# SELECT count(*) from annuncio400 where rubric='DD';
count
---
 6753
(1 row)
dba400=# SELECT count(*) from annuncio400 where rubric='MA';
count
---
 2165
(1 row)
so it's using the index on 2000 rows and not for 6000?  it's not that 
big difference, isn't it?


I suggest maybe increasing the amount of stats recorded for your 
rubrik column:

ALTER TABLE annuncio400 ALTER rubrik SET STATISTICS 100;
ANALYZE annuncio400;
done, almost the same, still not using index

You could also try reducing the random_page_cost value in your 
postgresql.conf a little, say to 3 (if it's currently 4).  That will 
make postgres more likely to use index scans over seq scans.

changed the setting on postgresql.conf, restarted the server,
nothing has changed.
what about setting this to false?
#enable_seqscan = true
thanks again
Edoardo
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] slow seqscan

2004-04-21 Thread Christopher Kings-Lynne

enable_seqscan = false

and I'm having all index scans, timing has improved from 600ms to 18ms

wondering what other implications I might expect.
Lots of really bad implications...it's really not a good idea.

Chris

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-21 Thread ohp
How long is this test supposed to run?

I've launched just 1 for testing, the plan seems horrible; the test is cpu
bound and hasn't finished yet after 17:02 min of CPU time, dual XEON 2.6G
Unixware 713

The machine is a Fujitsu-Siemens TX 200 server
 On Mon, 19 Apr 2004, Tom Lane wrote:

 Date: Mon, 19 Apr 2004 20:01:56 -0400
 From: Tom Lane [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: Joe Conway [EMAIL PROTECTED], scott.marlowe [EMAIL PROTECTED],
  Bruce Momjian [EMAIL PROTECTED], [EMAIL PROTECTED],
  [EMAIL PROTECTED], Neil Conway [EMAIL PROTECTED]
 Subject: Re: [PERFORM] Wierd context-switching issue on Xeon

 Here is a test case.  To set up, run the test_setup.sql script once;
 then launch two copies of the test_run.sql script.  (For those of
 you with more than two CPUs, see whether you need one per CPU to make
 trouble, or whether two test_runs are enough.)  Check that you get a
 nestloops-with-index-scans plan shown by the EXPLAIN in test_run.

 In isolation, test_run.sql should do essentially no syscalls at all once
 it's past the initial ramp-up.  On a machine that's functioning per
 expectations, multiple copies of test_run show a relatively low rate of
 semop() calls --- a few per second, at most --- and maybe a delaying
 select() here and there.

 What I actually see on Josh's client's machine is a context swap storm:
 vmstat 1 shows CS rates around 170K/sec.  strace'ing the backends
 shows a corresponding rate of semop() syscalls, with a few delaying
 select()s sprinkled in.  top(1) shows system CPU percent of 25-30
 and idle CPU percent of 16-20.

 I haven't bothered to check how long the test_run query takes, but if it
 ends while you're still examining the behavior, just start it again.

 Note the test case assumes you've got shared_buffers set to at least
 1000; with smaller values, you may get some I/O syscalls, which will
 probably skew the results.

   regards, tom lane



-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Rod Taylor
On Wed, 2004-04-21 at 08:19, Rod Taylor wrote:
  I realize that MySQL  PG cannot really be compared (especially when you 
  consider the issues that MySQL has with things like data integrity) but 
  still surely PG would perform better than the stats show (i.e. #7 31.28 
  seconds versus 42 minutes!!!).
 
 We know that PostgreSQL 7.5 will perform much better than 7.4 did due to
 the efforts of OSDN and Tom.

OSDL not OSDN.

 I've enquired as to whether they ran ANALYZE after the data load. They
 don't explicitly mention it, and given the mention it took 2.5days to
 load 1GB of data, they're not regular PostgreSQL users.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Paul Thomas
On 21/04/2004 09:31 Nick Barr wrote:
Hi,

Has anyone had a look at:

http://people.ac.upc.es/zgomez/

I realize that MySQL  PG cannot really be compared (especially when you 
consider the issues that MySQL has with things like data integrity) but 
still surely PG would perform better than the stats show (i.e. #7 31.28 
seconds versus 42 minutes!!!).
Looks like he's using the default postgresql.conf settings in which case 
I'm not suprised at pg looking so slow. His stated use of foreign keys 
invalidates the tests anyway as MyISAM tables don't support FKs so we're 
probably seeing FK check overheads in pg that are simply ignore by MySQL. 
In an honest test, MySQL should be reported as failing those tests.

Perhaps one of the advocay team will pick up the batton?
On a side note it certainly looks like linux kernel 2.6 is quite a bit 
faster in comparision to 2.4.
Yes, I've seen other benchmarks which also show that.

--
Paul Thomas
+--+-+
| Thomas Micro Systems Limited | Software Solutions for 
Business |
| Computer Consultants | 
http://www.thomas-micro-systems-ltd.co.uk   |
+--+-+

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Cestmir Hybl
 Looks like he's using the default postgresql.conf settings in which case
 I'm not suprised at pg looking so slow.

The question also is, IMHO, why the hell, postgreSQL still comes out of the
box with so stupid configuration defaults, totally underestimated for todays
average hardware configuration (1+GHz, 0.5+GB RAM, fast FSB, fast HDD).

It seems to me better strategy to force that 1% of users to downgrade cfg.
than vice-versa.

regards
ch


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Paul Thomas
On 21/04/2004 14:31 Cestmir Hybl wrote:
 Looks like he's using the default postgresql.conf settings in which
case
 I'm not suprised at pg looking so slow.
The question also is, IMHO, why the hell, postgreSQL still comes out of
the
box with so stupid configuration defaults, totally underestimated for
todays
average hardware configuration (1+GHz, 0.5+GB RAM, fast FSB, fast HDD).
It seems to me better strategy to force that 1% of users to downgrade
cfg.
than vice-versa.
regards
ch
This has been discussed many times before. Check the archives.

--
Paul Thomas
+--+-+
| Thomas Micro Systems Limited | Software Solutions for 
Business |
| Computer Consultants | 
http://www.thomas-micro-systems-ltd.co.uk   |
+--+-+

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] slow seqscan

2004-04-21 Thread Stephan Szabo

On Wed, 21 Apr 2004, Edoardo Ceccarelli wrote:


  What happens if you go:
 
  CREATE INDEX annuncio400_rubric_testo_idx ON annuncio400(rubric,
  LOWER(testo));
 
  or even just:
 
  CREATE INDEX annuncio400_rubric_testo_idx ON annuncio400(LOWER(testo));
 
 I wasn't able to make this 2 field index with lower:

 dba400=# CREATE INDEX annuncio400_rubric_testo_idx ON
 annuncio400(rubric, LOWER(testo));
 ERROR:  parser: parse error at or near ( at character 71

That's a 7.4 feature I think (and I think the version with two columns
may need extra parens around the lower()). I think the only way to do
something equivalent in 7.3 is to make a function that concatenates the
two in some fashion after having applied the lower to the one part and
then using that in the queries as well.  Plus, if you're not in C
locale, I'm not sure that it'd help in 7.3 anyway.

  But the strangest thing ever is that if I change the filter with
  another one that represent a smaller amount of data  it uses the
  index scan!!!
 
 
  What's strange about that?  The less data is going to be retrieved,
  the more likely postgres is to use the index.
 
 can't understand this policy:

 dba400=# SELECT count(*) from annuncio400 where rubric='DD';
  count
 ---
   6753
 (1 row)

 dba400=# SELECT count(*) from annuncio400 where rubric='MA';
  count
 ---
   2165
 (1 row)

 so it's using the index on 2000 rows and not for 6000?  it's not that
 big difference, isn't it?

It's a question of how many pages it thinks it's going to have to retrieve
in order to handle the request.  If it say needs (or think it needs) to
retrieve 50% of the pages, then given a random_page_cost of 4, it's going
to expect the index scan to be about twice the cost.

Generally speaking one good way to compare is to try the query with
explain analyze and then change parameters like enable_seqscan and try the
query with explain analyze again and compare the estimated rows and costs.
That'll give an idea of how it expects the two versions of the query to
compare speed wise.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-21 Thread Dave Cramer
After some testing if you use the current head code for s_lock.c which
has some mods in it to alleviate this situation, and change
SPINS_PER_DELAY to 10 you can drastically reduce the cs with tom's test.
I am seeing a slight degradation in throughput using pgbench -c 10 -t
1000 but it might be liveable, considering the alternative is unbearable
in some situations.

Can anyone else replicate my results?

Dave
On Wed, 2004-04-21 at 08:10, Dirk_Lutzebäck wrote:
 It is intended to run indefinately.
 
 Dirk
 
 [EMAIL PROTECTED] wrote:
 
 How long is this test supposed to run?
 
 I've launched just 1 for testing, the plan seems horrible; the test is cpu
 bound and hasn't finished yet after 17:02 min of CPU time, dual XEON 2.6G
 Unixware 713
 
 The machine is a Fujitsu-Siemens TX 200 server
   
 
 
 
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
 
 
 
 !DSPAM:40866735106778584283649!
 
 
-- 
Dave Cramer
519 939 0336
ICQ # 14675561


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Help understanding stat tables

2004-04-21 Thread Chris Hoover
I just want to make sure that I am interpreting this data correctly.

From pg_statio_user_tables, I have pulled relname, heap_blks_read, 
heap_blks_hit.  I get several rows like this:
relname heap_bkls_read  heap_blks_hit
 clmhdr 8607161 196547165  
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   


So this means that I am getting over a 100% cache hit ratio for this table, 
right?  If not, please help me understand what these numbers mean.

Thanks,

Chris


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Shared buffers, Sort memory, Effective Cache Size

2004-04-21 Thread Frédéric Robinet
Hello,

I have a bi-PIII server with 2Gb of RAM with Debian and a PostgreSQL 7.4
running on.  What are the bests settings for shared buffers, sort memory and
effective cache size?

My main database have a small/mid range size: some tables may have 1 or 2
millions of records.

Thanks

Frédéric Robinet
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Help understanding stat tables

2004-04-21 Thread Chris Hoover
I think I have figured my problem out.

I was taking heap_blks_hit / heap_blks_read for my hit pct.

It should be heap_blks_hit/(heap_blks_read+heap_blks_hit), correct?

Thanks
On Wednesday 21 April 2004 11:34, Chris Hoover wrote:
 I just want to make sure that I am interpreting this data correctly.

 From pg_statio_user_tables, I have pulled relname, heap_blks_read,
 heap_blks_hit.  I get several rows like this:
 relname   heap_bkls_read  heap_blks_hit
  clmhdr   8607161 196547165


 So this means that I am getting over a 100% cache hit ratio for this table,
 right?  If not, please help me understand what these numbers mean.

 Thanks,

 Chris


 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-21 Thread Josh Berkus
Dave,

 After some testing if you use the current head code for s_lock.c which
 has some mods in it to alleviate this situation, and change
 SPINS_PER_DELAY to 10 you can drastically reduce the cs with tom's test.
 I am seeing a slight degradation in throughput using pgbench -c 10 -t
 1000 but it might be liveable, considering the alternative is unbearable
 in some situations.

 Can anyone else replicate my results?

Can you produce a patch against 7.4.1?   I'd like to test your fix against a 
real-world database.


-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Josh Berkus
Folks,

I've sent a polite e-mail to Mr. Gomez offering our help.  Please, nobody 
flame him!

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] slow seqscan

2004-04-21 Thread Tom Lane
Edoardo Ceccarelli [EMAIL PROTECTED] writes:
 I wasn't able to make this 2 field index with lower:

 dba400=# CREATE INDEX annuncio400_rubric_testo_idx ON 
 annuncio400(rubric, LOWER(testo));
 ERROR:  parser: parse error at or near ( at character 71

 seems impossible to creat 2 field indexes with lower function.

You need 7.4 to do that; previous releases don't support multi-column
functional indexes.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-21 Thread Paul Tuckfield
Dave:

Why would test and set increase context swtches:
Note that it *does not increase* context swtiches when the two threads 
are on the two cores of a single Xeon processor. (use taskset to force 
affinity on linux)

Scenario:
If the two test and set processes are testing and setting the same bit 
as each other, then they'll see worst case cache coherency misses.  
They'll ping a cache line back and forth between CPUs.  Another case, 
might be that they're tesing and setting different bits or words, but 
those bits or words are always in the same cache line, again causing 
worst case cache coherency and misses.  The fact that tis doesn't 
happen when the threads are bound to the 2 cores of a single Xeon 
suggests it's because they're now sharing L1 cache. No pings/bounces.

I wonder do the threads stall so badly when pinging cache lines back 
and forth,  that the kernel sees it as an opportunity to put the 
process to sleep? or do these worst case misses cause an interrupt?

My question is:  What is it that the two threads waiting for when they 
spin? Is it exactly the same resource, or two resources that happen to 
have test-and-set flags in the same cache line?

On Apr 20, 2004, at 7:41 PM, Dave Cramer wrote:

I modified the code in s_lock.c to remove the spins

#define SPINS_PER_DELAY 1

and it doesn't exhibit the behaviour

This effectively changes the code to

while(TAS(lock))
select(1); // 10ms
Can anyone explain why executing TAS 100 times would increase context
switches ?
Dave

On Tue, 2004-04-20 at 12:59, Josh Berkus wrote:
Anjan,

Quad 2.0GHz XEON with highest load we have seen on the applications, 
DB
performing great -
Can you run Tom's test?   It takes a particular pattern of data 
access to
reproduce the issue.
--
Dave Cramer
519 939 0336
ICQ # 14675561
---(end of 
broadcast)---
TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-21 Thread Tom Lane
Paul Tuckfield [EMAIL PROTECTED] writes:
 I wonder do the threads stall so badly when pinging cache lines back 
 and forth,  that the kernel sees it as an opportunity to put the 
 process to sleep? or do these worst case misses cause an interrupt?

No; AFAICS the kernel could not even be aware of that behavior.

The context swap storm is happening because of contention at the next
level up (LWLocks rather than spinlocks).  It could be an independent
issue that just happens to be triggered by the same sort of access
pattern.  I put forward a hypothesis that the cache miss storm caused by
the test-and-set ops induces the context swap storm by making the code
more likely to be executing in certain places at certain times ... but
it's only a hypothesis.

Yesterday evening I had pretty well convinced myself that they were
indeed independent issues: profiling on a single-CPU machine was telling
me that the test case I proposed spends over 10% of its time inside
ReadBuffer, which certainly seems like enough to explain a high rate of
contention on the BufMgrLock, without any assumptions about funny
behavior at the hardware level.  However, your report and Dave's suggest
that there really is some linkage.  So I'm still confused.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Matthew T. O'Connor
Paul Thomas wrote:

Looks like he's using the default postgresql.conf settings in which 
case I'm not suprised at pg looking so slow. His stated use of foreign 
keys invalidates the tests anyway as MyISAM tables don't support FKs 
so we're probably seeing FK check overheads in pg that are simply 
ignore by MySQL. In an honest test, MySQL should be reported as 
failing those tests.


Either failures, or they should not have been using MyISAM, they should 
have used the table format that supports FK's.  This is just not apples 
to apples.



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Help understanding stat tables

2004-04-21 Thread Tom Lane
Chris Hoover [EMAIL PROTECTED] writes:
 I was taking heap_blks_hit / heap_blks_read for my hit pct.
 It should be heap_blks_hit/(heap_blks_read+heap_blks_hit), correct?

Right.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-21 Thread Dave Cramer
FYI,

I am doing my testing on non hyperthreading dual athlons. 

Also, the test and set is attempting to set the same resource, and not
simply a bit. It's really an lock;xchg in assemblelr.

Also we are using the PAUSE mnemonic, so we should not be seeing any
cache coherency issues, as the cache is being taken out of the picture
AFAICS ?

Dave

On Wed, 2004-04-21 at 14:19, Paul Tuckfield wrote:
 Dave:
 
 Why would test and set increase context swtches:
 Note that it *does not increase* context swtiches when the two threads 
 are on the two cores of a single Xeon processor. (use taskset to force 
 affinity on linux)
 
 Scenario:
 If the two test and set processes are testing and setting the same bit 
 as each other, then they'll see worst case cache coherency misses.  
 They'll ping a cache line back and forth between CPUs.  Another case, 
 might be that they're tesing and setting different bits or words, but 
 those bits or words are always in the same cache line, again causing 
 worst case cache coherency and misses.  The fact that tis doesn't 
 happen when the threads are bound to the 2 cores of a single Xeon 
 suggests it's because they're now sharing L1 cache. No pings/bounces.
 
 
 I wonder do the threads stall so badly when pinging cache lines back 
 and forth,  that the kernel sees it as an opportunity to put the 
 process to sleep? or do these worst case misses cause an interrupt?
 
 My question is:  What is it that the two threads waiting for when they 
 spin? Is it exactly the same resource, or two resources that happen to 
 have test-and-set flags in the same cache line?
 
 On Apr 20, 2004, at 7:41 PM, Dave Cramer wrote:
 
  I modified the code in s_lock.c to remove the spins
 
  #define SPINS_PER_DELAY 1
 
  and it doesn't exhibit the behaviour
 
  This effectively changes the code to
 
 
  while(TAS(lock))
  select(1); // 10ms
 
  Can anyone explain why executing TAS 100 times would increase context
  switches ?
 
  Dave
 
 
  On Tue, 2004-04-20 at 12:59, Josh Berkus wrote:
  Anjan,
 
  Quad 2.0GHz XEON with highest load we have seen on the applications, 
  DB
  performing great -
 
  Can you run Tom's test?   It takes a particular pattern of data 
  access to
  reproduce the issue.
  -- 
  Dave Cramer
  519 939 0336
  ICQ # 14675561
 
 
  ---(end of 
  broadcast)---
  TIP 8: explain analyze is your friend
 
 
 
 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match
 
 
 
 !DSPAM:4086c4d0263544680737483!
 
 
-- 
Dave Cramer
519 939 0336
ICQ # 14675561


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Jan Wieck
Josh Berkus wrote:

Folks,

I've sent a polite e-mail to Mr. Gomez offering our help.  Please, nobody 
flame him!

Please keep in mind that the entire test has, other than a similar 
database schema and query types maybe, nothing to do with a TPC-H. I 
don't see any kind of SUT. Foreign key support on the DB level is not 
required by any of the TPC benchmarks. But the System Under Test, which 
is the combination of middleware application and database together with 
all computers and network components these parts are running on, must 
implement all the required semantics, like ACID properties, referential 
integrity c. One could implement a TPC-H with flat files, it's just a 
major pain in the middleware.

A proper TPC benchmark implementation would for example be a complete 
PHP+DB application, where the user interaction is done by an emulated 
browser and what is measured is the http response times, not anything 
going on between PHP and the DB. Assuming that all requirements of the 
TPC specification are implemented by either using available DB features, 
or including appropriate workarounds in the PHP code, that would very 
well lead to something that can compare PHP+MySQL vs. PHP+PostgreSQL.

All TPC benchmarks I have seen are performed by timing such a system 
after a considerable rampup time, giving the DB system a chance to 
properly populate caches and so forth. Rebooting the machine just before 
the test is the wrong thing here and will especially kill any advanced 
cache algorithms like ARC.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Wierd context-switching issue on Xeon patch for 7.4.1

2004-04-21 Thread Dave Cramer
attached.
-- 
Dave Cramer
519 939 0336
ICQ # 14675561
Index: backend/storage/lmgr/s_lock.c
===
RCS file: /usr/local/cvs/pgsql-server/src/backend/storage/lmgr/s_lock.c,v
retrieving revision 1.16
diff -c -r1.16 s_lock.c
*** backend/storage/lmgr/s_lock.c	8 Aug 2003 21:42:00 -	1.16
--- backend/storage/lmgr/s_lock.c	21 Apr 2004 20:27:34 -
***
*** 76,82 
  	 * The select() delays are measured in centiseconds (0.01 sec) because 10
  	 * msec is a common resolution limit at the OS level.
  	 */
! #define SPINS_PER_DELAY		100
  #define NUM_DELAYS			1000
  #define MIN_DELAY_CSEC		1
  #define MAX_DELAY_CSEC		100
--- 76,82 
  	 * The select() delays are measured in centiseconds (0.01 sec) because 10
  	 * msec is a common resolution limit at the OS level.
  	 */
! #define SPINS_PER_DELAY		10
  #define NUM_DELAYS			1000
  #define MIN_DELAY_CSEC		1
  #define MAX_DELAY_CSEC		100
***
*** 88,93 
--- 88,94 
  
  	while (TAS(lock))
  	{
+ 		__asm__ __volatile__ ( rep;nop: : :memory);
  		if (++spins  SPINS_PER_DELAY)
  		{
  			if (++delays  NUM_DELAYS)
Index: include/storage/s_lock.h
===
RCS file: /usr/local/cvs/pgsql-server/src/include/storage/s_lock.h,v
retrieving revision 1.115.2.1
diff -c -r1.115.2.1 s_lock.h
*** include/storage/s_lock.h	4 Nov 2003 09:43:56 -	1.115.2.1
--- include/storage/s_lock.h	21 Apr 2004 20:26:25 -
***
*** 103,110 
  	register slock_t _res = 1;
  
  	__asm__ __volatile__(
! 			lock			\n
  			xchgb	%0,%1	\n
  :		=q(_res), =m(*lock)
  :		0(_res));
  	return (int) _res;
--- 103,113 
  	register slock_t _res = 1;
  
  	__asm__ __volatile__(
! 		   cmpb $0,%1  \n
! 		   jne 1f  \n
! 			lock		\n
  			xchgb	%0,%1	\n
+ 		   1:\n
  :		=q(_res), =m(*lock)
  :		0(_res));
  	return (int) _res;

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Moving postgres to FC disks

2004-04-21 Thread Cott Lang
On Tue, 2004-04-20 at 17:27, Joshua D. Drake wrote:

  -Currently, the internal RAID volume is ext3 filesystem. Any
  recommendations for the filesystem on the new FC volume? Rieserfs?
  
  
 XFS

What Linux distributions are popular in here for PG+XFS?  

I'm very disappointed that Redhat Enterprise 3 doesn't appear to support
XFS/JFS, or anything else. Suse Server 8 seems very dated, at least from
the eval I downloaded. I'm curious as to where other people have gone
with the death of RH9. I'd have gone on to Redhat 3 if I wasn't
interested in getting some of the benefits of XFS at the same time ...




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-21 Thread Tom Lane
Kenneth Marshall [EMAIL PROTECTED] writes:
 If the context swap storm derives from LWLock contention, maybe using
 a random order to assign buffer locks in buf_init.c would prevent
 simple adjacency of buffer allocation to cause the storm.

Good try, but no cigar ;-).  The test cases I've been looking at take
only shared locks on the per-buffer locks, so that's not where the
context swaps are coming from.  The swaps have to be caused by the
BufMgrLock, because that's the only exclusive lock being taken.

I did try increasing the allocated size of the spinlocks to 128 bytes
to see if it would do anything.  It didn't ...

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Wierd context-switching issue on Xeon patch for 7.4.1

2004-04-21 Thread Tom Lane
Dave Cramer [EMAIL PROTECTED] writes:
 diff -c -r1.16 s_lock.c
 *** backend/storage/lmgr/s_lock.c 8 Aug 2003 21:42:00 -   1.16
 --- backend/storage/lmgr/s_lock.c 21 Apr 2004 20:27:34 -
 ***
 *** 76,82 
* The select() delays are measured in centiseconds (0.01 sec) because 10
* msec is a common resolution limit at the OS level.
*/
 ! #define SPINS_PER_DELAY 100
   #define NUM_DELAYS  1000
   #define MIN_DELAY_CSEC  1
   #define MAX_DELAY_CSEC  100
 --- 76,82 
* The select() delays are measured in centiseconds (0.01 sec) because 10
* msec is a common resolution limit at the OS level.
*/
 ! #define SPINS_PER_DELAY 10
   #define NUM_DELAYS  1000
   #define MIN_DELAY_CSEC  1
   #define MAX_DELAY_CSEC  100


As far as I can tell, this does reduce the rate of semop's
significantly, but it does so by bringing the overall processing rate
to a crawl :-(.  I see 97% CPU idle time when using this patch.
I believe what is happening is that the select() delay in s_lock.c is
being hit frequently because the spin loop isn't allowed to run long
enough to let the other processor get out of the spinlock.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Wierd context-switching issue on Xeon patch for 7.4.1

2004-04-21 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 For BSDOS it has:

 #if (CLIENT_OS == OS_FREEBSD) || (CLIENT_OS == OS_BSDOS) || \
 (CLIENT_OS == OS_OPENBSD) || (CLIENT_OS == OS_NETBSD)
 { /* comment out if inappropriate for your *bsd - cyp (25/may/1999) */
   int ncpus; size_t len = sizeof(ncpus);
   int mib[2]; mib[0] = CTL_HW; mib[1] = HW_NCPU;
   if (sysctl( mib[0], 2, ncpus, len, NULL, 0 ) == 0)
   //if (sysctlbyname(hw.ncpu, ncpus, len, NULL, 0 ) == 0)
 cpucount = ncpus;
 }

Multiplied by how many platforms?  Ewww...

I was wondering about some sort of dynamic adaptation, roughly along the
lines of whenever a spin loop successfully gets the lock after
spinning, decrease the allowed loop count by one; whenever we fail to
get the lock after spinning, increase by 100; if the loop count reaches,
say, 1, decide we are on a uniprocessor and irreversibly set it to
1.  As written this would tend to incur a select() delay once per
hundred spinlock acquisitions, which is way too much, but I think we
could make it work with a sufficiently slow adaptation rate.  The tricky
part is that a slow adaptation rate means we can't have every backend
figuring this out for itself --- the right value would have to be
maintained globally, and I'm not sure how to do that without adding a
lot of overhead.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend