date:20070330

[PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Gaetano Mendola

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all,
take a look at those plans:


test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE pvcp in 
(select id from l_pvcp where value ilike '%pi%');
  QUERY PLAN
- 
---
 Aggregate  (cost=154279.01..154279.01 rows=1 width=8) (actual 
time=4010.094..4010.096 rows=1 loops=1)
   ->  Hash IN Join  (cost=2.22..153835.49 rows=177404 width=8) (actual 
time=2.908..4001.814 rows=7801 loops=1)
 Hash Cond: ("outer".pvcp = "inner".id)
 ->  Seq Scan on t_oa_2_00_card  (cost=0.00..147670.82 rows=877682 
width=12) (actual time=0.030..2904.522 rows=877682 loops=1)
 ->  Hash  (cost=2.17..2.17 rows=19 width=4) (actual time=0.093..0.093 
rows=1 loops=1)
   ->  Seq Scan on l_pvcp  (cost=0.00..2.17 rows=19 width=4) 
(actual time=0.066..0.081 rows=1 loops=1)
 Filter: (value ~~* '%pi%'::text)
 Total runtime: 4010.413 ms
(8 rows)

test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE pvcp in 
(select id from l_pvcp where value ilike 'pi');
  QUERY PLAN
- 
--
 Aggregate  (cost=93540.82..93540.83 rows=1 width=8) (actual 
time=55.333..55.334 rows=1 loops=1)
   ->  Nested Loop  (cost=84.60..93447.44 rows=37348 width=8) (actual 
time=2.730..46.770 rows=7801 loops=1)
 ->  HashAggregate  (cost=2.18..2.22 rows=4 width=4) (actual 
time=0.089..0.092 rows=1 loops=1)
   ->  Seq Scan on l_pvcp  (cost=0.00..2.17 rows=4 width=4) (actual 
time=0.065..0.081 rows=1 loops=1)
 Filter: (value ~~* 'pi'::text)
 ->  Bitmap Heap Scan on t_oa_2_00_card  (cost=82.42..23216.95 
rows=11548 width=12) (actual time=2.633..29.566 rows=7801 loops=1)
   Recheck Cond: (t_oa_2_00_card.pvcp = "outer".id)
   ->  Bitmap Index Scan on i3_t_oa_2_00_card  (cost=0.00..82.42 
rows=11548 width=0) (actual time=2.050..2.050 rows=7801 loops=1)
 Index Cond: (t_oa_2_00_card.pvcp = "outer".id)
 Total runtime: 55.454 ms
(10 rows)


Isn't too much choose a sequential scan due to 19 estimated rows when with 4 
estimated does a correct index scan ?


Regards
Gaetano Mendola
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGDNlB7UpzwH2SGd4RAjY8AJ9yrIaQe297m3Lh7+ZVM4i9hoqlYQCeJFGL
z00RLwJ5yR/7bOT2TVx+JVA=
=1lOI
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Claus Guttesen


Hi all,
take a look at those plans:


Try changing random_page_cost from the default 4 to 2 in postgresql.conf:

random_page_cost = 2

The default in postgresql is somewhat conservative. This setting
indicates for postgresql how fast your disks are, the lower the
faster.

Could this setting be changed to 2 as default rather than 4?

regards
Claus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Richard Huxton


Gaetano Mendola wrote:

-BEGIN PGP SIGNED MESSAGE- Hash: SHA1

Hi all, take a look at those plans:


test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE
pvcp in (select id from l_pvcp where value ilike '%pi%');



->  Hash IN Join  (cost=2.22..153835.49 rows=177404 width=8) (actual
time=2.908..4001.814 rows=7801 loops=1) Hash Cond: ("outer".pvcp =
"inner".id)



Isn't too much choose a sequential scan due to 19 estimated rows when
with 4 estimated does a correct index scan ?


I don't think it's the matches on l_pvcp that's the problem, it's the 
fact that it thinks its getting 177404 rows matching the IN.


Now, why 19 rows from the subquery should produce such a large estimate 
in the outer query I'm not sure. Any strange distribution of values on pvcp?


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Gaetano Mendola

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Claus Guttesen wrote:
>> Hi all,
>> take a look at those plans:
> 
> Try changing random_page_cost from the default 4 to 2 in postgresql.conf:
> 
> random_page_cost = 2
> 
> The default in postgresql is somewhat conservative. This setting
> indicates for postgresql how fast your disks are, the lower the
> faster.
> 
> Could this setting be changed to 2 as default rather than 4?

I have tuned that number already at 2.5, lowering it to 2 doesn't change
the plan.

Regards
Gaetano Mendola
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGDOGa7UpzwH2SGd4RAjvaAKDAbz/vxwyOBPCILGpw8rBSvTFMtACfRPBe
yMge0RFfww0ef7xrGBLal7o=
=k+RM
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Gaetano Mendola

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Richard Huxton wrote:
> Gaetano Mendola wrote:
>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1
>>
>> Hi all, take a look at those plans:
>>
>>
>> test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE
>> pvcp in (select id from l_pvcp where value ilike '%pi%');
> 
>> ->  Hash IN Join  (cost=2.22..153835.49 rows=177404 width=8) (actual
>> time=2.908..4001.814 rows=7801 loops=1) Hash Cond: ("outer".pvcp =
>> "inner".id)
> 
>> Isn't too much choose a sequential scan due to 19 estimated rows when
>> with 4 estimated does a correct index scan ?
> 
> I don't think it's the matches on l_pvcp that's the problem, it's the
> fact that it thinks its getting 177404 rows matching the IN.
> 
> Now, why 19 rows from the subquery should produce such a large estimate
> in the outer query I'm not sure. Any strange distribution of values on
> pvcp?

I don't know what do you mean for strange, this is the distribution:

test=# select count(*) from t_oa_2_00_card;
 count
- 
 877682
(1 row)

test=# select count(*), pvcp from t_oa_2_00_card group by pvcp;
 count | pvcp
- ---+--
13 |
 2 |   94
57 |   93
   250 |   90
  8158 |   89
  4535 |   88
  3170 |   87
 13711 |   86
  5442 |   85
  2058 |   84
44 |   83
 1 |   82
 4 |   80
 1 |   79
 14851 |   78
 12149 |   77
   149 |   76
 9 |   75
 4 |   74
 2 |   73
 5 |   72
 28856 |   71
 12847 |   70
  8183 |   69
 11246 |   68
  9232 |   67
 14433 |   66
 13970 |   65
  3616 |   64
  2996 |   63
  7801 |   62
  3329 |   61
   949 |   60
 35168 |   59
 18752 |   58
  1719 |   57
  1031 |   56
  1585 |   55
  2125 |   54
  9007 |   53
 22060 |   52
  2800 |   51
  5629 |   50
 16970 |   49
  8254 |   48
 11448 |   47
 20253 |   46
  3637 |   45
 13876 |   44
 19002 |   43
 17940 |   42
  5022 |   41
 24478 |   40
  2374 |   39
  4885 |   38
  3779 |   37
  3532 |   36
 11783 |   35
 15843 |   34
 14546 |   33
 29171 |   32
  5048 |   31
 13411 |   30
  6746 |   29
   375 |   28
  9244 |   27
 10577 |   26
 36096 |   25
  3827 |   24
 29497 |   23
 20362 |   22
  8068 |   21
  2936 |   20
   661 |   19
  8224 |   18
  3016 |   17
  7731 |   16
  8792 |   15
  4486 |   14
 3 |   13
  6859 |   12
  4576 |   11
 13377 |   10
 14578 |9
  6991 |8
 52714 |7
  6477 |6
 11445 |5
 24690 |4
 10522 |3
  2917 |2
 34694 |1
(92 rows)


I think that estimate is something like:  877682 / 92 * 19


Regards
Gaetano Mendola



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGDONZ7UpzwH2SGd4RAhs3AKCYWgyn3vkzDvhWl/tF1TRs/nDT7QCeJDZu
k9hQ0WBS1cFHcCjIs3jca0Y=
=RIDE
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Richard Huxton


Gaetano Mendola wrote:


Richard Huxton wrote:


Now, why 19 rows from the subquery should produce such a large estimate
in the outer query I'm not sure. Any strange distribution of values on
pvcp?


I don't know what do you mean for strange, this is the distribution:

test=# select count(*) from t_oa_2_00_card;
 count
- 
 877682
(1 row)

test=# select count(*), pvcp from t_oa_2_00_card group by pvcp;
 count | pvcp
- ---+--
(92 rows)


I think that estimate is something like:  877682 / 92 * 19


So if you actually had 19 matches for '%pi%' it might be a sensible plan 
then. I'm afraid I don't know of any way to improve PG's prediction on 
how many matches you'll get for a substring pattern though.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread ismo . tuononen


I don't know about postgres, but in oracle it could be better to write:

SELECT COUNT(distinct c.id)
FROM t_oa_2_00_card c,l_pvcp l
WHERE l.value ilike '%pi%' and c.pvcp=l.id;

or

SELECT COUNT(c.id) 
FROM t_oa_2_00_card c,
(select distinct id from l_pvcp where value ilike '%pi%') l
WHERE c.pvcp=l.id;

depending how many rows, what kind of rows, ... are in l_pvcp table.

having index in t_oa_2_00_card.pvcp can slow queries in oracle.

Ismo

On Fri, 30 Mar 2007, Gaetano Mendola wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Richard Huxton wrote:
> > Gaetano Mendola wrote:
> >> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1
> >>
> >> Hi all, take a look at those plans:
> >>
> >>
> >> test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE
> >> pvcp in (select id from l_pvcp where value ilike '%pi%');
> > 
> >> ->  Hash IN Join  (cost=2.22..153835.49 rows=177404 width=8) (actual
> >> time=2.908..4001.814 rows=7801 loops=1) Hash Cond: ("outer".pvcp =
> >> "inner".id)
> > 
> >> Isn't too much choose a sequential scan due to 19 estimated rows when
> >> with 4 estimated does a correct index scan ?
> > 
> > I don't think it's the matches on l_pvcp that's the problem, it's the
> > fact that it thinks its getting 177404 rows matching the IN.
> > 
> > Now, why 19 rows from the subquery should produce such a large estimate
> > in the outer query I'm not sure. Any strange distribution of values on
> > pvcp?
> 
> I don't know what do you mean for strange, this is the distribution:
> 
> test=# select count(*) from t_oa_2_00_card;
>  count
> - 
>  877682
> (1 row)
> 
> test=# select count(*), pvcp from t_oa_2_00_card group by pvcp;
>  count | pvcp
> - ---+--
> 13 |
>  2 |   94
> 57 |   93
>250 |   90
>   8158 |   89
>   4535 |   88
>   3170 |   87
>  13711 |   86
>   5442 |   85
>   2058 |   84
> 44 |   83
>  1 |   82
>  4 |   80
>  1 |   79
>  14851 |   78
>  12149 |   77
>149 |   76
>  9 |   75
>  4 |   74
>  2 |   73
>  5 |   72
>  28856 |   71
>  12847 |   70
>   8183 |   69
>  11246 |   68
>   9232 |   67
>  14433 |   66
>  13970 |   65
>   3616 |   64
>   2996 |   63
>   7801 |   62
>   3329 |   61
>949 |   60
>  35168 |   59
>  18752 |   58
>   1719 |   57
>   1031 |   56
>   1585 |   55
>   2125 |   54
>   9007 |   53
>  22060 |   52
>   2800 |   51
>   5629 |   50
>  16970 |   49
>   8254 |   48
>  11448 |   47
>  20253 |   46
>   3637 |   45
>  13876 |   44
>  19002 |   43
>  17940 |   42
>   5022 |   41
>  24478 |   40
>   2374 |   39
>   4885 |   38
>   3779 |   37
>   3532 |   36
>  11783 |   35
>  15843 |   34
>  14546 |   33
>  29171 |   32
>   5048 |   31
>  13411 |   30
>   6746 |   29
>375 |   28
>   9244 |   27
>  10577 |   26
>  36096 |   25
>   3827 |   24
>  29497 |   23
>  20362 |   22
>   8068 |   21
>   2936 |   20
>661 |   19
>   8224 |   18
>   3016 |   17
>   7731 |   16
>   8792 |   15
>   4486 |   14
>  3 |   13
>   6859 |   12
>   4576 |   11
>  13377 |   10
>  14578 |9
>   6991 |8
>  52714 |7
>   6477 |6
>  11445 |5
>  24690 |4
>  10522 |3
>   2917 |2
>  34694 |1
> (92 rows)
> 
> 
> I think that estimate is something like:  877682 / 92 * 19
> 
> 
> Regards
> Gaetano Mendola
> 
> 
> 
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.2.5 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> 
> iD8DBQFGDONZ7UpzwH2SGd4RAhs3AKCYWgyn3vkzDvhWl/tF1TRs/nDT7QCeJDZu
> k9hQ0WBS1cFHcCjIs3jca0Y=
> =RIDE
> -END PGP SIGNATURE-
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match
> 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Michael Fuhr

On Fri, Mar 30, 2007 at 12:08:26PM +0200, Gaetano Mendola wrote:
> Claus Guttesen wrote:
> > Try changing random_page_cost from the default 4 to 2 in postgresql.conf:
> > 
> > random_page_cost = 2
> 
> I have tuned that number already at 2.5, lowering it to 2 doesn't change
> the plan.

The following 19-fold overestimate is influencing the rest of the
plan:

  ->  Seq Scan on l_pvcp  (cost=0.00..2.17 rows=19 width=4) (actual 
time=0.066..0.081 rows=1 loops=1)
Filter: (value ~~* '%pi%'::text)

Have you tried increasing the statistics target on l_pvcp.value?
I ran your queries against canned data in 8.2.3 and better statistics
resulted in more accurate row count estimates for this and other
parts of the plan.  I don't recall if estimates for non-leading-character
matches in earlier versions can benefit from better statistics.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Michael Fuhr

On Fri, Mar 30, 2007 at 04:46:11AM -0600, Michael Fuhr wrote:
> Have you tried increasing the statistics target on l_pvcp.value?
> I ran your queries against canned data in 8.2.3 and better statistics
> resulted in more accurate row count estimates for this and other
> parts of the plan.  I don't recall if estimates for non-leading-character
> matches in earlier versions can benefit from better statistics.

This might work only in 8.2.  I see the following in the Release Notes:

* Improve the optimizer's selectivity estimates for LIKE, ILIKE,
  and regular expression operations (Tom)

-- 
Michael Fuhr

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Gaetano Mendola

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Michael Fuhr wrote:
> On Fri, Mar 30, 2007 at 12:08:26PM +0200, Gaetano Mendola wrote:
>> Claus Guttesen wrote:
>>> Try changing random_page_cost from the default 4 to 2 in postgresql.conf:
>>>
>>> random_page_cost = 2
>> I have tuned that number already at 2.5, lowering it to 2 doesn't change
>> the plan.
> 
> The following 19-fold overestimate is influencing the rest of the
> plan:
> 
>   ->  Seq Scan on l_pvcp  (cost=0.00..2.17 rows=19 width=4) (actual 
> time=0.066..0.081 rows=1 loops=1)
> Filter: (value ~~* '%pi%'::text)
> 
> Have you tried increasing the statistics target on l_pvcp.value?
> I ran your queries against canned data in 8.2.3 and better statistics
> resulted in more accurate row count estimates for this and other
> parts of the plan.  I don't recall if estimates for non-leading-character
> matches in earlier versions can benefit from better statistics.
> 


test=# alter table l_pvcp alter column value set statistics 1000;
ALTER TABLE
test=# analyze l_pvcp;
ANALYZE
test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE pvcp in 
(select id from l_pvcp where value ilike '%pi%');
  QUERY PLAN
- 
---
 Aggregate  (cost=154321.83..154321.84 rows=1 width=8) (actual 
time=4948.627..4948.628 rows=1 loops=1)
   ->  Hash IN Join  (cost=2.22..153877.08 rows=177898 width=8) (actual 
time=2.262..4940.395 rows=7801 loops=1)
 Hash Cond: ("outer".pvcp = "inner".id)
 ->  Seq Scan on t_oa_2_00_card  (cost=0.00..147695.25 rows=880125 
width=12) (actual time=0.040..3850.074 rows=877682 loops=1)
 ->  Hash  (cost=2.17..2.17 rows=19 width=4) (actual time=0.073..0.073 
rows=1 loops=1)
   ->  Seq Scan on l_pvcp  (cost=0.00..2.17 rows=19 width=4) 
(actual time=0.052..0.067 rows=1 loops=1)
 Filter: (value ~~* '%pi%'::text)
 Total runtime: 4948.717 ms
(8 rows)


and nothing changed.


Regards
Gaetano Mendola

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGDPVS7UpzwH2SGd4RAp+DAJ9Z5HdDcKx9rOQDbm+uAdb8uEc8OgCgjGmM
Z351j5icCHT4yMOLEu3ZcJY=
=CY1c
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Gaetano Mendola

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Michael Fuhr wrote:
> On Fri, Mar 30, 2007 at 04:46:11AM -0600, Michael Fuhr wrote:
>> Have you tried increasing the statistics target on l_pvcp.value?
>> I ran your queries against canned data in 8.2.3 and better statistics
>> resulted in more accurate row count estimates for this and other
>> parts of the plan.  I don't recall if estimates for non-leading-character
>> matches in earlier versions can benefit from better statistics.
> 
> This might work only in 8.2.  I see the following in the Release Notes:
> 
> * Improve the optimizer's selectivity estimates for LIKE, ILIKE,
>   and regular expression operations (Tom)


I will try same select on a 8.2 ( that one was a 8.1 ) and I'll let you
know.

Regards
Gaetano Mendola

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGDPXk7UpzwH2SGd4RAsQcAKCs5sh3mYuE2TMdbtdxxgSOs989JACglT1H
44s1hJZJ5upBzIPwLigoxa4=
=Aas2
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Richard Huxton


Gaetano Mendola wrote:


The match 19 for '%pi%' is estimated, the real matches are:

test=# select id from l_pvcp where value ilike '%pi%';
 id
- 
 62
(1 row)


test=#  select id from l_pvcp where value ilike 'pi';
 id
- 
 62
(1 row)

so one row in both cases, that's why I expect for both same plan.


Ah, but it's got no way of knowing what matches you'll get for 
'%anything%'. There's no easy way to get statistics for matching substrings.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

2007-03-30 Thread Dimitri



You are right in that the page size constraint is lifted in that
directio cuts out the VM filesystem cache.  However, the Solaris
kernel still issues io ops in terms of its logical block size (which
we have at the default 8K).  It can issue io ops for fragments as
small as 1/8th of the block size, but Postgres issues its io requests
in terms of the block size which means that io ops from Postgres will
be in 8K chunks which is exactly what we see when we look at our
system io stats.  In fact, if any io request is made that isn't a
multiple of 512 bytes (the disk sector size), the file system
switches back to the buffered io.


Oh, yes, of course! yes, you still need to respect multiple of 512
bytes block size on read and write - sorry, I was tired :)

Then it's seems to be true - default XLOG block size is 8K, means for
every even small auto-committed transaction we should write 8K?... Is
there any reason to use so big default block size?...

Probably it may be a good idea to put it as 'initdb' parameter? and
have such value per database server?

Rgds,
-Dimitri



>
> However, to understand TX number mystery I think the only possible
> solution
> is to reproduce a small live test:
>
> (I'm sure you're aware you can mount/unmount forcedirectio
> dynamically?)
>
> during stable workload do:
>
>   # mount -o remount,logging  /path_to_your_filesystem
>
> and check if I/O volume is increasing as well TX numbers
> than come back:
>
>   # mount -o remount,forcedirectio  /path_to_your_filesystem
>
> and see if I/O volume is decreasing as well TX numbers...

That's an excellent idea and I'll run it by the rest of our team
tomorrow.

erik jones <[EMAIL PROTECTED]>
software developer
615-296-0838
emma(r)






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Matteo Beccati


Gaetano Mendola wrote:

Michael Fuhr wrote:

On Fri, Mar 30, 2007 at 04:46:11AM -0600, Michael Fuhr wrote:

Have you tried increasing the statistics target on l_pvcp.value?
I ran your queries against canned data in 8.2.3 and better statistics
resulted in more accurate row count estimates for this and other
parts of the plan.  I don't recall if estimates for non-leading-character
matches in earlier versions can benefit from better statistics.

This might work only in 8.2.  I see the following in the Release Notes:

* Improve the optimizer's selectivity estimates for LIKE, ILIKE,
  and regular expression operations (Tom)



I will try same select on a 8.2 ( that one was a 8.1 ) and I'll let you
know.


You will also need to set statistics for the column to at least 100 to 
trigger the improved selectivity estimate if memory serves.


Not enough time to check the code, but Tom could better advise.


Best regards
--
Matteo Beccati
http://phpadsnew.com
http://phppgads.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

2007-03-30 Thread Erik Jones



On Mar 30, 2007, at 8:14 AM, Dimitri wrote:



You are right in that the page size constraint is lifted in that
directio cuts out the VM filesystem cache.  However, the Solaris
kernel still issues io ops in terms of its logical block size (which
we have at the default 8K).  It can issue io ops for fragments as
small as 1/8th of the block size, but Postgres issues its io requests
in terms of the block size which means that io ops from Postgres will
be in 8K chunks which is exactly what we see when we look at our
system io stats.  In fact, if any io request is made that isn't a
multiple of 512 bytes (the disk sector size), the file system
switches back to the buffered io.


Oh, yes, of course! yes, you still need to respect multiple of 512
bytes block size on read and write - sorry, I was tired :)

Then it's seems to be true - default XLOG block size is 8K, means for
every even small auto-committed transaction we should write 8K?... Is
there any reason to use so big default block size?...

Probably it may be a good idea to put it as 'initdb' parameter? and
have such value per database server?


I believe it's because that is a pretty normal Unix kernal block size  
and you want the two to match.


erik jones <[EMAIL PROTECTED]>
software developer
615-296-0838
emma(r)

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

2007-03-30 Thread Dimitri

The problem is while your goal is to commit as fast as possible - it's
pity to vast I/O operation speed just keeping common block size...
Let's say if your transaction modification entering into 512K - you'll
be able to write much more 512K blocks per second rather 8K per second
(for the same amount of data)... Even we rewrite probably several
times the same block with incoming transactions - it still costs on
traffic, and we will process slower even H/W can do better. Don't
think it's good, no? ;)

Rgds,
-Dimitri

On 3/30/07, Erik Jones <[EMAIL PROTECTED]> wrote:

On Mar 30, 2007, at 8:14 AM, Dimitri wrote:

>>
>> You are right in that the page size constraint is lifted in that
>> directio cuts out the VM filesystem cache.  However, the Solaris
>> kernel still issues io ops in terms of its logical block size (which
>> we have at the default 8K).  It can issue io ops for fragments as
>> small as 1/8th of the block size, but Postgres issues its io requests
>> in terms of the block size which means that io ops from Postgres will
>> be in 8K chunks which is exactly what we see when we look at our
>> system io stats.  In fact, if any io request is made that isn't a
>> multiple of 512 bytes (the disk sector size), the file system
>> switches back to the buffered io.
>
> Oh, yes, of course! yes, you still need to respect multiple of 512
> bytes block size on read and write - sorry, I was tired :)
>
> Then it's seems to be true - default XLOG block size is 8K, means for
> every even small auto-committed transaction we should write 8K?... Is
> there any reason to use so big default block size?...
>
> Probably it may be a good idea to put it as 'initdb' parameter? and
> have such value per database server?

I believe it's because that is a pretty normal Unix kernal block size
and you want the two to match.

erik jones <[EMAIL PROTECTED]>
software developer
615-296-0838
emma(r)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] Weird performance drop

2007-03-30 Thread Dave Dutcher

> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Vincenzo Romano
> 
> Is there any "workaround"?
> 
> In my opinion the later the query planner decisions are taken the more
> effective they can be.
> It could be an option for the function (body) to delay any 
> query planner
> decision.

I think a possible workaround is to use a plpgsql function and the execute
statement. The docs will have more info.

Dave



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Wrong plan sequential scan instead of an index one

2007-03-30 Thread Tom Lane

Richard Huxton  writes:
> Ah, but it's got no way of knowing what matches you'll get for 
> '%anything%'. There's no easy way to get statistics for matching substrings.

8.2 actually tries the match on the most-common-values list, if said
list is big enough (I think the threshold is stats target = 100).
Not sure if that will help here, though.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

2007-03-30 Thread Erik Jones



On Mar 30, 2007, at 10:05 AM, Kenneth Marshall wrote:


On Fri, Mar 30, 2007 at 04:25:16PM +0200, Dimitri wrote:
The problem is while your goal is to commit as fast as possible -  
it's

pity to vast I/O operation speed just keeping common block size...
Let's say if your transaction modification entering into 512K -  
you'll
be able to write much more 512K blocks per second rather 8K per  
second

(for the same amount of data)... Even we rewrite probably several
times the same block with incoming transactions - it still costs on
traffic, and we will process slower even H/W can do better. Don't
think it's good, no? ;)

Rgds,
-Dimitri


With block sizes you are always trading off overhead versus space
efficiency. Most OS write only in 4k/8k to the underlying hardware
regardless of the size of the write you issue. Issuing 16 512byte
writes has much more overhead than 1 8k write. On the light  
transaction

end, there is no real benefit to a small write and it will slow
performance for high throughput environments. It would be better to,
and I think that someone is looking into, batching I/O.

Ken


True, and really, considering that data is only written to disk by  
the bgwriter and at checkpoints, writes are already somewhat  
batched.  Also, Dimitri, I feel I should backtrack a little and point  
out that it is possible to have postgres write in 512byte blocks (at  
least for UFS which is what's in my head right now) if you set the  
systems logical block size to 4K and fragment size to 512 bytes and  
then set postgres's BLCKSZ to 512bytes.  However, as Ken has just  
pointed out, what you gain in space efficiency you lose in  
performance so if you're working with a high traffic database this  
wouldn't be a good idea.


erik jones <[EMAIL PROTECTED]>
software developer
615-296-0838
emma(r)

[PERFORM] scalablility problem

2007-03-30 Thread Xiaoning Ding


Hi all,

When I run multiple TPC-H queries (DBT3) on  postgresql, I found the system
is not scalable.  My machine has 8GB memory, and 4 Xeon Dual Core processor
( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I 
run multiple

q2 queries simultaneously. The results are:

1 process takes 0.65 second to finish.
2 processes take 1.07 seconds.
4 processes take 4.93 seconds.
8 processes take 16.95 seconds.

For 4-process case and 8-process case, queries takes even more time than
they are executed serially one after another. Because the system has 8GB
memory, which is much bigger than the DB size(SF=1), and I warmed the cache
before I run the test, I do not think the problem was caused by disk I/O.

I think it might be caused by some contentions. But I do not know postgresql
much. May anybody give me some clue to find the reasons?

Thanks!

Xiaoning

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] scalablility problem

2007-03-30 Thread Dave Dutcher

> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Xiaoning Ding
> 
> 
> Hi all,
> 
> When I run multiple TPC-H queries (DBT3) on  postgresql, I 
> found the system
> is not scalable.  My machine has 8GB memory, and 4 Xeon Dual 
> Core processor
> ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 
> 7.3.18.

Is there anyway you can upgrade to 8.2?  There have been a lot of
performance and scalability enhancements.


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] scalablility problem

2007-03-30 Thread Tom Lane

Xiaoning Ding <[EMAIL PROTECTED]> writes:
> When I run multiple TPC-H queries (DBT3) on  postgresql, I found the system
> is not scalable.  My machine has 8GB memory, and 4 Xeon Dual Core processor
> ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18.

If you are not running PG 8.1 or later, it's really not worth your time
to test this.  Multiprocessor scalability was hardly even on the radar
in 7.3 days.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] scalablility problem

2007-03-30 Thread Xiaoning Ding


Thanks guys,

I update PG to 8.2.3. The results are much better now.
1 process :0.94 second
2 processes: 1.32 seconds
4 processes: 2.03 seconds
8 processes: 2.54 seconds

Do you think they are good enough?
BTW where can I found some info on what 8.2.3 did to improve
scalability compared with pre 8.1 versions?


Xiaoning

Tom Lane wrote:

Xiaoning Ding <[EMAIL PROTECTED]> writes:

When I run multiple TPC-H queries (DBT3) on  postgresql, I found the system
is not scalable.  My machine has 8GB memory, and 4 Xeon Dual Core processor
( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18.


If you are not running PG 8.1 or later, it's really not worth your time
to test this.  Multiprocessor scalability was hardly even on the radar
in 7.3 days.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] scalablility problem

2007-03-30 Thread Scott Marlowe

On Fri, 2007-03-30 at 15:25, Xiaoning Ding wrote:
> Hi all,
> 
> When I run multiple TPC-H queries (DBT3) on  postgresql, I found the system
> is not scalable.  My machine has 8GB memory, and 4 Xeon Dual Core processor
> ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I 
> run multiple
> q2 queries simultaneously. The results are:
> 
> 1 process takes 0.65 second to finish.
> 2 processes take 1.07 seconds.
> 4 processes take 4.93 seconds.
> 8 processes take 16.95 seconds.
> 
> For 4-process case and 8-process case, queries takes even more time than
> they are executed serially one after another. Because the system has 8GB
> memory, which is much bigger than the DB size(SF=1), and I warmed the cache
> before I run the test, I do not think the problem was caused by disk I/O.

You may be right, you may be wrong.  What did top / vmstat have to say
about IO wait / disk idle time?

PostgreSQL has to commit transactions to disk.  TPC-H does both business
decision mostly read queries, as well as mixing in writes.  If you have
one hard drive, it may well be that activity is stacking up waiting on
those writes.

> I think it might be caused by some contentions. But I do not know postgresql
> much. May anybody give me some clue to find the reasons?

Others have mentioned your version of postgresql.  7.3 is quite old, as
it came out at the end of 2002.  Seeing as 7.3 is the standard pgsql
version supported by RHEL3, and RHEL came with a 2.6.9 kernel, I'm gonna
guess your OS is about that old too.

pgsql 7.3 cannot take advantage of lots of shared memory, and has some
issues scaling to lots of CPUs / processes.

While RHEL won't be EOLed for a few more years (redhat promises 7 years
I think) it's really not a great choice for getting started today. 
RHEL5 just released and RHEL4 is very stable.

There are several things to look at to get better performance.

1:  Late model PostgreSQL.  Go with 8.2.3 or as a minimum 8.1.8
2:  Late model Unix.
3:  RAID controller with battery backed cache
4:  Plenty of memory.
5:  Lots of hard drives
6:  4 to 8 CPUs.

Then, google postgresql performance tuning. There are three or four good
tuning guides that pop up right at the top.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] scalablility problem

2007-03-30 Thread Scott Marlowe

On Fri, 2007-03-30 at 16:38, Xiaoning Ding wrote:
> Thanks guys,
> 
> I update PG to 8.2.3. The results are much better now.
> 1 process :0.94 second
> 2 processes: 1.32 seconds
> 4 processes: 2.03 seconds
> 8 processes: 2.54 seconds
> 
> Do you think they are good enough?
> BTW where can I found some info on what 8.2.3 did to improve
> scalability compared with pre 8.1 versions?

Very nice, eh?  

I'd say look through -hackers and -perform to see some of it, but as
usual, the source code is the reference.  You'd be surprised how well
commented it is.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

2007-03-30 Thread Josh Berkus

Erik,

> You'er welcome!  However, I believe our situation is very different
> from what you're testing if I understand you correctly.  Are you
> saying that you're entire database will fit in memory?  If so, then
> these are very different situations as there is no way ours could
> ever do that.  In fact, I'm not sure that forcedirectio would really
> net you any gain in that situation as the IO service time will be
> basically nil if the filesystem cache doesn't have to page which I
> would think is why your seeing what you are.

Even more interesting.  I guess we've been doing too much work with 
benchmark workloads, which tend to be smaller databases.  

Thing is, there's *always* I/O for a read/write database.  If nothing else, 
updates have to be synched to disk.

Anyway ... regarding the mystery transactions ... are you certain that it's 
not your application?  I can imagine that, if your app has a fairly tight 
retry interval for database non-response, that I/O sluggishness could 
result in commit attempts spinning out of control.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] scalablility problem

2007-03-30 Thread Christopher Browne

Quoth [EMAIL PROTECTED] (Xiaoning Ding):
> When I run multiple TPC-H queries (DBT3) on  postgresql, I found the system
> is not scalable.  My machine has 8GB memory, and 4 Xeon Dual Core processor
> ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I

> I think it might be caused by some contentions. But I do not know postgresql
> much. May anybody give me some clue to find the reasons?

Two primary issues:

1.  You're running a horrendously ancient version of PostgreSQL.  The
7.3 series is Really Old.  Newer versions have *enormous*
improvements that are likely to be *enormously* relevant.

Upgrade to 8.2.

2.  There are known issues with the combination of Xeon processors and
PAE memory addressing; that sort of hardware tends to be *way* less
speedy than the specs would suggest.

There have been "context switching" issues on this sort of hardware
that are enormously worsened if you're running on a version of
PostgreSQL that is 4 major releases out of date.
-- 
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','gmail.com').
http://cbbrowne.com/info/x.html
I am not a number!
I am a free man!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] scalablility problem

2007-03-30 Thread Guido Neitzer


On 30.03.2007, at 19:18, Christopher Browne wrote:


2.  There are known issues with the combination of Xeon processors and
PAE memory addressing; that sort of hardware tends to be *way* less
speedy than the specs would suggest.


That is not true as the current series of processors (Woodcrest and  
the like) are also called Xeon. You probably mean the Pentium IV era  
Xeons.


cug

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

Re: [PERFORM] Weird performance drop

Re: [PERFORM] Wrong plan sequential scan instead of an index one

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

[PERFORM] scalablility problem

Re: [PERFORM] scalablility problem

Re: [PERFORM] scalablility problem

Re: [PERFORM] scalablility problem

Re: [PERFORM] scalablility problem

Re: [PERFORM] scalablility problem

Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris

Re: [PERFORM] scalablility problem

Re: [PERFORM] scalablility problem

28 matches

Site Navigation

Mail list logo

Footer information