Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-11 Thread Paolo Rizzi
Are you saying that PostgreSQL+PostGIS can actually run on a 
smartphone??? Intriguing...

Did anyone ever actually tried that???


If it's a supported CPU type and you've got a suitable build toolchain,
sure.  Seven or eight years ago we were getting a good laugh out of the
fact that you could run PG on a PlayStation 2.

Good to know!!! I imagine that on a PS3 it would be _really_ fast... :-)



The real issue with the kind of hardware you're describing is going to
be the finite write lifetime of a flash device.  For a low-update
application it'll probably be okay, but PG could very easily destroy a
flash in no time if you aren't careful to minimize updates.
This is something I thought about too, but it's something that those 
people (this client of mine) should be well aware of, anyway I'll point 
it out for them.


Anyway it seems interesting the fact that newer Flashes use several 
techniques, such as wear leveling, to spread writes across the least 
used cells. But this leads to files physical fragmentation, and it may 
be a case where sequential scans are actually slower than random ones!!!




regards, tom lane


Bye
Paolo


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-11 Thread PFC


A smartphone... you're right, I didn't think of that, but the hardware I  
described is very much like the one of a modern smartphone!!!
Are you saying that PostgreSQL+PostGIS can actually run on a  
smartphone??? Intriguing...

Did anyone ever actually tried that???


	While the performance of ARM cpus used in smartphones, PDAs, etc, is  
pretty good, this hardware is optimized for small size and low power use,  
thus you generally get quite low memory bandwidth, the problem of Flash  
endurance, and lack of standard interfaces to hook up to the rest of your  
system.
	Embedded PC-Compatible hardware in the 600 MHz range you mention would  
probably get a DIMM memory module (maybe for the only reason that  
mass-production makes them so cheap) so you'd get a much higher memory  
bandwidth, and much larger RAM. Even if the CPU is only 2x faster than a  
smartphone, if the memory bandwidth is 10x higher, you'll see the  
difference. It would also have standard interfaces, very useful for you,  
and you can hook it up to a real SSD (not a micro-SD card) with real flash  
wear leveling algorithms.


	But yeah since today's smartphones are more powerful that the desktops of  
10 years ago (which ran PG just fine) it would probably work, if you can  
run Linux on it...


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Dimitri
Folks, it's completely crazy, but here is what I found:

- if HISTORY table is analyzed with target 1000 my SELECT response
time is jumping to 3ms, and the max throughput is limited to 6.000 TPS
(it's what happenned to 8.3.7)

-if HISTORY table is analyzed with target 5 - my SELECT response time
is decreasing to 1.2ms (!)  and then my max TPS level is ~12.000 !
and CPU is used up to 95% even by 8.3.7 :-)  and 8.4 performed better
just because I left its analyze target to default 100 value.

Anyone may explain me why analyze target may have so huge negative
secondary effect?..

Next point: SCALABILITY ISSUE

Now both 8.3.7 and 8.4 have similar performance levels, but 8.3.7 is
always slightly better comparing to 8.4, but well. The problem I have:
   - on 8 cores: ~5.000 TPS  / 5.500 MAX
   - on 16 cores: ~10.000 TPS / 11.000 MAX
   - on  32 cores: ~10.500 TPS  / 11.500 MAX

What else may limit concurrent SELECTs here?..

Yes, forget, MySQL is reaching 17.500 TPS here.

Rgds,
-Dimitri

On 5/7/09, Simon Riggs si...@2ndquadrant.com wrote:

 On Thu, 2009-05-07 at 20:36 +0200, Dimitri wrote:

 I've simply restarted a full test with hashjoin OFF. Until 32
 concurrent users things are going well. Then since 32 users response
 time is jumping to 20ms, with 64 users it's higher again, and with 256
 users reaching 700ms, so TPS is dropping from 5.000 to ~200..

 With hashjoin ON it's not happening, and I'm reaching at least 11.000
 TPS on fully busy 32 cores.

 Much better to stick to the defaults.

 Sounds like a problem worth investigating further, but not pro bono.

 About scalability issue - there is one on 8.3.7, because on 32 cores
 with such kind of load it's using only 50% CPU and not outpassing
 6.000 TPS, while 8.4 uses 90% CPU and reaching 11.000 TPS..

 Yeh, small changes make a big difference. Thanks for the info.

 How does MySQL perform?

 --
  Simon Riggs   www.2ndQuadrant.com
  PostgreSQL Training, Services and Support



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Tom Lane
Dimitri dimitrik...@gmail.com writes:
 Anyone may explain me why analyze target may have so huge negative
 secondary effect?..

If these are simple queries, maybe what you're looking at is the
increase in planning time caused by having to process 10x as much
statistical data.  Cranking statistics_target to the max just because
you can is not necessarily a good strategy.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-11 Thread Stefan Kaltenbrunner

Paolo Rizzi wrote:
Are you saying that PostgreSQL+PostGIS can actually run on a 
smartphone??? Intriguing...

Did anyone ever actually tried that???


If it's a supported CPU type and you've got a suitable build toolchain,
sure.  Seven or eight years ago we were getting a good laugh out of the
fact that you could run PG on a PlayStation 2.

Good to know!!! I imagine that on a PS3 it would be _really_ fast... :-)


well not really - while it is fairly easy to get postgresql running on a 
PS3 it is not a fast platform. While the main CPU there is a pretty fast 
Power based core it only has 256MB of Ram and a single SATA disk 
available(though you could add some USB disks).



Stefan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Dimitri
Hi,

what may you suggest as the most optimal postgresql.conf to keep
writing as stable as possible?..

What I want is to avoid throughput waves  - I want to keep my
response times stable without any activity holes. I've tried to reduce
checkpoint timeout from 5min to 30sec - it helped, throughput is more
stable now, but instead of big waves I have now short waves anyway..

What is the best options combination here?..

Rgds,
-Dimitri

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Kevin Grittner
Dimitri dimitrik...@gmail.com wrote: 
 
 what may you suggest as the most optimal postgresql.conf to keep
 writing as stable as possible?..
 
 What I want is to avoid throughput waves  - I want to keep my
 response times stable without any activity holes. I've tried to
 reduce checkpoint timeout from 5min to 30sec - it helped, throughput
 is more stable now, but instead of big waves I have now short waves
 anyway..
 
 What is the best options combination here?..
 
What version of PostgreSQL?  What operating system?  What hardware?
 
The answers are going to depend on the answers to those questions.
 
It would also be good to show all lines from postgresql.conf which are
not commented out.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Dimitri
Hi Tom,

it was not willing :-)
it just stayed so after various probes with a query plan.

Anyway, on 8.4 the default target is 100, an just by move it to 5 I
reached on 16cores  10.500 TPS instead of 8.000 initially. And I think
you have a good reason to keep it equal to 100 by default, isn't it?
;-)

And what about scalability on 32cores?..
Any idea?

Rgds,
-Dimitri

On 5/11/09, Tom Lane t...@sss.pgh.pa.us wrote:
 Dimitri dimitrik...@gmail.com writes:
 Anyone may explain me why analyze target may have so huge negative
 secondary effect?..

 If these are simple queries, maybe what you're looking at is the
 increase in planning time caused by having to process 10x as much
 statistical data.  Cranking statistics_target to the max just because
 you can is not necessarily a good strategy.

   regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Dimitri
Hi Kevin,

PostgreSQL: 8.3.7  8.4
Server: Sun M5000 32cores
OS: Solaris 10

current postgresql.conf:

#
max_connections = 2000  # (change requires restart)
effective_cache_size = 48000MB
shared_buffers = 12000MB
temp_buffers = 200MB
work_mem = 100MB# min 64kB
maintenance_work_mem = 600MB# min 1MB

max_fsm_pages = 2048000
fsync = on  # turns forced synchronization on or off
synchronous_commit = off# immediate fsync at commit
wal_sync_method = fdatasync
wal_buffers = 2MB
wal_writer_delay = 400ms# 1-1 milliseconds

checkpoint_segments = 128
checkpoint_timeout = 30s

archive_mode = off
track_counts = on
autovacuum = on
log_autovacuum_min_duration = 0
autovacuum_max_workers = 4
autovacuum_naptime = 20 # time between autovacuum runs
autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50
autovacuum_vacuum_scale_factor = 0.001

lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'

#

Rgds,
-Dimitri


On 5/11/09, Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Dimitri dimitrik...@gmail.com wrote:

 what may you suggest as the most optimal postgresql.conf to keep
 writing as stable as possible?..

 What I want is to avoid throughput waves  - I want to keep my
 response times stable without any activity holes. I've tried to
 reduce checkpoint timeout from 5min to 30sec - it helped, throughput
 is more stable now, but instead of big waves I have now short waves
 anyway..

 What is the best options combination here?..

 What version of PostgreSQL?  What operating system?  What hardware?

 The answers are going to depend on the answers to those questions.

 It would also be good to show all lines from postgresql.conf which are
 not commented out.

 -Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Kevin Grittner
Dimitri dimitrik...@gmail.com wrote: 
 
 PostgreSQL: 8.3.7  8.4
 Server: Sun M5000 32cores
 OS: Solaris 10
 
Does that have a battery backed RAID controller?  If so, is it
configured for write-back?  These both help a lot with smoothing
checkpoint I/O gluts.
 
We've minimized problems by making the background writer more
aggressive.  8.3 and later does a better job in general, but we've
still had to go with:
 
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 4.0
 
 shared_buffers = 12000MB
 
You might want to test with that set to something much lower, to see
what the checkpoint delays look like.  We've found it best to use a
small (256MB) setting, and leave caching to the OS; in our
environment, it seems to do a better job of scheduling the disk I/O. 
YMMV, of course.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Dimitri
Thanks a lot, I'll try them all!

Yes, I have a good external storage box with battery backed cache enabled.
There are 64GB of RAM so I expected it'll help little bit to increase
a buffer cache, but ok, will see if with 256MB it'll be better :-)

What about full_page_writes ? seems it's on  by default. Does it
makes sense to put if off?..

Rgds,
-Dimitri




On 5/11/09, Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Dimitri dimitrik...@gmail.com wrote:

 PostgreSQL: 8.3.7  8.4
 Server: Sun M5000 32cores
 OS: Solaris 10

 Does that have a battery backed RAID controller?  If so, is it
 configured for write-back?  These both help a lot with smoothing
 checkpoint I/O gluts.

 We've minimized problems by making the background writer more
 aggressive.  8.3 and later does a better job in general, but we've
 still had to go with:

 bgwriter_lru_maxpages = 1000
 bgwriter_lru_multiplier = 4.0

 shared_buffers = 12000MB

 You might want to test with that set to something much lower, to see
 what the checkpoint delays look like.  We've found it best to use a
 small (256MB) setting, and leave caching to the OS; in our
 environment, it seems to do a better job of scheduling the disk I/O.
 YMMV, of course.

 -Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Kevin Grittner
Dimitri dimitrik...@gmail.com wrote:
 
 What about full_page_writes ? seems it's on  by default. Does it
 makes sense to put if off?..
 
It would probably help with performance, but the description is a
little disconcerting in terms of crash recovery.  We tried running
with it off for a while (a year or so back), but had problems with
corruption.  I think the specific cause of that has since been fixed,
it's left us a bit leery of the option.
 
Maybe someone else can speak to how safe (or not) the current
implementation of that option is.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Scott Marlowe
On Mon, May 11, 2009 at 10:31 AM, Dimitri dimitrik...@gmail.com wrote:
 Hi Kevin,

 PostgreSQL: 8.3.7  8.4
 Server: Sun M5000 32cores
 OS: Solaris 10

 current postgresql.conf:

 #
 max_connections = 2000                  # (change requires restart)
 effective_cache_size = 48000MB
 shared_buffers = 12000MB
 temp_buffers = 200MB
 work_mem = 100MB                                # min 64kB
 maintenance_work_mem = 600MB            # min 1MB

 max_fsm_pages = 2048000
 fsync = on                              # turns forced synchronization on or 
 off
 synchronous_commit = off                # immediate fsync at commit
 wal_sync_method = fdatasync
 wal_buffers = 2MB
 wal_writer_delay = 400ms                # 1-1 milliseconds

 checkpoint_segments = 128
 checkpoint_timeout = 30s

What's your checkpoint completion target set to? Crank that up a bit
ot 0.7, 0.8 etc and make the timeout more, not less.  That should
help.

Also, look into better hardware (RAID controller with battery backed
cache) and also putting pg_xlog on a separate RAID-1 set (or RAID-10
set if you've got a lot of drives under the postgres data set).

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Dimitri
Hi Scott,

good point - the current checkpoint completion target is a default
0.5, and it makes sense to set it to 0.8 to make writing more smooth,
great!

yes, data and xlog are separated, each one is sitting on an
independent storage LUN RAID1, and storage box is enough performant

Thank you!

Rgds,
-Dimitri


On 5/11/09, Scott Marlowe scott.marl...@gmail.com wrote:
 On Mon, May 11, 2009 at 10:31 AM, Dimitri dimitrik...@gmail.com wrote:
 Hi Kevin,

 PostgreSQL: 8.3.7  8.4
 Server: Sun M5000 32cores
 OS: Solaris 10

 current postgresql.conf:

 #
 max_connections = 2000  # (change requires restart)
 effective_cache_size = 48000MB
 shared_buffers = 12000MB
 temp_buffers = 200MB
 work_mem = 100MB# min 64kB
 maintenance_work_mem = 600MB# min 1MB

 max_fsm_pages = 2048000
 fsync = on  # turns forced synchronization on
 or off
 synchronous_commit = off# immediate fsync at commit
 wal_sync_method = fdatasync
 wal_buffers = 2MB
 wal_writer_delay = 400ms# 1-1 milliseconds

 checkpoint_segments = 128
 checkpoint_timeout = 30s

 What's your checkpoint completion target set to? Crank that up a bit
 ot 0.7, 0.8 etc and make the timeout more, not less.  That should
 help.

 Also, look into better hardware (RAID controller with battery backed
 cache) and also putting pg_xlog on a separate RAID-1 set (or RAID-10
 set if you've got a lot of drives under the postgres data set).


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Merlin Moncure
On Mon, May 11, 2009 at 11:18 AM, Dimitri dimitrik...@gmail.com wrote:
 Folks, it's completely crazy, but here is what I found:

 - if HISTORY table is analyzed with target 1000 my SELECT response
 time is jumping to 3ms, and the max throughput is limited to 6.000 TPS
 (it's what happenned to 8.3.7)

 -if HISTORY table is analyzed with target 5 - my SELECT response time
 is decreasing to 1.2ms (!)  and then my max TPS level is ~12.000 !
 and CPU is used up to 95% even by 8.3.7 :-)  and 8.4 performed better
 just because I left its analyze target to default 100 value.

 Anyone may explain me why analyze target may have so huge negative
 secondary effect?..

 Next point: SCALABILITY ISSUE

 Now both 8.3.7 and 8.4 have similar performance levels, but 8.3.7 is
 always slightly better comparing to 8.4, but well. The problem I have:
   - on 8 cores: ~5.000 TPS  / 5.500 MAX
   - on 16 cores: ~10.000 TPS / 11.000 MAX
   - on  32 cores: ~10.500 TPS  / 11.500 MAX

 What else may limit concurrent SELECTs here?..

 Yes, forget, MySQL is reaching 17.500 TPS here.

why aren't you preparing the query?  mysql uses simple rule based
planner and postgresql has a statistics based planner.  Our planner
has all kinds of advantages in various scenarios, but this is
compensated by slightly longer planning time in some cases.  OTOH, you
have prepared queries to compensate this.  (mysql also has prepared
queries, but the syntax is awkward and there is much less benefit to
using them).

merlin

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Dimitri
OK, it'll be better to avoid a such improvement :-)
Performance - yes, but not for any price :-)

Thank you!

Rgds,
-Dimitri

On 5/11/09, Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Dimitri dimitrik...@gmail.com wrote:

 What about full_page_writes ? seems it's on  by default. Does it
 makes sense to put if off?..

 It would probably help with performance, but the description is a
 little disconcerting in terms of crash recovery.  We tried running
 with it off for a while (a year or so back), but had problems with
 corruption.  I think the specific cause of that has since been fixed,
 it's left us a bit leery of the option.

 Maybe someone else can speak to how safe (or not) the current
 implementation of that option is.

 -Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Simon Riggs

On Mon, 2009-05-11 at 17:18 +0200, Dimitri wrote:

 Yes, forget, MySQL is reaching 17.500 TPS here.

Please share your measurements of MySQL scalability also.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Simon Riggs

On Mon, 2009-05-11 at 11:23 -0400, Tom Lane wrote:
 Dimitri dimitrik...@gmail.com writes:
  Anyone may explain me why analyze target may have so huge negative
  secondary effect?..
 
 If these are simple queries, maybe what you're looking at is the
 increase in planning time caused by having to process 10x as much
 statistical data.  Cranking statistics_target to the max just because
 you can is not necessarily a good strategy.

statistics_target effects tables, so we have problems if you have a mix
of simple and complex queries. IMHO we need an explicit planner_effort
control, rather than the more arcane *_limit knobs which are effectively
the same thing, just harder to use in practice.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Aidan Van Dyk
* Dimitri dimitrik...@gmail.com [090511 11:18]:
 Folks, it's completely crazy, but here is what I found:
 
 - if HISTORY table is analyzed with target 1000 my SELECT response
 time is jumping to 3ms, and the max throughput is limited to 6.000 TPS
 (it's what happenned to 8.3.7)
 
 -if HISTORY table is analyzed with target 5 - my SELECT response time
 is decreasing to 1.2ms (!)  and then my max TPS level is ~12.000 !
 and CPU is used up to 95% even by 8.3.7 :-)  and 8.4 performed better
 just because I left its analyze target to default 100 value.
 
 Anyone may explain me why analyze target may have so huge negative
 secondary effect?..

It's actually pretty straight forward.

The PostgreSQL query planner is a smart planner.  It takes into
consideration all the statistics available on the columns/tables,
expected outputs based on inputs, etc, to choose what it thinks will be
the best plan.  The more data you have in statistics (the larger
statistics target you have), the more CPU time and longer it's going to
take to plan your queries.  The tradeoff is hopefully better plans.

But, in your scenario, where you are hitting the database with the
absolute worst possible way to use PostgreSQL, with small, repeated,
simple queries, you're not getting the advantage  of better plans.  In
your case, you're throwing absolutely simple queries at PG as fast as
you can, and for each query, PostgreSQL has to:

1) Parse the given query string
2) Given the statistics available, plan the query and pick the best one
3) Actually run the query.

Part 2 is going to dominate the CPU time in your tests, more so the more
statistics it has to evaluate, and unless the data has to come from the
disks (i.e. not in shared buffers or cache) is thus going to dominate the
time before you get your results.  More statistics means more time
needed to do the planning/picking of the query.

If you were to use prepared statements, the cost of #1 and #2 is done
once, and then every time you throw a new execution of the query to
PostgreSQL, you get to just do #3, the easy quick part, especially for
small simple queries where all the data is in shared buffers or the cache.

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


[PERFORM] Query planner making bad decisions

2009-05-11 Thread Cory Coager
I'm running version 8.1.11 on SLES 10 SP2.  I'm trying to improve this 
query and unfortunately I cannot change the application.  For some 
reason the planner is making a bad decision sometimes after an analyze 
of table objectcustomfieldvalues. 



The query is:
SELECT DISTINCT main.* FROM Tickets main JOIN CustomFields 
CustomFields_1  ON ( CustomFields_1.Name = 'QA Origin' ) JOIN 
CustomFields CustomFields_3  ON (CustomFields_3.Name = 'QA Group Code' ) 
JOIN ObjectCustomFieldValues ObjectCustomFieldValues_4  ON 
(ObjectCustomFieldValues_4.ObjectId = main.id ) AND ( 
ObjectCustomFieldValues_4.Disabled = '0' ) AND 
(ObjectCustomFieldValues_4.ObjectType = 'RT::Ticket' ) AND ( 
ObjectCustomFieldValues_4.CustomField = CustomFields_3.id ) JOIN 
ObjectCustomFieldValues ObjectCustomFieldValues_2  ON ( 
ObjectCustomFieldValues_2.Disabled = '0' ) AND 
(ObjectCustomFieldValues_2.ObjectId = main.id ) AND ( 
ObjectCustomFieldValues_2.CustomField = CustomFields_1.id ) AND 
(ObjectCustomFieldValues_2.ObjectType = 'RT::Ticket' )  WHERE 
(main.Status != 'deleted') AND (main.Queue = '60' AND 
ObjectCustomFieldValues_2.Content LIKE '%Patient Sat Survey%' AND 
ObjectCustomFieldValues_4.Content LIKE'%MOT%') AND (main.EffectiveId = 
main.id) AND (main.Type = 'ticket')  ORDER BY main.id ASC;



Here is the query run in 12379.816 ms:

Unique  (cost=1560.06..1560.12 rows=1 width=181) (actual 
time=12379.573..12379.642 rows=13 loops=1)
  -  Sort  (cost=1560.06..1560.06 rows=1 width=181) (actual 
time=12379.568..12379.586 rows=13 loops=1)
Sort Key: main.id, main.effectiveid, main.queue, main.type, 
main.issuestatement, main.resolution, main.owner, main.subject, 
main.initialpriority, main.finalpriority, main.priority, 
main.timeestimated, main.timeworked, main.status, main.timeleft, 
main.told, main.starts, main.started, main.due, main.resolved, 
main.lastupdatedby, main.lastupdated, main.creator, main.created, 
main.disabled
-  Nested Loop  (cost=0.00..1560.05 rows=1 width=181) (actual 
time=9081.782..12379.303 rows=13 loops=1)

  Join Filter: (outer.objectid = inner.id)
  -  Nested Loop  (cost=0.00..849.90 rows=1 width=8) 
(actual time=9059.881..12052.548 rows=13 loops=1)

Join Filter: (outer.objectid = inner.objectid)
-  Nested Loop  (cost=0.00..424.19 rows=1 width=4) 
(actual time=0.274..26.660 rows=1575 loops=1)
  -  Index Scan using customfields_pkey on 
customfields customfields_1  (cost=0.00..16.41 rows=1 width=4) (actual 
time=0.228..0.371 rows=1 loops=1)

Filter: ((name)::text = 'QA Origin'::text)
  -  Index Scan using ticketcustomfieldvalues2 
on objectcustomfieldvalues objectcustomfieldvalues_2  (cost=0.00..407.76 
rows=1 width=8) (actual time=0.039..21.243 rows=1575 loops=1)
Index Cond: 
(objectcustomfieldvalues_2.customfield = outer.id)
Filter: ((disabled = 0) AND 
((objecttype)::text = 'RT::Ticket'::text) AND ((content)::text ~~ 
'%Patient Sat Survey%'::text))
-  Nested Loop  (cost=0.00..424.99 rows=58 
width=4) (actual time=5.188..7.605 rows=18 loops=1575)
  -  Index Scan using customfields_pkey on 
customfields customfields_3  (cost=0.00..16.41 rows=1 width=4) (actual 
time=0.235..0.419 rows=1 loops=1575)
Filter: ((name)::text = 'QA Group 
Code'::text)
  -  Index Scan using ticketcustomfieldvalues2 
on objectcustomfieldvalues objectcustomfieldvalues_4  (cost=0.00..407.76 
rows=65 width=8) (actual time=4.947..7.130 rows=18 loops=1575)
Index Cond: 
(objectcustomfieldvalues_4.customfield = outer.id)
Filter: ((disabled = 0) AND 
((objecttype)::text = 'RT::Ticket'::text) AND ((content)::text ~~ 
'%MOT%'::text))
  -  Index Scan using tickets1 on tickets main  
(cost=0.00..709.77 rows=30 width=181) (actual time=0.020..17.104 
rows=5743 loops=13)

Index Cond: (queue = 60)
Filter: (((status)::text  'deleted'::text) AND 
(effectiveid = id) AND ((type)::text = 'ticket'::text))

Total runtime: 12379.816 ms
(23 rows)


select attname,n_distinct from pg_stats where tablename='tickets';
attname | n_distinct
-+
id  | -1
effectiveid |  -0.968462
queue   | 37
type|  1
issuestatement  |  1
resolution  |  1
owner   |123
subject |  -0.885148
initialpriority | 12
finalpriority   |  9
priority| 43
timeestimated   |  5
timeworked  |  5
status  |  6
timeleft|  3
told|  -0.128088
starts  | 60
started |  -0.862352
due |   1270
resolved  

Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Dimitri
Hi Simon,

it's too early yet to speak about MySQL scalability... :-)
it's only since few months there is *no more* regression on MySQL
performance while moving from 8 to 16 cores. But looking how quickly
it's progressing now things may change very quickly :-)

For the moment on my tests it gives:
  - on 8 cores: 14.000 TPS
  - on 16 cores: 17.500 TPS
  - on 32 cores: 15.000 TPS (regression)

Rgds,
-Dimitri

On 5/11/09, Simon Riggs si...@2ndquadrant.com wrote:

 On Mon, 2009-05-11 at 17:18 +0200, Dimitri wrote:

 Yes, forget, MySQL is reaching 17.500 TPS here.

 Please share your measurements of MySQL scalability also.

 --
  Simon Riggs   www.2ndQuadrant.com
  PostgreSQL Training, Services and Support



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Dimitri
Hi Aidan,

thanks a lot for this detailed summary!

So, why I don't use prepare here: let's say I'm testing the worst
stress case :-)  Imagine you have thousands of such kind of queries -
you cannot prepare all of them! :-)  or you'll maybe prepare it once,
but as I showed previously in this thread prepare statement itself
takes 16ms, so for a single shot there is no gain! :-)  Stressing with
such kind of short and simple queries (and again, they have joins, it
may be even more simple :-)) will give me a result to show with
guarantee my worst case - I know then if I have to deploy a bombarding
OLTP-like application my database engine will be able to keep such
workload, and if I have performance problems they are inside of
application! :-)  (well, it's very simplistic, but it's not far from
the truth :-))

Now, as you see from your explanation, the Part #2 is the most
dominant - so why instead to blame this query not to implement a QUERY
PLANNER CACHE??? - in way if any *similar* query is recognized by
parser we simply *reuse* the same plan?..

Rgds,
-Dimitri


On 5/11/09, Aidan Van Dyk ai...@highrise.ca wrote:
 * Dimitri dimitrik...@gmail.com [090511 11:18]:
 Folks, it's completely crazy, but here is what I found:

 - if HISTORY table is analyzed with target 1000 my SELECT response
 time is jumping to 3ms, and the max throughput is limited to 6.000 TPS
 (it's what happenned to 8.3.7)

 -if HISTORY table is analyzed with target 5 - my SELECT response time
 is decreasing to 1.2ms (!)  and then my max TPS level is ~12.000 !
 and CPU is used up to 95% even by 8.3.7 :-)  and 8.4 performed better
 just because I left its analyze target to default 100 value.

 Anyone may explain me why analyze target may have so huge negative
 secondary effect?..

 It's actually pretty straight forward.

 The PostgreSQL query planner is a smart planner.  It takes into
 consideration all the statistics available on the columns/tables,
 expected outputs based on inputs, etc, to choose what it thinks will be
 the best plan.  The more data you have in statistics (the larger
 statistics target you have), the more CPU time and longer it's going to
 take to plan your queries.  The tradeoff is hopefully better plans.

 But, in your scenario, where you are hitting the database with the
 absolute worst possible way to use PostgreSQL, with small, repeated,
 simple queries, you're not getting the advantage  of better plans.  In
 your case, you're throwing absolutely simple queries at PG as fast as
 you can, and for each query, PostgreSQL has to:

 1) Parse the given query string
 2) Given the statistics available, plan the query and pick the best one
 3) Actually run the query.

 Part 2 is going to dominate the CPU time in your tests, more so the more
 statistics it has to evaluate, and unless the data has to come from the
 disks (i.e. not in shared buffers or cache) is thus going to dominate the
 time before you get your results.  More statistics means more time
 needed to do the planning/picking of the query.

 If you were to use prepared statements, the cost of #1 and #2 is done
 once, and then every time you throw a new execution of the query to
 PostgreSQL, you get to just do #3, the easy quick part, especially for
 small simple queries where all the data is in shared buffers or the cache.

 a.

 --
 Aidan Van Dyk Create like a god,
 ai...@highrise.ca   command like a king,
 http://www.highrise.ca/   work like a slave.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Alvaro Herrera
Dimitri escribió:
 Hi Aidan,
 
 thanks a lot for this detailed summary!
 
 So, why I don't use prepare here: let's say I'm testing the worst
 stress case :-)  Imagine you have thousands of such kind of queries -
 you cannot prepare all of them! :-)

Thousands?  Surely there'll be a dozen or three of most common queries,
to which you pass different parameters.  You can prepare thoseu

 Now, as you see from your explanation, the Part #2 is the most
 dominant - so why instead to blame this query not to implement a QUERY
 PLANNER CACHE??? - in way if any *similar* query is recognized by
 parser we simply *reuse* the same plan?..

This has been discussed in the past, but it turns out that a real
implementation is a lot harder than it seems.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query planner making bad decisions

2009-05-11 Thread Tom Lane
Cory Coager ccoa...@davisvision.com writes:
 Even better yet, if I turn off enable_nestloop the query runs in 
 3499.970 ms:

The reason it prefers a nestloop inappropriately is a mistaken estimate
that some plan node is only going to yield a very small number of rows
(like one or two --- there's not a hard cutoff, but usually more than
a couple of estimated rows will lead it away from a nestloop).
In this case the worst problem seems to be here:

-  Index Scan using 
 ticketcustomfieldvalues2 on objectcustomfieldvalues 
 objectcustomfieldvalues_2  (cost=0.00..26514.04 rows=1 width=8) (actual 
 time=1493.091..1721.155 rows=1575 loops=1)
  Filter: ((disabled = 0) AND 
 ((objecttype)::text = 'RT::Ticket'::text) AND ((content)::text ~~ 
 '%Patient Sat Survey%'::text))

where we're off by a factor of 1500+ :-(

I think most likely the ~~ operator is the biggest problem.
Unfortunately 8.1's estimator for ~~ is not terribly bright.  You could
try increasing your statistics target but I don't think it will help
much.  Is there any chance of updating to 8.2 or later?  8.2 can do
significantly better on this type of estimate as long as it has enough
stats.

In any case I'd suggest raising default_statistics_target to 100 or so,
as you seem to be running queries complex enough to need good stats.
But I'm not sure that that will be enough to fix the problem in 8.1.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Greg Smith

On Mon, 11 May 2009, Dimitri wrote:

I've tried to reduce checkpoint timeout from 5min to 30sec - it helped, 
throughput is more stable now, but instead of big waves I have now short 
waves anyway..


Tuning for very tiny checkpoints all of the time is one approach here. 
The other is to push up checkpoint_segments (done in your case), 
checkpoint_timeout, and checkpoint_completion_target to as high as you 
can, in order to spread the checkpoint period over as much time as 
possible.  Reducing shared_buffers can also help in both cases, you've set 
that to an extremely high value.


http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm is a 
long discussion of just this topic, if you saw a serious change by 
adjusting checkpoint_timeout than further experimentation in this area is 
likely to help you out.


You might also want to look at the filesystem parameters you're using 
under Solaris.  ZFS in particular can cache more writes than you may 
expect, which can lead to that all getting pushed out at the very end of 
checkpoint time.  That may very well be the source of your waves, on a 
system with 64GB of RAM for all we know *every* write you're doing between 
checkpoints is being buffered until the fsyncs at the checkpoint end. 
There were a couple of sessions at PG East last year that mentioned this 
area, I put a summary of suggestions and links to more detail at 
http://notemagnet.blogspot.com/2008/04/conference-east-08-and-solaris-notes.html


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] What is the most optimal config parameters to keep stable write TPS ?..

2009-05-11 Thread Scott Marlowe
On Mon, May 11, 2009 at 8:15 PM, Greg Smith gsm...@gregsmith.com wrote:

 http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm is a long
 discussion of just this topic, if you saw a serious change by adjusting
 checkpoint_timeout than further experimentation in this area is likely to
 help you out.

I highly recommend your article on the background writer.  Reading the
one on tuning the 8.2 bgw allowed me to make some changes to the
production servers at my last job that made a huge difference in
sustained tps on a logging server

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Heikki Linnakangas

Dimitri wrote:

Now, as you see from your explanation, the Part #2 is the most
dominant - so why instead to blame this query not to implement a QUERY
PLANNER CACHE??? - in way if any *similar* query is recognized by
parser we simply *reuse* the same plan?..


At least in JDBC, there's several open source prepared statement cache 
implementations out there that people use. I don't know about other 
client libraries, but it certainly is possible to do in the client.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance