Re: [PERFORM] PostgreSQL with Zabbix - problem of newbe

2010-04-14 Thread Krzysztof Kardas

Hi all.

Well I have used all Your recomendations but I still have no luck with
performance tunning. The machine has a moments thas was utilized in
100%. The problem was I/O on disks. CPU's were busy on system
interrupts.

I have started again to look of I/O performance tunning and I have changed a

synchronous_commit = off

Ofcourse with risk that if there will be a power failure I will lose
some data. But this is acceptable.

This caused a monumental performance jump. From a machine that is
utilized on 100%, machine is now sleeping and doing nothing. I have
executed some sqls on huge tables like history and all has executed
like lightning. Comparing to MySQL, PostgreSQL in this configuration
is about 30 - 40% faster in serving data. Housekeeper is about 2 to 3
times faster

Many thanks to all helpers and all PostgreSQL team.

-- 
Greeting
Krzysztof Kardas

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with Zabbix - problem of newbe

2010-04-14 Thread Grzegorz Jaśkiewicz
That really sounds like hardware issue. The I/O causes the system to freeze
basically.
Happens sometimes on cheaper hardware.


Re: [PERFORM] PostgreSQL with Zabbix - problem of newbe

2010-04-14 Thread Kevin Grittner
Krzysztof Kardas  wrote:
 
> synchronous_commit = off
 
> This caused a monumental performance jump. From a machine that is
> utilized on 100%, machine is now sleeping and doing nothing. I
> have executed some sqls on huge tables like history and all has
> executed like lightning. Comparing to MySQL, PostgreSQL in this
> configuration is about 30 - 40% faster in serving data.
> Housekeeper is about 2 to 3 times faster
 
If you have a good RAID controller with battery backup for the
cache, and it's configured to write-back, this setting shouldn't
make very much difference.  Next time you're looking at hardware for
a database server, I strongly recommend you get such a RAID
controller and make sure it is configured to write-back.
 
Anyway, I'm glad to hear that things are working well for you now!
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] count is ten times faster

2010-04-14 Thread Kevin Grittner
"Sabin Coanda"  wrote:
 
> How do you explain the cost is about ten times lower in the 2nd
> query than the first ?
 
To elaborate on Pierre's answer:
 
In the first query, you scan the entire table and execute the "f1"
function on each row.  In the second query you pass the entire table
just counting visible tuples and then run the "f1" function once,
and use the resulting value to scan an index on which it expects to
find one row.  
 
It estimates the cost of running the "f1" function 7.7 million times
as being roughly ten times the cost of scanning the table.  Now,
this is all just estimates; if they don't really reflect the
relative cost of *running* the two queries, you might want to adjust
costs factors -- perhaps the estimated cost of the "f1" function.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] significant slow down with various LIMIT

2010-04-14 Thread Kevin Grittner
"Kevin Grittner"  wrote:
 
> (3) Try it like this (untested, so you may need to fix it up):
>  
> explain analyze
> SELECT core_object.id
>   from (SELECT id, city_id FROM "plugins_guide_address")
>"plugins_guide_address"
>   JOIN "plugins_plugin_addr"
> ON ("plugins_plugin_addr"."address_id"
>= "plugins_guide_address"."id")
>   JOIN "core_object"
> ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
>   WHERE "plugins_guide_address"."city_id" = 4535
>   ORDER BY "core_object"."id" DESC
>   LIMIT 4 -- or whatever it normally takes to cause the problem
> ;
 
Hmph.  I see I didn't take that quite where I intended.
Forget the above and try this:
 
explain analyze
SELECT core_object.id
  from (SELECT id, city_id FROM "plugins_guide_address"
  WHERE "city_id" = 4535) "plugins_guide_address"
  JOIN "plugins_plugin_addr"
ON ("plugins_plugin_addr"."address_id"
   = "plugins_guide_address"."id")
  JOIN "core_object"
ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
  ORDER BY "core_object"."id" DESC
  LIMIT 4 -- or whatever it normally takes to cause the problem
;
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with Zabbix - problem of newbe

2010-04-14 Thread Krzysztof Kardas
W dniu 14 kwietnia 2010 15:30 użytkownik Grzegorz Jaśkiewicz
 napisał:
> That really sounds like hardware issue. The I/O causes the system to freeze
> basically.
> Happens sometimes on cheaper hardware.
>

Probably You have right because this is HS21 Blade Server. And as You
know blades are cheap and good. Why blades are good - because they are
cheap (quoting IBM salesman). I know this hardware is not made for
databases but for now I do not have any other server. Firmware on this
current server is very old and it should be upgraded and there are
many other things to do. VMWare machines (currently I have ESX 3.5,
vSphere 4 is based od 64bit RedHat5 system and is much faster that 3.5
but migration process is not even planned) has still to low
performance for database solutions (of course in using vmdk, not RAW
device mapping or Virtual WWN solution for accessing LUN-s).

As more I am reading than more I see that the file system is wrong
partitioned. For example - all logs and database files are on the same
volume, and that is not right.
Kevin Grittner also mentioned about write back function on the
controller. LSI controllers for blades has that function as far as I
know. I have to check it if that option is turned on.

As I mentioned - I am not familiar with databases so I have made some
mistakes but I am very happy for the effects how fast now Zabbix
works, and how easy PostgreSQL reclaims space. I think it was a good
decision and maybe I will try to interest some people in my company in
PostgreSQL instate of Oracle XE.

Once more time - many thanks to all :)

-- 
Greetings
Krzysztof Kardas

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] JDBC question for PG 8.3.9

2010-04-14 Thread Dave Crooke
Hi foilks

I am using PG 8.3 from Java. I am considering a performance tweak which will
involve holding about 150 java.sql.PreparedStatment objects open against a
single PGSQL connection. Is this safe?

I know that MySQL does not support prepared statements *per se*, and so
their implementation of PreparedStatement is nothing more than some
client-side convenience code that knows how to escape and format constants
for you. Is this the case for PG, or does the PG JDBC driver do the real
thing? I'm assuming if it's just a client side constant escaper that there
won't be an issue.

Cheers
Dave


Re: [PERFORM] Occasional giant spikes in CPU load

2010-04-14 Thread Bruce Momjian
Craig James wrote:
> On 4/7/10 5:47 PM, Robert Haas wrote:
> > On Wed, Apr 7, 2010 at 6:56 PM, David Rees  wrote:
> >>> max_fsm_pages = 1600
> >>> max_fsm_relations = 625000
> >>> synchronous_commit = off
> >>
> >> You are playing with fire here.  You should never turn this off unless
> >> you do not care if your data becomes irrecoverably corrupted.
> >
> > That is not correct.  Turning off synchronous_commit is sensible if
> > you don't mind losing the last few transactions on a crash.  What will
> > corrupt your database is if you turn off fsync.
> 
> A bit off the original topic, but ...
> 
> I set it this way because I was advised that with a battery-backed
> RAID controller, this was a safe setting.  Is that not the case?

To get good performance, you can either get a battery-backed RAID
controller or risk losing a few transaction with synchronous_commit =
off.  If you already have a battery-backed RAID controller, there is
little benefit to turning synchronous_commit off, and some major
downsides (possible data loss).

--
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] JDBC question for PG 8.3.9

2010-04-14 Thread Craig Ringer

On 15/04/10 04:49, Dave Crooke wrote:

Hi foilks

I am using PG 8.3 from Java. I am considering a performance tweak which
will involve holding about 150 java.sql.PreparedStatment objects open
against a single PGSQL connection. Is this safe?

I know that MySQL does not support prepared statements /per se/, and so
their implementation of PreparedStatement is nothing more than some
client-side convenience code that knows how to escape and format
constants for you. Is this the case for PG, or does the PG JDBC driver
do the real thing?


Pg supports real server-side prepared statements, as does the JDBC driver.

IIRC (and I can't say this with 100% certainty without checking the 
sources or a good look at TFM) the PostgreSQL JDBC driver initially does 
only a client-side prepare. However, if the PreparedStatement is re-used 
more than a certain number of times (five by default?) it switches to 
server-side prepared statements.


This has actually caused a bunch of performance complaints on the jdbc 
list, because the query plan may change at that switch-over point, since 
with a server-side prepared statement Pg no longer has a specific value 
for each parameter and may pick a more generic plan.


Again only IIRC there's a configurable threshold for prepared statement 
switch-over. I thought all this was in the PgJDBC documentation and/or 
javadoc - if it's not, it needs to be.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Good/Bad query plans based on text criteria

2010-04-14 Thread JmH
Hello,

I am struggling to understand why for certain criteria that i supply for a
query alters the the query plan. In my "good" case i can see that an index
is used, in my bad case where i only change the text value of the criteria,
but not the criteria itslef (ie change/add the conditions) a hbitmap heap
scan of the table is performed.

Refer attached Good/Bad query plans.

The basic query is:

SELECT * FROM highrate_log_entry
WHERE
test_seq_number > 26668670
and udc = '2424'
AND (test_signal_number = 'D2030'
)
ORDER BY test_seq_number LIMIT 11

test_seq_number is the pk and is generated by a sequence.

The D2030 is the only thing that i vary between good/bad runs. The issue is
possibly related to the  data spead is for the test-signal_number is not
uniform, but there does not appear to be that much difference in difference
between the first sequence number and the last sequence number (to achieve
the 11 results), when compared between the  test_seq_number that yield good
or bad results.

I dont believe that the issue is to do with re-writing the query, but how
the planner chooses its path.

I am using Postgres 8.4 on windows with default postgres.conf. I have tried
changing(increasing) shared_buffers, work_mem and effective_cache_size
without success.

Any suggestions would be appreciated.

Thanks

Jason
explain analyze SELECT * FROM highrate_log_entry
WHERE 
test_seq_number > 26668670
and udc = '2424'
AND (test_signal_number = 'D2030'
)
ORDER BY test_seq_number LIMIT 11

QUERY PLAN
Limit  (cost=6694.14..6694.17 rows=11 width=61) (actual time=2.859..2.881 
rows=11 loops=1)
  ->  Sort  (cost=6694.14..6695.82 rows=672 width=61) (actual time=2.856..2.863 
rows=11 loops=1)
Sort Key: public.highrate_log_entry.test_seq_number
Sort Method:  top-N heapsort  Memory: 18kB
->  Result  (cost=0.00..6679.16 rows=672 width=61) (actual 
time=0.973..2.484 rows=254 loops=1)
  ->  Append  (cost=0.00..6679.16 rows=672 width=61) (actual 
time=0.970..2.046 rows=254 loops=1)
->  Index Scan using high_rate_index on highrate_log_entry  
(cost=0.00..8.27 rows=1 width=145) (actual time=0.009..0.009 rows=0 loops=1)
  Index Cond: ((udc = 2424::numeric) AND 
((test_signal_number)::text = 'D2030'::text))
  Filter: (test_seq_number > 26668670::numeric)
->  Index Scan using highrate_log_entry_2424_2009_pkey on 
highrate_log_entry_2424_2009 highrate_log_entry  (cost=0.00..3734.01 rows=1 
width=60) (actual time=0.020..0.020 rows=0 loops=1)
  Index Cond: ((udc = 2424::numeric) AND 
(test_seq_number > 26668670::numeric))
  Filter: ((test_signal_number)::text = 'D2030'::text)
->  Index Scan using highrate_log_entry_2424_2010_udc_key 
on highrate_log_entry_2424_2010 highrate_log_entry  (cost=0.00..2936.88 
rows=670 width=61) (actual time=0.935..1.620 rows=254 loops=1)
  Index Cond: ((udc = 2424::numeric) AND 
((test_signal_number)::text = 'D2030'::text))
  Filter: (test_seq_number > 26668670::numeric)
Total runtime: 2.949 ms
explain analyze SELECT * FROM highrate_log_entry
WHERE 
test_seq_number > 26668670
and udc = '2424'
AND (test_signal_number = 'D2048'
)
ORDER BY test_seq_number LIMIT 11

QUERY PLAN
Limit  (cost=107017.91..107017.94 rows=11 width=61) (actual 
time=2102.129..2102.152 rows=11 loops=1)
  ->  Sort  (cost=107017.91..107112.96 rows=38021 width=61) (actual 
time=2102.126..2102.133 rows=11 loops=1)
Sort Key: public.highrate_log_entry.test_seq_number
Sort Method:  top-N heapsort  Memory: 18kB
->  Result  (cost=0.00..106170.15 rows=38021 width=61) (actual 
time=1428.960..2066.815 rows=26497 loops=1)
  ->  Append  (cost=0.00..106170.15 rows=38021 width=61) (actual 
time=1428.957..2024.959 rows=26497 loops=1)
->  Index Scan using high_rate_index on highrate_log_entry  
(cost=0.00..8.27 rows=1 width=145) (actual time=0.006..0.006 rows=0 loops=1)
  Index Cond: ((udc = 2424::numeric) AND 
((test_signal_number)::text = 'D2048'::text))
  Filter: (test_seq_number > 26668670::numeric)
->  Index Scan using highrate_log_entry_2424_2009_pkey on 
highrate_log_entry_2424_2009 highrate_log_entry  (cost=0.00..3734.01 rows=22 
width=60) (actual time=0.017..0.017 rows=0 loops=1)
  Index Cond: ((udc = 2424::numeric) AND 
(test_seq_number > 26668670::numeric))
  Filter: ((test_signal_number)::text = 'D2048'::text)
->  Bitmap Heap Scan on highrate_log_entry_2424_2010 
highrate_log_entry  (cost=2600.42..102427.87 rows=37998 width=61) (actual 
time=1428.928..1987.050 rows=26497 loops=1)
  Recheck Cond: ((udc = 2424::numeric) AND 
((test_signal_number)::text = 'D2048'::text))
  Filte

Re: [PERFORM] Good/Bad query plans based on text criteria

2010-04-14 Thread Tom Lane
JmH  writes:
> I am struggling to understand why for certain criteria that i supply for a
> query alters the the query plan. In my "good" case i can see that an index
> is used, in my bad case where i only change the text value of the criteria,
> but not the criteria itslef (ie change/add the conditions) a hbitmap heap
> scan of the table is performed.

I think you're jumping to conclusions.  The second plan is processing
about 100 times as many rows, because the WHERE conditions are much less
selective.  A change in plan is entirely appropriate.

It might be that you need to change planner parameters (particularly
random_page_cost/seq_page_cost) to more nearly approximate the operating
conditions of your database, but I'd recommend being very cautious about
doing so on the basis of a small number of example queries.  In
particular it's easy to fall into the trap of optimizing for
fully-cached scenarios because repeatedly trying the same example
results in touching only already-cached data --- but that might or might
not be reflective of your whole workload.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] JDBC question for PG 8.3.9

2010-04-14 Thread Dave Crooke
Mine is a single record INSERT, so no issues with plans :-) Little Java ETL
job.

Is there any setting I'd need to tweak assuming I'm using 150-200 of these
at once?

Cheers
Dave

On Wed, Apr 14, 2010 at 6:10 PM, Craig Ringer
wrote:

> On 15/04/10 04:49, Dave Crooke wrote:
>
>> Hi foilks
>>
>> I am using PG 8.3 from Java. I am considering a performance tweak which
>> will involve holding about 150 java.sql.PreparedStatment objects open
>> against a single PGSQL connection. Is this safe?
>>
>> I know that MySQL does not support prepared statements /per se/, and so
>> their implementation of PreparedStatement is nothing more than some
>> client-side convenience code that knows how to escape and format
>> constants for you. Is this the case for PG, or does the PG JDBC driver
>> do the real thing?
>>
>
> Pg supports real server-side prepared statements, as does the JDBC driver.
>
> IIRC (and I can't say this with 100% certainty without checking the sources
> or a good look at TFM) the PostgreSQL JDBC driver initially does only a
> client-side prepare. However, if the PreparedStatement is re-used more than
> a certain number of times (five by default?) it switches to server-side
> prepared statements.
>
> This has actually caused a bunch of performance complaints on the jdbc
> list, because the query plan may change at that switch-over point, since
> with a server-side prepared statement Pg no longer has a specific value for
> each parameter and may pick a more generic plan.
>
> Again only IIRC there's a configurable threshold for prepared statement
> switch-over. I thought all this was in the PgJDBC documentation and/or
> javadoc - if it's not, it needs to be.
>
> --
> Craig Ringer
>