[PERFORM] CPU maximized out!
Hi, there, I am running PostgreSQL 7.3.4 on MAC OS X G5 with dual processors and 8GB memory. The shared buffer was set as 512MB. The database has been running great until about 10 days ago when our developers decided to add some indexes to some tables to speed up certain uploading ops. Now the CPU usage reaches 100% constantly when there are a few users accessing their information by SELECT tables in databases. If I REINEX all the indexes, the database performance improves a bit but before long, it goes back to bad again. My suspicion is that since now a few indexes are added, every ops are run by PostgreSQL with the indexes being used when calculating cost. This leads to the downgrade of performance. What do you think of this? What is the possible solution? Thanks! Qing The following is the output from TOP command: Processes: 92 total, 4 running, 88 sleeping... 180 threads 13:09:18 Load Avg: 2.81, 2.73, 2.50 CPU usage: 95.2% user, 4.8% sys, 0.0% idle SharedLibs: num = 116, resident = 11.5M code, 1.66M data, 4.08M LinkEdit MemRegions: num = 12132, resident = 148M + 2.82M private, 403M shared PhysMem: 435M wired, 5.04G active, 2.22G inactive, 7.69G used, 316M free VM: 32.7G + 81.5M 5281127(13) pageins, 8544145(0) pageouts PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE 27314 postgres92.2% 2:14.75 1 949 12.8M+ 396M 75.0M+ 849M 26099 postgres91.1% 19:28.04 1 967 15.9M+ 396M 298M+ 850M 24754 top 2.8% 4:48.33 12926 272K 404K 648K 27.1M 0 kernel_tas 1.9% 2:12:05 40 2 8476 67.1M 0K 281M 1.03G 294 hwmond 0.5% 2:26:34 87557 240K 544K 1.09M 31.0M 347 lookupd 0.3% 1:52:28 23573 3.05M 648K 3.14M 33.6M 89 configd 0.1% 53:05.16 3 126 151 304K 644K 832K 29.2M 26774 servermgrd 0.1% 0:02.93 11040 344K- 1.17M+ 1.86M 28.2M 170 coreservic 0.1% 0:09.04 14093 152K 532K 2.64M 28.5M 223 DirectoryS 0.1% 19:42.47 884 135 880K+ 1.44M 4.60M+ 37.1M+ 125 dynamic_pa 0.0% 0:26.79 1121716K 292K28K 17.7M 87 kextd0.0% 0:01.23 21721 0K 292K36K 28.2M 122 update 0.0% 14:27.71 1 91516K 300K44K 17.6M 1 init 0.0% 0:00.03 1121628K 320K76K 17.6M 2 mach_init0.0% 3:36.18 2951876K 320K 148K 18.2M 81 syslogd 0.0% 0:19.96 1101796K 320K 148K 17.7M ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Tryint to match Solaris-Oracle performance with directio?
Our product (Sophos PureMessage) runs on a Postgres database. Some of our Solaris customers have Oracle licenses, and they've commented on the performance difference between Oracle and Postgresql on such boxes. In-house, we've noticed the 2:1 (sometimes 5:1) performance difference in inserting rows (mostly 2-4K), between Postgresql on Solaris 8 and on Linux, for machines with comparable CPU's and RAM. These (big) customers are starting to ask, why don't we just port our dataserver to Oracle for them? I'd like to avoid that, if possible :-) What we can test on, in-house are leetle Sun workstations, while some of our customers have BIG Sun iron --- so I have no means to-date to reproduce what their bottleneck is :-( Yes, it has been recommended that we talk to Sun about their iForce test lab ... that's in the pipe. In the meantime, what I gather from browsing mail archives is that postgresql on Solaris seems to get hung up on IO rather than CPU. Furthermore, I notice that Oracle and now MySQL use directio to bypass the system cache, when doing heavy writes to the disk; and Postgresql does not. Not wishing to alter backend/store/file for this test, I figured I could get a customer to mount the UFS volume for pg_xlog with the option forcedirectio. Any comment on this? No consideration of what the wal_sync_method is at this point. Presumably it's defaulting to fdatasync on Solaris. BTW this is Postgres 7.4.1, and our customers are Solaris 8 and 9. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Tryint to match Solaris-Oracle performance with directio?
Mischa Sandberg [EMAIL PROTECTED] writes: Our product (Sophos PureMessage) runs on a Postgres database. Some of our Solaris customers have Oracle licenses, and they've commented on the performance difference between Oracle and Postgresql on such boxes. In-house, we've noticed the 2:1 (sometimes 5:1) performance difference in inserting rows (mostly 2-4K), between Postgresql on Solaris 8 and on Linux, for machines with comparable CPU's and RAM. You haven't given any evidence at all to say that I/O is where the problem is. I think it would be good first to work through the conventional issues such as configuration parameters, foreign key problems, etc. Give us some more detail about the slow INSERT queries ... regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[PERFORM] Planner having way wrong estimate for group aggregate
Hi, I'm using PostgreSQL 7.4 on a table with ~700.000 rows looking like this: Table public.enkeltsalg Column | Type | Modifiers +--+--- id | integer | not null default nextval('enkeltsalg_id_seq'::text) kommentar | text | not null default ''::text antall | numeric(14,4)| not null belop | numeric(10,0)| not null type | character(1) | not null tid| timestamp with time zone | default now() eksternid | integer | kasseid| integer | baraapning | integer | salgspris | integer | firma | integer | bongid | integer | Indexes: enkeltsalg_pkey primary key, btree (id) enkeltsalg_aapn btree (baraapning) enkeltsalg_aapn_pris btree (baraapning, salgspris) enkeltsalg_aapn_type btree (baraapning, type) enkeltsalg_pris btree (salgspris) Check constraints: enkeltsalg_type_valid CHECK (type = 'K'::bpchar OR type = 'B'::bpchar OR type = 'M'::bpchar OR type = 'T'::bpchar) And I'm doing the query (after VACUUM ANALYZE) smt=# explain analyze select sum(belop) as omsetning,date_trunc('day',tid) as dato from enkeltsalg group by date_trunc('day',tid); QUERY PLAN -- GroupAggregate (cost=108062.34..114477.98 rows=172735 width=17) (actual time=20977.544..23890.020 rows=361 loops=1) - Sort (cost=108062.34..109912.99 rows=740263 width=17) (actual time=20947.372..21627.107 rows=710720 loops=1) Sort Key: date_trunc('day'::text, tid) - Seq Scan on enkeltsalg (cost=0.00..18010.29 rows=740263 width=17) (actual time=0.091..7180.528 rows=710720 loops=1) Total runtime: 23908.538 ms (5 rows) Now, as you can see, the GroupAggregate here is _way_ off, so the planner makes the wrong choice (it should do a hash aggregate). If I set sort_mem to 131072 instead of 16384, it does a hash aggregate (which is 10 seconds instead of 24), but I can't have sort_mem that high generally. Now, my first notion was creating a functional index to help the planner: smt=# create index enkeltsalg_dag on enkeltsalg ( date_trunc('day',tid) ); CREATE INDEX smt=# vacuum analyze; VACUUM However, this obviously didn't help the planner (this came as a surprise to me, but probably won't come as a surprise to the more seasoned users here :-) ): smt=# explain analyze select sum(belop) as omsetning,date_trunc('day',tid) as dato from enkeltsalg group by date_trunc('day',tid); QUERY PLAN -- GroupAggregate (cost=103809.15..110017.11 rows=175512 width=17) (actual time=21061.357..23917.370 rows=361 loops=1) - Sort (cost=103809.15..105585.95 rows=710720 width=17) (actual time=21032.239..21695.674 rows=710720 loops=1) Sort Key: date_trunc('day'::text, tid) - Seq Scan on enkeltsalg (cost=0.00..17641.00 rows=710720 width=17) (actual time=0.091..7231.387 rows=710720 loops=1) Total runtime: 23937.791 ms (5 rows) I also tried to increase the statistics on the tid column: smt=# alter table enkeltsalg alter column tid set statistics 500; ALTER TABLE smt=# analyze enkeltsalg; ANALYZE However, this made the planner only do a _worse_ estimate: smt=# explain analyze select sum(belop) as omsetning,date_trunc('day',tid) as dato from enkeltsalg group by date_trunc('day',tid); QUERY PLAN -- GroupAggregate (cost=107906.59..114449.09 rows=199715 width=17) (actual time=20947.197..23794.389 rows=361 loops=1) - Sort (cost=107906.59..109754.56 rows=739190 width=17) (actual time=20918.001..21588.735 rows=710720 loops=1) Sort Key: date_trunc('day'::text, tid) - Seq Scan on enkeltsalg (cost=0.00..17996.88 rows=739190 width=17) (actual time=0.092..7166.488 rows=710720 loops=1) Total runtime: 23814.624 ms (5 rows) Actually, it seems that the higher I set statistics on tid, the worse the estimate becomes. Also, I was told (on #postgresql :-) ) to include the following information: smt=# select n_distinct from pg_stats
Re: [PERFORM] Tryint to match Solaris-Oracle performance with directio?
Mischa Sandberg wrote: In the meantime, what I gather from browsing mail archives is that postgresql on Solaris seems to get hung up on IO rather than CPU. Well, people more knowledgeable in the secrets of postgres seem confident that this is not your problem. Fortunetly, however, there is a simple way to find out. Just download the utinyint var type from pgfoundry (http://pgfoundry.org/projects/sql2pg/). There are some stuff there you will need to compile yourself from CVS. I'm sorry, but I haven't done a proper release just yet. In any case, the utinyint type should provide you with the data type you seek, and thus allow you to find out whether this is, indeed, the problem. -- Shachar Shemesh Lingnu Open Source Consulting ltd. http://www.lingnu.com/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Planner having way wrong estimate for group aggregate
Steinar H. Gunderson [EMAIL PROTECTED] writes: Now, my first notion was creating a functional index to help the planner: ... However, this obviously didn't help the planner (this came as a surprise to me, but probably won't come as a surprise to the more seasoned users here :-) 7.4 doesn't have any statistics on expression indexes. 8.0 will do what you want though. (I just fixed an oversight that prevented it from doing so...) Actually, it seems that the higher I set statistics on tid, the worse the estimate becomes. I believe that the estimate of number of groups will be exactly the same as the estimate of the number of values of tid --- there's no knowledge that date_trunc() might reduce the number of distinct values. Any ideas for speeding this up? In 7.4, the only way I can see to force this to use a hash aggregate is to temporarily set enable_sort false or raise sort_mem. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Planner having way wrong estimate for group aggregate
On Sat, Sep 18, 2004 at 03:48:13PM -0400, Tom Lane wrote: 7.4 doesn't have any statistics on expression indexes. 8.0 will do what you want though. (I just fixed an oversight that prevented it from doing so...) OK, so I'll have to wait for 8.0.0beta3 or 8.0.0 (I tried 8.0.0beta2, it gave me zero difference) -- fortunately, I can probably wait at the rate everything else is progressing here. :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster