Re: [PERFORM] Multicolumn order by
On Wed, 2006-04-19 at 01:08, Tom Lane wrote: Theo Kramer [EMAIL PROTECTED] writes: select * from mytable where (c1 = 'c1v' and c2 = 'c2v' and c3 = 'c3v') or (c1 = 'c1v' and c2 'c2v') or (c1 'c1v') order by c1, c2, c3; Yeah ... what you really want is the SQL-spec row comparison operator select ... where (c1,c2,c3) = ('c1v','c2v','c3v') order by c1,c2,c3; This does not work properly in any current PG release :-( but it does work and is optimized well in CVS HEAD. See eg this thread http://archives.postgresql.org/pgsql-hackers/2006-02/msg00209.php That is awesome - been fighting with porting my isam based stuff onto sql for a long time and the row comparison operator is exactly what I have been looking for. I tried this on my test system running 8.1.3 and appears to work fine. Appreciate it if you could let me know in what cases it does not work properly. -- Regards Theo ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] SELECT FOR UPDATE performance is bad
On Tue, 2006-04-18 at 11:33 -0400, Tom Lane wrote: Mario Splivalo [EMAIL PROTECTED] writes: If there is concurrent locking, you're also running a big risk of deadlock because two processes might try to lock the same rows in different orders. I think there is no risk of a deadlock, since that particular function is called from the middleware (functions are used as interface to the database), and the lock order is always the same. No, you don't even know what the order is, let alone that it's always the same. You got me confused here! :) If I have just only one function that acts as a interface to the middleware, and all the operations on the database are done trough that one function, and I carefuly design that function so that I first grab the lock, and then do the stuff, aint I pretty sure that I won't be having any deadlocks? Now, I just need to have serialization, I need to have clients 'line up' in order to perform something in the database. Actually, users are sending codes from the newspaper, beer-cans, Cola-cans, and stuff, and database needs to check has the code allready been played. Since the system is designed so that it could run multiple code-games (and then there similair code could exists for coke-game and beer-game), I'm using messages table to see what code-game (i.e. service) that particular code belongs. I'd suggest using a table that has exactly one row per code-game, and doing a SELECT FOR UPDATE on that row to establish the lock you need. This need not have anything to do with the tables/rows you are actually intending to update --- although obviously such a convention is pretty fragile if you have updates coming from a variety of code. I think it's reasonably safe when you're funneling all the operations through a bit of middleware. I tend to design my applications so I don't have flying SQL in my java/python/c#/php/whereever code, all the database stuff is done trough the functions which are designed as interfaces. Those functions are also designed so they don't stop each other. So, since I need the serialization, I'll do as you suggested, using a lock-table with exactley one row per code-game. Just one more question here, it has to do with postgres internals, but still I'd like to know why is postgres doing such huge i/o (in my log file I see a lot of messages that say LOG: archived transaction log file when performing that big FOR UPDATE. Mario -- Mario Splivalo Mob-Art [EMAIL PROTECTED] I can do it quick, I can do it cheap, I can do it well. Pick any two. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Multicolumn order by
On Wed, 2006-04-19 at 08:00, Theo Kramer wrote: I tried this on my test system running 8.1.3 and appears to work fine. Appreciate it if you could let me know in what cases it does not work properly. Please ignore - 'Explain is your friend' - got to look at the tips :) -- Regards Theo ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Inserts optimization?
Hi, Magnus, Magnus Hagander wrote: Bacula already serializes access to the database (they have to support mysql/myisam), so this shouldn't help. Ouch, that hurts. To support mysql, they break performance for _every other_ database system? cynism Now, I understand how the mysql people manage to spread the legend of mysql being fast. They convince software developers to thwart all others. / Seriously: How can we convince developers to either fix MySQL or abandon and replace it with a database, instead of crippling client software? Actually, it might well hurt by introducing extra delays. Well, if you read the documentation, you will see that it will only wait if there are at least commit_siblings other transactions active. So when Bacula serializes access, there will be no delays, as there is only a single transaction alive. HTH Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Inserts optimization?
Bacula already serializes access to the database (they have to support mysql/myisam), so this shouldn't help. Ouch, that hurts. To support mysql, they break performance for _every other_ database system? Actually, it probably helps on SQLite as well. And considering they only support postgresql, mysql and sqlite, there is some merit to it from their perspective. You can find a thread about it in the bacula archives from a month or two back. cynism Now, I understand how the mysql people manage to spread the legend of mysql being fast. They convince software developers to thwart all others. / Yes, same as the fact that most (at least FOSS) web project-du-jour are dumbed down to the mysql featureset. (And not just mysql, but mysql-lowest-common-factors, which means myisam etc) Actually, it might well hurt by introducing extra delays. Well, if you read the documentation, you will see that it will only wait if there are at least commit_siblings other transactions active. So when Bacula serializes access, there will be no delays, as there is only a single transaction alive. Hm. Right. Well, it still won't help :-) //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Inserts optimization?
Hi, Magnus, Magnus Hagander wrote: To support mysql, they break performance for _every other_ database system? Actually, it probably helps on SQLite as well. AFAICS from the FAQ http://www.sqlite.org/faq.html#q7 and #q8, SQLite does serialize itsself. And considering they only support postgresql, mysql and sqlite, there is some merit to it from their perspective. Okay, I understand, but I hesitate to endorse it. IMHO, they should write their application in a normal way, and then have the serialization etc. encapsulated in the database driver interface (possibly a wrapper class or so). cynism Now, I understand how the mysql people manage to spread the legend of mysql being fast. They convince software developers to thwart all others. / Yes, same as the fact that most (at least FOSS) web project-du-jour are dumbed down to the mysql featureset. (And not just mysql, but mysql-lowest-common-factors, which means myisam etc) Well, most of those projects don't need a database, they need a bunch of tables and a lock. Heck, they even use client-side SELECT-loops in PHP instead of a JOIN because I always confuse left and right. Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Inserts optimization?
On Wed, 2006-04-19 at 07:08, Markus Schaber wrote: Hi, Magnus, Magnus Hagander wrote: Bacula already serializes access to the database (they have to support mysql/myisam), so this shouldn't help. Ouch, that hurts. To support mysql, they break performance for _every other_ database system? Note that should be to support MySQL with MyISAM tables. If they had written it for MySQL with innodb tables they would likely be able to use the same basic methods for performance tuning MySQL as or Oracle or PostgreSQL. It's the refusal of people to stop using MyISAM table types that's the real issue. Of course, given the shakey ground MySQL is now on with Oracle owning innodb... ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Inserts optimization?
Isn't MyISAM still the default over there? Yes, it's the default. Personnally I compile MySQL without InnoDB... and for any new development I use postgres. It's hardly likely that the average MySQL user would use anything but the default table type ... Double yes ; also many hosts provide MySQL 4.0 or even 3.x, both of which have no subquery support and are really brain-dead ; and most OSS PHP apps have to be compatible... argh. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Planner doesn't chose Index - (slow select)
Tom, You are absolutely correct about not having run ANALYZE on the particular table. In my attempt to create a simple test case I created that table (pk_c2) from the original and had not run ANALYZE on it, even though, ANALYZE had been run prior to building that table. The problem on the test table and the simple select count(*) is no longer there (after ANALYZE). The original issue, however, is still there. I'm stumped as how to formulate my question without having to write a lengthy essay. As to upgrading from 7.4, I hear you, but I'm trying to support a deployed product. Thanks again for your input, --patrick On 4/18/06, Tom Lane [EMAIL PROTECTED] wrote: patrick keshishian [EMAIL PROTECTED] writes: I've been struggling with some performance issues with certain SQL queries. I was prepping a long-ish overview of my problem to submit, but I think I'll start out with a simple case of the problem first, hopefully answers I receive will help me solve my initial issue. Have you ANALYZEd this table lately? db=# select count(*) from pk_c2 b0 where b0.offer_id=7141; count --- 1 (1 row) The planner is evidently estimating that there are 12109 such rows, not 1, which is the reason for its reluctance to use an indexscan. Generally the only reason for it to be off that far on such a simple statistical issue is if you haven't updated the stats in a long time. (If you've got a really skewed data distribution for offer_id, you might need to raise the statistics target for it.) The table has indexes for both 'offer_id' and '(pending=true)': Indexes: pk_boidx btree (offer_id) pk_bpidx btree (((pending = true))) The expression index on (pending = true) won't do you any good, unless you spell your query in a weird way like ... WHERE (pending = true) = true I'd suggest a plain index on pending instead. db=# select version(); PostgreSQL 7.4.12 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.6 You might want to think about an update, too. 7.4 is pretty long in the tooth. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Takes too long to fetch the data from database
Please provide me some help regarding how could I use cursor in following cases? : I want to fetch 50 records at a time starting from largest stime. Total no. of records in the wan table: 82019 pdb=# \d wan Table wan Column | Type | Modifiers -+--+--- stime | bigint | not null kname | character varying(64) | eid | smallint | rtpe | smallint | taddr | character varying(16) | ntime | bigint | Primary key: wan_pkey stime is the primary key. pdb=# SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900; pdb=# explain analyze SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900; NOTICE: QUERY PLAN: Limit (cost=17995.15..17995.15 rows=50 width=95) (actual time=9842.92..9843.20 rows=50 loops=1) - Sort (cost=17995.15..17995.15 rows=82016 width=95) (actual time=9364.56.. 9793.00 rows=81951 loops=1) - Seq Scan on wan (cost=0.00..3281.16 rows=82016 width=95) (actu al time=0.11..3906.29 rows=82019 loops=1) Total runtime: 10010.76 msec EXPLAIN pdb=# SELECT * FROM wan where kname='pluto' ORDER BY stime LIMIT 50 OFFSET 81900; pdb=# explain analyze SELECT * from wan where kname='pluto' order by stime limit 50 offset 81900; NOTICE: QUERY PLAN: Limit (cost=3494.13..3494.13 rows=1 width=95) (actual time=9512.85..9512.85 rows=0 loops=1) - Sort (cost=3494.13..3494.13 rows=206 width=95) (actual time=9330.74..9494.90 rows=27485 loops=1) - Seq Scan on wan (cost=0.00..3486.20 rows=206 width=95) (actual time=0.28..4951.76 rows=27485 loops=1) Total runtime: 9636.96 msec EXPLAIN SELECT * FROM wan where kname='pluto' and rtpe=20 ORDER BY stime LIMIT 50 OFFSET 81900; pdb=# explain analyze SELECT * from wan where kname='pluto' and rtpe = 20 order by stime limit 50 offset 81900; NOTICE: QUERY PLAN: Limit (cost=3691.25..3691.25 rows=1 width=95) (actual time=7361.50..7361.50 rows=0 loops=1) - Sort (cost=3691.25..3691.25 rows=1 width=95) (actual time=7361.50..7361.50 rows=0 loops=1) - Seq Scan on wan (cost=0.00..3691.24 rows=1 width=95) (actual time=7361.30..7361.30 rows=0 loops=1) Total runtime: 7361.71 msec EXPLAIN pdb=# all the above queries taking around 7~10 sec. to fetch the last 50 records. I want to reduce this time because table is growing and table can contain more than 1 GB data then for 1 GB data above queries will take too much time. I am not getting how to use cursor to fetch records starting from last records in the above case offset can be any number (less than total no. of records). I have use following cursor, but it is taking same time as query takes. BEGIN; DECLARE crs cursor FOR SELECT * FROM wan ORDER BY stime LIMIT 50 OFFSET 81900; FETCH ALL in crs; CLOSE crs; COMMIT; On 4/11/06, Merlin Moncure [EMAIL PROTECTED] wrote: pdb=# explain analyze SELECT sdate, stime, rbts from lan WHERE (( bname = 'pluto' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate = '2004-07-21' ) AND ( sdate = '2004-07-21' ) ))ORDER BY sdate, stime ;this query would benefit from an index onpluto, cno, pno, sdatecreate index Ian_idx on Ian(bname, cno, pno, sdate); pdb=# explain analyze SELECT ALL sdate, stime, rbts from lan WHERE (( bname = 'neptune' ) AND ( cno = 17 ) AND ( pno = 1 ) AND ( ( sdate = '2004-07-21' ) AND ( sdate = '2004-07-21' ) ))ORDER BY sdate, stime ; ditto above.Generally, the closer the fields in the where clause arematched by the index, the it will speed up your query.Merlin