[PERFORM] Query's fast standalone - slow as a subquery.
I have a query that runs quite quickly using a hash join when run standalone. When I use this query as a subquery the planner always seems to pick a differnt plan with an order of magnitude worse performance. This bad plan is chosen even when the outer sql statement is a trivial expression like this: select * from (query) as a; which I believe should be a no-op. Should the optimizer have noticed that it could have used a hash join in this case? Anything I can do to help convince it to? Explain analyze output follows. Thanks, Ron fli=# explain analyze SELECT * from (select * from userfeatures.points join icons using (iconid) where the_geom setSRID('BOX3D(-123.40 25.66,-97.87 43.17)'::BOX3D, -1 )) as upf ; QUERY PLAN -- Nested Loop (cost=0.00..446.42 rows=1 width=120) (actual time=-0.096..7928.546 rows=15743 loops=1) Join Filter: (outer.iconid = inner.iconid) - Seq Scan on points (cost=0.00..444.43 rows=1 width=82) (actual time=0.096..132.255 rows=15743 loops=1) Filter: (the_geom '010300010005009AD95EC0295C8FC2F5A839409AD95EC0F6285C8FC295454048E17A14AE7758C0F6285C8FC295454048E17A14AE7758C0295C8FC2F5A839409AD95EC0295C8FC2F5A83940'::geometry) - Seq Scan on icons (cost=0.00..1.44 rows=44 width=42) (actual time=0.006..0.242 rows=44 loops=15743) Total runtime: 8005.766 ms (6 rows) fli=# explain analyze select * from userfeatures.points join icons using (iconid) where the_geom setSRID('BOX3D(-123.40 25.66,-97.87 43.17)'::BOX3D, -1 ); QUERY PLAN -- Hash Join (cost=1.55..682.84 rows=15789 width=120) (actual time=0.641..320.002 rows=15743 loops=1) Hash Cond: (outer.iconid = inner.iconid) - Seq Scan on points (cost=0.00..444.43 rows=15794 width=82) (actual time=0.067..94.307 rows=15743 loops=1) Filter: (the_geom '010300010005009AD95EC0295C8FC2F5A839409AD95EC0F6285C8FC295454048E17A14AE7758C0F6285C8FC295454048E17A14AE7758C0295C8FC2F5A839409AD95EC0295C8FC2F5A83940'::geometry) - Hash (cost=1.44..1.44 rows=44 width=42) (actual time=0.530..0.530 rows=0 loops=1) - Seq Scan on icons (cost=0.00..1.44 rows=44 width=42) (actual time=0.026..0.287 rows=44 loops=1) Total runtime: 397.003 ms (7 rows) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Query's fast standalone - slow as a subquery.
Ron Mayer [EMAIL PROTECTED] writes: - Seq Scan on points (cost=0.00..444.43 rows=1 width=82) (actual time=0.096..132.255 rows=15743 loops=1) Filter: (the_geom '010300010005009AD95EC0295C8FC2F5A839409AD95EC0F6285C8FC295454048E17A14AE7758C0F6285C8FC295454048E17A14AE7758C0295C8FC2F5A839409AD95EC0295C8FC2F5A83940'::geometry) - Seq Scan on points (cost=0.00..444.43 rows=15794 width=82) (actual time=0.067..94.307 rows=15743 loops=1) Filter: (the_geom '010300010005009AD95EC0295C8FC2F5A839409AD95EC0F6285C8FC295454048E17A14AE7758C0F6285C8FC295454048E17A14AE7758C0295C8FC2F5A839409AD95EC0295C8FC2F5A83940'::geometry) Apparently the selectivity of the condition is misestimated in the first case (note the radically wrong rowcount estimate), leading to an inefficient join plan choice. I suppose this is a bug in the postgis selectivity routines --- better complain to them. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Help with tuning this query (with explain analyze finally)
Richard, What do you mean by summary table? Basically a cache of the query into a table with replicated column names of all the joins? I'd probably have to whipe out the table every minute and re-insert the data for each carrier in the system. I'm not sure how expensive this operation would be, but I'm guessing it would be fairly heavy-weight. And maintaince would be a lot harder because of the duplicated columns, making refactorings on the database more error-prone. Am I understanding your suggestion correctly? Please correct me if I am. Can you turn the problem around? Calculate what you want for all users (once every 60 seconds) and stuff those results into a summary table. Then let the users query the summary table as often as they like (with the understanding that the figures aren't going to update any faster than once a minute) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Help with tuning this query (with explain analyze finally)
Ken wrote: Richard, What do you mean by summary table? Basically a cache of the query into a table with replicated column names of all the joins? I'd probably have to whipe out the table every minute and re-insert the data for each carrier in the system. I'm not sure how expensive this operation would be, but I'm guessing it would be fairly heavy-weight. And maintaince would be a lot harder because of the duplicated columns, making refactorings on the database more error-prone. Am I understanding your suggestion correctly? Please correct me if I am. Can you turn the problem around? Calculate what you want for all users (once every 60 seconds) and stuff those results into a summary table. Then let the users query the summary table as often as they like (with the understanding that the figures aren't going to update any faster than once a minute) It's the same idea of a materialized view, or possibly just a lazy cache. Just try this query: CREATE TABLE cachedview AS select p.id as person_id, s.*, ss.* from shipment s inner join shipment_status ss on s.current_status_id=ss.id inner join release_code rc on ss.release_code_id=rc.id left outer join driver d on s.driver_id=d.id left outer join carrier_code cc on s.carrier_code_id=cc.id where s.carrier_code_id in ( select cc.id from person p inner join carrier_to_person ctp on p.id=ctp.person_id inner join carrier c on ctp.carrier_id=c.id inner join carrier_code cc on cc.carrier_id = c.id ) and s.current_status_id is not null and s.is_purged=false and(rc.number='9' ) and(ss.date=current_date-31 ) order by ss.date desc ; Notice that I took out the internal p.id = blah. Then you can do: CREATE INDEX cachedview_person_id_idx ON cachedview(person_id); Then from the client side, you can just run: SELECT * from cachedview WHERE person_id = id; Now, this assumes that rc.number='9' is what you always want. If that isn't the case, you could refactor a little bit. This unrolls all of the work, a table which should be really fast to query. If this query takes less than 10s to generate, than just have a service run it every 60s. I think for refreshing, it is actually faster to drop the table and recreate it, rather than deleteing the entries. Dropping also has the advantage that if you ever add more rows to s or ss, then the table automatically gets the new entries. Another possibility, is to have the cachedview not use s.*, ss.*, but instead just include whatever the primary keys are for those tables. Then your final query becomes: SELECT s.*, ss.* FROM cachedview cv, s, ss WHERE cv.person_id = id, cv.s_id = s.pkey, cv.ss_id = ss.pkey; Again, this should be really fast, because you should have an index on cv.person_id and only have say 300 rows there, and then you are just fetching a few rows from s and ss. You can also use this time to do some of your left joins against other tables. Does this make sense? The biggest advantage you have is your 60s statement. With that in hand, I think you can do a lot of caching optimizations. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Help with tuning this query (with explain analyze finally)
Ken Egervari wrote: Josh, ... I thought about this, but it's very important since shipment and shipment_status are both updated in real time 24/7/365. I think I might be able to cache it within the application for 60 seconds at most, but it would make little difference since people tend to refresh within that time anyway. It's very important that real-time inforamtion exists though. Is 60s real-time enough for you? That's what it sounds like. It would be nice if you could have 1hr, but there's still a lot of extra work you can do in 60s. You could also always throw more hardware at it. :) If the shipment_status is one of the bottlenecks, create a 4-disk raid10 and move the table over. I don't remember what your hardware is, but I don't remember it being a quad opteron with 16GB ram, and 20 15k SCSI disks, with the transaction log on a solid state disk. :) That sounds like an awesome system. I loved to have something like that. Unfortunately, the production server is just a single processor machine with 1 GB ram. I think throwing more disks at it is probably the best bet, moving the shipment and shipment_status tables over as you suggested. That's great advice. Well, disk I/O is one side, but probably sticking another 1GB (2GB total) also would be a fairly economical upgrade for performance. You are looking for query performance, not really update performance, right? So buy a 4-port SATA controller, and some WD Raptor 10k SATA disks. With this you can create a RAID10 for $2k (probably like $1k). 30ms is a good target, although I guess I was naive for setting that goal perhaps. I've just taken queries that ran at 600ms and with 1 or 2 indexes, they went down to 15ms. It all depends on your query. If you have a giant table (1M rows), and you are doing a seqscan for only 5 rows, then adding an index will give you enormous productivity gains. But you are getting 30k rows, and combining them with 6k rows, plus a bunch of other stuff. I think we've tuned the query about as far as we can. Let's say we have 200 users signed into the application at the same time. The application refreshes their shipment information automatically to make sure it's up to date on the user's screen. The application will execute the query we are trying to tune every 60 seconds for most of these users. Users can set the refresh time to be higher, but 60 is the lowest amount so I'm just assuming everyone has it at 60. Anyway, if you have 200 users logged in, that's 200 queries in the 60 second period, which is about 3-4 queries every second. As you can see, it's getting maxed out, and because of bad luck, the queries are bunched together and are being called at the same time, making 8-9 queries in the same second and that's where the performance is starting to degrade. I just know that if I could get this down to 30 ms, or even 100, we'd be okay for a few months without throwing hardware at the problem. Also keep in mind that other application logic and Hibernate mapping is occuring to, so 3-4 queries a second is already no good when everything is running on a single machine. The other query I just sent, where you do the query for all users at once, and then cache the result, *might* be cheaper than doing a bunch of different queries. However, you may find that doing the query for *all* users takes to long. So you could keep another table indicating who the most recent people logged in are, and then only cache the info for those people. This does start getting a little more involved, so see if you can do all users before heading down this road. This isn't the best setup, but it's the best we can afford. We are just a new startup company. Cheaper servers and open source keep our costs low. But money is starting to come in after 10 months of hard work, so we'll be able to replace our server within the next 2 months. It'll be a neccessity because we are signing on some big clientsnow and they'll have 40 or 50 users for a single company. If they are all logged in at the same time, that's a lot of queries. Sure. Just realize you can't really support 200 concurrent connections with a single P4 and 1GB of ram. John =:- signature.asc Description: OpenPGP digital signature
[PERFORM] Select in FOR LOOP Performance
I face problem when running the following pgplsql function. The problem is it takes more than 24hours to complete the calculation. The EMP table has about 200,000 records. I execute the function through psql select calculate(); (There is no cyclic link inside the data). Computer used: IBM xSeries 225, RAM 1GB, SCSI 36GB O/S : RedHat Linux Enterprise 3.0 AS PostgreSQL version 8.0.1 fsync=false I would very appreciate if anyone can help to find out what the problem is, or any others way to improve the performance of the function. Is there any difference between select in FOR LOOP with CURSOR in term of performance ? EMP Table GEN char(3), CODE varchar(20), PARENT varchar(20), POSITION INT4 DEFAULT 0, PG NUMERIC(15,2) DEFAULT 0, P NUMERIC(15,2) DEFAULT 0, QUA CHAR(1) DEFAULT '0', . . . create index EMP_GEN on EMP (GEN); create index EMP_CODE on EMP (CODE); create index EMP_PARENT on PARENT (PARENT); Sample EMP DATA: GEN CODE PARENT POSITIONPPG QUA === 000 A001 3 100 0 '1' 001 A002 A001 250 0 '1' 001 A003 A001 150 0 '1' 001 A004 A001 120 0 '1' 002 A005 A003 220 0 '1' 002 A006 A004 330 0 '1' ... ... for vTMP_ROW in select CODE,PARENT,POSITION from EMP order by GEN desc loop vCODE := vTMP_ROW.CODE; vPARENT := vTMP_ROW.PARENT; nPOSITION := vTMP_ROW.POSITION; update EMP set PG=PG+P where CODE = vCODE; select into vCURR_ROW PG,POSITION from EMP where CODE = vCODE; nPG := vCURR_ROW.PG; nPOSITION := vCURR_ROW.POSITION; vUPL := vPARENT; loop select into vUPL_ROW CODE,PARENT,POSITION,P,QUA from EMP where CODE = vUPL; if found then if vUPL_ROW.POSITION nPOSITION and vUPL_ROW.QUA = ''1'' then update EMP set PG=PG+nPG where CODE = vUPL; exit; end if; else exit; end if; vUPL := vUPL_ROW.PARENT; end loop; end loop; . . . Thank You __ Celebrate Yahoo!'s 10th Birthday! Yahoo! Netrospective: 100 Moments of the Web http://birthday.yahoo.com/netrospective/ ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Help with tuning this query (with explain analyze finally)
Ken, I did everything you said and my query does perform a bit better. I've been getting speeds from 203 to 219 to 234 milliseconds now. I tried increasing the work mem and the effective cache size from the values you provided, but I didn't see any more improvement. I've tried to looking into setting the shared buffers for Windows XP, but I'm not sure how to do it. I'm looking in the manual at: Now that you know how to change the shared_buffers, want to go ahead and run the query again? I'm pretty concerned about your case, because based on your description I would expect 100ms on a Linux machine.So I'm wondering if this is a problem with WindowsXP performance, or if it's something we can fix through tuning. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Select in FOR LOOP Performance
Charles Joseph [EMAIL PROTECTED] writes: I face problem when running the following pgplsql function. The problem is it takes more than 24hours to complete the calculation. The EMP table has about 200,000 records. Sure there are no infinite loops of PARENT links in your table? Also, if CODE is supposed to be unique, you should probably declare its index that way. Or at least make sure the planner knows it's unique (have you ANALYZEd the table lately?) regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Help with tuning this query (with explain analyze finally)
John Arbash Meinel wrote: Ken wrote: Richard, What do you mean by summary table? Basically a cache of the query into a table with replicated column names of all the joins? I'd probably have to whipe out the table every minute and re-insert the data for each carrier in the system. I'm not sure how expensive this operation would be, but I'm guessing it would be fairly heavy-weight. And maintaince would be a lot harder because of the duplicated columns, making refactorings on the database more error-prone. Am I understanding your suggestion correctly? Please correct me if I am. Can you turn the problem around? Calculate what you want for all users (once every 60 seconds) and stuff those results into a summary table. Then let the users query the summary table as often as they like (with the understanding that the figures aren't going to update any faster than once a minute) It's the same idea of a materialized view, or possibly just a lazy cache. ... This unrolls all of the work, a table which should be really fast to query. If this query takes less than 10s to generate, than just have a service run it every 60s. I think for refreshing, it is actually faster to drop the table and recreate it, rather than deleteing the entries. Dropping also has the advantage that if you ever add more rows to s or ss, then the table automatically gets the new entries. Just as a small update. If completely regenerating the cache takes to long, the other way to do it, is to create insert and update triggers on s and ss, such that as they change, they also update the cachedview table. Something like CREATE TRIGGER on_ss_ins AFTER INSERT ON ss FOR EACH ROW EXECUTE INSERT INTO cached_view SELECT p.id as person_id, s.*, ss.* FROM the big stuff WHERE s.id = NEW.id; This runs the same query, but notice that the WHERE means it only allows the new row. So this query should run fast. It is a little bit of overhead on each of your inserts, but it should keep the cache up-to-date. With something like this, I would have the final client query still include the date restriction, since you accumulate older rows into the cached view. But you can run a daily process that prunes out everything older than 31 days, which keeps the cachedview from getting really large. John =:- signature.asc Description: OpenPGP digital signature
[PERFORM] MAIN vs. PLAIN
I notice that by default, postgres sets numeric fields to storage MAIN. What exactly does that mean? Does that mean it stores it in some type of compressed BCD format? If so, how much performance gain can I expect by setting the storage to PLAIN? Also, the docs say that char(n) is implemented more or less the same way as text. Does that mean that setting a field to, say, char(2) PLAIN is not going be any faster than text PLAIN? That seems a bit counter-intuitive. I would hope that a char(2) PLAIN would just reserve two chars in the record structure without any overhead of pointers to external data. Is there a reason this isn't supported? __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] name search query speed
Jeremiah Jahn wrote: I have about 5M names stored on my DB. Currently the searches are very quick unless, they are on a very common last name ie. SMITH. The Index is always used, but I still hit 10-20 seconds on a SMITH or Jones search, and I average about 6 searches a second and max out at about 30/s. Any suggestions on how I could arrange things to make this search quicker? I have 4gb of mem on a raid 5 w/ 3 drives. I'm hoping that I can increase this speed w/o a HW upgrade. thanx, -jj- is there a chance you could benefit from indices spanning over multiple columns? maybe the user that searches for SMITH knows more then the last name, ie first name, location (zip code, name of city, etc.)? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] What is the postgres sql command for last_user_id ???
[EMAIL PROTECTED] wrote: I would like to know whether there is any command which the server will give the record ID back to the client when client puts the data and the server generates an autoincrement ID for that record. For example if many clients try to put the money data to the server and each record from each client has its own record ID by autoincrement process of the server [x+1] and i don't need to lock the record since it will bring the system to slow down. That client wil then want to know which ID that server gives to that record in order to select that record to print the reciept [bill]. I know that in mysql there is a command last_record_id which acts the same as I mention above. Does anybody know that , please give me the detail? Amrit,Thailand ---(end of broadcast)--- TIP 8: explain analyze is your friend http://www.postgresql.org/docs/8.0/static/functions-sequence.html ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] multi billion row tables: possible or insane?
On Tue, Mar 01, 2005 at 10:34:29AM +0100, Ramon Bastiaans wrote: Hi all, I am doing research for a project of mine where I need to store several billion values for a monitoring and historical tracking system for a big computer system. My currect estimate is that I have to store (somehow) around 1 billion values each month (possibly more). On a side-note, do you need to keep the actual row-level details for history? http://rrs.decibel.org might be of some use. Other than that, what others have said. Lots and lots of disks in RAID10, and opterons (though I would choose opterons not for memory size but because of memory *bandwidth*). -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: Where do you want to go today? Linux: Where do you want to go tomorrow? FreeBSD: Are you guys coming, or what? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] multi billion row tables: possible or insane?
Not true - with fsync on I get nearly 500 tx/s, with it off I'm as high as 1600/sec with dual opteron and 14xSATA drives and 4GB RAM on a 3ware Escalade. Database has 3 million rows. As long as queries use indexes, multi billion row shouldn't be too bad. Full table scan will suck though. Alex Turner netEconomist On Tue, 1 Mar 2005 16:40:29 +0100, Vig, Sandor (G/FI-2) [EMAIL PROTECTED] wrote: 385 transaction/sec? fsync = false risky but fast. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of John Arbash Meinel Sent: Tuesday, March 01, 2005 4:19 PM To: Ramon Bastiaans Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] multi billion row tables: possible or insane? Ramon Bastiaans wrote: Hi all, I am doing research for a project of mine where I need to store several billion values for a monitoring and historical tracking system for a big computer system. My currect estimate is that I have to store (somehow) around 1 billion values each month (possibly more). If you have that 1 billion perfectly distributed over all hours of the day, then you need 1e9/30/24/3600 = 385 transactions per second. Which I'm pretty sure is possible with postgres, you just need pretty beefy hardware. And like Jeff said, lots of disks for lots of IO. Like a quad opteron, with 16GB of ram, and around 14-20 very fast disks. raid10 not raid5, etc. To improve query performance, you can do some load balancing by having replication machines by using Slony. Or if you can do batch processing, you could split up the work into a few update machines, which then do bulk updates on the master database. This lets you get more machines into the job, since you can't share a database across multiple machines. I was wondering if anyone has had any experience with these kind of big numbers of data in a postgres sql database and how this affects database design and optimization. Well, one of the biggest things is if you can get bulk updates, or if clients can handle data being slightly out of date, so you can use cacheing. Can you segregate your data into separate tables as much as possible? Are your clients okay if aggregate information takes a little while to update? One trick is to use semi-lazy materialized views to get your updates to be fast. What would be important issues when setting up a database this big, and is it at all doable? Or would it be a insane to think about storing up to 5-10 billion rows in a postgres database. I think you if you can design the db properly, it is doable. But if you have a clients saying I need up to the second information on 1 billion rows, you're never going to get it. The database's performance is important. There would be no use in storing the data if a query will take ages. Query's should be quite fast if possible. Again, it depends on the queries being done. There are some nice tricks you can use, like doing a month-by-month partitioning (if you are getting 1G inserts, you might want week-by-week partitioning), and then with a date column index, and a union all view you should be able to get pretty good insert speed, and still keep fast *recent* queries. Going through 1billion rows is always going to be expensive. I would really like to hear people's thoughts/suggestions or go see a shrink, you must be mad statements ;) Kind regards, Ramon Bastiaans I think it would be possible, but there are a lot of design issues with a system like this. You can't go into it thinking that you can design a multi billion row database the same way you would design a million row db. John =:- The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] MAIN vs. PLAIN
Dave Held [EMAIL PROTECTED] writes: I notice that by default, postgres sets numeric fields to storage MAIN. What exactly does that mean? See http://developer.postgresql.org/docs/postgres/storage-toast.html There isn't any amazingly strong reason why numeric defaults to MAIN rather than EXTENDED, which is the default for every other toastable datatype --- except that I thought it'd be a good idea to have at least one type that did so, just to exercise that code path in the tuple toaster. And numeric shouldn't ordinarily be large enough to need out-of-line storage anyway. It's unlikely even to need compression, really, but as long as it's a varlena type the overhead to support toasting is nearly nil. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend