[SQL] select is fast, update based on same where clause is slow
I have a select statement that returns a set of 74,000+ results back in under a minute as follows: select s.sessid, s.membid, s.datetime from sessions2 s, (select min(datetime) as datetime, membid from sessions2 where membid is not null group by membid) as minsess where s.membid = minsess.membid and s.datetime = minsess.datetime; The final cost from explain for the above select is 22199.15 ... 24318.40 with rows = 5 and width = 28. Then I issue an update as follows (to update those 74,000+ rows): update sessions2 set sinceinception = 0 from sessions2 s, (select min(datetime) as datetime, membid from sessions2 group by membid) as mindate where s.membid = mindate.membid and s.datetime = mindate.datetime; The final cost from explain for the above update is 31112.11...98869.91 with rows = 2013209 and width=87. This update statement has been left running over night and does not complete. The ram usage on the machine is at about 3/4 capacity (800mb) during this time and CPU usage is near 100%. The machine has the -F option set and memory segments of 200mb and is running 7.1.2. What could be causing this update statement to not complete? Why are the costs so different since it seems to me that besides the cost of the update they are the same query? Any help would be great! Jeff Barrett ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] select is fast, update based on same where clause is slow
On Fri, 21 Sep 2001, Jeff Barrett wrote: > I have a select statement that returns a set of 74,000+ results back in > under a minute as follows: > > select s.sessid, s.membid, s.datetime > from sessions2 s, (select min(datetime) as datetime, membid > from sessions2 > where membid is not null > group by membid) as minsess > where s.membid = minsess.membid > and s.datetime = minsess.datetime; > > The final cost from explain for the above select is 22199.15 ... 24318.40 > with rows = 5 and width = 28. > > Then I issue an update as follows (to update those 74,000+ rows): > update sessions2 set sinceinception = 0 > from sessions2 s, (select min(datetime) as datetime, membid from sessions2 > group by membid) as mindate > where s.membid = mindate.membid > and s.datetime = mindate.datetime; > > The final cost from explain for the above update is 31112.11...98869.91 with > rows = 2013209 and width=87. > > This update statement has been left running over night and does not > complete. The ram usage on the machine is at about 3/4 capacity (800mb) > during this time and CPU usage is near 100%. The machine has the -F option > set and memory segments of 200mb and is running 7.1.2. > > What could be causing this update statement to not complete? > Why are the costs so different since it seems to me that besides the cost of > the update they are the same query? I thought that the updated table is always in your from list (implicitly), so you'd want: update sessions2 set sinceinception = 0 from (select min(datetime) as datetime, membid from sessions2 group by membid) as mindate where sessions2.membid=mindate.membid and sessions2.datetime=mindate.datetime; I think your query would be joining the s/mindate results against sessions2. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] select is fast, update based on same where clause is slow
Jeff, I think that you're running into a syntactical problem here: > Then I issue an update as follows (to update those 74,000+ rows): > update sessions2 set sinceinception = 0 > from sessions2 s, (select min(datetime) as datetime, membid from > sessions2 > group by membid) as mindate > where s.membid = mindate.membid > and s.datetime = mindate.datetime; In the query above, there is no linkage between the instance of sessions2 (sessions2) you are updating and the instance you are limiting (s). As a result, you are running an update on 74,000^2 rows (about 55 trillion), and if the query ever completed you would find that sinceinception would be set to 0 for all rows in sessions2. Your query should read: update sessions2 set sinceinception = 0 from (select min(datetime) as datetime, membid from sessions2 group by membid) as mindate where sessions2.membid = mindate.membid and sessions2.datetime = mindate.datetime; It's a peculiarity of Update queries that the name of the updated table cannot be aliased anywhere in the query. To add to the confusion, in MS Transact SQL, you are expected to repeat the name of the updated table in the FROM clause, while in Postgres such repetition is prohibited. This is mainly due to the fact that UPDATE ... FROM is not well-defined in the SQL 92 standard. -Josh __AGLIO DATABASE SOLUTIONS___ Josh Berkus Complete information technology [EMAIL PROTECTED] and data management solutions (415) 565-7293 for law firms, small businessesfax 621-2533 and non-profit organizations. San Francisco ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Selecting latest value II
Hi back Carl van Tast wrote: > > Hi, Thurstan > > On Thu, 20 Sep 2001 17:30:46 +0100, "Thurstan R. McDougle" > <[EMAIL PROTECTED]> wrote: > > > [...] > >Carl van Tast had 2 good methods as follows > > > >SELECT userid, val > >FROM tbl > >WHERE NOT EXISTS (SELECT * FROM tbl AS t2 > > WHERE tbl.userid=t2.userid AND t2.ts > tbl.ts); > > > >or > > > >SELECT tbl.userid, tbl.val > >FROM tbl > > INNER JOIN > > (SELECT userid, max(ts) AS maxts > > FROM tbl > > GROUP BY userid) AS t2 > > ON (tbl.userid=t2.userid AND tbl.ts=t2.maxts); > > ... although I like Tom Lane's even better. This one should outperform > all others, especially my first one: > > > SELECT DISTINCT ON (userid) userid, val, ts FROM table > > ORDER BY userid, ts DESC; Indeed. I was looking for a 1 pass method but had not found one. I had thought of DISTINCT, but not DISTINCT ON as it I have not used it yet... Even though I had just been looking at the code for the DISTINCT/DISTINCT ON processing! > > That's the reason I'm here: learning by helping :-) Well, I think of helping more as paying my dues for the help received. > > Kind regards > Carl van Tast -- This is the identity that I use for NewsGroups. Email to this will just sit there. If you wish to email me replace the domain with knightpiesold . co . uk (no spaces). ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] select is fast, update based on same where clause is slow
That did the trick. Thank you for the quick detailed answer. It runs in about a minute now. Jeff Barrett "Stephan Szabo" <[EMAIL PROTECTED]> wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > On Fri, 21 Sep 2001, Jeff Barrett wrote: > > > I have a select statement that returns a set of 74,000+ results back in > > under a minute as follows: > > > > select s.sessid, s.membid, s.datetime > > from sessions2 s, (select min(datetime) as datetime, membid > > from sessions2 > > where membid is not null > > group by membid) as minsess > > where s.membid = minsess.membid > > and s.datetime = minsess.datetime; > > > > The final cost from explain for the above select is 22199.15 ... 24318.40 > > with rows = 5 and width = 28. > > > > Then I issue an update as follows (to update those 74,000+ rows): > > update sessions2 set sinceinception = 0 > > from sessions2 s, (select min(datetime) as datetime, membid from sessions2 > > group by membid) as mindate > > where s.membid = mindate.membid > > and s.datetime = mindate.datetime; > > > > The final cost from explain for the above update is 31112.11...98869.91 with > > rows = 2013209 and width=87. > > > > This update statement has been left running over night and does not > > complete. The ram usage on the machine is at about 3/4 capacity (800mb) > > during this time and CPU usage is near 100%. The machine has the -F option > > set and memory segments of 200mb and is running 7.1.2. > > > > What could be causing this update statement to not complete? > > Why are the costs so different since it seems to me that besides the cost of > > the update they are the same query? > > I thought that the updated table is always in your from list (implicitly), > so you'd want: > update sessions2 set sinceinception = 0 > from (select min(datetime) as datetime, membid from sessions2 group by > membid) as mindate > where sessions2.membid=mindate.membid and > sessions2.datetime=mindate.datetime; > > I think your query would be joining the s/mindate results against > sessions2. > > > > ---(end of broadcast)--- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: Q on "Re: [SQL] select is fast, update based on same where clause is slow "
Bladvin, > I understand why did Jeff's original solution not work. I understand > why yours do. > > But could you please explain me how the 74000^2 can be calculated > for this original query?: Sure. In Jeff's original query, the updated instance of the table (sessions2) and the referenced instance (s) are seperate, without any JOIN or WHERE conditions to link them. Under these conditions, SQL engines do a CROSS JOIN, where every single one of the rows in the first table is matched against every single one of the rows in the second table. This gives you a result set of the number of rows in sessions2 * the number of rows in s. 74,000^2 Get it? Example: TableA: 1 A 2 B Table B 7 H 8 G SELECT * FROM TableA, TableB 1 A 7 H 1 A 8 G 2 B 7 H 2 B 8 G -Josh -Josh __AGLIO DATABASE SOLUTIONS___ Josh Berkus Complete information technology [EMAIL PROTECTED] and data management solutions (415) 565-7293 for law firms, small businessesfax 621-2533 and non-profit organizations. San Francisco ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
