On 4/13/15 7:01 PM, Pawel Veselov wrote:
    Cursors tend to make things slow. Avoid them if you can.


Is there an alternative to iterating over a number of rows, where a
direct update query is not an option?

I really doubt that either the actual processing logic, including use of
types has anything to do with my problem. This is based on the fact that
out of the tables that are being changed, only one is exhibiting the
problem. All of the involved tables have nearly the same structure, and
have the same logical operations performed on them. I thought may be the
"bad" table is slow because it was first in the list, and Postgres was
caching the functions results, but I moved things around, and pattern is
the same.

I'm guessing that you're essentially processing a queue. Take a look at http://www.postgresql.org/message-id/552c750f.2010...@bluetreble.com for some ideas. Basically, not only do cursors have non-trivial overhead, doing a ton of single-row queries is going to have a non-trivial overhead itself.

    As for your specific question, I suggest you modify the plpgsql
    function so that it's doing an EXPLAIN ANALYZE on the slow table.
    EXPLAIN ANALYZE actually returns a recordset the same way a SELECT
    would, with a single column of type text. So you just need to do
    something with that output. The easiest thing would be to replace
    this in your function:

    UPDATE slow_table SET ...

    to this (untested)

    RETURN QUERY EXPLAIN ANALYZE UPDATE slow_table SET ...

    and change the function so it returns SETOF text instead of whatever
    it returns now.


Thank you, that made it a lot easier to see into what's really going on.
But the outcome is somewhat the same. The "bad" table analysis shows a
very high cost, and thousands of rows, where the table contains only 24
rows. This time, however, the actual run time is shown, and one can see
where the time is spent (I was using just a sum of clock_time()s around
the update statements to see where the problem is).

r_agrio_hourly - "good", r_agrio_total - "bad".

  Update on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329)
(actual time=2.248..2.248 rows=0 loops=1)
  ->  Index Scan using u_r_agrio_hourly on r_agrio_hourly
  (cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207
rows=1 loops=1)
          Index Cond: ((tagid = 1002::numeric) AND (unitid =
1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND
(device_type = 3::numeric) AND (placement = 2::numeric))
  Total runtime: 2.281 ms
  Update on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321)
(actual time=106.766..106.766 rows=0 loops=1)
  ->  Index Scan using u_r_agrio_total on r_agrio_total
  (cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626
rows=1 loops=1)
          Index Cond: ((tagid = 1002::numeric) AND (unitid =
1002::numeric) AND (device_type = 3::numeric) AND (placement = 2::numeric))
  Total runtime: 106.793 ms

Keep in mind that the estimated cost is not terribly useful; it's the actual times that matter.

I suspect what's happening here is a combination of things. First, the hourly table is basically living in cache, but the total table is not. That means that when you go to find a row in the total table you're actually hitting the disk instead of pulling the data from memory.

Second, you may have a lot of dead rows in the total table. I suspect this because of the very large amount of time the index scan is taking. Unless you're running on an old 10MB MFM drive you'd be pretty hard pressed for even 2 IO operations (one for the index leaf page and one for the heap page) to take 32ms. I suspect the index scan is having to read many dead rows in before it finds a live one, and incurring multiple IOs. Swiching to EXPLAIN (analyze, buffers) would help confirm that.

Third, I think something odd is happening with the update itself. I'm pretty sure that the index scan itself is visiting the heap pages, so each page should be in shared buffers by the time each tuple hits the update node. That makes me wonder what on earth is taking 60ms to update the tuple. I suspect it's going into either finding a free buffer to put the new tuple on, or waiting to try and extend the relation. Selecting ctid from the freshly updated rows and comparing the first number to the total number of pages in the heap would show if the new tuples are all ending up at the end of the heap.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to