On Sun, Apr 12, 2015 at 5:40 PM, Jim Nasby <jim.na...@bluetreble.com> wrote:

> On 4/9/15 6:18 PM, Pawel Veselov wrote:
>
>> Hi.
>>
>> I have a plpgsql procedure that updates a few similar tables.
>> for some reason, updates on one of the tables take a lot longer the
>> updates on the other ones. The difference is, say, 7 seconds vs. 80
>> milliseconds.
>>
>> the procedure uses cursors and record variables to do the updates. For
>> example:
>>
>>          update r_agrio_total set
>>            unserved = unserved + (agrow->>'unserved')::numeric(38),
>>            r_brkconn = mush_brk_conn(r_brkconn, q_item.r_brkconn),
>>          where
>>            tagid = _tagid and
>>            unitid = (akey->>'unitid')::numeric and
>>            placement = (akey->>'placement')::numeric and
>>            device_type = (akey->>'device_type')::numeric;
>>
>> There is another table (xq_agr) that is read record by record, and for
>> each of those records, such update is executed.
>>
>> I was trying to select analyze the updates to see where the time could
>> be spent.
>> There are only 24 row in the "bad" table, and 3,400 rows in "good"
>> table. So, for the "bad" table, most of the updates will be on the same
>> rows. The times were measured on processing 100 original records.
>>
>> When I'm analyzing pure update statements, I don't see anything strange.
>>
>> "bad" table: explain analyze update r_agrio_total set unconfirmed =
>> unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0
>> and device_type = 100;
>>
>> RESULT:
>>   Update on r_agrio_total  (cost=0.42..4.46 rows=1 width=321) (actual
>> time=0.253..0.253 rows=0 loops=1)
>>     ->  Index Scan using tag_r_agrio_total on r_agrio_total
>>   (cost=0.42..4.46 rows=1 width=321) (actual time=0.037..0.041 rows=1
>> loops=1)
>>           Index Cond: (tagid = 1000::numeric)
>>           Filter: ((unitid = 1000::numeric) AND (placement = 0::numeric)
>> AND (device_type = 100::numeric))
>>           Rows Removed by Filter: 7
>>   Total runtime: 0.282 ms
>>
>> "good" table: explain analyze update r_agrio_hourly set unconfirmed =
>> unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0
>> and device_type = 100 and rowdate = '2015-02-23T13';
>>
>> RESULT:
>>   Update on r_agrio_hourly  (cost=0.42..17.36 rows=6 width=329) (actual
>> time=0.102..0.102 rows=0 loops=1)
>>     ->  Index Scan using u_r_agrio_hourly on r_agrio_hourly
>>   (cost=0.42..17.36 rows=6 width=329) (actual time=0.047..0.048 rows=1
>> loops=1)
>>           Index Cond: ((tagid = 1000::numeric) AND (unitid =
>> 1000::numeric) AND ((rowdate)::text = '2015-02-23T13'::text) AND
>> (device_type = 100::numeric) AND (placement = 0::numeric))
>>   Total runtime: 0.135 ms
>>
>> When I try doing it with WITH statement (really, to apply the actual
>> data that the plpgsql function uses), there is something strange in the
>> "bad" table.
>>
>> explain analyze
>> with SRC as (select * from xq_agr where id = 914830)
>>          update r_agrio_total set
>>            unconfirmed = unconfirmed +
>> (SRC.r_agrio->>'unconfirmed')::numeric(38)
>>          from SRC
>>          where
>>            tagid = (SRC.r_agrio->'key'->>'tagid')::numeric and
>>            unitid = (SRC.r_agrio->'key'->>'unit')::numeric and
>>            placement = (SRC.r_agrio->'key'->>'placement')::numeric and
>>            device_type = (SRC.r_agrio->'key'->>'device_type')::numeric;
>>
>> RESULT:
>>   Update on r_agrio_total  (cost=8.91..32777.51 rows=19331 width=409)
>> (actual time=0.107..0.107 rows=0 loops=1)
>>     CTE src
>>       ->  Index Scan using xq_agr_pkey on xq_agr  (cost=0.42..8.44
>> rows=1 width=379) (actual time=0.026..0.027 rows=1 loops=1)
>>             Index Cond: (id = 914830)
>>     ->  Nested Loop  (cost=0.46..32769.07 rows=19331 width=409) (actual
>> time=0.107..0.107 rows=0 loops=1)
>>           ->  CTE Scan on src  (cost=0.00..0.02 rows=1 width=88) (actual
>> time=0.032..0.033 rows=1 loops=1)
>>           ->  Index Scan using u_r_agrio_total on r_agrio_total
>>   (*cost=0.46..32285.78 rows=19331* width=321) (actual time=0.001..0.001
>>
>> rows=0 loops=1)
>>                 Index Cond: ((tagid = (((src.r_agrio -> 'key'::text) ->>
>> 'tagid'::text))::numeric) AND (unitid = (((src.r_agrio -> 'key'::text)
>> ->> 'unit'::text))::numeric) AND (device_type = (((src.r_agrio ->
>> 'key'::text) ->> 'device_type'::text))::numeric) AND (placement =
>> (((src.r_agrio -> 'key'::text) ->> 'placement'::text))::numeric))
>>   Total runtime: 0.155 ms
>>
>> explain analyze
>> with SRC as (select * from xq_agr where id = 914830)
>>          update r_agrio_hourly set
>>            unconfirmed = unconfirmed +
>> (SRC.r_agrio->>'unconfirmed')::numeric(38)
>>          from SRC
>>          where
>>            tagid = (SRC.r_agrio->'key'->>'tagid')::numeric and
>>            unitid = (SRC.r_agrio->'key'->>'unit')::numeric and
>>            placement = (SRC.r_agrio->'key'->>'placement')::numeric and
>>            device_type = (SRC.r_agrio->'key'->>'device_type')::numeric
>> and
>>            rowdate = (SRC.r_agrio->'key'->>'rowdate');
>>
>> RESULT:
>>   Update on r_agrio_hourly  (cost=8.91..52.91 rows=20 width=417) (actual
>> time=0.123..0.123 rows=0 loops=1)
>>     CTE src
>>       ->  Index Scan using xq_agr_pkey on xq_agr  (cost=0.42..8.44
>> rows=1 width=379) (actual time=0.023..0.024 rows=1 loops=1)
>>             Index Cond: (id = 914830)
>>     ->  Nested Loop  (cost=0.47..44.47 rows=20 width=417) (actual
>> time=0.121..0.121 rows=0 loops=1)
>>           ->  CTE Scan on src  (cost=0.00..0.02 rows=1 width=88) (actual
>> time=0.030..0.031 rows=1 loops=1)
>>           ->  Index Scan using u_r_agrio_hourly on r_agrio_hourly
>>   (*cost=0.47..43.95 rows=20* width=329) (actual time=0.000..0.000
>>
>> rows=0 loops=1)
>>                 Index Cond: ((tagid = (((src.r_agrio -> 'key'::text) ->>
>> 'tagid'::text))::numeric) AND (unitid = (((src.r_agrio -> 'key'::text)
>> ->> 'unit'::text))::numeric) AND ((rowdate)::text = ((src.r_agrio ->
>> 'key'::text) ->> 'rowdate'::text)) AND (device_type = (((src.r_agrio ->
>> 'key'::text) ->> 'device_type'::text))::numeric) AND (placement =
>> (((src.r_agrio -> 'key'::text) ->> 'placement'::text))::numeric))
>>   Total runtime: 0.176 ms
>>
>> I've tried doing vacuum full analyze, rebuilding indexes. Vacuum did
>> help somewhat, but the time differential is still huge (that 7sec vs.
>> 80ms)
>>
>> The table structure (some fields thrown out just to save output length):
>>
>> => \d r_agrio_total
>>        Column       |     Type      |                         Modifiers
>> -------------------+---------------+------------------------
>> ------------------------------------
>>   id                | bigint        | not null default
>> nextval('r_agrio_total_id_seq'::regclass)
>>   tagid             | numeric(38,0) | not null
>>   unitid            | numeric(38,0) | not null
>>   device_type       | numeric(38,0) | not null
>>   placement         | numeric(38,0) | not null default 0
>>   unserved          | numeric(38,0) | not null default 0
>>   r_brkconn         | json          |
>> Indexes:
>>      "r_agrio_total_pkey" PRIMARY KEY, btree (id)
>>      "u_r_agrio_total" UNIQUE, btree (tagid, unitid, device_type,
>> placement)
>>      "unit_r_agrio_total" btree (unitid)
>>      "tag_r_agrio_total" btree (tagid)
>>
>> => \d r_agrio_hourly
>>        Column       |         Type          |
>>   Modifiers
>> -------------------+-----------------------+----------------
>> ---------------------------------------------
>>   id                | bigint                | not null default
>> nextval('r_agrio_hourly_id_seq'::regclass)
>>   tagid             | numeric(38,0)         | not null
>>   unitid            | numeric(38,0)         | not null
>>   rowdate           | character varying(15) | not null
>>   device_type       | numeric(38,0)         | not null
>>   placement         | numeric(38,0)         | not null default 0
>>   unserved          | numeric(38,0)         | not null default 0
>>   unconfirmed       | numeric(38,0)         | not null default 0
>>   r_brkconn         | json                  |
>> Indexes:
>>      "r_agrio_hourly_pkey" PRIMARY KEY, btree (id)
>>      "u_r_agrio_hourly" UNIQUE, btree (tagid, unitid, rowdate,
>> device_type, placement)
>>      "unit_r_agrio_hourly" btree (unitid)
>>      "rowdate_r_agrio_hourly" btree (rowdate)
>>      "tag_r_agrio_hourly" btree (tagid)
>>
>> Would appreciate any help on this, thank you!
>>
>
> What is the ->> operator? I'm not familiar with it.
>

It extracts JSON value as a string.


> numeric is FAR slower than int/bigint. Unless you *really* need 38 digits,
> use one of the other types.
>

Yeah, that's on purpose.


> Cursors tend to make things slow. Avoid them if you can.
>

Is there an alternative to iterating over a number of rows, where a direct
update query is not an option?

I really doubt that either the actual processing logic, including use of
types has anything to do with my problem. This is based on the fact that
out of the tables that are being changed, only one is exhibiting the
problem. All of the involved tables have nearly the same structure, and
have the same logical operations performed on them. I thought may be the
"bad" table is slow because it was first in the list, and Postgres was
caching the functions results, but I moved things around, and pattern is
the same.


> As for your specific question, I suggest you modify the plpgsql function
> so that it's doing an EXPLAIN ANALYZE on the slow table. EXPLAIN ANALYZE
> actually returns a recordset the same way a SELECT would, with a single
> column of type text. So you just need to do something with that output. The
> easiest thing would be to replace this in your function:
>
> UPDATE slow_table SET ...
>
> to this (untested)
>
> RETURN QUERY EXPLAIN ANALYZE UPDATE slow_table SET ...
>
> and change the function so it returns SETOF text instead of whatever it
> returns now.
>
>
Thank you, that made it a lot easier to see into what's really going on.
But the outcome is somewhat the same. The "bad" table analysis shows a very
high cost, and thousands of rows, where the table contains only 24 rows.
This time, however, the actual run time is shown, and one can see where the
time is spent (I was using just a sum of clock_time()s around the update
statements to see where the problem is).

r_agrio_hourly - "good", r_agrio_total - "bad".

 Update on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329) (actual
time=2.248..2.248 rows=0 loops=1)
   ->  Index Scan using u_r_agrio_hourly on r_agrio_hourly
 (cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207 rows=1
loops=1)
         Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric)
AND ((rowdate)::text = '2015-04-09T23'::text) AND (device_type =
3::numeric) AND (placement = 2::numeric))
 Total runtime: 2.281 ms
 Update on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321)
(actual time=106.766..106.766 rows=0 loops=1)
   ->  Index Scan using u_r_agrio_total on r_agrio_total
 (cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626
rows=1 loops=1)
         Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric)
AND (device_type = 3::numeric) AND (placement = 2::numeric))
 Total runtime: 106.793 ms
 Update on r_agrio_hourly  (cost=0.42..978.49 rows=252 width=329) (actual
time=1.796..1.796 rows=0 loops=1)
   ->  Index Scan using u_r_agrio_hourly on r_agrio_hourly
 (cost=0.42..978.49 rows=252 width=329) (actual time=1.054..1.090 rows=1
loops=1)
         Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric)
AND ((rowdate)::text = '2015-04-09T23'::text) AND (device_type =
1::numeric) AND (placement = 2::numeric))
 Total runtime: 1.824 ms
 Update on r_agrio_total  (cost=0.42..11553.27 rows=3017 width=321) (actual
time=102.685..102.685 rows=0 loops=1)
   ->  Index Scan using u_r_agrio_total on r_agrio_total
 (cost=0.42..11553.27 rows=3017 width=321) (actual time=0.675..33.375
rows=1 loops=1)
         Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric)
AND (device_type = 1::numeric) AND (placement = 2::numeric))
 Total runtime: 102.711 ms
 Update on r_agrio_hourly  (cost=0.42..1005.13 rows=259 width=329) (actual
time=1.962..1.962 rows=0 loops=1)
   ->  Index Scan using u_r_agrio_hourly on r_agrio_hourly
 (cost=0.42..1005.13 rows=259 width=329) (actual time=0.978..1.124 rows=1
loops=1)
         Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric)
AND ((rowdate)::text = '2015-04-09T23'::text) AND (device_type =
2::numeric) AND (placement = 2::numeric))
 Total runtime: 1.992 ms
 Update on r_agrio_total  (cost=0.42..66541.26 rows=18103 width=321)
(actual time=108.574..108.574 rows=0 loops=1)
   ->  Index Scan using u_r_agrio_total on r_agrio_total
 (cost=0.42..66541.26 rows=18103 width=321) (actual time=0.761..33.245
rows=1 loops=1)
         Index Cond: ((tagid = 1002::numeric) AND (unitid = 1002::numeric)
AND (device_type = 2::numeric) AND (placement = 2::numeric))
 Total runtime: 108.601 ms

(and there is ~800 more, with the same pattern).


Thank you!

Reply via email to