Re: [HACKERS] insert/update performance

2016-01-27 Thread Jinhua Luo
>
> But what kind of rows would satisfy heap_page_prune() and what would not?
>
> In my case all updates are doing the same thing (there is no HOT
> updates, obviously), but why some updated rows are reported by
> heap_page_prune() but the others are not? And it's also a random
> issue. That means sometimes heap_page_prune() would report all
> removable rows, and sometimes it reports no rows.
>

I check the codes again.

The heap_page_prune() would skip items if ItemIdIsDead() returns true.

That means some obsoleted items are flagged dead before vacuum, and I
found 3 places:

1) heap_page_prune_opt() --> heap_page_prune() --> ItemIdSetDead()
2) _bt_check_unique() --> ItemIdMarkDead()
3) _bt_killitems() --> ItemIdMarkDead()

In my case, the first one happens most frequently.
And it's interesting that it's invoked from select statement!

 0x80ca000 : heap_page_prune_opt+0x0/0x1a0
 0x80d030d : index_fetch_heap+0x11d/0x140
 0x80d035e : index_getnext+0x2e/0x40
 0x81eec9b : IndexNext+0x3b/0x100
 0x81e4ddf : ExecScan+0x15f/0x290
 0x81eed8d : ExecIndexScan+0x2d/0x50
 0x81ddb20 : ExecProcNode+0x1f0/0x2a0
 0x81dac6c : standard_ExecutorRun+0xfc/0x160
 0x82d0503 : PortalRunSelect+0x183/0x200
 0x82d17da : PortalRun+0x26a/0x3c0
 0x82cf452 : PostgresMain+0x2282/0x2fc0
 0x8097f52 : ServerLoop+0xb1b/0xec2
 0x82793d7 : PostmasterMain+0x1237/0x13c0
 0x8098b6c : main+0x48c/0x4d4
 0xb754fa83 : __libc_start_main+0xf3/0x210
 0x8098bd5 : _start+0x21/0x2c


Regards,
Jinhua Luo


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert/update performance

2016-01-26 Thread Jinhua Luo
Ok, I found the vaccum output is correct.

I check the codes of lazy_scan_heap(), the rows to be removed are
reported in two parts, one is return of heap_page_prune(), the other
is ItemIdIsDead() when scanning the page.

After scanning all pages of the relation, those rows would be clean up in:

if (vacrelstats->num_dead_tuples > 0) {
...
lazy_vacuum_heap()
...
}

It would then output
> INFO:  "test": removed 6 row versions in 1 pages

The number of rows is correct.

But what kind of rows would satisfy heap_page_prune() and what would not?

In my case all updates are doing the same thing (there is no HOT
updates, obviously), but why some updated rows are reported by
heap_page_prune() but the others are not? And it's also a random
issue. That means sometimes heap_page_prune() would report all
removable rows, and sometimes it reports no rows.


Regards,
Jinhua Luo


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert/update performance

2016-01-23 Thread Amit Kapila
On Sat, Jan 23, 2016 at 1:16 PM, Jinhua Luo  wrote:
>
> Hi,
>
> The vacuum doesn't recycle the rows obsoleted by update?
>

It does free up the space which can be used by future inserts.

> I don't think
> so. In the above vacuum result, I do not delete any rows, but the
> vacuum still recycles a fraction of rows, obviously they're obsoleted
> by update.
>
> I know plain vacuum (without full option) do not reduce the size of
> the whole table file/segments, but it should refresh the fsm. In my
> case, the update statement did modify the index column, but is it
> related to the fsm? I think anyways, the update would obsolete
> previous versions, as long as they are not hold by any active
> transactions, they would be recycled and count in the fsm, right?

I also think it will be added to fsm.

>  I
> just cannot understand why the recycle ratio is not 50%.
>

At the moment, I am also not able to see why it is so.  You might
want to first try with a simple test (Can you extract insert/update
statements from application and run it manually for couple of times
and then run Vacuum to see the result).

By anychance have you set a value for vacuum_defer_cleanup_age?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] insert/update performance

2016-01-23 Thread Jinhua Luo
Hi,

2016-01-23 18:40 GMT+08:00 Amit Kapila :
> At the moment, I am also not able to see why it is so.  You might
> want to first try with a simple test (Can you extract insert/update
> statements from application and run it manually for couple of times
> and then run Vacuum to see the result).

I try to do it manually, the issue is the same. It's weird that for
the index, the number of removed rows is correct. Just the table
itself is wrong (Sometimes it's correct too, it seems that it's a
random issue, I'm so confused).

>
> By anychance have you set a value for vacuum_defer_cleanup_age?
>

No, I do not configure it.

Regards,
Jinhua Luo


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert/update performance

2016-01-23 Thread Tom Lane
Jinhua Luo  writes:
> I have a table with 70 columns, and 6 indexes. The data flow is a
> special OLTP model: frequent inserts (2000 tps), and each inserted row
> would be updated very soon (i.e. the number of inserts is equal to the
> number of updates).

Do those predictable updates change any of the indexed columns?

> I do a simple test: I truncate the table, disable the autovacuum, and
> run the application for a few minutes, then I invokes vacuum manually,
> it gives a strange output:
> found 598 removable, 25662 nonremovable row versions in 3476 pages
> DETAIL:  0 dead row versions cannot be removed yet
> As said before, the number of inserts is equal to the number of
> updates. So the bloat of the table should be 100%, and the number of
> removable rows should be equal to the number of nonremovable rows,
> which is the real number of inserts issued by the application.

What seems likely is that most of the updates are HOT (because they
don't change any indexed columns) and then the freed space is reclaimable
by subsequent updates on the same page without needing a VACUUM.

Watching the insert/update/hot-update counts in pg_stat_all_tables would
provide some evidence.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert/update performance

2016-01-23 Thread Jinhua Luo
And I also found the pg_stat_all_tables may be not so accurate.

testdb=# truncate test;
testdb=# select pg_stat_reset_single_table_counters(42515);
testdb=# select
n_tup_ins,n_tup_upd,n_tup_hot_upd,n_tup_del,n_live_tup,n_dead_tup from
pg_stat_all_tables where relid=42515;
 n_tup_ins | n_tup_upd | n_tup_hot_upd | n_tup_del | n_live_tup | n_dead_tup
---+---+---+---++
 0 | 0 | 0 | 0 |  0 |  0
(1 row)

# run application a while

testdb=# select
n_tup_ins,n_tup_upd,n_tup_hot_upd,n_tup_del,n_live_tup,n_dead_tup from
pg_stat_all_tables where relid=42515;
 n_tup_ins | n_tup_upd | n_tup_hot_upd | n_tup_del | n_live_tup | n_dead_tup
---+---+---+---++
 24829 | 24839 | 0 | 0 |  24829 |  24839
(1 row)

testdb=# select count(*) from test;
 count
---
 24780
(1 row)

testdb=# vacuum verbose test;
...
DETAIL:  24780 index row versions were removed.
...
INFO:  "test": found 863 removable, 24780 nonremovable row versions in
3148 out of 3148 pages


The n_tup_ins is bigger than actual rows, and the n_tup_upd is even
bigger than n_tup_ins!

Regards,
Jinhua Luo


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert/update performance

2016-01-23 Thread Jinhua Luo
2016-01-23 23:00 GMT+08:00 Tom Lane :
> Jinhua Luo  writes:
>> I have a table with 70 columns, and 6 indexes. The data flow is a
>> special OLTP model: frequent inserts (2000 tps), and each inserted row
>> would be updated very soon (i.e. the number of inserts is equal to the
>> number of updates).
>
> Do those predictable updates change any of the indexed columns?

The update statement itself do not modify the indexed columns, but the
before update trigger modifies two indexed columns: one is in
timestamp type, used to record the update time, the trigger would fill
CURRENT_TIMESTAMP into this column; the other is status, which would
be set to 'done'. These two columns are indexed in btree.

>
>> I do a simple test: I truncate the table, disable the autovacuum, and
>> run the application for a few minutes, then I invokes vacuum manually,
>> it gives a strange output:
>> found 598 removable, 25662 nonremovable row versions in 3476 pages
>> DETAIL:  0 dead row versions cannot be removed yet
>> As said before, the number of inserts is equal to the number of
>> updates. So the bloat of the table should be 100%, and the number of
>> removable rows should be equal to the number of nonremovable rows,
>> which is the real number of inserts issued by the application.
>
> What seems likely is that most of the updates are HOT (because they
> don't change any indexed columns) and then the freed space is reclaimable
> by subsequent updates on the same page without needing a VACUUM.
>
> Watching the insert/update/hot-update counts in pg_stat_all_tables would
> provide some evidence.

testdb=# truncate test;
TRUNCATE TABLE
testdb=# vacuum test;
testdb=# select pg_stat_reset_single_table_counters(42515);
 pg_stat_reset_single_table_counters
-

(1 row)

testdb=# select
n_tup_ins,n_tup_upd,n_tup_hot_upd,n_tup_del,n_live_tup,n_dead_tup from
pg_stat_all_tables where relid=42515;
 n_tup_ins | n_tup_upd | n_tup_hot_upd | n_tup_del | n_live_tup | n_dead_tup
---+---+---+---++
 0 | 0 | 0 | 0 |  0 |  0
(1 row)

#insert 6 rows

testdb=# select
n_tup_ins,n_tup_upd,n_tup_hot_upd,n_tup_del,n_live_tup,n_dead_tup from
pg_stat_all_tables where relid=42515;
 n_tup_ins | n_tup_upd | n_tup_hot_upd | n_tup_del | n_live_tup | n_dead_tup
---+---+---+---++
 6 | 6 | 0 | 0 |  6 |  6
(1 row)

testdb=# vacuum verbose test;
INFO:  vacuuming "public.test"
INFO:  scanned index "test_pkey" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "deliver_done_date_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "deliver_task_queue_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "message_id_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "status_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "status_report_done_date_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "submit_done_date_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "tp_scts_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "test": removed 6 row versions in 1 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "test_pkey" now contains 6 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "deliver_done_date_idx" now contains 6 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "deliver_task_queue_idx" now contains 0 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "message_id_idx" now contains 6 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "status_idx" now contains 6 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "status_report_done_date_idx" now contains 6 row versions
in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "submit_done_date_idx" now contains 6 row 

[HACKERS] insert/update performance

2016-01-22 Thread Jinhua Luo
Hi All,

Here is my test environment:

postgresql 9.4.5, Ubuntu 14.04, 8 CPU core, 8GB ram, SCSI hard disk

I have a table with 70 columns, and 6 indexes. The data flow is a
special OLTP model: frequent inserts (2000 tps), and each inserted row
would be updated very soon (i.e. the number of inserts is equal to the
number of updates).

>From the start, the table (initially empty) would be extended bigger
and bigger (via mdextend(), which allocates a new page and zero out
the page to the disk). During this period, the query latency is ok
(Because most fsync of writes are queued to the next checkpoint, so
the writes only copy zero-filled data to the page cache of linux
kernel).

After a long time, the mdextend() disappears, instead, the mdread()
occupy almost all I/O requests, because the fsm indicates enough free
space within the old table segments. At this stage, the performance is
bad. Because most page cache of linux kernel is occupied by the latest
table segments and the index files (I check it via mincore() syscall).
The read from disk (in sync way, of course) slow down all subsequent
queries.

Why fsm is updated much slower than the query speed? If the fsm is
updated in time, then the free space would still possibly cached by
the linux kernel. I thought it's due to the autovacuum is not so
aggressive. So I try to reconfigure it, e.g. set cost_delay to 0,
lower down the vacuum threshold. But it doesn't help.

At last, I found it's not the problem of autovacuum.
I do a simple test: I truncate the table, disable the autovacuum, and
run the application for a few minutes, then I invokes vacuum manually,
it gives a strange output:
found 598 removable, 25662 nonremovable row versions in 3476 pages
DETAIL:  0 dead row versions cannot be removed yet

As said before, the number of inserts is equal to the number of
updates. So the bloat of the table should be 100%, and the number of
removable rows should be equal to the number of nonremovable rows,
which is the real number of inserts issued by the application.

But the result shows that the number of nonremovable rows is just a
small fraction. If it's true, then no wonder that the table would keep
extending for a long time, because the free space is almost small.

Why the free space after vacuum is not 50% of the original size in my case?

Please help. Thanks!


Regards,
Jinhua Luo


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert/update performance

2016-01-22 Thread Amit Kapila
On Sat, Jan 23, 2016 at 12:13 PM, Jinhua Luo  wrote:
>
> Hi All,
>
> Here is my test environment:
>
> postgresql 9.4.5, Ubuntu 14.04, 8 CPU core, 8GB ram, SCSI hard disk
>
> I have a table with 70 columns, and 6 indexes. The data flow is a
> special OLTP model: frequent inserts (2000 tps), and each inserted row
> would be updated very soon (i.e. the number of inserts is equal to the
> number of updates).
>
>
> At last, I found it's not the problem of autovacuum.
> I do a simple test: I truncate the table, disable the autovacuum, and
> run the application for a few minutes, then I invokes vacuum manually,
> it gives a strange output:
> found 598 removable, 25662 nonremovable row versions in 3476 pages
> DETAIL:  0 dead row versions cannot be removed yet
>
> As said before, the number of inserts is equal to the number of
> updates. So the bloat of the table should be 100%, and the number of
> removable rows should be equal to the number of nonremovable rows,
> which is the real number of inserts issued by the application.
>
> But the result shows that the number of nonremovable rows is just a
> small fraction. If it's true, then no wonder that the table would keep
> extending for a long time, because the free space is almost small.
>
> Why the free space after vacuum is not 50% of the original size in my
case?
>

Vacuum just removes the deleted rows (provided they are not visible to
any active transaction), it won't reduce the size which is already extended,
unless the empty space is at end of relation.

Are you updating any index column?

I think if you should once try with higher fill factor as you have lot
of updates.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] insert/update performance

2016-01-22 Thread Jinhua Luo
Hi,

The vacuum doesn't recycle the rows obsoleted by update? I don't think
so. In the above vacuum result, I do not delete any rows, but the
vacuum still recycles a fraction of rows, obviously they're obsoleted
by update.

I know plain vacuum (without full option) do not reduce the size of
the whole table file/segments, but it should refresh the fsm. In my
case, the update statement did modify the index column, but is it
related to the fsm? I think anyways, the update would obsolete
previous versions, as long as they are not hold by any active
transactions, they would be recycled and count in the fsm, right?  I
just cannot understand why the recycle ratio is not 50%.


Regards,
Jinhua Luo

2016-01-23 15:13 GMT+08:00 Amit Kapila :
> On Sat, Jan 23, 2016 at 12:13 PM, Jinhua Luo  wrote:
>>
>> Hi All,
>>
>> Here is my test environment:
>>
>> postgresql 9.4.5, Ubuntu 14.04, 8 CPU core, 8GB ram, SCSI hard disk
>>
>> I have a table with 70 columns, and 6 indexes. The data flow is a
>> special OLTP model: frequent inserts (2000 tps), and each inserted row
>> would be updated very soon (i.e. the number of inserts is equal to the
>> number of updates).
>>
>>
>> At last, I found it's not the problem of autovacuum.
>> I do a simple test: I truncate the table, disable the autovacuum, and
>> run the application for a few minutes, then I invokes vacuum manually,
>> it gives a strange output:
>> found 598 removable, 25662 nonremovable row versions in 3476 pages
>> DETAIL:  0 dead row versions cannot be removed yet
>>
>> As said before, the number of inserts is equal to the number of
>> updates. So the bloat of the table should be 100%, and the number of
>> removable rows should be equal to the number of nonremovable rows,
>> which is the real number of inserts issued by the application.
>>
>> But the result shows that the number of nonremovable rows is just a
>> small fraction. If it's true, then no wonder that the table would keep
>> extending for a long time, because the free space is almost small.
>>
>> Why the free space after vacuum is not 50% of the original size in my
>> case?
>>
>
> Vacuum just removes the deleted rows (provided they are not visible to
> any active transaction), it won't reduce the size which is already extended,
> unless the empty space is at end of relation.
>
> Are you updating any index column?
>
> I think if you should once try with higher fill factor as you have lot
> of updates.
>
>
>
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers