Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-24 Thread Amit Kapila
On Wed, Aug 24, 2016 at 11:44 AM, Mark Kirkwood
 wrote:
> On 24/08/16 17:01, Mark Kirkwood wrote:
>>
>>
>> ...actually I was wrong there, only 2 of them were the same. So I've
>> attached a new log of bt's of them all.
>>
>>
>>
>
> And I can reproduce with only 1 session, figured that might be a helpful
> piece of the puzzle (trace attached).
>

Thanks.

I think I know the problem here.  Basically _hash_freeovflpage() is
trying to take a lock on a buffer previous to overflow page to update
the links and it is quite possible that the same buffer is already
locked for moving the tuples while squeezing the bucket.  I am working
on a fix for the same.

Coincidently, Ashutosh Sharma a colleague of mine who was also testing
this patch found the same issue by an attached sql script.  So we
might be able to inculcate a test case  in the regression suite as
well after fix.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


test_hash.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-24 Thread Mark Kirkwood

On 24/08/16 17:01, Mark Kirkwood wrote:


...actually I was wrong there, only 2 of them were the same. So I've 
attached a new log of bt's of them all.






And I can reproduce with only 1 session, figured that might be a helpful 
piece of the puzzle (trace attached).



$ pgbench -c 1 -T600 bench

bench=# SELECT pid,state,now()-xact_start AS 
wait,wait_event_type,wait_event,query FROM pg_stat_activity WHERE 
datname='bench' ORDER BY wait DESC;
  pid  | state  |  wait   | wait_event_type |   wait_event   |  
 
query   
 

---++-+-++---
-

 17266 | active | 00:45:23.027555 | LWLockTranche   | buffer_content | INSERT 
INTO pgbenc
h_history (tid, bid, aid, delta, mtime) VALUES (352, 76, 1305123, -1604, 
CURRENT_TIMESTAM
P);
(1 row)

$ gdb -p 17266 
(gdb) bt
#0  0x7f7b4d533387 in semop () at ../sysdeps/unix/syscall-template.S:84
#1  0x00660331 in PGSemaphoreLock (sema=sema@entry=0x7f7b4cad1690)
at pg_sema.c:387
#2  0x006bde9b in LWLockAcquire (lock=0x7f7b44583fe4, mode=LW_EXCLUSIVE)
at lwlock.c:1288
#3  0x0049a625 in _hash_getbuf_with_strategy 
(rel=rel@entry=0x7f7b4e0c1908, 
blkno=blkno@entry=12, access=access@entry=2, flags=flags@entry=1, 
bstrategy=bstrategy@entry=0x0) at hashpage.c:252
#4  0x00498040 in _hash_freeovflpage (rel=rel@entry=0x7f7b4e0c1908, 
bucketbuf=bucketbuf@entry=12796, ovflbuf=ovflbuf@entry=236, 
wbuf=wbuf@entry=13166, 
itups=itups@entry=0x7ffc693fc800, 
itup_offsets=itup_offsets@entry=0x13067d0, 
tups_size=0x7ffc693fd4c0, nitups=34, bstrategy=0x0) at hashovfl.c:517
#5  0x00499985 in _hash_squeezebucket (rel=rel@entry=0x7f7b4e0c1908, 
bucket=bucket@entry=7, bucket_blkno=bucket_blkno@entry=10, 
bucket_buf=bucket_buf@entry=12796, bstrategy=bstrategy@entry=0x0) at 
hashovfl.c:1010
#6  0x00496caf in hashbucketcleanup (rel=rel@entry=0x7f7b4e0c1908, 
bucket_buf=bucket_buf@entry=12796, bucket_blkno=bucket_blkno@entry=10, 
bstrategy=bstrategy@entry=0x0, maxbucket=22, highmask=31, lowmask=15, 
tuples_removed=0x0, num_index_tuples=0x0, bucket_has_garbage=1 '\001', 
delay=0 '\000', callback=0x0, callback_state=0x0) at hash.c:937
#7  0x0049b353 in _hash_expandtable (rel=rel@entry=0x7f7b4e0c1908, 
metabuf=metabuf@entry=6179) at hashpage.c:785
#8  0x00497bae in _hash_doinsert (rel=rel@entry=0x7f7b4e0c1908, 
itup=itup@entry=0x13060b8) at hashinsert.c:313
#9  0x0049617f in hashinsert (rel=0x7f7b4e0c1908, values=, 
isnull=, ht_ctid=0x1305f7c, heapRel=, 
checkUnique=) at hash.c:247
#10 0x005bf060 in ExecInsertIndexTuples (slot=slot@entry=0x1304a70, 
tupleid=tupleid@entry=0x1305f7c, estate=estate@entry=0x13043b0, 
noDupErr=noDupErr@entry=0 '\000', specConflict=specConflict@entry=0x0, 
arbiterIndexes=arbiterIndexes@entry=0x0) at execIndexing.c:388
#11 0x005dd011 in ExecInsert (canSetTag=1 '\001', estate=0x13043b0, 
onconflict=, arbiterIndexes=0x0, planSlot=0x1304a70, 
slot=0x1304a70, 
mtstate=0x1304600) at nodeModifyTable.c:481
#12 ExecModifyTable (node=node@entry=0x1304600) at nodeModifyTable.c:1496
#13 0x005c3948 in ExecProcNode (node=node@entry=0x1304600) at 
execProcnode.c:396
#14 0x005bfdbf in ExecutePlan (dest=0x12f4bd8, direction=, 
numberTuples=0, sendTuples=, operation=CMD_INSERT, 
use_parallel_mode=, planstate=0x1304600, estate=0x13043b0)
at execMain.c:1567
#15 standard_ExecutorRun (queryDesc=0x1303fa0, direction=, 
count=0)
at execMain.c:338
#16 0x006ce669 in ProcessQuery (plan=, 
sourceText=0x12ccf50 "INSERT INTO pgbench_history (tid, bid, aid, delta, 
mtime) VALUES (352, 76, 1305123, -1604, CURRENT_TIMESTAMP);", params=0x0, 
dest=0x12f4bd8, 
completionTag=0x7ffc69400c30 "") at pquery.c:187
#17 0x006ce89c in PortalRunMulti (portal=portal@entry=0x1271fe0, 
isTopLevel=isTopLevel@entry=1 '\001', 
setHoldSnapshot=setHoldSnapshot@entry=0 '\000', dest=dest@entry=0x12f4bd8, 
---Type  to continue, or q  to quit---
altdest=altdest@entry=0x12f4bd8, 
completionTag=completionTag@entry=0x7ffc69400c30 "") at pquery.c:1303
#18 0x006cf34e in PortalRun (portal=portal@entry=0x1271fe0, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 
'\001', 
dest=dest@entry=0x12f4bd8, altdest=altdest@entry=0x12f4bd8, 
completionTag=completionTag@entry=0x7ffc69400c30 "") at pquery.c:815
#19 0x006cc45b in exec_simple_query (
query_string=0x12ccf50 "INSERT INTO pgbench_history (tid, bid, aid, delta, 
mtime) VALUES (352, 76, 1305123, -1604, CURRENT_TIMESTAMP);") at postgres.c:1094
#20 PostgresMain (argc=, 

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Mark Kirkwood

On 24/08/16 16:52, Mark Kirkwood wrote:

On 24/08/16 16:33, Amit Kapila wrote:

On Wed, Aug 24, 2016 at 9:53 AM, Mark Kirkwood
 wrote:

On 24/08/16 15:36, Amit Kapila wrote:

On Wed, Aug 24, 2016 at 8:54 AM, Mark Kirkwood
 wrote:
Can you get the call stacks?


For every stuck backend? (just double checking)...

Yeah.



Cool,

I managed to reproduce with a reduced workload of 4 backends, then 
noticed that the traces for 3 of 'em were all the same. So I've 
attached the 2 unique ones, plus noted what pg_stat_activity thought 
the wait event was, in case that is useful.


...actually I was wrong there, only 2 of them were the same. So I've 
attached a new log of bt's of them all.



$ pgbench -c 4 -T600 bench

postgres=# SELECT pid, state,now()-xact_start AS 
wait,wait_event_type,wait_event,query FROM pg_stat_activity WHERE 
datname='bench' ORDER BY wait DESC;  pid  | state  |  wait   | 
wait_event_type |   wait_event   |  
   query   
   
---++-+-++-
---
 16549 | active | 00:06:00.252146 | LWLockTranche   | buffer_content | INSERT 
INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (190, 24, 9513890, 
-2566, C
URRENT_TIMESTAMP);
 16548 | active | 00:06:00.235003 | LWLockTranche   | buffer_content | INSERT 
INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (445, 3, 5688076, 
1021, CUR
RENT_TIMESTAMP);
 16547 | active | 00:06:00.218032 | LWLockTranche   | buffer_content | INSERT 
INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (803, 97, 6871273, 
-35, CUR
RENT_TIMESTAMP);
 16546 | active | 00:06:00.192543 | Lock| transactionid  | UPDATE 
pgbench_branches SET bbalance = bbalance + -2823 WHERE bid = 3;
(4 rows)

$ gdb 16546
(gdb) bt
#0  0x7f51daa3a133 in __epoll_wait_nocancel () at 
../sysdeps/unix/syscall-template.S:84
#1  0x006abdb6 in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7fffd197d3a0, cur_timeout=-1, set=0xdc9288)
at latch.c:987
#2  WaitEventSetWait (set=set@entry=0xdc9288, timeout=timeout@entry=-1, 
occurred_events=occurred_events@entry=0x7fffd197d3a0, 
nevents=nevents@entry=1) at latch.c:941
#3  0x006ac1ca in WaitLatchOrSocket (latch=0x7f51d9fd969c, 
wakeEvents=wakeEvents@entry=1, sock=sock@entry=-1, 
timeout=-1, timeout@entry=0) at latch.c:347
#4  0x006ac27d in WaitLatch (latch=, 
wakeEvents=wakeEvents@entry=1, timeout=timeout@entry=0)
at latch.c:302
#5  0x006bb5e3 in ProcSleep (locallock=locallock@entry=0xd3d690, 
lockMethodTable=lockMethodTable@entry=0x9149a0 ) at 
proc.c:1219
#6  0x006b639d in WaitOnLock (locallock=locallock@entry=0xd3d690, 
owner=owner@entry=0xdc4cb0) at lock.c:1703
#7  0x006b7a77 in LockAcquireExtended 
(locktag=locktag@entry=0x7fffd197d700, lockmode=lockmode@entry=5, 
sessionLock=sessionLock@entry=0 '\000', dontWait=dontWait@entry=0 '\000', 
reportMemoryError=reportMemoryError@entry=1 '\001')
at lock.c:998
#8  0x006b7d51 in LockAcquire (locktag=locktag@entry=0x7fffd197d700, 
lockmode=lockmode@entry=5, 
sessionLock=sessionLock@entry=0 '\000', dontWait=dontWait@entry=0 '\000') 
at lock.c:688
#9  0x006b58fe in XactLockTableWait (xid=xid@entry=7667, 
rel=rel@entry=0x7f51db5c6398, ctid=ctid@entry=0x7fffd197d844, 
oper=oper@entry=XLTW_Update) at lmgr.c:587
#10 0x004a5d29 in heap_update (relation=relation@entry=0x7f51db5c6398, 
otid=otid@entry=0x7fffd197d9e0, 
newtup=newtup@entry=0xd4b3d0, cid=2, crosscheck=0x0, wait=wait@entry=1 
'\001', hufd=0x7fffd197d8f0, lockmode=0x7fffd197d8ec)
at heapam.c:3755
#11 0x005dbeaa in ExecUpdate (tupleid=tupleid@entry=0x7fffd197d9e0, 
oldtuple=oldtuple@entry=0x0, 
slot=slot@entry=0xd4adb0, planSlot=planSlot@entry=0xdc8ea0, 
epqstate=epqstate@entry=0xdc76c8, estate=estate@entry=0xdc73d0, 
canSetTag=1 '\001') at nodeModifyTable.c:922
#12 0x005dc633 in ExecModifyTable (node=node@entry=0xdc7620) at 
nodeModifyTable.c:1501
#13 0x005c3948 in ExecProcNode (node=node@entry=0xdc7620) at 
execProcnode.c:396
#14 0x005bfdbf in ExecutePlan (dest=0xdc9e60, direction=, numberTuples=0, sendTuples=, 
operation=CMD_UPDATE, use_parallel_mode=, 
planstate=0xdc7620, estate=0xdc73d0) at execMain.c:1567
#15 standard_ExecutorRun (queryDesc=0xdc6fc0, direction=, 
count=0) at execMain.c:338
#16 0x006ce669 in ProcessQuery (plan=, 
sourceText=0xd8ff70 "UPDATE pgbench_branches SET bbalance = bbalance + 
-2823 WHERE bid = 3;", params=0x0, dest=0xdc9e60, 
completionTag=0x7fffd197dfc0 "") at pquery.c:187
#17 0x006ce89c in PortalRunMulti (portal=portal@entry=0xd35000, 
isTopLevel=isTopLevel@entry=1 '\001', 

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Amit Kapila
On Wed, Aug 24, 2016 at 2:37 AM, Jeff Janes  wrote:
> Hi Amit,
>
> Thanks for working on this.
>
> When building with --enable-cassert, I get compiler warning:
>
> hash.c: In function 'hashbucketcleanup':
> hash.c:722: warning: 'new_bucket' may be used uninitialized in this function
>

This warning is from concurrent index patch. I will fix it and post
the patch on that thread.

>
> After an intentionally created crash, I get an Assert triggering:
>
> TRAP: FailedAssertion("!(((freep)[(bitmapbit)/32] &
> (1<<((bitmapbit)%32", File: "hashovfl.c", Line: 553)
>
> freep[0] is zero and bitmapbit is 16.
>

Here what is happening is that when it tries to clear the bitmapbit,
it expects it to be set.  Now, I think the reason for why it didn't
find the bit as set could be that after the new overflow page is added
and the bit corresponding to it is set, you might have crashed the
system and the replay would not have set the bit.  Then while freeing
the overflow page it can hit the Assert as mentioned by you.  I think
the problem here could be that I am using REGBUF_STANDARD to log the
bitmap page updates which seems to be causing the issue.  As bitmap
page doesn't follow the standard page layout, it would have omitted
the actual contents while taking full page image and then during
replay, it would not have set the bit, because page doesn't need REDO.
I think here the fix is to use REGBUF_NO_IMAGE as we use for vm
buffers.

If you can send me the detailed steps for how you have produced the
problem, then I can verify after fixing whether you are seeing the
same problem or something else.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Mark Kirkwood

On 24/08/16 16:33, Amit Kapila wrote:

On Wed, Aug 24, 2016 at 9:53 AM, Mark Kirkwood
 wrote:

On 24/08/16 15:36, Amit Kapila wrote:

On Wed, Aug 24, 2016 at 8:54 AM, Mark Kirkwood
 wrote:
Can you get the call stacks?


For every stuck backend? (just double checking)...

Yeah.



Cool,

I managed to reproduce with a reduced workload of 4 backends, then 
noticed that the traces for 3 of 'em were all the same. So I've attached 
the 2 unique ones, plus noted what pg_stat_activity thought the wait 
event was, in case that is useful.


Cheers

Mark

$ pgbench -c 4 -T600 bench

postgres=# SELECT pid, state,now()-xact_start AS 
wait,wait_event_type,wait_event,query FROM pg_stat_activity WHERE 
datname='bench' ORDER BY wait DESC;  pid  | state  |  wait   | 
wait_event_type |   wait_event   |  
   query   
   
---++-+-++-
---
 16549 | active | 00:06:00.252146 | LWLockTranche   | buffer_content | INSERT 
INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (190, 24, 9513890, 
-2566, C
URRENT_TIMESTAMP);
 16548 | active | 00:06:00.235003 | LWLockTranche   | buffer_content | INSERT 
INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (445, 3, 5688076, 
1021, CUR
RENT_TIMESTAMP);
 16547 | active | 00:06:00.218032 | LWLockTranche   | buffer_content | INSERT 
INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (803, 97, 6871273, 
-35, CUR
RENT_TIMESTAMP);
 16546 | active | 00:06:00.192543 | Lock| transactionid  | UPDATE 
pgbench_branches SET bbalance = bbalance + -2823 WHERE bid = 3;
(4 rows)


postgres@zmori:~$ gdb -p 16546
0x7f51daa3a133 in __epoll_wait_nocancel ()
at ../sysdeps/unix/syscall-template.S:84
84  ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0  0x7f51daa3a133 in __epoll_wait_nocancel ()
at ../sysdeps/unix/syscall-template.S:84
#1  0x006abdb6 in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7fffd197d3a0, cur_timeout=-1, set=0xdc9288)
at latch.c:987
#2  WaitEventSetWait (set=set@entry=0xdc9288, timeout=timeout@entry=-1, 
occurred_events=occurred_events@entry=0x7fffd197d3a0, 
nevents=nevents@entry=1) at latch.c:941
#3  0x006ac1ca in WaitLatchOrSocket (latch=0x7f51d9fd969c, 
wakeEvents=wakeEvents@entry=1, sock=sock@entry=-1, timeout=-1, 
timeout@entry=0) at latch.c:347
#4  0x006ac27d in WaitLatch (latch=, 
wakeEvents=wakeEvents@entry=1, timeout=timeout@entry=0) at latch.c:302
#5  0x006bb5e3 in ProcSleep (locallock=locallock@entry=0xd3d690, 
lockMethodTable=lockMethodTable@entry=0x9149a0 )
at proc.c:1219
#6  0x006b639d in WaitOnLock (locallock=locallock@entry=0xd3d690, 
owner=owner@entry=0xdc4cb0) at lock.c:1703
#7  0x006b7a77 in LockAcquireExtended (
locktag=locktag@entry=0x7fffd197d700, lockmode=lockmode@entry=5, 
sessionLock=sessionLock@entry=0 '\000', dontWait=dontWait@entry=0 '\000', 
reportMemoryError=reportMemoryError@entry=1 '\001') at lock.c:998
#8  0x006b7d51 in LockAcquire (locktag=locktag@entry=0x7fffd197d700, 
lockmode=lockmode@entry=5, sessionLock=sessionLock@entry=0 '\000', 
dontWait=dontWait@entry=0 '\000') at lock.c:688
#9  0x006b58fe in XactLockTableWait (xid=xid@entry=7667, 
rel=rel@entry=0x7f51db5c6398, ctid=ctid@entry=0x7fffd197d844, 
oper=oper@entry=XLTW_Update) at lmgr.c:587
#10 0x004a5d29 in heap_update (relation=relation@entry=0x7f51db5c6398, 
otid=otid@entry=0x7fffd197d9e0, newtup=newtup@entry=0xd4b3d0, cid=2, 
crosscheck=0x0, wait=wait@entry=1 '\001', hufd=0x7fffd197d8f0, 
lockmode=0x7fffd197d8ec) at heapam.c:3755
#11 0x005dbeaa in ExecUpdate (tupleid=tupleid@entry=0x7fffd197d9e0, 
oldtuple=oldtuple@entry=0x0, slot=slot@entry=0xd4adb0, 
planSlot=planSlot@entry=0xdc8ea0, epqstate=epqstate@entry=0xdc76c8, 
estate=estate@entry=0xdc73d0, canSetTag=1 '\001') at nodeModifyTable.c:922
#12 0x005dc633 in ExecModifyTable (node=node@entry=0xdc7620)
at nodeModifyTable.c:1501
#13 0x005c3948 in ExecProcNode (node=node@entry=0xdc7620)
at execProcnode.c:396
#14 0x005bfdbf in ExecutePlan (dest=0xdc9e60, 
direction=, numberTuples=0, sendTuples=, 
operation=CMD_UPDATE, use_parallel_mode=, 
planstate=0xdc7620, estate=0xdc73d0) at execMain.c:1567
#15 standard_ExecutorRun (queryDesc=0xdc6fc0, direction=, 
count=0) at execMain.c:338
#16 0x006ce669 in ProcessQuery (plan=, 
sourceText=0xd8ff70 "UPDATE pgbench_branches SET bbalance = bbalance + 
-2823 WHERE bid = 3;", params=0x0, dest=0xdc9e60, completionTag=0x7fffd197dfc0 
"")
---Type  to continue, or q  to quit--- 

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Amit Kapila
On Wed, Aug 24, 2016 at 9:53 AM, Mark Kirkwood
 wrote:
> On 24/08/16 15:36, Amit Kapila wrote:
>>
>> On Wed, Aug 24, 2016 at 8:54 AM, Mark Kirkwood
>>  wrote:
>>>
>>
>> Can you get the call stacks?
>>
>
> For every stuck backend? (just double checking)...

Yeah.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Mark Kirkwood

On 24/08/16 15:36, Amit Kapila wrote:

On Wed, Aug 24, 2016 at 8:54 AM, Mark Kirkwood
 wrote:

On 24/08/16 12:09, Mark Kirkwood wrote:

On 23/08/16 15:24, Amit Kapila wrote:


Thoughts?

Note - To use this patch, first apply latest version of concurrent
hash index patch [2].

[1] - https://commitfest.postgresql.org/10/647/
[2] -
https://www.postgresql.org/message-id/caa4ek1lkq_udism-z2dq6cuvjh3db5fnfnnezzbpsrjw0ha...@mail.gmail.com



Firstly - really nice! Patches applied easily etc to latest version 10
checkout.

I thought I'd test by initializing pgbench schema, adding a standby, then
adding some hash indexes and running pgbench:

Looking on the standby:

bench=# \d pgbench_accounts
Table "public.pgbench_accounts"
   Column  | Type  | Modifiers
--+---+---
  aid  | integer   | not null
  bid  | integer   |
  abalance | integer   |
  filler   | character(84) |
Indexes:
 "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
 "pgbench_accounts_bid" hash (bid)<

bench=# \d pgbench_history
   Table "public.pgbench_history"
  Column |Type | Modifiers
+-+---
  tid| integer |
  bid| integer |
  aid| integer |
  delta  | integer |
  mtime  | timestamp without time zone |
  filler | character(22)   |
Indexes:
 "pgbench_history_bid" hash (bid) <=


they have been replicated there ok.

However I'm seeing a hang on the master after a while:

bench=# SELECT datname,application_name,state,now()-xact_start AS
wait,query FROM pg_stat_activity ORDER BY wait DESC;
  datname | application_name | state  |  wait | query

-+--++-+
  | walreceiver  | idle   | |
  bench   | pgbench  | active | 00:31:38.367467 | INSERT INTO
pgbench_history (tid, bid, aid, delta, mtime) VALUES (921, 38, 251973,
-3868, CURRENT_TIMESTAMP);
  bench   | pgbench  | active | 00:31:38.215378 | INSERT INTO
pgbench_history (tid, bid, aid, delta, mtime) VALUES (280, 95, 3954814,
2091, CURRENT_TIMESTAMP);
  bench   | pgbench  | active | 00:31:35.991056 | INSERT INTO
pgbench_history (tid, bid, aid, delta, mtime) VALUES (447, 33, 8355237,
3438, CURRENT_TIMESTAMP);
  bench   | pgbench  | active | 00:31:35.619798 | INSERT INTO
pgbench_history (tid, bid, aid, delta, mtime) VALUES (134, 16, 4839994,
-2443, CURRENT_TIMESTAMP);
  bench   | pgbench  | active | 00:31:35.544196 | INSERT INTO
pgbench_history (tid, bid, aid, delta, mtime) VALUES (37, 73, 9620119, 4053,
CURRENT_TIMESTAMP);
  bench   | pgbench  | active | 00:31:35.334504 | UPDATE
pgbench_branches SET bbalance = bbalance + -2954 WHERE bid = 33;
  bench   | pgbench  | active | 00:31:35.234112 | UPDATE
pgbench_branches SET bbalance = bbalance + -713 WHERE bid = 38;
  bench   | pgbench  | active | 00:31:34.434676 | UPDATE
pgbench_branches SET bbalance = bbalance + -921 WHERE bid = 33;
  bench   | psql | active | 00:00:00| SELECT
datname,application_name,state,now()-xact_start AS wait,query FROM
pg_stat_activity ORDER BY wait DESC;
(10 rows)

but no errors in the logs, any thoughts?

Can you get the call stacks?



For every stuck backend? (just double checking)...


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Amit Kapila
On Wed, Aug 24, 2016 at 8:54 AM, Mark Kirkwood
 wrote:
> On 24/08/16 12:09, Mark Kirkwood wrote:
>>
>> On 23/08/16 15:24, Amit Kapila wrote:
>>>
>>>
>>> Thoughts?
>>>
>>> Note - To use this patch, first apply latest version of concurrent
>>> hash index patch [2].
>>>
>>> [1] - https://commitfest.postgresql.org/10/647/
>>> [2] -
>>> https://www.postgresql.org/message-id/caa4ek1lkq_udism-z2dq6cuvjh3db5fnfnnezzbpsrjw0ha...@mail.gmail.com
>>>
>>>
>>
>> Firstly - really nice! Patches applied easily etc to latest version 10
>> checkout.
>>
>> I thought I'd test by initializing pgbench schema, adding a standby, then
>> adding some hash indexes and running pgbench:
>>
>> Looking on the standby:
>>
>> bench=# \d pgbench_accounts
>>Table "public.pgbench_accounts"
>>   Column  | Type  | Modifiers
>> --+---+---
>>  aid  | integer   | not null
>>  bid  | integer   |
>>  abalance | integer   |
>>  filler   | character(84) |
>> Indexes:
>> "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
>> "pgbench_accounts_bid" hash (bid)<
>>
>> bench=# \d pgbench_history
>>   Table "public.pgbench_history"
>>  Column |Type | Modifiers
>> +-+---
>>  tid| integer |
>>  bid| integer |
>>  aid| integer |
>>  delta  | integer |
>>  mtime  | timestamp without time zone |
>>  filler | character(22)   |
>> Indexes:
>> "pgbench_history_bid" hash (bid) <=
>>
>>
>> they have been replicated there ok.
>>
>> However I'm seeing a hang on the master after a while:
>>
>> bench=# SELECT datname,application_name,state,now()-xact_start AS
>> wait,query FROM pg_stat_activity ORDER BY wait DESC;
>>  datname | application_name | state  |  wait | query
>>
>> -+--++-+
>>  | walreceiver  | idle   | |
>>  bench   | pgbench  | active | 00:31:38.367467 | INSERT INTO
>> pgbench_history (tid, bid, aid, delta, mtime) VALUES (921, 38, 251973,
>> -3868, CURRENT_TIMESTAMP);
>>  bench   | pgbench  | active | 00:31:38.215378 | INSERT INTO
>> pgbench_history (tid, bid, aid, delta, mtime) VALUES (280, 95, 3954814,
>> 2091, CURRENT_TIMESTAMP);
>>  bench   | pgbench  | active | 00:31:35.991056 | INSERT INTO
>> pgbench_history (tid, bid, aid, delta, mtime) VALUES (447, 33, 8355237,
>> 3438, CURRENT_TIMESTAMP);
>>  bench   | pgbench  | active | 00:31:35.619798 | INSERT INTO
>> pgbench_history (tid, bid, aid, delta, mtime) VALUES (134, 16, 4839994,
>> -2443, CURRENT_TIMESTAMP);
>>  bench   | pgbench  | active | 00:31:35.544196 | INSERT INTO
>> pgbench_history (tid, bid, aid, delta, mtime) VALUES (37, 73, 9620119, 4053,
>> CURRENT_TIMESTAMP);
>>  bench   | pgbench  | active | 00:31:35.334504 | UPDATE
>> pgbench_branches SET bbalance = bbalance + -2954 WHERE bid = 33;
>>  bench   | pgbench  | active | 00:31:35.234112 | UPDATE
>> pgbench_branches SET bbalance = bbalance + -713 WHERE bid = 38;
>>  bench   | pgbench  | active | 00:31:34.434676 | UPDATE
>> pgbench_branches SET bbalance = bbalance + -921 WHERE bid = 33;
>>  bench   | psql | active | 00:00:00| SELECT
>> datname,application_name,state,now()-xact_start AS wait,query FROM
>> pg_stat_activity ORDER BY wait DESC;
>> (10 rows)
>>
>> but no errors in the logs, any thoughts?
>

Can you get the call stacks?

>
>
> FWIW, retesting with the wal logging patch removed (i.e leaving the
> concurrent hast one) works fine.
>

Okay, information noted.


Thanks for testing and showing interest in the patch.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Mark Kirkwood

On 24/08/16 12:09, Mark Kirkwood wrote:

On 23/08/16 15:24, Amit Kapila wrote:


Thoughts?

Note - To use this patch, first apply latest version of concurrent
hash index patch [2].

[1] - https://commitfest.postgresql.org/10/647/
[2] - 
https://www.postgresql.org/message-id/caa4ek1lkq_udism-z2dq6cuvjh3db5fnfnnezzbpsrjw0ha...@mail.gmail.com





Firstly - really nice! Patches applied easily etc to latest version 10 
checkout.


I thought I'd test by initializing pgbench schema, adding a standby, 
then adding some hash indexes and running pgbench:


Looking on the standby:

bench=# \d pgbench_accounts
   Table "public.pgbench_accounts"
  Column  | Type  | Modifiers
--+---+---
 aid  | integer   | not null
 bid  | integer   |
 abalance | integer   |
 filler   | character(84) |
Indexes:
"pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
"pgbench_accounts_bid" hash (bid)<

bench=# \d pgbench_history
  Table "public.pgbench_history"
 Column |Type | Modifiers
+-+---
 tid| integer |
 bid| integer |
 aid| integer |
 delta  | integer |
 mtime  | timestamp without time zone |
 filler | character(22)   |
Indexes:
"pgbench_history_bid" hash (bid) <=


they have been replicated there ok.

However I'm seeing a hang on the master after a while:

bench=# SELECT datname,application_name,state,now()-xact_start AS 
wait,query FROM pg_stat_activity ORDER BY wait DESC;

 datname | application_name | state  |  wait | query
-+--++-+ 


 | walreceiver  | idle   | |
 bench   | pgbench  | active | 00:31:38.367467 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (921, 38, 251973, 
-3868, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:38.215378 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (280, 95, 
3954814, 2091, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:35.991056 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (447, 33, 
8355237, 3438, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:35.619798 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (134, 16, 
4839994, -2443, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:35.544196 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (37, 73, 9620119, 
4053, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:35.334504 | UPDATE 
pgbench_branches SET bbalance = bbalance + -2954 WHERE bid = 33;
 bench   | pgbench  | active | 00:31:35.234112 | UPDATE 
pgbench_branches SET bbalance = bbalance + -713 WHERE bid = 38;
 bench   | pgbench  | active | 00:31:34.434676 | UPDATE 
pgbench_branches SET bbalance = bbalance + -921 WHERE bid = 33;
 bench   | psql | active | 00:00:00| SELECT 
datname,application_name,state,now()-xact_start AS wait,query FROM 
pg_stat_activity ORDER BY wait DESC;

(10 rows)

but no errors in the logs, any thoughts?



FWIW, retesting with the wal logging patch removed (i.e leaving the 
concurrent hast one) works fine.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Mark Kirkwood

On 23/08/16 15:24, Amit Kapila wrote:


Thoughts?

Note - To use this patch, first apply latest version of concurrent
hash index patch [2].

[1] - https://commitfest.postgresql.org/10/647/
[2] - 
https://www.postgresql.org/message-id/caa4ek1lkq_udism-z2dq6cuvjh3db5fnfnnezzbpsrjw0ha...@mail.gmail.com




Firstly - really nice! Patches applied easily etc to latest version 10 
checkout.


I thought I'd test by initializing pgbench schema, adding a standby, 
then adding some hash indexes and running pgbench:


Looking on the standby:

bench=# \d pgbench_accounts
   Table "public.pgbench_accounts"
  Column  | Type  | Modifiers
--+---+---
 aid  | integer   | not null
 bid  | integer   |
 abalance | integer   |
 filler   | character(84) |
Indexes:
"pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
"pgbench_accounts_bid" hash (bid)<

bench=# \d pgbench_history
  Table "public.pgbench_history"
 Column |Type | Modifiers
+-+---
 tid| integer |
 bid| integer |
 aid| integer |
 delta  | integer |
 mtime  | timestamp without time zone |
 filler | character(22)   |
Indexes:
"pgbench_history_bid" hash (bid) <=


they have been replicated there ok.

However I'm seeing a hang on the master after a while:

bench=# SELECT datname,application_name,state,now()-xact_start AS 
wait,query FROM pg_stat_activity ORDER BY wait DESC;

 datname | application_name | state  |  wait | query
-+--++-+
 | walreceiver  | idle   | |
 bench   | pgbench  | active | 00:31:38.367467 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (921, 38, 251973, 
-3868, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:38.215378 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (280, 95, 3954814, 
2091, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:35.991056 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (447, 33, 8355237, 
3438, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:35.619798 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (134, 16, 4839994, 
-2443, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:35.544196 | INSERT INTO 
pgbench_history (tid, bid, aid, delta, mtime) VALUES (37, 73, 9620119, 
4053, CURRENT_TIMESTAMP);
 bench   | pgbench  | active | 00:31:35.334504 | UPDATE 
pgbench_branches SET bbalance = bbalance + -2954 WHERE bid = 33;
 bench   | pgbench  | active | 00:31:35.234112 | UPDATE 
pgbench_branches SET bbalance = bbalance + -713 WHERE bid = 38;
 bench   | pgbench  | active | 00:31:34.434676 | UPDATE 
pgbench_branches SET bbalance = bbalance + -921 WHERE bid = 33;
 bench   | psql | active | 00:00:00| SELECT 
datname,application_name,state,now()-xact_start AS wait,query FROM 
pg_stat_activity ORDER BY wait DESC;

(10 rows)

but no errors in the logs, any thoughts?

Cheers

Mark







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Jeff Janes
Hi Amit,

Thanks for working on this.

When building with --enable-cassert, I get compiler warning:

hash.c: In function 'hashbucketcleanup':
hash.c:722: warning: 'new_bucket' may be used uninitialized in this function


After an intentionally created crash, I get an Assert triggering:

TRAP: FailedAssertion("!(((freep)[(bitmapbit)/32] &
(1<<((bitmapbit)%32", File: "hashovfl.c", Line: 553)

freep[0] is zero and bitmapbit is 16.

With this backtrace:

(gdb) bt
#0  0x003838c325e5 in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x003838c33dc5 in abort () at abort.c:92
#2  0x0081a8fd in ExceptionalCondition (conditionName=, errorType=, fileName=,
lineNumber=) at assert.c:54
#3  0x004a4199 in _hash_freeovflpage (rel=0x7f3f745d86b8,
bucketbuf=198, ovflbuf=199, wbuf=198, itups=0x7ffc258fa090,
itup_offsets=0x126e8a8,
tups_size=0x7ffc258f93d0, nitups=70, bstrategy=0x12ba320) at
hashovfl.c:553
#4  0x004a4c32 in _hash_squeezebucket (rel=,
bucket=38, bucket_blkno=56, bucket_buf=198, bstrategy=0x12ba320)
at hashovfl.c:1010
#5  0x004a042a in hashbucketcleanup (rel=0x7f3f745d86b8,
bucket_buf=198, bucket_blkno=56, bstrategy=0x12ba320, maxbucket=96,
highmask=127,
lowmask=63, tuples_removed=0x7ffc258fc1c8,
num_index_tuples=0x7ffc258fc1c0, bucket_has_garbage=0 '\000', delay=1
'\001',
callback=0x5e9bd0 , callback_state=0x126e248) at
hash.c:937
#6  0x004a07e7 in hashbulkdelete (info=0x7ffc258fc2b0, stats=0x0,
callback=0x5e9bd0 , callback_state=0x126e248) at hash.c:580
#7  0x005e98c5 in lazy_vacuum_index (indrel=0x7f3f745d86b8,
stats=0x126ecc0, vacrelstats=0x126e248) at vacuumlazy.c:1599
#8  0x005ea7f9 in lazy_scan_heap (onerel=,
options=, params=0x12ba290, bstrategy=)
at vacuumlazy.c:1291
#9  lazy_vacuum_rel (onerel=, options=, params=0x12ba290, bstrategy=) at vacuumlazy.c:255
#10 0x005e8939 in vacuum_rel (relid=17329, relation=0x7ffc258fcbd0,
options=99, params=0x12ba290) at vacuum.c:1399
#11 0x005e8d01 in vacuum (options=99, relation=0x7ffc258fcbd0,
relid=, params=0x12ba290, va_cols=0x0,
bstrategy=, isTopLevel=1 '\001') at vacuum.c:307
#12 0x006a07f1 in autovacuum_do_vac_analyze () at autovacuum.c:2823
#13 do_autovacuum () at autovacuum.c:2341
#14 0x006a0f9c in AutoVacWorkerMain (argc=,
argv=) at autovacuum.c:1656
#15 0x006a1116 in StartAutoVacWorker () at autovacuum.c:1461
#16 0x006afb00 in StartAutovacuumWorker (postgres_signal_arg=) at postmaster.c:5323
#17 sigusr1_handler (postgres_signal_arg=) at
postmaster.c:5009
#18 
#19 0x003838ce1503 in __select_nocancel () at
../sysdeps/unix/syscall-template.S:82
#20 0x006b0ec0 in ServerLoop (argc=,
argv=) at postmaster.c:1657
#21 PostmasterMain (argc=, argv=)
at postmaster.c:1301
#22 0x00632e88 in main (argc=4, argv=0x11f4d50) at main.c:228

Cheers,

Jeff


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-23 Thread Ashutosh Sharma
Hi All,

Following are the steps that i have followed to verify the WAL Logging
of hash index,

1. I used Mithun's patch to improve coverage of hash index code [1] to
verify the WAL Logging of hash index. Firstly i have confirmed if all
the XLOG records associated with hash index are being covered or not
using this patch. In case if any of the XLOG record for hash index
operation is not being covered i have added a testcase for it. I have
found that one of the XLOG record 'XLOG_HASH_MOVE_PAGE_CONTENTS' was
not being covered and added a small testcase for the same. The patch
for this is available @ [2].

2. I executed the regression test suite and found all the hash indexes
that are getting created as a part of regression test suite using the
below query.

SELECT t.relname index_name, t.oid FROM pg_class t JOIN pg_am idx
ON idx.oid = t.relam WHERE idx.amname = 'hash';

3. Thirdly, I have calculated the number of pages associated with each
hash index and compared every page of hash index on master and standby
server using pg_filedump tool.  As for example if the number of pages
associated with 'con_hash_index' is 10 then here is what i did,

On master:
-
select pg_relation_filepath('con_hash_index');
 pg_relation_filepath
--
 base/16408/16433
(1 row)

./pg_filedump -if -R 0 9
/home/edb/git-clone-postgresql/postgresql/TMP/postgres/master/base/16408/16433
> /tmp/file1

On Slave:
---
select pg_relation_filepath('con_hash_index');
 pg_relation_filepath
--
 base/16408/16433
(1 row)

./pg_filedump -if -R 0 9
/home/edb/git-clone-postgresql/postgresql/TMP/postgres/standby/base/16408/16433
> /tmp/file2

compared file1 and file2 using some diff tool.

Following are the list of hash indexes that got created inside
regression database when regression test suite was executed on a
master server.

hash_i4_index
hash_name_index
hash_txt_index
hash_f8_index
con_hash_index
hash_idx

In short, this is all i did and found no issues during testing. Please
let me know if you need any further details.

I would like to Thank Amit for his support and guidance during the
testing phase.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1JOBX%3DYU33631Qh-XivYXtPSALh514%2BjR8XeD7v%2BK3r_Q%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CAE9k0PkNjryhSiG53mjnKFhi%2BMipJMjSa%3DYkH-UeW3bfr1HPJQ%40mail.gmail.com

With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-08-22 Thread Amit Kapila
On Tue, Aug 23, 2016 at 8:54 AM, Amit Kapila  wrote:
> $SUBJECT will make hash indexes reliable and usable on standby.
> AFAIU, currently hash indexes are not recommended to be used in
> production mainly because they are not crash-safe and with this patch,
> I hope we can address that limitation and recommend them for use in
> production.
>
> This patch is built on my earlier patch [1] of making hash indexes
> concurrent.  The main reason for doing so is that the earlier patch
> allows to complete the split operation and used light-weight locking
> due to which operations can be logged at granular level.
>
> WAL for different operations:
>
> This has been explained in README as well, but I am again writing it
> here for the ease of people.
>
>
..
> One of the challenge in writing this patch was that the current code
> was not written with a mindset that we need to write WAL for different
> operations.  Typical example is _hash_addovflpage() where pages are
> modified across different function calls and all modifications needs
> to be done atomically, so I have to refactor some code so that the
> operations can be logged sensibly.
>

This patch has not done handling for OldSnapshot.  Previously, we
haven't done TestForOldSnapshot() checks in hash index as they were
not logged, but now with this patch, it makes sense to perform such
checks.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


<    1   2