subject:"\[HACKERS\] Moving relation extension locks out of heavyweight lock manager"

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-17 Thread Amit Kapila

On Tue, Mar 17, 2020 at 6:32 PM Dilip Kumar  wrote:
>
> Your changes look fine to me.  I have also verified all the test and
> everything works fine.
>

I have pushed the first patch.  I will push the others in coming days.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-17 Thread Dilip Kumar

On Tue, Mar 17, 2020 at 5:14 PM Amit Kapila  wrote:
>
> On Mon, Mar 16, 2020 at 3:24 PM Dilip Kumar  wrote:
> >
>
> +
> + /*
> + * Indicate that the lock is released for certain types of locks
> + */
> +#ifdef USE_ASSERT_CHECKING
> + CheckAndSetLockHeld(locallock, false);
> +#endif
>  }
>
>  /*
> @@ -1618,6 +1666,11 @@ GrantLockLocal(LOCALLOCK *locallock, ResourceOwner 
> owner)
>   locallock->numLockOwners++;
>   if (owner != NULL)
>   ResourceOwnerRememberLock(owner, locallock);
> +
> + /* Indicate that the lock is acquired for certain types of locks. */
> +#ifdef USE_ASSERT_CHECKING
> + CheckAndSetLockHeld(locallock, true);
> +#endif
>  }
>
> There is no need to sprinkle USE_ASSERT_CHECKING at so many places,
> having inside the new function is sufficient.  I have changed that,
> added few more comments and
> made minor changes.  See, what you think about attached?

Your changes look fine to me.  I have also verified all the test and
everything works fine.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-17 Thread Amit Kapila

On Mon, Mar 16, 2020 at 3:24 PM Dilip Kumar  wrote:
>

+
+ /*
+ * Indicate that the lock is released for certain types of locks
+ */
+#ifdef USE_ASSERT_CHECKING
+ CheckAndSetLockHeld(locallock, false);
+#endif
 }

 /*
@@ -1618,6 +1666,11 @@ GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner)
  locallock->numLockOwners++;
  if (owner != NULL)
  ResourceOwnerRememberLock(owner, locallock);
+
+ /* Indicate that the lock is acquired for certain types of locks. */
+#ifdef USE_ASSERT_CHECKING
+ CheckAndSetLockHeld(locallock, true);
+#endif
 }

There is no need to sprinkle USE_ASSERT_CHECKING at so many places,
having inside the new function is sufficient.  I have changed that,
added few more comments and
made minor changes.  See, what you think about attached?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


v12-0001-Assert-that-we-don-t-acquire-a-heavyweight-lock-on-a.patch
Description: Binary data


v12-0002-Add-assert-to-ensure-that-page-locks-don-t-participa.patch
Description: Binary data


v12-0003-Allow-relation-extension-lock-to-conflict-among-para.patch
Description: Binary data


v12-0004-Allow-page-lock-to-conflict-among-parallel-group-mem.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-16 Thread Dilip Kumar

On Mon, Mar 16, 2020 at 11:56 AM Kuntal Ghosh
 wrote:
>
> On Mon, Mar 16, 2020 at 9:43 AM Dilip Kumar  wrote:
> > On Mon, Mar 16, 2020 at 8:57 AM Masahiko Sawada
> >  wrote:
> > > IsRelationExtensionLockHeld and IsPageLockHeld are used only when
> > > assertion is enabled. So how about making CheckAndSetLockHeld work
> > > only if USE_ASSERT_CHECKING to avoid overheads?
> >
> > That makes sense to me so updated the patch.
> +1
>
> In v10-0001-Assert-that-we-don-t-acquire-a-heavyweight-lock-.patch,
>
> + * Indicate that the lock is released for a particular type of locks.
> s/lock is/locks are

Done

> + /* Indicate that the lock is acquired for a certain type of locks. */
> s/lock is/locks are

Done

>
> In v10-0002-*.patch,
>
> + * Flag to indicate if the page lock is held by this backend.  We don't
> + * acquire any other heavyweight lock while holding the page lock except for
> + * relation extension.  However, these locks are never taken in reverse order
> + * which implies that page locks will also never participate in the deadlock
> + * cycle.
> s/while holding the page lock except for relation extension/while
> holding the page lock except for relation extension and page lock

Done

> + * We don't acquire any other heavyweight lock while holding the page lock
> + * except for relation extension lock.
> Same as above

Done

>
> Other than that, the patches look good to me. I've also done some
> testing after applying the Test-group-deadlock patch provided by Amit
> earlier in the thread. It works as expected.

Thanks for testing.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v11-0001-Assert-that-we-don-t-acquire-a-heavyweight-lock-.patch
Description: Binary data


v11-0004-Allow-page-lock-to-conflict-among-parallel-group.patch
Description: Binary data


v11-0003-Allow-relation-extension-lock-to-conflict-among-.patch
Description: Binary data


v11-0002-Add-assert-to-ensure-that-page-locks-don-t-parti.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-16 Thread Kuntal Ghosh

On Mon, Mar 16, 2020 at 9:43 AM Dilip Kumar  wrote:
> On Mon, Mar 16, 2020 at 8:57 AM Masahiko Sawada
>  wrote:
> > IsRelationExtensionLockHeld and IsPageLockHeld are used only when
> > assertion is enabled. So how about making CheckAndSetLockHeld work
> > only if USE_ASSERT_CHECKING to avoid overheads?
>
> That makes sense to me so updated the patch.
+1

In v10-0001-Assert-that-we-don-t-acquire-a-heavyweight-lock-.patch,

+ * Indicate that the lock is released for a particular type of locks.
s/lock is/locks are

+ /* Indicate that the lock is acquired for a certain type of locks. */
s/lock is/locks are

In v10-0002-*.patch,

+ * Flag to indicate if the page lock is held by this backend.  We don't
+ * acquire any other heavyweight lock while holding the page lock except for
+ * relation extension.  However, these locks are never taken in reverse order
+ * which implies that page locks will also never participate in the deadlock
+ * cycle.
s/while holding the page lock except for relation extension/while
holding the page lock except for relation extension and page lock

+ * We don't acquire any other heavyweight lock while holding the page lock
+ * except for relation extension lock.
Same as above

Other than that, the patches look good to me. I've also done some
testing after applying the Test-group-deadlock patch provided by Amit
earlier in the thread. It works as expected.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-15 Thread Dilip Kumar

On Mon, Mar 16, 2020 at 8:15 AM Amit Kapila  wrote:
>
> On Sun, Mar 15, 2020 at 9:17 PM Dilip Kumar  wrote:
> >
> > On Sun, Mar 15, 2020 at 5:58 PM Amit Kapila  wrote:
> > >
> > >
> > > 1. Group members wait for page locks.  If you test that the leader
> > > acquires the page lock and then member also tries to acquire the same
> > > lock on the same index, it wouldn't block before the patch, but after
> > > the patch, the member should wait for the leader to release the lock.
> >
> > Okay, I will test this part.
> >
> > > 2. Try to hit Assert in LockAcquireExtended (a) by trying to
> > > re-acquire the page lock via the debugger,
> >
> > I am not sure whether it is true or not,  Because, if we are holding
> > the page lock and we try the same page lock then the lock will be
> > granted without reaching the code path.  However, I agree that this is
> > not intended instead this is a side effect of allowing relation
> > extension lock while holding the same relation extension lock.  So
> > basically, now the situation is that if the lock is directly granted
> > because we are holding the same lock then it will not go to the assert
> > code.  IMHO, we don't need to add extra code to make it behave
> > differently.  Please let me know what is your opinion on this.
> >
>
> I also don't think there is any reason to add code to prevent that.
> Actually, what I wanted to test was to somehow hit the Assert for the
> cases where it will actually hit if someone tomorrow tries to acquire
> any other type of lock.  Can we mimic such a situation by hacking code
> (say try to acquire some other type of heavyweight lock) or in some
> way to hit the newly added Assert?

I have hacked the code by calling another heavyweight lock and the
assert is hit.

>
> > (b) try to acquire the
> > > relation extension lock after page lock and it should be allowed
> > > (after acquiring page lock, we take relation extension lock in
> > > following code path:
> > > ginInsertCleanup->ginEntryInsert->ginFindLeafPage->ginPlaceToPage->GinNewBuffer).

I have tested this part and it works as expected i.e. assert is not hit.

--test case
create table gin_test_tbl(i int4[]) with (autovacuum_enabled = off);
create index gin_test_idx on gin_test_tbl using gin (i);
insert into gin_test_tbl select array[1, 2, g] from generate_series(1, 2) g;
select gin_clean_pending_list('gin_test_idx');

BTW, this test is already covered by the existing gin.sql file so we
don't need to add any new test.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-15 Thread Dilip Kumar

On Mon, Mar 16, 2020 at 8:57 AM Masahiko Sawada
 wrote:
>
> On Mon, 16 Mar 2020 at 00:54, Dilip Kumar  wrote:
> >
> > On Sun, Mar 15, 2020 at 6:20 PM Amit Kapila  wrote:
> > >
> > > On Sun, Mar 15, 2020 at 4:34 PM Dilip Kumar  wrote:
> > > >
> > > > I have modified 0001 and 0002 slightly,  Basically, instead of two
> > > > function CheckAndSetLockHeld and CheckAndReSetLockHeld, I have created
> > > > a one function.
> > > >
> > >
> > > +CheckAndSetLockHeld(LOCALLOCK *locallock, bool value)
> > >
> > > Can we rename the parameter as lock_held, acquired or something like
> > > that so that it indicates what it intends to do and probably add a
> > > comment for that variable atop of function?
> >
> > Done
> >
>
> I've looked at the patches and ISTM these work as expected.

Thanks for verifying.

> IsRelationExtensionLockHeld and IsPageLockHeld are used only when
> assertion is enabled. So how about making CheckAndSetLockHeld work
> only if USE_ASSERT_CHECKING to avoid overheads?

That makes sense to me so updated the patch.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v10-0001-Assert-that-we-don-t-acquire-a-heavyweight-lock-.patch
Description: Binary data


v10-0003-Allow-relation-extension-lock-to-conflict-among-.patch
Description: Binary data


v10-0004-Allow-page-lock-to-conflict-among-parallel-group.patch
Description: Binary data


v10-0002-Add-assert-to-ensure-that-page-locks-don-t-parti.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-15 Thread Masahiko Sawada

On Mon, 16 Mar 2020 at 00:54, Dilip Kumar  wrote:
>
> On Sun, Mar 15, 2020 at 6:20 PM Amit Kapila  wrote:
> >
> > On Sun, Mar 15, 2020 at 4:34 PM Dilip Kumar  wrote:
> > >
> > > I have modified 0001 and 0002 slightly,  Basically, instead of two
> > > function CheckAndSetLockHeld and CheckAndReSetLockHeld, I have created
> > > a one function.
> > >
> >
> > +CheckAndSetLockHeld(LOCALLOCK *locallock, bool value)
> >
> > Can we rename the parameter as lock_held, acquired or something like
> > that so that it indicates what it intends to do and probably add a
> > comment for that variable atop of function?
>
> Done
>

I've looked at the patches and ISTM these work as expected.
IsRelationExtensionLockHeld and IsPageLockHeld are used only when
assertion is enabled. So how about making CheckAndSetLockHeld work
only if USE_ASSERT_CHECKING to avoid overheads?

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-15 Thread Amit Kapila

On Sun, Mar 15, 2020 at 9:17 PM Dilip Kumar  wrote:
>
> On Sun, Mar 15, 2020 at 5:58 PM Amit Kapila  wrote:
> >
> >
> > 1. Group members wait for page locks.  If you test that the leader
> > acquires the page lock and then member also tries to acquire the same
> > lock on the same index, it wouldn't block before the patch, but after
> > the patch, the member should wait for the leader to release the lock.
>
> Okay, I will test this part.
>
> > 2. Try to hit Assert in LockAcquireExtended (a) by trying to
> > re-acquire the page lock via the debugger,
>
> I am not sure whether it is true or not,  Because, if we are holding
> the page lock and we try the same page lock then the lock will be
> granted without reaching the code path.  However, I agree that this is
> not intended instead this is a side effect of allowing relation
> extension lock while holding the same relation extension lock.  So
> basically, now the situation is that if the lock is directly granted
> because we are holding the same lock then it will not go to the assert
> code.  IMHO, we don't need to add extra code to make it behave
> differently.  Please let me know what is your opinion on this.
>

I also don't think there is any reason to add code to prevent that.
Actually, what I wanted to test was to somehow hit the Assert for the
cases where it will actually hit if someone tomorrow tries to acquire
any other type of lock.  Can we mimic such a situation by hacking code
(say try to acquire some other type of heavyweight lock) or in some
way to hit the newly added Assert?

> (b) try to acquire the
> > relation extension lock after page lock and it should be allowed
> > (after acquiring page lock, we take relation extension lock in
> > following code path:
> > ginInsertCleanup->ginEntryInsert->ginFindLeafPage->ginPlaceToPage->GinNewBuffer).
>
> ok
>

Thanks.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-15 Thread Dilip Kumar

On Sun, Mar 15, 2020 at 6:20 PM Amit Kapila  wrote:
>
> On Sun, Mar 15, 2020 at 4:34 PM Dilip Kumar  wrote:
> >
> > I have modified 0001 and 0002 slightly,  Basically, instead of two
> > function CheckAndSetLockHeld and CheckAndReSetLockHeld, I have created
> > a one function.
> >
>
> +CheckAndSetLockHeld(LOCALLOCK *locallock, bool value)
>
> Can we rename the parameter as lock_held, acquired or something like
> that so that it indicates what it intends to do and probably add a
> comment for that variable atop of function?

Done

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v9-0002-Add-assert-to-ensure-that-page-locks-don-t-partic.patch
Description: Binary data


v9-0003-Allow-relation-extension-lock-to-conflict-among-p.patch
Description: Binary data


v9-0001-Assert-that-we-don-t-acquire-a-heavyweight-lock-o.patch
Description: Binary data


v9-0004-Allow-page-lock-to-conflict-among-parallel-group-.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-15 Thread Amit Kapila

On Sun, Mar 15, 2020 at 4:34 PM Dilip Kumar  wrote:
>
> I have modified 0001 and 0002 slightly,  Basically, instead of two
> function CheckAndSetLockHeld and CheckAndReSetLockHeld, I have created
> a one function.
>

+CheckAndSetLockHeld(LOCALLOCK *locallock, bool value)

Can we rename the parameter as lock_held, acquired or something like
that so that it indicates what it intends to do and probably add a
comment for that variable atop of function?

There is some work left related to testing some parts of the patch and
I can do some more review, but it started to look good to me, so I am
planning to push this in the coming week (say by Wednesday or so)
unless there are some major comments.  There are primarily two parts
of the patch-series (a) Assert that we don't acquire a heavyweight
lock on another object after relation extension lock. (b) Allow
relation extension lock to conflict among the parallel group members.
On similar lines there are two patches for page locks.

I think we have discussed in detail about LWLock approach and it seems
that it might be tricky than we initially thought especially with some
of the latest findings where we have noticed that there are multiple
cases where we can try to re-acquire the relation extension lock and
other things which we have discussed.  Also, all of us don't agree
with that idea.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-15 Thread Amit Kapila

On Sun, Mar 15, 2020 at 1:15 PM Dilip Kumar  wrote:
>
> On Sat, Mar 14, 2020 at 7:39 PM Amit Kapila  wrote:
> >
> > On Fri, Mar 13, 2020 at 7:02 PM Dilip Kumar  wrote:
> > >
> > > Apart from that, I have also extended the solution for the page lock.
> > > And, I have also broken down the 3rd patch in two parts for relation
> > > extension and for the page lock.
> > >
> >
> > Thanks, I have made a number of cosmetic changes and written
> > appropriate commit messages for all patches.  See the attached patch
> > series and let me know your opinion? BTW, did you get a chance to test
> > page locks by using the extension which I have posted above or by some
> > other way?  I think it is important to test page-lock related patches
> > now.
>
> I have reviewed the updated patches and looks fine to me.  Apart from
> this I have done testing for the Page Lock using group locking
> extension.
>
> --Setup
> create table gin_test_tbl(i int4[]) with (autovacuum_enabled = off);
> create index gin_test_idx on gin_test_tbl using gin (i);
> create table gin_test_tbl1(i int4[]) with (autovacuum_enabled = off);
> create index gin_test_idx1 on gin_test_tbl1 using gin (i);
>
> --session1:
> select become_lock_group_leader();
> select gin_clean_pending_list('gin_test_idx');
>
> --session2:
> select become_lock_group_member(session1_pid);
> select gin_clean_pending_list('gin_test_idx1');
>
> --session3:
> select become_lock_group_leader();
> select gin_clean_pending_list('gin_test_idx1');
>
> --session4:
> select become_lock_group_member(session3_pid);
> select gin_clean_pending_list('gin_test_idx');
>
> ERROR:  deadlock detected
> DETAIL:  Process 61953 waits for ExclusiveLock on page 0 of relation
> 16399 of database 13577; blocked by process 62197.
> Process 62197 waits for ExclusiveLock on page 0 of relation 16400 of
> database 13577; blocked by process 61953.
> HINT:  See server log for query details.
>
>
> Session1 and Session3 acquire the PageLock on two different index's
> meta-pages and blocked in gdb,  meanwhile, their member tries to
> acquire the page lock as shown in the above example and it detects the
> deadlock which is solved after applying the patch.
>

So, in this test, you have first performed the actions from Session-1
and Session-3 (blocked them via GDB after acquiring page lock) and
then performed the actions from Session-2 and Session-4, right?
Though this is not a very realistic case, it proves the point that
page locks don't participate in the deadlock cycle after the patch.  I
think we can do a few more tests that test other aspects of the patch.

1. Group members wait for page locks.  If you test that the leader
acquires the page lock and then member also tries to acquire the same
lock on the same index, it wouldn't block before the patch, but after
the patch, the member should wait for the leader to release the lock.
2. Try to hit Assert in LockAcquireExtended (a) by trying to
re-acquire the page lock via the debugger, (b) try to acquire the
relation extension lock after page lock and it should be allowed
(after acquiring page lock, we take relation extension lock in
following code path:
ginInsertCleanup->ginEntryInsert->ginFindLeafPage->ginPlaceToPage->GinNewBuffer).

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-15 Thread Dilip Kumar

On Sun, Mar 15, 2020 at 1:15 PM Dilip Kumar  wrote:
>
> On Sat, Mar 14, 2020 at 7:39 PM Amit Kapila  wrote:
> >
> > On Fri, Mar 13, 2020 at 7:02 PM Dilip Kumar  wrote:
> > >
> > > Apart from that, I have also extended the solution for the page lock.
> > > And, I have also broken down the 3rd patch in two parts for relation
> > > extension and for the page lock.
> > >
> >
> > Thanks, I have made a number of cosmetic changes and written
> > appropriate commit messages for all patches.  See the attached patch
> > series and let me know your opinion? BTW, did you get a chance to test
> > page locks by using the extension which I have posted above or by some
> > other way?  I think it is important to test page-lock related patches
> > now.
>
> I have reviewed the updated patches and looks fine to me.  Apart from
> this I have done testing for the Page Lock using group locking
> extension.
>
> --Setup
> create table gin_test_tbl(i int4[]) with (autovacuum_enabled = off);
> create index gin_test_idx on gin_test_tbl using gin (i);
> create table gin_test_tbl1(i int4[]) with (autovacuum_enabled = off);
> create index gin_test_idx1 on gin_test_tbl1 using gin (i);
>
> --session1:
> select become_lock_group_leader();
> select gin_clean_pending_list('gin_test_idx');
>
> --session2:
> select become_lock_group_member(session1_pid);
> select gin_clean_pending_list('gin_test_idx1');
>
> --session3:
> select become_lock_group_leader();
> select gin_clean_pending_list('gin_test_idx1');
>
> --session4:
> select become_lock_group_member(session3_pid);
> select gin_clean_pending_list('gin_test_idx');
>
> ERROR:  deadlock detected
> DETAIL:  Process 61953 waits for ExclusiveLock on page 0 of relation
> 16399 of database 13577; blocked by process 62197.
> Process 62197 waits for ExclusiveLock on page 0 of relation 16400 of
> database 13577; blocked by process 61953.
> HINT:  See server log for query details.
>
>
> Session1 and Session3 acquire the PageLock on two different index's
> meta-pages and blocked in gdb,  meanwhile, their member tries to
> acquire the page lock as shown in the above example and it detects the
> deadlock which is solved after applying the patch.

I have modified 0001 and 0002 slightly,  Basically, instead of two
function CheckAndSetLockHeld and CheckAndReSetLockHeld, I have created
a one function.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v8-0001-Assert-that-we-don-t-acquire-a-heavyweight-lock-o.patch
Description: Binary data


v8-0003-Allow-relation-extension-lock-to-conflict-among-p.patch
Description: Binary data


v8-0004-Allow-page-lock-to-conflict-among-parallel-group-.patch
Description: Binary data


v8-0002-Add-assert-to-ensure-that-page-locks-don-t-partic.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-15 Thread Dilip Kumar

On Sat, Mar 14, 2020 at 7:39 PM Amit Kapila  wrote:
>
> On Fri, Mar 13, 2020 at 7:02 PM Dilip Kumar  wrote:
> >
> > Apart from that, I have also extended the solution for the page lock.
> > And, I have also broken down the 3rd patch in two parts for relation
> > extension and for the page lock.
> >
>
> Thanks, I have made a number of cosmetic changes and written
> appropriate commit messages for all patches.  See the attached patch
> series and let me know your opinion? BTW, did you get a chance to test
> page locks by using the extension which I have posted above or by some
> other way?  I think it is important to test page-lock related patches
> now.

I have reviewed the updated patches and looks fine to me.  Apart from
this I have done testing for the Page Lock using group locking
extension.

--Setup
create table gin_test_tbl(i int4[]) with (autovacuum_enabled = off);
create index gin_test_idx on gin_test_tbl using gin (i);
create table gin_test_tbl1(i int4[]) with (autovacuum_enabled = off);
create index gin_test_idx1 on gin_test_tbl1 using gin (i);

--session1:
select become_lock_group_leader();
select gin_clean_pending_list('gin_test_idx');

--session2:
select become_lock_group_member(session1_pid);
select gin_clean_pending_list('gin_test_idx1');

--session3:
select become_lock_group_leader();
select gin_clean_pending_list('gin_test_idx1');

--session4:
select become_lock_group_member(session3_pid);
select gin_clean_pending_list('gin_test_idx');

ERROR:  deadlock detected
DETAIL:  Process 61953 waits for ExclusiveLock on page 0 of relation
16399 of database 13577; blocked by process 62197.
Process 62197 waits for ExclusiveLock on page 0 of relation 16400 of
database 13577; blocked by process 61953.
HINT:  See server log for query details.

Session1 and Session3 acquire the PageLock on two different index's
meta-pages and blocked in gdb,  meanwhile, their member tries to
acquire the page lock as shown in the above example and it detects the
deadlock which is solved after applying the patch.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-14 Thread Amit Kapila

On Fri, Mar 13, 2020 at 7:02 PM Dilip Kumar  wrote:
>
> Apart from that, I have also extended the solution for the page lock.
> And, I have also broken down the 3rd patch in two parts for relation
> extension and for the page lock.
>

Thanks, I have made a number of cosmetic changes and written
appropriate commit messages for all patches.  See the attached patch
series and let me know your opinion? BTW, did you get a chance to test
page locks by using the extension which I have posted above or by some
other way?  I think it is important to test page-lock related patches
now.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

v7-0001-Assert-that-we-don-t-acquire-a-heavyweight-lock-on-a.patch
Description: Binary data

v7-0002-Add-assert-to-ensure-that-page-locks-don-t-participa.patch
Description: Binary data

v7-0003-Allow-relation-extension-lock-to-conflict-among-para.patch
Description: Binary data

v7-0004-Allow-page-lock-to-conflict-among-parallel-group-mem.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-13 Thread Dilip Kumar

On Fri, Mar 13, 2020 at 3:39 PM Amit Kapila  wrote:
>
> On Fri, Mar 13, 2020 at 8:37 AM Dilip Kumar  wrote:
> >
> > On Thu, Mar 12, 2020 at 5:28 PM Amit Kapila  wrote:
> > >
> > > On Thu, Mar 12, 2020 at 11:15 AM Dilip Kumar  
> > > wrote:
> > > >
> > > > I have fixed this in the attached patch set.
> > > >
> > >
> > > I have modified your
> > > v4-0003-Conflict-Extension-Page-lock-in-group-member patch.  The
> > > modifications are (a) Change src/backend/storage/lmgr/README to
> > > reflect new behaviour, (b) Introduce a new macro LOCK_LOCKTAG which
> > > slightly simplifies the code, (c) moved the deadlock.c check a few
> > > lines up and (d) changed a few comments.
> >
> > Changes look fine to me.
> >
>
> Today, while looking at this patch again, I realized that there is a
> where we sometimes allow group members to jump the wait queue.  This
> is primarily to avoid creating deadlocks (see ProcSleep).  Now,
> ideally, we don't need this for relation extension or page locks as
> those can never lead to deadlocks.  However, the current code will
> give group members more priority to acquire relation extension or page
> locks if any one of the members has held those locks.  Now, if we want
> we can prevent giving group members priority for these locks, but I am
> not sure how important is that case.  So, I have left that as it is by
> adding a few comments.  What do you think?
>
> Additionally, I have changed/added a few more sentences in README.

I have included all your changes in the latest patch set.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-13 Thread Dilip Kumar

On Fri, Mar 13, 2020 at 11:16 AM Dilip Kumar  wrote:
>
> On Fri, Mar 13, 2020 at 11:08 AM Amit Kapila  wrote:
> >
> > On Thu, Mar 12, 2020 at 3:04 PM Dilip Kumar  wrote:
> > >
> > > On Wed, Mar 11, 2020 at 2:36 PM Amit Kapila  
> > > wrote:
> > > >
> > > >
> > > > If we have no other choice, then I see a few downsides of adding a
> > > > special check in the LockRelease() call:
> > > >
> > > > 1. Instead of resetting/decrement the variable from specific APIs like
> > > > UnlockRelationForExtension or UnlockPage, we need to have it in
> > > > LockRelease. It will also look odd, if set variable in
> > > > LockRelationForExtension, but don't reset in the
> > > > UnlockRelationForExtension variant.  Now, maybe we can allow to reset
> > > > it at both places if it is a flag, but not if it is a counter
> > > > variable.
> > > >
> > > > 2. One can argue that adding extra instructions in a generic path
> > > > (like LockRelease) is not a good idea, especially if those are for an
> > > > Assert. I understand this won't add anything which we can measure by
> > > > standard benchmarks.
> > >
> > > I have just written a WIP patch for relation extension lock where
> > > instead of incrementing and decrementing the counter in
> > > LockRelationForExtension and UnlockRelationForExtension respectively.
> > > We can just set and reset the flag in LockAcquireExtended and
> > > LockRelease.  So this patch appears simple to me as we are not
> > > involving the transaction APIs to set and reset the flag.  However, we
> > > need to add an extra check as you have already mentioned.  I think we
> > > could measure the performance and see whether it has any impact or
> > > not?
> > >
> >
> > LockAcquireExtended()
> > {
> > ..
> > + if (locktag->locktag_type == LOCKTAG_RELATION_EXTEND)
> > + IsRelationExtensionLockHeld = true;
> > ..
> > }
> >
> > Can we move this check inside a function (CheckAndSetLockHeld or
> > something like that) as we need to add a similar thing for page lock?
>
> ok

Done

>
> > Also, how about moving the set and reset of these flags to
> > GrantLockLocal and RemoveLocalLock as that will further reduce the
> > number of places where we need to add such a check.
>
> Make sense to me.

Done
>

>  Another thing is
> > to see if it makes sense to have a macro like LOCALLOCK_LOCKMETHOD to
> > get the lock tag.
>
> ok

Done

Apart from that, I have also extended the solution for the page lock.
And, I have also broken down the 3rd patch in two parts for relation
extension and for the page lock.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v6-0001-Add-assert-check-for-relation-extension-lock.patch
Description: Binary data


v6-0004-Page-lock-to-conflict-among-parallel-group-member.patch
Description: Binary data


v6-0002-Extend-the-assert-for-the-page-lock.patch
Description: Binary data


v6-0003-Relation-extension-lock-to-conflict-among-paralle.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-13 Thread Dilip Kumar

On Fri, Mar 13, 2020 at 2:32 PM Kuntal Ghosh  wrote:
>
> On Fri, Mar 13, 2020 at 8:29 AM Amit Kapila  wrote:
> >
> > On Thu, Mar 12, 2020 at 7:50 PM Kuntal Ghosh  
> > wrote:
> > > I think moving them inside a macro is a good idea. Also, I think we
> > > should move all the Assert related code inside some debugging macro
> > > similar to this:
> > > #ifdef LOCK_DEBUG
> > > 
> > > #endif
> > >
> > If we move it under some macro, then those Asserts will be only
> > enabled when that macro is defined.  I think we want there Asserts to
> > be enabled always in assert enabled build, these will be like any
> > other Asserts in the code.  What is the advantage of doing those under
> > macro?
> >
> My concern is related to performance regression. We're using two
> static variables in hot-paths only for checking a few asserts. So, I'm
> not sure whether we should enable the same by default, specially when
> asserts are itself disabled.
> -ResetRelExtLockHeldCount()
> +ResetRelExtPageLockHeldCount()
>  {
>   RelationExtensionLockHeldCount = 0;
> + PageLockHeldCount = 0;
> +}
> Also, we're calling this method from frequently used functions like
> Commit/AbortTransaction. So, it's better these two static variables
> share the same cache line and reinitalize them with a single
> instruction.

In the recent version of the patch, instead of a counter, we have done
with a flag.  So I think now we can just keep a single variable and we
can just reset the bit in a single instruction.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-13 Thread Amit Kapila

On Fri, Mar 13, 2020 at 8:37 AM Dilip Kumar  wrote:
>
> On Thu, Mar 12, 2020 at 5:28 PM Amit Kapila  wrote:
> >
> > On Thu, Mar 12, 2020 at 11:15 AM Dilip Kumar  wrote:
> > >
> > > I have fixed this in the attached patch set.
> > >
> >
> > I have modified your
> > v4-0003-Conflict-Extension-Page-lock-in-group-member patch.  The
> > modifications are (a) Change src/backend/storage/lmgr/README to
> > reflect new behaviour, (b) Introduce a new macro LOCK_LOCKTAG which
> > slightly simplifies the code, (c) moved the deadlock.c check a few
> > lines up and (d) changed a few comments.
>
> Changes look fine to me.
>

Today, while looking at this patch again, I realized that there is a
where we sometimes allow group members to jump the wait queue.  This
is primarily to avoid creating deadlocks (see ProcSleep).  Now,
ideally, we don't need this for relation extension or page locks as
those can never lead to deadlocks.  However, the current code will
give group members more priority to acquire relation extension or page
locks if any one of the members has held those locks.  Now, if we want
we can prevent giving group members priority for these locks, but I am
not sure how important is that case.  So, I have left that as it is by
adding a few comments.  What do you think?

Additionally, I have changed/added a few more sentences in README.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

0002-Allow-relation-extension-and-page-locks-to-conflict-.v2.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-13 Thread Kuntal Ghosh

On Fri, Mar 13, 2020 at 8:42 AM Dilip Kumar  wrote:
> > On Thu, Mar 12, 2020 at 7:50 PM Kuntal Ghosh  
> > wrote:
> >
> > > + /*
> > > + * The relation extension or page lock can never participate in actual
> > > + * deadlock cycle.  See Asserts in LockAcquireExtended.  So, there is
> > > + * no advantage in checking wait edges from it.
> > > + */
> > > + if ((LOCK_LOCKTAG(*lock) == LOCKTAG_RELATION_EXTEND) ||
> > > + (LOCK_LOCKTAG(*lock) == LOCKTAG_PAGE))
> > > + return false;
> > > +
> > > Since this is true, we can also avoid these kind of locks in
> > > ExpandConstraints, right?
>
> I am not sure I understand this part.  Because topological sort will
> work on the soft edges we have created when we found the cycle,  but
> for relation extension/page lock we are completely ignoring hard/soft
> edge then it will never participate in topo sort as well.  Am I
> missing something?
>
No, I think you're right. We only add constraints if we've detected a
cycle in the graph. Hence, you don't need the check here.


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-13 Thread Kuntal Ghosh

On Fri, Mar 13, 2020 at 8:29 AM Amit Kapila  wrote:
>
> On Thu, Mar 12, 2020 at 7:50 PM Kuntal Ghosh  
> wrote:
> > I think moving them inside a macro is a good idea. Also, I think we
> > should move all the Assert related code inside some debugging macro
> > similar to this:
> > #ifdef LOCK_DEBUG
> > 
> > #endif
> >
> If we move it under some macro, then those Asserts will be only
> enabled when that macro is defined.  I think we want there Asserts to
> be enabled always in assert enabled build, these will be like any
> other Asserts in the code.  What is the advantage of doing those under
> macro?
>
My concern is related to performance regression. We're using two
static variables in hot-paths only for checking a few asserts. So, I'm
not sure whether we should enable the same by default, specially when
asserts are itself disabled.
-ResetRelExtLockHeldCount()
+ResetRelExtPageLockHeldCount()
 {
  RelationExtensionLockHeldCount = 0;
+ PageLockHeldCount = 0;
+}
Also, we're calling this method from frequently used functions like
Commit/AbortTransaction. So, it's better these two static variables
share the same cache line and reinitalize them with a single
instruction.

>
> >   /*
> > + * The relation extension or page lock conflict even between the group
> > + * members.
> > + */
> > + if ((LOCK_LOCKTAG(*lock) == LOCKTAG_RELATION_EXTEND) ||
> > + (LOCK_LOCKTAG(*lock) == LOCKTAG_PAGE))
> > + {
> > + PROCLOCK_PRINT("LockCheckConflicts: conflicting (group)",
> > + proclock);
> > + return true;
> > + }
> > This check includes the heavyweight locks that conflict even under
> > same parallel group. It also has another property that they can never
> > participate in deadlock cycles. And, the number of locks under this
> > category is likely to increase in future with new parallel features.
> > Hence, it could be used in multiple places. Should we move the
> > condition inside a macro and just call it from here?
> >
>
> Right, this is what I have suggested upthread. Do you have any
> suggestions for naming such a macro or function?  I could think of
> something like LocksConflictAmongGroupMembers or
> LocksNotParticipateInDeadlock. The first one suits more for its usage
> in LockCheckConflicts and the second in the deadlock.c code. So none
> of those sound perfect to me.
>
Actually, I'm not able to come up with a good suggestion. I'm trying
to think of a generic name similar to strong or weak locks but with
the following properties:
a. Locks that don't participate in deadlock detection
b. Locks that conflicts in the same parallel group

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-12 Thread Dilip Kumar

On Fri, Mar 13, 2020 at 11:08 AM Amit Kapila  wrote:
>
> On Thu, Mar 12, 2020 at 3:04 PM Dilip Kumar  wrote:
> >
> > On Wed, Mar 11, 2020 at 2:36 PM Amit Kapila  wrote:
> > >
> > >
> > > If we have no other choice, then I see a few downsides of adding a
> > > special check in the LockRelease() call:
> > >
> > > 1. Instead of resetting/decrement the variable from specific APIs like
> > > UnlockRelationForExtension or UnlockPage, we need to have it in
> > > LockRelease. It will also look odd, if set variable in
> > > LockRelationForExtension, but don't reset in the
> > > UnlockRelationForExtension variant.  Now, maybe we can allow to reset
> > > it at both places if it is a flag, but not if it is a counter
> > > variable.
> > >
> > > 2. One can argue that adding extra instructions in a generic path
> > > (like LockRelease) is not a good idea, especially if those are for an
> > > Assert. I understand this won't add anything which we can measure by
> > > standard benchmarks.
> >
> > I have just written a WIP patch for relation extension lock where
> > instead of incrementing and decrementing the counter in
> > LockRelationForExtension and UnlockRelationForExtension respectively.
> > We can just set and reset the flag in LockAcquireExtended and
> > LockRelease.  So this patch appears simple to me as we are not
> > involving the transaction APIs to set and reset the flag.  However, we
> > need to add an extra check as you have already mentioned.  I think we
> > could measure the performance and see whether it has any impact or
> > not?
> >
>
> LockAcquireExtended()
> {
> ..
> + if (locktag->locktag_type == LOCKTAG_RELATION_EXTEND)
> + IsRelationExtensionLockHeld = true;
> ..
> }
>
> Can we move this check inside a function (CheckAndSetLockHeld or
> something like that) as we need to add a similar thing for page lock?

ok

> Also, how about moving the set and reset of these flags to
> GrantLockLocal and RemoveLocalLock as that will further reduce the
> number of places where we need to add such a check.

Make sense to me.

 Another thing is
> to see if it makes sense to have a macro like LOCALLOCK_LOCKMETHOD to
> get the lock tag.

ok

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-12 Thread Amit Kapila

On Thu, Mar 12, 2020 at 3:04 PM Dilip Kumar  wrote:
>
> On Wed, Mar 11, 2020 at 2:36 PM Amit Kapila  wrote:
> >
> >
> > If we have no other choice, then I see a few downsides of adding a
> > special check in the LockRelease() call:
> >
> > 1. Instead of resetting/decrement the variable from specific APIs like
> > UnlockRelationForExtension or UnlockPage, we need to have it in
> > LockRelease. It will also look odd, if set variable in
> > LockRelationForExtension, but don't reset in the
> > UnlockRelationForExtension variant.  Now, maybe we can allow to reset
> > it at both places if it is a flag, but not if it is a counter
> > variable.
> >
> > 2. One can argue that adding extra instructions in a generic path
> > (like LockRelease) is not a good idea, especially if those are for an
> > Assert. I understand this won't add anything which we can measure by
> > standard benchmarks.
>
> I have just written a WIP patch for relation extension lock where
> instead of incrementing and decrementing the counter in
> LockRelationForExtension and UnlockRelationForExtension respectively.
> We can just set and reset the flag in LockAcquireExtended and
> LockRelease.  So this patch appears simple to me as we are not
> involving the transaction APIs to set and reset the flag.  However, we
> need to add an extra check as you have already mentioned.  I think we
> could measure the performance and see whether it has any impact or
> not?
>

LockAcquireExtended()
{
..
+ if (locktag->locktag_type == LOCKTAG_RELATION_EXTEND)
+ IsRelationExtensionLockHeld = true;
..
}

Can we move this check inside a function (CheckAndSetLockHeld or
something like that) as we need to add a similar thing for page lock?
Also, how about moving the set and reset of these flags to
GrantLockLocal and RemoveLocalLock as that will further reduce the
number of places where we need to add such a check.  Another thing is
to see if it makes sense to have a macro like LOCALLOCK_LOCKMETHOD to
get the lock tag.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-12 Thread Dilip Kumar

On Fri, Mar 13, 2020 at 8:29 AM Amit Kapila  wrote:
>
> On Thu, Mar 12, 2020 at 7:50 PM Kuntal Ghosh  
> wrote:
> >
> > On Thu, Mar 12, 2020 at 5:28 PM Amit Kapila  wrote:
> > >
> > > On Thu, Mar 12, 2020 at 11:15 AM Dilip Kumar  
> > > wrote:
> > > >
> > > > I have fixed this in the attached patch set.
> > > >
> > >
> > > I have modified your
> > > v4-0003-Conflict-Extension-Page-lock-in-group-member patch.  The
> > > modifications are (a) Change src/backend/storage/lmgr/README to
> > > reflect new behaviour, (b) Introduce a new macro LOCK_LOCKTAG which
> > > slightly simplifies the code, (c) moved the deadlock.c check a few
> > > lines up and (d) changed a few comments.
> > >
> > > It might be better if we can move the checks related to extension and
> > > page lock in a separate API or macro.  What do you think?
> > >
> > I think moving them inside a macro is a good idea. Also, I think we
> > should move all the Assert related code inside some debugging macro
> > similar to this:
> > #ifdef LOCK_DEBUG
> > 
> > #endif
> >
>
> If we move it under some macro, then those Asserts will be only
> enabled when that macro is defined.  I think we want there Asserts to
> be enabled always in assert enabled build, these will be like any
> other Asserts in the code.  What is the advantage of doing those under
> macro?
>
> > + /*
> > + * The relation extension or page lock can never participate in actual
> > + * deadlock cycle.  See Asserts in LockAcquireExtended.  So, there is
> > + * no advantage in checking wait edges from it.
> > + */
> > + if ((LOCK_LOCKTAG(*lock) == LOCKTAG_RELATION_EXTEND) ||
> > + (LOCK_LOCKTAG(*lock) == LOCKTAG_PAGE))
> > + return false;
> > +
> > Since this is true, we can also avoid these kind of locks in
> > ExpandConstraints, right?
> >
>
> Yes, I had also thought about it but left it to avoid sprinkling such
> checks at more places than absolutely required.
>
> > It'll certainly reduce some complexity in
> > topological sort.
> >
>
> I think you mean to say TopoSort will have to look at fewer members in
> the wait queue, otherwise, there is nothing from the perspective of
> code which we can remove/change there. I think there will be hardly
> any chance that such locks will participate here because we take those
> for some work and release them (basically, they are unlike other
> heavyweight locks which can be released at the end).   Having said
> that, I am not against putting those checks at the place you are
> suggesting, it is just that I thought that it won't be of much use.

I am not sure I understand this part.  Because topological sort will
work on the soft edges we have created when we found the cycle,  but
for relation extension/page lock we are completely ignoring hard/soft
edge then it will never participate in topo sort as well.  Am I
missing something?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-12 Thread Dilip Kumar

On Thu, Mar 12, 2020 at 5:28 PM Amit Kapila  wrote:
>
> On Thu, Mar 12, 2020 at 11:15 AM Dilip Kumar  wrote:
> >
> > I have fixed this in the attached patch set.
> >
>
> I have modified your
> v4-0003-Conflict-Extension-Page-lock-in-group-member patch.  The
> modifications are (a) Change src/backend/storage/lmgr/README to
> reflect new behaviour, (b) Introduce a new macro LOCK_LOCKTAG which
> slightly simplifies the code, (c) moved the deadlock.c check a few
> lines up and (d) changed a few comments.

Changes look fine to me.

> It might be better if we can move the checks related to extension and
> page lock in a separate API or macro.  What do you think?

I feel it looks cleaner this way as well.  But, If we plan to move it
to common function/macro then we should use some common name such that
it can be used in FindLockCycleRecurseMember as well as in
LockCheckConflicts.

> I have also used an extension to test this patch.  This is the same
> extension that I have used to test the group locking patch.  It will
> allow backends to form a group as we do for parallel workers.  The
> extension is attached to this email.
>
> Test without patch:
> Session-1
> Create table t1(c1 int, c2 char(500));
> Select become_lock_group_leader();
>
> Insert into t1 values(generate_series(1,100),'aaa'); -- stop this
> after acquiring relation extension lock via GDB.
>
> Session-2
> Select  become_lock_group_member();
> Insert into t1 values(generate_series(101,200),'aaa');
> - Debug LockAcquire and found that it doesn't generate conflict for
> Relation Extension lock.
>
> The above experiment has shown that without patch group members can
> acquire relation extension lock if the group leader has that lock.
> After patch the second session waits for the first session to release
> the relation extension lock. I know this is not a perfect way to test,
> but it is better than nothing.  I think we need to do some more
> testing either using this extension or some other way for extension
> and page locks.

I have also tested the same and verified it.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-12 Thread Amit Kapila

On Thu, Mar 12, 2020 at 7:50 PM Kuntal Ghosh  wrote:
>
> On Thu, Mar 12, 2020 at 5:28 PM Amit Kapila  wrote:
> >
> > On Thu, Mar 12, 2020 at 11:15 AM Dilip Kumar  wrote:
> > >
> > > I have fixed this in the attached patch set.
> > >
> >
> > I have modified your
> > v4-0003-Conflict-Extension-Page-lock-in-group-member patch.  The
> > modifications are (a) Change src/backend/storage/lmgr/README to
> > reflect new behaviour, (b) Introduce a new macro LOCK_LOCKTAG which
> > slightly simplifies the code, (c) moved the deadlock.c check a few
> > lines up and (d) changed a few comments.
> >
> > It might be better if we can move the checks related to extension and
> > page lock in a separate API or macro.  What do you think?
> >
> I think moving them inside a macro is a good idea. Also, I think we
> should move all the Assert related code inside some debugging macro
> similar to this:
> #ifdef LOCK_DEBUG
> 
> #endif
>

If we move it under some macro, then those Asserts will be only
enabled when that macro is defined.  I think we want there Asserts to
be enabled always in assert enabled build, these will be like any
other Asserts in the code.  What is the advantage of doing those under
macro?

> + /*
> + * The relation extension or page lock can never participate in actual
> + * deadlock cycle.  See Asserts in LockAcquireExtended.  So, there is
> + * no advantage in checking wait edges from it.
> + */
> + if ((LOCK_LOCKTAG(*lock) == LOCKTAG_RELATION_EXTEND) ||
> + (LOCK_LOCKTAG(*lock) == LOCKTAG_PAGE))
> + return false;
> +
> Since this is true, we can also avoid these kind of locks in
> ExpandConstraints, right?
>

Yes, I had also thought about it but left it to avoid sprinkling such
checks at more places than absolutely required.

> It'll certainly reduce some complexity in
> topological sort.
>

I think you mean to say TopoSort will have to look at fewer members in
the wait queue, otherwise, there is nothing from the perspective of
code which we can remove/change there. I think there will be hardly
any chance that such locks will participate here because we take those
for some work and release them (basically, they are unlike other
heavyweight locks which can be released at the end).   Having said
that, I am not against putting those checks at the place you are
suggesting, it is just that I thought that it won't be of much use.

>   /*
> + * The relation extension or page lock conflict even between the group
> + * members.
> + */
> + if ((LOCK_LOCKTAG(*lock) == LOCKTAG_RELATION_EXTEND) ||
> + (LOCK_LOCKTAG(*lock) == LOCKTAG_PAGE))
> + {
> + PROCLOCK_PRINT("LockCheckConflicts: conflicting (group)",
> + proclock);
> + return true;
> + }
> This check includes the heavyweight locks that conflict even under
> same parallel group. It also has another property that they can never
> participate in deadlock cycles. And, the number of locks under this
> category is likely to increase in future with new parallel features.
> Hence, it could be used in multiple places. Should we move the
> condition inside a macro and just call it from here?
>

Right, this is what I have suggested upthread. Do you have any
suggestions for naming such a macro or function?  I could think of
something like LocksConflictAmongGroupMembers or
LocksNotParticipateInDeadlock. The first one suits more for its usage
in LockCheckConflicts and the second in the deadlock.c code. So none
of those sound perfect to me.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-12 Thread Kuntal Ghosh

On Thu, Mar 12, 2020 at 5:28 PM Amit Kapila  wrote:
>
> On Thu, Mar 12, 2020 at 11:15 AM Dilip Kumar  wrote:
> >
> > I have fixed this in the attached patch set.
> >
>
> I have modified your
> v4-0003-Conflict-Extension-Page-lock-in-group-member patch.  The
> modifications are (a) Change src/backend/storage/lmgr/README to
> reflect new behaviour, (b) Introduce a new macro LOCK_LOCKTAG which
> slightly simplifies the code, (c) moved the deadlock.c check a few
> lines up and (d) changed a few comments.
>
> It might be better if we can move the checks related to extension and
> page lock in a separate API or macro.  What do you think?
>
I think moving them inside a macro is a good idea. Also, I think we
should move all the Assert related code inside some debugging macro
similar to this:
#ifdef LOCK_DEBUG

#endif

+ /*
+ * The relation extension or page lock can never participate in actual
+ * deadlock cycle.  See Asserts in LockAcquireExtended.  So, there is
+ * no advantage in checking wait edges from it.
+ */
+ if ((LOCK_LOCKTAG(*lock) == LOCKTAG_RELATION_EXTEND) ||
+ (LOCK_LOCKTAG(*lock) == LOCKTAG_PAGE))
+ return false;
+
Since this is true, we can also avoid these kind of locks in
ExpandConstraints, right? It'll certainly reduce some complexity in
topological sort.

  /*
+ * The relation extension or page lock conflict even between the group
+ * members.
+ */
+ if ((LOCK_LOCKTAG(*lock) == LOCKTAG_RELATION_EXTEND) ||
+ (LOCK_LOCKTAG(*lock) == LOCKTAG_PAGE))
+ {
+ PROCLOCK_PRINT("LockCheckConflicts: conflicting (group)",
+ proclock);
+ return true;
+ }
This check includes the heavyweight locks that conflict even under
same parallel group. It also has another property that they can never
participate in deadlock cycles. And, the number of locks under this
category is likely to increase in future with new parallel features.
Hence, it could be used in multiple places. Should we move the
condition inside a macro and just call it from here?

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-12 Thread Amit Kapila

On Thu, Mar 12, 2020 at 11:15 AM Dilip Kumar  wrote:
>
> I have fixed this in the attached patch set.
>

I have modified your
v4-0003-Conflict-Extension-Page-lock-in-group-member patch.  The
modifications are (a) Change src/backend/storage/lmgr/README to
reflect new behaviour, (b) Introduce a new macro LOCK_LOCKTAG which
slightly simplifies the code, (c) moved the deadlock.c check a few
lines up and (d) changed a few comments.

It might be better if we can move the checks related to extension and
page lock in a separate API or macro.  What do you think?

I have also used an extension to test this patch.  This is the same
extension that I have used to test the group locking patch.  It will
allow backends to form a group as we do for parallel workers.  The
extension is attached to this email.

Test without patch:
Session-1
Create table t1(c1 int, c2 char(500));
Select become_lock_group_leader();

Insert into t1 values(generate_series(1,100),'aaa'); -- stop this
after acquiring relation extension lock via GDB.

Session-2
Select  become_lock_group_member();
Insert into t1 values(generate_series(101,200),'aaa');
- Debug LockAcquire and found that it doesn't generate conflict for
Relation Extension lock.

The above experiment has shown that without patch group members can
acquire relation extension lock if the group leader has that lock.
After patch the second session waits for the first session to release
the relation extension lock. I know this is not a perfect way to test,
but it is better than nothing.  I think we need to do some more
testing either using this extension or some other way for extension
and page locks.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

0001-Test-group-dead-locks.patch
Description: Binary data

0002-Allow-relation-extension-and-page-locks-to-conflict-.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-12 Thread Dilip Kumar

On Wed, Mar 11, 2020 at 2:36 PM Amit Kapila  wrote:
>
> On Tue, Mar 10, 2020 at 6:39 PM Robert Haas  wrote:
> >
> > On Fri, Mar 6, 2020 at 11:27 PM Dilip Kumar  wrote:
> > > I think instead of the flag we need to keep the counter because we can
> > > acquire the same relation extension lock multiple times.  So
> > > basically, every time we acquire the lock we can increment the counter
> > > and while releasing we can decrement it.   During an error path, I
> > > think it is fine to set it to 0 in CommitTransaction/AbortTransaction.
> > > But, I am not sure that we can set to 0 or decrement it in
> > > AbortSubTransaction because we are not sure whether we have acquired
> > > the lock under this subtransaction or not.
> >
> > I think that CommitTransaction, AbortTransaction, and friends have
> > *zero* business touching this. I think the counter - or flag - should
> > track whether we've got a PROCLOCK entry for a relation extension
> > lock. We either do, or we do not, and that does not change because of
> > anything have to do with the transaction state. It changes because
> > somebody calls LockRelease() or LockReleaseAll().
> >
>
> Do we want to have a special check in the LockRelease() to identify
> whether we are releasing relation extension lock?  If not, then how we
> will identify that relation extension is released and we can reset it
> during subtransaction abort due to error?  During success paths, we
> know when we have released RelationExtension or Page Lock (via
> UnlockRelationForExtension or UnlockPage).  During the top-level
> transaction end, we know when we have released all the locks, so that
> will imply that RelationExtension and or Page locks must have been
> released by that time.
>
> If we have no other choice, then I see a few downsides of adding a
> special check in the LockRelease() call:
>
> 1. Instead of resetting/decrement the variable from specific APIs like
> UnlockRelationForExtension or UnlockPage, we need to have it in
> LockRelease. It will also look odd, if set variable in
> LockRelationForExtension, but don't reset in the
> UnlockRelationForExtension variant.  Now, maybe we can allow to reset
> it at both places if it is a flag, but not if it is a counter
> variable.
>
> 2. One can argue that adding extra instructions in a generic path
> (like LockRelease) is not a good idea, especially if those are for an
> Assert. I understand this won't add anything which we can measure by
> standard benchmarks.

I have just written a WIP patch for relation extension lock where
instead of incrementing and decrementing the counter in
LockRelationForExtension and UnlockRelationForExtension respectively.
We can just set and reset the flag in LockAcquireExtended and
LockRelease.  So this patch appears simple to me as we are not
involving the transaction APIs to set and reset the flag.  However, we
need to add an extra check as you have already mentioned.  I think we
could measure the performance and see whether it has any impact or
not?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From e49e483646f14a2e626190d5ef98f628668d025c Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Thu, 12 Mar 2020 13:16:45 +0530
Subject: [PATCH v5] WIP-Add assert check for relation extension lock

Add assert to check that we should not acquire any other lock if we are
already holding the relation extension lock.  Only exception is that if
we are trying to acquire the relation extension lock then we can hold the
same lock.
---
 src/backend/storage/lmgr/lock.c | 43 +
 1 file changed, 43 insertions(+)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 56dba09..f572dab 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -170,6 +170,15 @@ typedef struct TwoPhaseLockRecord
  */
 static int	FastPathLocalUseCount = 0;
 
+/*
+ * Flag is set if the relation extension lock is currently held by this backend.
+ * We need this flag so that we can ensure that while holding the relation
+ * extension lock we are not trying to acquire any other heavy weight lock.
+ * Basically, that will ensuring that the proc holding relation extension lock
+ * can not wait for any another lock which can lead to a deadlock.
+ */
+static bool	IsRelationExtensionLockHeld = false;
+
 /* Macros for manipulating proc->fpLockBits */
 #define FAST_PATH_BITS_PER_SLOT			3
 #define FAST_PATH_LOCKNUMBER_OFFSET		1
@@ -841,6 +850,14 @@ LockAcquireExtended(const LOCKTAG *locktag,
 	}
 
 	/*
+	 * We should not try to acquire any other heavyweight lock if we are already
+	 * holding the relation extension lock.  If we are trying to hold the same
+	 * relation extension lock then it should have been already granted so we
+	 * will not come here.
+	 */
+	Assert(!IsRelationExtensionLockHeld);
+
+	/*
 	 * Prepare to emit a WAL record if acquisition of this lock needs to be
 	 * replayed in a standby server.
 	 *
@@ -900,6 +917,11

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-11 Thread Dilip Kumar

On Wed, Mar 11, 2020 at 2:23 PM Amit Kapila  wrote:
>
> On Tue, Mar 10, 2020 at 8:53 AM Dilip Kumar  wrote:
> >
> > Please find the updated patch (summary of the changes)
> > - Instead of searching the lock hash table for assert, it maintains a 
> > counter.
> > - Also, handled the case where we can acquire the relation extension
> > lock while holding the relation extension lock on the same relation.
> > - Handled the error case.
> >
> > In addition to that prepared a WIP patch for handling the PageLock.
> > First, I thought that we can use the same counter for the PageLock and
> > the RelationExtensionLock because in assert we just need to check
> > whether we are trying to acquire any other heavyweight lock while
> > holding any of these locks.  But, the exceptional case where we
> > allowed to acquire a relation extension lock while holding any of
> > these locks is a bit different.  Because, if we are holding a relation
> > extension lock then we allowed to acquire the relation extension lock
> > on the same relation but it can not be any other relation otherwise it
> > can create a cycle.  But, the same is not true with the PageLock,
> > i.e. while holding the PageLock you can acquire the relation extension
> > lock on any relation and that will be safe because the relation
> > extension lock guarantee that, it will never create the cycle.
> > However, I agree that we don't have any such cases where we want to
> > acquire a relation extension lock on the different relations while
> > holding the PageLock.
> >
>
> Right, today, we don't have such cases where after acquiring relation
> extension or page lock for a particular relation, we need to acquire
> any of those for other relation and I am not able to offhand think of
> many cases where we might have such a need in the future.  The one
> theoretical possibility is to include fork_num in the lock tag while
> acquiring extension lock for fsm/vm, but that will also have the same
> relation.  Similarly one might say it is valid to acquire extension
> lock in share mode after we have acquired it exclusive mode.  I am not
> sure how much futuristic we want to make these Asserts.
>
> I feel we should cover the current possible cases (which I think will
> make the asserts more strict then required) and if there is a need to
> relax them in the future for any particular use case, then we will
> consider those.  In general, if we consider the way Mahendra has
> written a patch which is to find the entry via the local hash table to
> check for an Assert condition, then it will be a bit easier to extend
> the checks if required in future as that way we have more information
> about the particular lock. However, it will make the check more
> expensive which might be okay considering that it is only for Assert
> enabled builds.
>
> One minor comment:
> /*
> + * We should not acquire any other lock if we are already holding the
> + * relation extension lock.  Only exception is that if we are trying to
> + * acquire the relation extension lock then we can hold the relation
> + * extension on the same relation.
> + */
> + Assert(!IsRelExtLockHeld() ||
> +((locktag->locktag_type == LOCKTAG_RELATION_EXTEND) && found));

I have fixed this in the attached patch set.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v4-0001-Add-assert-to-check-that-we-should-not-acquire-an.patch
Description: Binary data


v4-0003-Conflict-Extension-Page-lock-in-group-member.patch
Description: Binary data


v4-0002-WIP-Extend-the-patch-for-handling-PageLock.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-11 Thread Dilip Kumar

On Tue, Mar 10, 2020 at 4:11 PM Amit Kapila  wrote:
>
> On Mon, Feb 24, 2020 at 3:38 PM Amit Kapila  wrote:
> >
> > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
> > > What I'm advocating is that extension locks should continue to go
> > > through lock.c. And yes, that requires some changes to group locking,
> > > but I still don't see why they'd be complicated.
> > >
> >
> > Fair position, as per initial analysis, I think if we do below three
> > things, it should work out without changing to a new way of locking
> > for relation extension or page type locks.
> > a. As per the discussion above, ensure in code we will never try to
> > acquire another heavy-weight lock after acquiring relation extension
> > or page type locks (probably by having Asserts in code or maybe some
> > other way).
>
> I have done an analysis of the relation extension lock (which can be
> acquired via LockRelationForExtension or
> ConditionalLockRelationForExtension) and found that we don't acquire
> any other heavyweight lock after acquiring it. However, we do
> sometimes try to acquire it again in the places where we update FSM
> after extension, see points (e) and (f) described below.  The usage of
> this lock can be broadly divided into six categories and each one is
> explained as follows:
>
> a. Where after taking the relation extension lock we call ReadBuffer
> (or its variant) and then LockBuffer.  The LockBuffer internally calls
> either LWLock to acquire or release neither of which acquire another
> heavy-weight lock. It is quite obvious as well that while taking some
> lightweight lock, there is no reason to acquire another heavyweight
> lock on any object.  The specs/comments of ReadBufferExtended (which
> gets called from variants of ReadBuffer) API says that if the blknum
> requested is P_NEW, only one backend can call it at-a-time which
> indicates that we don't need to acquire any heavy-weight lock inside
> this API.  Otherwise, also, this API won't need a heavy-weight lock to
> read the existing block into shared buffer as two different backends
> are allowed to read the same block.  I have also gone through all the
> functions called/used in this path to ensure that we don't use
> heavy-weight locks inside it.
>
> The usage by APIs BloomNewBuffer, GinNewBuffer, gistNewBuffer,
> _bt_getbuf, and SpGistNewBuffer falls in this category.  Another API
> that falls under this category is revmap_physical_extend which uses
> ReadBuffer, LocakBuffer and ReleaseBuffer. The ReleaseBuffer API
> unpins aka decrement the reference count for buffer and disassociates
> a buffer from the resource owner.  None of that requires heavy-weight
> lock. T
>
> b. After taking relation extension lock, we call
> RelationGetNumberOfBlocks which primarily calls file-level functions
> to determine the size of the file. This doesn't acquire any other
> heavy-weight lock after relation extension lock.
>
> The usage by APIs ginvacuumcleanup, gistvacuumscan, btvacuumscan, and
> spgvacuumscan falls in this category.
>
> c. There is a usage in API brin_page_cleanup() where we just acquire
> and release the relation extension lock to avoid reinitializing the
> page. As there is no call in-between acquire and release, so there is
> no chance of another heavy-weight lock acquire after having relation
> extension lock.
>
> d. In fsm_extend() and vm_extend(), after acquiring relation extension
> lock, we perform various file-level operations like RelationOpenSmgr,
> smgrexists, smgrcreate, smgrnblocks, smgrextend.  First, from theory,
> we don't have any heavy-weight lock other than relation extension lock
> which can cover such operations and then I have verified it by going
> through these APIs that these don't acquire any other heavy-weight
> lock.  Then these APIs also call PageSetChecksumInplace computes a
> checksum of the page and sets the same in page header which is quite
> straight-forward and doesn't acquire any heavy-weight lock.
>
> In vm_extend, we additionally call CacheInvalidateSmgr to send a
> shared-inval message to force other backends to close any smgr
> references they may have for the relation for which we extending
> visibility map which has no reason to acquire any heavy-weight lock.
> I have checked the code path as well and I didn't find any
> heavy-weight lock call in that.
>
> e. In brin_getinsertbuffer, we call ReadBuffer() and LockBuffer(), the
> usage of which is the same as what is mentioned in (a).  In addition
> to that it calls brin_initialize_empty_new_buffer() which further
> calls RecordPageWithFreeSpace which can again acquire relation
> extension lock for same relation.  This usage is safe because we have
> a mechanism in heavy-weight lock manager that if we already hold a
> lock and a request came for the same lock and in same mode, the lock
> will be granted.
>
> f. In RelationGetBufferForTuple(), there are multiple APIs that get
> called and like (e), it can try to reacquire the relation extension
>

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-11 Thread Dilip Kumar

On Wed, Mar 11, 2020 at 2:23 PM Amit Kapila  wrote:
>
> On Tue, Mar 10, 2020 at 8:53 AM Dilip Kumar  wrote:
> >
> > Please find the updated patch (summary of the changes)
> > - Instead of searching the lock hash table for assert, it maintains a 
> > counter.
> > - Also, handled the case where we can acquire the relation extension
> > lock while holding the relation extension lock on the same relation.
> > - Handled the error case.
> >
> > In addition to that prepared a WIP patch for handling the PageLock.
> > First, I thought that we can use the same counter for the PageLock and
> > the RelationExtensionLock because in assert we just need to check
> > whether we are trying to acquire any other heavyweight lock while
> > holding any of these locks.  But, the exceptional case where we
> > allowed to acquire a relation extension lock while holding any of
> > these locks is a bit different.  Because, if we are holding a relation
> > extension lock then we allowed to acquire the relation extension lock
> > on the same relation but it can not be any other relation otherwise it
> > can create a cycle.  But, the same is not true with the PageLock,
> > i.e. while holding the PageLock you can acquire the relation extension
> > lock on any relation and that will be safe because the relation
> > extension lock guarantee that, it will never create the cycle.
> > However, I agree that we don't have any such cases where we want to
> > acquire a relation extension lock on the different relations while
> > holding the PageLock.
> >
>
> Right, today, we don't have such cases where after acquiring relation
> extension or page lock for a particular relation, we need to acquire
> any of those for other relation and I am not able to offhand think of
> many cases where we might have such a need in the future.  The one
> theoretical possibility is to include fork_num in the lock tag while
> acquiring extension lock for fsm/vm, but that will also have the same
> relation.  Similarly one might say it is valid to acquire extension
> lock in share mode after we have acquired it exclusive mode.  I am not
> sure how much futuristic we want to make these Asserts.
>
> I feel we should cover the current possible cases (which I think will
> make the asserts more strict then required) and if there is a need to
> relax them in the future for any particular use case, then we will
> consider those.  In general, if we consider the way Mahendra has
> written a patch which is to find the entry via the local hash table to
> check for an Assert condition, then it will be a bit easier to extend
> the checks if required in future as that way we have more information
> about the particular lock. However, it will make the check more
> expensive which might be okay considering that it is only for Assert
> enabled builds.
>
> One minor comment:
> /*
> + * We should not acquire any other lock if we are already holding the
> + * relation extension lock.  Only exception is that if we are trying to
> + * acquire the relation extension lock then we can hold the relation
> + * extension on the same relation.
> + */
> + Assert(!IsRelExtLockHeld() ||
> +((locktag->locktag_type == LOCKTAG_RELATION_EXTEND) && found));
>
> I think you don't need the second part of the check because if we have
> found the lock in the local lock table, we would return before this
> check.

Right.

  I think it will catch the case where if we have an extension
> lock on one relation, then it won't allow us to acquire it on another
> relation.

But, those will be caught even if we remove the second part right.
Basically, if we have Assert(!IsRelExtLockHeld(), that means by this
time you should not hold any relation extension lock.  The exceptional
case where we allow relation extension on the same relation will
anyway not reach here.  I think the second part of the Assert is just
useless.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-11 Thread Amit Kapila

On Tue, Mar 10, 2020 at 6:39 PM Robert Haas  wrote:
>
> On Fri, Mar 6, 2020 at 11:27 PM Dilip Kumar  wrote:
> > I think instead of the flag we need to keep the counter because we can
> > acquire the same relation extension lock multiple times.  So
> > basically, every time we acquire the lock we can increment the counter
> > and while releasing we can decrement it.   During an error path, I
> > think it is fine to set it to 0 in CommitTransaction/AbortTransaction.
> > But, I am not sure that we can set to 0 or decrement it in
> > AbortSubTransaction because we are not sure whether we have acquired
> > the lock under this subtransaction or not.
>
> I think that CommitTransaction, AbortTransaction, and friends have
> *zero* business touching this. I think the counter - or flag - should
> track whether we've got a PROCLOCK entry for a relation extension
> lock. We either do, or we do not, and that does not change because of
> anything have to do with the transaction state. It changes because
> somebody calls LockRelease() or LockReleaseAll().
>

Do we want to have a special check in the LockRelease() to identify
whether we are releasing relation extension lock?  If not, then how we
will identify that relation extension is released and we can reset it
during subtransaction abort due to error?  During success paths, we
know when we have released RelationExtension or Page Lock (via
UnlockRelationForExtension or UnlockPage).  During the top-level
transaction end, we know when we have released all the locks, so that
will imply that RelationExtension and or Page locks must have been
released by that time.

If we have no other choice, then I see a few downsides of adding a
special check in the LockRelease() call:

1. Instead of resetting/decrement the variable from specific APIs like
UnlockRelationForExtension or UnlockPage, we need to have it in
LockRelease. It will also look odd, if set variable in
LockRelationForExtension, but don't reset in the
UnlockRelationForExtension variant.  Now, maybe we can allow to reset
it at both places if it is a flag, but not if it is a counter
variable.

2. One can argue that adding extra instructions in a generic path
(like LockRelease) is not a good idea, especially if those are for an
Assert. I understand this won't add anything which we can measure by
standard benchmarks.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-11 Thread Amit Kapila

On Tue, Mar 10, 2020 at 8:53 AM Dilip Kumar  wrote:
>
> Please find the updated patch (summary of the changes)
> - Instead of searching the lock hash table for assert, it maintains a counter.
> - Also, handled the case where we can acquire the relation extension
> lock while holding the relation extension lock on the same relation.
> - Handled the error case.
>
> In addition to that prepared a WIP patch for handling the PageLock.
> First, I thought that we can use the same counter for the PageLock and
> the RelationExtensionLock because in assert we just need to check
> whether we are trying to acquire any other heavyweight lock while
> holding any of these locks.  But, the exceptional case where we
> allowed to acquire a relation extension lock while holding any of
> these locks is a bit different.  Because, if we are holding a relation
> extension lock then we allowed to acquire the relation extension lock
> on the same relation but it can not be any other relation otherwise it
> can create a cycle.  But, the same is not true with the PageLock,
> i.e. while holding the PageLock you can acquire the relation extension
> lock on any relation and that will be safe because the relation
> extension lock guarantee that, it will never create the cycle.
> However, I agree that we don't have any such cases where we want to
> acquire a relation extension lock on the different relations while
> holding the PageLock.
>

Right, today, we don't have such cases where after acquiring relation
extension or page lock for a particular relation, we need to acquire
any of those for other relation and I am not able to offhand think of
many cases where we might have such a need in the future.  The one
theoretical possibility is to include fork_num in the lock tag while
acquiring extension lock for fsm/vm, but that will also have the same
relation.  Similarly one might say it is valid to acquire extension
lock in share mode after we have acquired it exclusive mode.  I am not
sure how much futuristic we want to make these Asserts.

I feel we should cover the current possible cases (which I think will
make the asserts more strict then required) and if there is a need to
relax them in the future for any particular use case, then we will
consider those.  In general, if we consider the way Mahendra has
written a patch which is to find the entry via the local hash table to
check for an Assert condition, then it will be a bit easier to extend
the checks if required in future as that way we have more information
about the particular lock. However, it will make the check more
expensive which might be okay considering that it is only for Assert
enabled builds.

One minor comment:
/*
+ * We should not acquire any other lock if we are already holding the
+ * relation extension lock.  Only exception is that if we are trying to
+ * acquire the relation extension lock then we can hold the relation
+ * extension on the same relation.
+ */
+ Assert(!IsRelExtLockHeld() ||
+((locktag->locktag_type == LOCKTAG_RELATION_EXTEND) && found));

I think you don't need the second part of the check because if we have
found the lock in the local lock table, we would return before this
check.  I think it will catch the case where if we have an extension
lock on one relation, then it won't allow us to acquire it on another
relation. OTOH, it will also not allow cases where backend has
relation extension lock in Exclusive mode and it tries to acquire it
in Shared mode. So, not sure if it is a good idea.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-10 Thread Amit Kapila

On Tue, Mar 10, 2020 at 6:48 PM Robert Haas  wrote:
>
> On Sat, Mar 7, 2020 at 10:23 AM Tom Lane  wrote:
> > I continue to think that we'd be better off getting all of this
> > out of the heavyweight lock manager.  There is no reason why we
> > should need deadlock detection, or multiple holds of the same
> > lock, or pretty much anything that LWLocks don't give you.
>
> Well, that was my initial inclination too, but Andres didn't like it.
> I don't know whether it's better to take his advice or yours.
>
> The one facility that we need here which the heavyweight lock facility
> does provide and the lightweight lock facility does not is the ability
> to take locks on an effectively unlimited number of distinct objects.
> That is, we can't have a separate LWLock for every relation, because
> there ~2^32 relation OIDs per database, and ~2^32 database OIDs, and a
> patch that tried to allocate a tranche of 2^64 LWLocks would probably
> get shot down.
>

I think if we have to follow any LWLock based design, then we also
need to think about a case where if it is already acquired by the
backend (say in X mode), then it should be granted if the same backend
tries to acquire it in same mode (or mode that is compatible with the
mode in which it is already acquired).  As per my analysis above [1],
we do this at multiple places for relation extension lock.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1%2BE8Vu%3D9PYZBZvMrga0Ynz_m6jmT3G_vJv-3L1PWv9Krg%40mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-10 Thread Robert Haas

On Sat, Mar 7, 2020 at 10:23 AM Tom Lane  wrote:
> I continue to think that we'd be better off getting all of this
> out of the heavyweight lock manager.  There is no reason why we
> should need deadlock detection, or multiple holds of the same
> lock, or pretty much anything that LWLocks don't give you.

Well, that was my initial inclination too, but Andres didn't like it.
I don't know whether it's better to take his advice or yours.

The one facility that we need here which the heavyweight lock facility
does provide and the lightweight lock facility does not is the ability
to take locks on an effectively unlimited number of distinct objects.
That is, we can't have a separate LWLock for every relation, because
there ~2^32 relation OIDs per database, and ~2^32 database OIDs, and a
patch that tried to allocate a tranche of 2^64 LWLocks would probably
get shot down.

The patch I wrote for this tried to work around this by having an
array of LWLocks and hashing  pairs onto array slots.
This produces some false sharing, though, which Andres didn't like
(and I can understand his concern). We could work around that problem
with a more complex design, where the LWLocks in the array do not
themselves represent the right to extend the relation, but only
protect the list of lockers. But at that point it starts to look like
you are reinventing the whole LOCK/PROCLOCK division.

So from my point of view we've got three possible approaches here, all
imperfect:

- Hash  pairs onto an array of LWLocks that represent the
right to extend the relation. Problem: false sharing for the whole
time the lock is held.

- Hash  pairs onto an array of LWLocks that protect a list of
lockers. Problem: looks like reinventing LOCK/PROCLOCK mechanism,
which is a fair amount of complexity to be duplicating.

- Adapt the heavyweight lock manager. Problem: Code is old, complex,
grotty, and doesn't need more weird special cases.

Whatever we choose, I think we ought to try to get Page locks and
Relation Extension locks into the same system. They're conceptually
the same kind of thing: you're not locking an SQL object, you
basically want an LWLock, but you can't use an LWLock because you want
to lock an OID not a piece of shared memory, so you can't have enough
LWLocks to use them in the regular way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-10 Thread Robert Haas

On Fri, Mar 6, 2020 at 11:27 PM Dilip Kumar  wrote:
> I think instead of the flag we need to keep the counter because we can
> acquire the same relation extension lock multiple times.  So
> basically, every time we acquire the lock we can increment the counter
> and while releasing we can decrement it.   During an error path, I
> think it is fine to set it to 0 in CommitTransaction/AbortTransaction.
> But, I am not sure that we can set to 0 or decrement it in
> AbortSubTransaction because we are not sure whether we have acquired
> the lock under this subtransaction or not.

I think that CommitTransaction, AbortTransaction, and friends have
*zero* business touching this. I think the counter - or flag - should
track whether we've got a PROCLOCK entry for a relation extension
lock. We either do, or we do not, and that does not change because of
anything have to do with the transaction state. It changes because
somebody calls LockRelease() or LockReleaseAll().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-10 Thread Amit Kapila

On Mon, Feb 24, 2020 at 3:38 PM Amit Kapila  wrote:
>
> On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
> > What I'm advocating is that extension locks should continue to go
> > through lock.c. And yes, that requires some changes to group locking,
> > but I still don't see why they'd be complicated.
> >
>
> Fair position, as per initial analysis, I think if we do below three
> things, it should work out without changing to a new way of locking
> for relation extension or page type locks.
> a. As per the discussion above, ensure in code we will never try to
> acquire another heavy-weight lock after acquiring relation extension
> or page type locks (probably by having Asserts in code or maybe some
> other way).

I have done an analysis of the relation extension lock (which can be
acquired via LockRelationForExtension or
ConditionalLockRelationForExtension) and found that we don't acquire
any other heavyweight lock after acquiring it. However, we do
sometimes try to acquire it again in the places where we update FSM
after extension, see points (e) and (f) described below.  The usage of
this lock can be broadly divided into six categories and each one is
explained as follows:

a. Where after taking the relation extension lock we call ReadBuffer
(or its variant) and then LockBuffer.  The LockBuffer internally calls
either LWLock to acquire or release neither of which acquire another
heavy-weight lock. It is quite obvious as well that while taking some
lightweight lock, there is no reason to acquire another heavyweight
lock on any object.  The specs/comments of ReadBufferExtended (which
gets called from variants of ReadBuffer) API says that if the blknum
requested is P_NEW, only one backend can call it at-a-time which
indicates that we don't need to acquire any heavy-weight lock inside
this API.  Otherwise, also, this API won't need a heavy-weight lock to
read the existing block into shared buffer as two different backends
are allowed to read the same block.  I have also gone through all the
functions called/used in this path to ensure that we don't use
heavy-weight locks inside it.

The usage by APIs BloomNewBuffer, GinNewBuffer, gistNewBuffer,
_bt_getbuf, and SpGistNewBuffer falls in this category.  Another API
that falls under this category is revmap_physical_extend which uses
ReadBuffer, LocakBuffer and ReleaseBuffer. The ReleaseBuffer API
unpins aka decrement the reference count for buffer and disassociates
a buffer from the resource owner.  None of that requires heavy-weight
lock. T

b. After taking relation extension lock, we call
RelationGetNumberOfBlocks which primarily calls file-level functions
to determine the size of the file. This doesn't acquire any other
heavy-weight lock after relation extension lock.

The usage by APIs ginvacuumcleanup, gistvacuumscan, btvacuumscan, and
spgvacuumscan falls in this category.

c. There is a usage in API brin_page_cleanup() where we just acquire
and release the relation extension lock to avoid reinitializing the
page. As there is no call in-between acquire and release, so there is
no chance of another heavy-weight lock acquire after having relation
extension lock.

d. In fsm_extend() and vm_extend(), after acquiring relation extension
lock, we perform various file-level operations like RelationOpenSmgr,
smgrexists, smgrcreate, smgrnblocks, smgrextend.  First, from theory,
we don't have any heavy-weight lock other than relation extension lock
which can cover such operations and then I have verified it by going
through these APIs that these don't acquire any other heavy-weight
lock.  Then these APIs also call PageSetChecksumInplace computes a
checksum of the page and sets the same in page header which is quite
straight-forward and doesn't acquire any heavy-weight lock.

In vm_extend, we additionally call CacheInvalidateSmgr to send a
shared-inval message to force other backends to close any smgr
references they may have for the relation for which we extending
visibility map which has no reason to acquire any heavy-weight lock.
I have checked the code path as well and I didn't find any
heavy-weight lock call in that.

e. In brin_getinsertbuffer, we call ReadBuffer() and LockBuffer(), the
usage of which is the same as what is mentioned in (a).  In addition
to that it calls brin_initialize_empty_new_buffer() which further
calls RecordPageWithFreeSpace which can again acquire relation
extension lock for same relation.  This usage is safe because we have
a mechanism in heavy-weight lock manager that if we already hold a
lock and a request came for the same lock and in same mode, the lock
will be granted.

f. In RelationGetBufferForTuple(), there are multiple APIs that get
called and like (e), it can try to reacquire the relation extension
lock in one of those APIs.  The main APIs it calls after acquiring
relation extension lock are described as follows:
   - GetPageWithFreeSpace: This tries to find a page in the given
relation with at least the specified amount of

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-09 Thread Dilip Kumar

On Mon, Mar 9, 2020 at 2:36 PM Amit Kapila  wrote:
>
> On Mon, Mar 9, 2020 at 2:09 PM Masahiko Sawada 
>  wrote:
>>
>> On Mon, 9 Mar 2020 at 15:50, Amit Kapila  wrote:
>> >
>> > On Mon, Mar 9, 2020 at 11:38 AM Masahiko Sawada 
>> >  wrote:
>> >>
>> >> On Mon, 9 Mar 2020 at 14:16, Amit Kapila  wrote:
>> >> >
>> >> > On Sun, Mar 8, 2020 at 7:58 AM Masahiko Sawada 
>> >> >  wrote:
>> >> >> >
>> >> >> > Fair position, as per initial analysis, I think if we do below three
>> >> >> > things, it should work out without changing to a new way of locking
>> >> >> > for relation extension or page type locks.
>> >> >> > a. As per the discussion above, ensure in code we will never try to
>> >> >> > acquire another heavy-weight lock after acquiring relation extension
>> >> >> > or page type locks (probably by having Asserts in code or maybe some
>> >> >> > other way).
>> >> >>
>> >> >> The current patch
>> >> >> (v02_0001-Added-assert-to-verify-that-we-never-try-to-take-any.patch)
>> >> >> doesn't check that acquiring a heavy-weight lock after page type lock,
>> >> >> is that right?
>> >> >
>> >> >
>> >> > No, it should do that.
>> >> >
>> >> >>
>> >> >> There is the path doing that: ginInsertCleanup() holds
>> >> >> a page lock and insert the pending list items, which might hold a
>> >> >> relation extension lock.
>> >> >
>> >> >
>> >> > Right, I could also see that, but do you see any problem with that?  I 
>> >> > agree that Assert should cover this case, but I don't see any 
>> >> > fundamental problem with that.
>> >>
>> >> I think that could be a problem if we change the group locking so that
>> >> it doesn't consider page lock type.
>> >
>> >
>> > I might be missing something, but won't that be a problem only when if 
>> > there is a case where we acquire page lock after acquiring a relation 
>> > extension lock?
>>
>> Yes, you're right.
>>
>> Well I meant that the reason why we need to make Assert should cover
>> page locks case is the same as the reason for extension lock type
>> case. If we change the group locking so that it doesn't consider
>> extension lock and change deadlock so that it doesn't make a wait edge
>> for it, we need to ensure that the same backend doesn't acquire
>> heavy-weight lock after holding relation extension lock. These are
>> already done in the current patch. Similarly, if we did the similar
>> change for page lock in the group locking and deadlock , we need to
>> ensure the same things for page lock.
>
>
> Agreed.
>
>>
>> But ISTM it doesn't necessarily
>> need to support page lock for now because currently we use it only for
>> cleanup pending list of gin index.
>>
>
> I agree, but I think it is better to have a patch for the same even if we 
> want to review/commit that separately.  That will help us to look at how the 
> complete solution looks.

Please find the updated patch (summary of the changes)
- Instead of searching the lock hash table for assert, it maintains a counter.
- Also, handled the case where we can acquire the relation extension
lock while holding the relation extension lock on the same relation.
- Handled the error case.

In addition to that prepared a WIP patch for handling the PageLock.
First, I thought that we can use the same counter for the PageLock and
the RelationExtensionLock because in assert we just need to check
whether we are trying to acquire any other heavyweight lock while
holding any of these locks.  But, the exceptional case where we
allowed to acquire a relation extension lock while holding any of
these locks is a bit different.  Because, if we are holding a relation
extension lock then we allowed to acquire the relation extension lock
on the same relation but it can not be any other relation otherwise it
can create a cycle.  But, the same is not true with the PageLock,
i.e. while holding the PageLock you can acquire the relation extension
lock on any relation and that will be safe because the relation
extension lock guarantee that, it will never create the cycle.
However, I agree that we don't have any such cases where we want to
acquire a relation extension lock on the different relations while
holding the PageLock.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v3-0001-Add-assert-to-check-that-we-should-not-acquire-an.patch
Description: Binary data


v3-0003-Conflict-Extension-Page-lock-in-group-member.patch
Description: Binary data


v3-0002-WIP-Extend-the-patch-for-handling-PageLock.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-09 Thread Amit Kapila

On Mon, Mar 9, 2020 at 2:09 PM Masahiko Sawada <
masahiko.saw...@2ndquadrant.com> wrote:

> On Mon, 9 Mar 2020 at 15:50, Amit Kapila  wrote:
> >
> > On Mon, Mar 9, 2020 at 11:38 AM Masahiko Sawada <
> masahiko.saw...@2ndquadrant.com> wrote:
> >>
> >> On Mon, 9 Mar 2020 at 14:16, Amit Kapila 
> wrote:
> >> >
> >> > On Sun, Mar 8, 2020 at 7:58 AM Masahiko Sawada <
> masahiko.saw...@2ndquadrant.com> wrote:
> >> >> >
> >> >> > Fair position, as per initial analysis, I think if we do below
> three
> >> >> > things, it should work out without changing to a new way of locking
> >> >> > for relation extension or page type locks.
> >> >> > a. As per the discussion above, ensure in code we will never try to
> >> >> > acquire another heavy-weight lock after acquiring relation
> extension
> >> >> > or page type locks (probably by having Asserts in code or maybe
> some
> >> >> > other way).
> >> >>
> >> >> The current patch
> >> >> (v02_0001-Added-assert-to-verify-that-we-never-try-to-take-any.patch)
> >> >> doesn't check that acquiring a heavy-weight lock after page type
> lock,
> >> >> is that right?
> >> >
> >> >
> >> > No, it should do that.
> >> >
> >> >>
> >> >> There is the path doing that: ginInsertCleanup() holds
> >> >> a page lock and insert the pending list items, which might hold a
> >> >> relation extension lock.
> >> >
> >> >
> >> > Right, I could also see that, but do you see any problem with that?
> I agree that Assert should cover this case, but I don't see any fundamental
> problem with that.
> >>
> >> I think that could be a problem if we change the group locking so that
> >> it doesn't consider page lock type.
> >
> >
> > I might be missing something, but won't that be a problem only when if
> there is a case where we acquire page lock after acquiring a relation
> extension lock?
>
> Yes, you're right.
>
> Well I meant that the reason why we need to make Assert should cover
> page locks case is the same as the reason for extension lock type
> case. If we change the group locking so that it doesn't consider
> extension lock and change deadlock so that it doesn't make a wait edge
> for it, we need to ensure that the same backend doesn't acquire
> heavy-weight lock after holding relation extension lock. These are
> already done in the current patch. Similarly, if we did the similar
> change for page lock in the group locking and deadlock , we need to
> ensure the same things for page lock.


Agreed.


> But ISTM it doesn't necessarily
> need to support page lock for now because currently we use it only for
> cleanup pending list of gin index.
>
>
I agree, but I think it is better to have a patch for the same even if we
want to review/commit that separately.  That will help us to look at how
the complete solution looks.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-09 Thread Masahiko Sawada

On Mon, 9 Mar 2020 at 15:50, Amit Kapila  wrote:
>
> On Mon, Mar 9, 2020 at 11:38 AM Masahiko Sawada 
>  wrote:
>>
>> On Mon, 9 Mar 2020 at 14:16, Amit Kapila  wrote:
>> >
>> > On Sun, Mar 8, 2020 at 7:58 AM Masahiko Sawada 
>> >  wrote:
>> >> >
>> >> > Fair position, as per initial analysis, I think if we do below three
>> >> > things, it should work out without changing to a new way of locking
>> >> > for relation extension or page type locks.
>> >> > a. As per the discussion above, ensure in code we will never try to
>> >> > acquire another heavy-weight lock after acquiring relation extension
>> >> > or page type locks (probably by having Asserts in code or maybe some
>> >> > other way).
>> >>
>> >> The current patch
>> >> (v02_0001-Added-assert-to-verify-that-we-never-try-to-take-any.patch)
>> >> doesn't check that acquiring a heavy-weight lock after page type lock,
>> >> is that right?
>> >
>> >
>> > No, it should do that.
>> >
>> >>
>> >> There is the path doing that: ginInsertCleanup() holds
>> >> a page lock and insert the pending list items, which might hold a
>> >> relation extension lock.
>> >
>> >
>> > Right, I could also see that, but do you see any problem with that?  I 
>> > agree that Assert should cover this case, but I don't see any fundamental 
>> > problem with that.
>>
>> I think that could be a problem if we change the group locking so that
>> it doesn't consider page lock type.
>
>
> I might be missing something, but won't that be a problem only when if there 
> is a case where we acquire page lock after acquiring a relation extension 
> lock?

Yes, you're right.

Well I meant that the reason why we need to make Assert should cover
page locks case is the same as the reason for extension lock type
case. If we change the group locking so that it doesn't consider
extension lock and change deadlock so that it doesn't make a wait edge
for it, we need to ensure that the same backend doesn't acquire
heavy-weight lock after holding relation extension lock. These are
already done in the current patch. Similarly, if we did the similar
change for page lock in the group locking and deadlock , we need to
ensure the same things for page lock. But ISTM it doesn't necessarily
need to support page lock for now because currently we use it only for
cleanup pending list of gin index.


Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-09 Thread Amit Kapila

On Mon, Mar 9, 2020 at 11:38 AM Masahiko Sawada <
masahiko.saw...@2ndquadrant.com> wrote:

> On Mon, 9 Mar 2020 at 14:16, Amit Kapila  wrote:
> >
> > On Sun, Mar 8, 2020 at 7:58 AM Masahiko Sawada <
> masahiko.saw...@2ndquadrant.com> wrote:
> >> >
> >> > Fair position, as per initial analysis, I think if we do below three
> >> > things, it should work out without changing to a new way of locking
> >> > for relation extension or page type locks.
> >> > a. As per the discussion above, ensure in code we will never try to
> >> > acquire another heavy-weight lock after acquiring relation extension
> >> > or page type locks (probably by having Asserts in code or maybe some
> >> > other way).
> >>
> >> The current patch
> >> (v02_0001-Added-assert-to-verify-that-we-never-try-to-take-any.patch)
> >> doesn't check that acquiring a heavy-weight lock after page type lock,
> >> is that right?
> >
> >
> > No, it should do that.
> >
> >>
> >> There is the path doing that: ginInsertCleanup() holds
> >> a page lock and insert the pending list items, which might hold a
> >> relation extension lock.
> >
> >
> > Right, I could also see that, but do you see any problem with that?  I
> agree that Assert should cover this case, but I don't see any fundamental
> problem with that.
>
> I think that could be a problem if we change the group locking so that
> it doesn't consider page lock type.
>

I might be missing something, but won't that be a problem only when if
there is a case where we acquire page lock after acquiring a relation
extension lock?  Can you please explain the scenario you have in mind which
can create a problem?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-09 Thread Masahiko Sawada

On Mon, 9 Mar 2020 at 14:16, Amit Kapila  wrote:
>
> On Sun, Mar 8, 2020 at 7:58 AM Masahiko Sawada 
>  wrote:
>>
>> On Mon, 24 Feb 2020 at 19:08, Amit Kapila  wrote:
>> >
>> > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
>> > >
>> > > Hi,
>> > >
>> > > On 2020-02-19 11:12:18 +0530, Amit Kapila wrote:
>> > > > I think till we know the real need for changing group locking, going
>> > > > in the direction of what Tom suggested to use an array of LWLocks [1]
>> > > > to address the problems in hand is a good idea.
>> > >
>> > > -many
>> > >
>> > > I think that building yet another locking subsystem is the entirely
>> > > wrong idea - especially when there's imo no convincing architectural
>> > > reasons to do so.
>> > >
>> >
>> > Hmm, AFAIU, it will be done by having an array of LWLocks which we do
>> > at other places as well (like BufferIO locks).  I am not sure if we
>> > can call it as new locking subsystem, but if we decide to continue
>> > using lock.c and change group locking then I think we can do that as
>> > well, see my comments below regarding that.
>> >
>> > >
>> > > > It is not very clear to me that are we thinking to give up on Tom's
>> > > > idea [1] and change group locking even though it is not clear or at
>> > > > least nobody has proposed an idea/patch which requires that?  Or are
>> > > > we thinking that we can do what Tom suggested for relation extension
>> > > > lock and also plan to change group locking for future parallel
>> > > > operations that might require it?
>> > >
>> > > What I'm advocating is that extension locks should continue to go
>> > > through lock.c. And yes, that requires some changes to group locking,
>> > > but I still don't see why they'd be complicated.
>> > >
>> >
>> > Fair position, as per initial analysis, I think if we do below three
>> > things, it should work out without changing to a new way of locking
>> > for relation extension or page type locks.
>> > a. As per the discussion above, ensure in code we will never try to
>> > acquire another heavy-weight lock after acquiring relation extension
>> > or page type locks (probably by having Asserts in code or maybe some
>> > other way).
>>
>> The current patch
>> (v02_0001-Added-assert-to-verify-that-we-never-try-to-take-any.patch)
>> doesn't check that acquiring a heavy-weight lock after page type lock,
>> is that right?
>
>
> No, it should do that.
>
>>
>> There is the path doing that: ginInsertCleanup() holds
>> a page lock and insert the pending list items, which might hold a
>> relation extension lock.
>
>
> Right, I could also see that, but do you see any problem with that?  I agree 
> that Assert should cover this case, but I don't see any fundamental problem 
> with that.

I think that could be a problem if we change the group locking so that
it doesn't consider page lock type.

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-08 Thread Amit Kapila

On Sun, Mar 8, 2020 at 7:58 AM Masahiko Sawada <
masahiko.saw...@2ndquadrant.com> wrote:

> On Mon, 24 Feb 2020 at 19:08, Amit Kapila  wrote:
> >
> > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund 
> wrote:
> > >
> > > Hi,
> > >
> > > On 2020-02-19 11:12:18 +0530, Amit Kapila wrote:
> > > > I think till we know the real need for changing group locking, going
> > > > in the direction of what Tom suggested to use an array of LWLocks [1]
> > > > to address the problems in hand is a good idea.
> > >
> > > -many
> > >
> > > I think that building yet another locking subsystem is the entirely
> > > wrong idea - especially when there's imo no convincing architectural
> > > reasons to do so.
> > >
> >
> > Hmm, AFAIU, it will be done by having an array of LWLocks which we do
> > at other places as well (like BufferIO locks).  I am not sure if we
> > can call it as new locking subsystem, but if we decide to continue
> > using lock.c and change group locking then I think we can do that as
> > well, see my comments below regarding that.
> >
> > >
> > > > It is not very clear to me that are we thinking to give up on Tom's
> > > > idea [1] and change group locking even though it is not clear or at
> > > > least nobody has proposed an idea/patch which requires that?  Or are
> > > > we thinking that we can do what Tom suggested for relation extension
> > > > lock and also plan to change group locking for future parallel
> > > > operations that might require it?
> > >
> > > What I'm advocating is that extension locks should continue to go
> > > through lock.c. And yes, that requires some changes to group locking,
> > > but I still don't see why they'd be complicated.
> > >
> >
> > Fair position, as per initial analysis, I think if we do below three
> > things, it should work out without changing to a new way of locking
> > for relation extension or page type locks.
> > a. As per the discussion above, ensure in code we will never try to
> > acquire another heavy-weight lock after acquiring relation extension
> > or page type locks (probably by having Asserts in code or maybe some
> > other way).
>
> The current patch
> (v02_0001-Added-assert-to-verify-that-we-never-try-to-take-any.patch)
> doesn't check that acquiring a heavy-weight lock after page type lock,
> is that right?


No, it should do that.


> There is the path doing that: ginInsertCleanup() holds
> a page lock and insert the pending list items, which might hold a
> relation extension lock.
>

Right, I could also see that, but do you see any problem with that?  I
agree that Assert should cover this case, but I don't see any fundamental
problem with that.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-07 Thread Masahiko Sawada

On Mon, 24 Feb 2020 at 19:08, Amit Kapila  wrote:
>
> On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
> >
> > Hi,
> >
> > On 2020-02-19 11:12:18 +0530, Amit Kapila wrote:
> > > I think till we know the real need for changing group locking, going
> > > in the direction of what Tom suggested to use an array of LWLocks [1]
> > > to address the problems in hand is a good idea.
> >
> > -many
> >
> > I think that building yet another locking subsystem is the entirely
> > wrong idea - especially when there's imo no convincing architectural
> > reasons to do so.
> >
>
> Hmm, AFAIU, it will be done by having an array of LWLocks which we do
> at other places as well (like BufferIO locks).  I am not sure if we
> can call it as new locking subsystem, but if we decide to continue
> using lock.c and change group locking then I think we can do that as
> well, see my comments below regarding that.
>
> >
> > > It is not very clear to me that are we thinking to give up on Tom's
> > > idea [1] and change group locking even though it is not clear or at
> > > least nobody has proposed an idea/patch which requires that?  Or are
> > > we thinking that we can do what Tom suggested for relation extension
> > > lock and also plan to change group locking for future parallel
> > > operations that might require it?
> >
> > What I'm advocating is that extension locks should continue to go
> > through lock.c. And yes, that requires some changes to group locking,
> > but I still don't see why they'd be complicated.
> >
>
> Fair position, as per initial analysis, I think if we do below three
> things, it should work out without changing to a new way of locking
> for relation extension or page type locks.
> a. As per the discussion above, ensure in code we will never try to
> acquire another heavy-weight lock after acquiring relation extension
> or page type locks (probably by having Asserts in code or maybe some
> other way).

The current patch
(v02_0001-Added-assert-to-verify-that-we-never-try-to-take-any.patch)
doesn't check that acquiring a heavy-weight lock after page type lock,
is that right? There is the path doing that: ginInsertCleanup() holds
a page lock and insert the pending list items, which might hold a
relation extension lock.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-07 Thread Dilip Kumar

On Sat, Mar 7, 2020 at 8:53 PM Tom Lane  wrote:
>
> Dilip Kumar  writes:
> > I think instead of the flag we need to keep the counter because we can
> > acquire the same relation extension lock multiple times.
>
> Uh ... what?  How would that not be broken usage on its face?

Basically,  if we can ensure that while holding the relation extension
lock we will not wait for any other lock then we can ignore in the
deadlock detection path so that we don't detect the false deadlock due
to the group locking mechanism.  So if we are already holding the
relation extension lock and trying to acquire the same lock-in same
mode then it can never wait so this is safe.

> I continue to think that we'd be better off getting all of this
> out of the heavyweight lock manager.  There is no reason why we
> should need deadlock detection, or multiple holds of the same
> lock, or pretty much anything that LWLocks don't give you.

Right, we never need deadlock detection for this lock.  But, I think
there are quite a few cases where we have multiple holds at the same
time.  e.g, during RelationAddExtraBlocks, while holding the relation
extension lock we try to update the block in FSM and FSM might need to
add extra FSM block which will again try to acquire the same lock.

But, I think the main reason for not converting it to an LWLocks is
because Andres has a concern about inventing new lock mechanism as
discuss upthread[1]

[1] 
https://www.postgresql.org/message-id/20200220023612.c44ggploywxtlvmx%40alap3.anarazel.de

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-07 Thread Tom Lane

Dilip Kumar  writes:
> I think instead of the flag we need to keep the counter because we can
> acquire the same relation extension lock multiple times.

Uh ... what?  How would that not be broken usage on its face?

I continue to think that we'd be better off getting all of this
out of the heavyweight lock manager.  There is no reason why we
should need deadlock detection, or multiple holds of the same
lock, or pretty much anything that LWLocks don't give you.

regards, tom lane

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-07 Thread Dilip Kumar

On Sat, Mar 7, 2020 at 3:26 PM Amit Kapila  wrote:
>
> On Sat, Mar 7, 2020 at 11:14 AM Amit Kapila  wrote:
>>
>> On Sat, Mar 7, 2020 at 9:57 AM Dilip Kumar  wrote:
>>>
>>> On Fri, Mar 6, 2020 at 9:47 AM Amit Kapila  wrote:
>>> >
>>> > On Thu, Mar 5, 2020 at 1:54 PM Dilip Kumar  wrote:
>>> > >
>>> > > On Thu, Mar 5, 2020 at 12:15 PM Amit Kapila  
>>> > > wrote:
>>> > > >
>>> > > >
>>> > > > 5. I have also tried to think of another way to check if we already
>>> > > > hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a
>>> > > > cheaper way than this.  Basically, I think if we traverse the
>>> > > > MyProc->myProcLocks queue, we will get this information, but that
>>> > > > doesn't seem much cheaper than this.
>>> > >
>>> > > I think we can maintain a flag (rel_extlock_held).  And, we can set
>>> > > that true in LockRelationForExtension,
>>> > > ConditionalLockRelationForExtension functions and we can reset it in
>>> > > UnlockRelationForExtension or in the error path e.g. LockReleaseAll.
>>> > >
>>> >
>>> > I think if we reset it in LockReleaseAll during the error path, then
>>> > we need to find a way to reset it during LockReleaseCurrentOwner as
>>> > that is called during Subtransaction Abort which can be tricky as we
>>> > don't know if it belongs to the current owner.  How about resetting in
>>> > Abort(Sub)Transaction and CommitTransaction after we release locks via
>>> > ResourceOwnerRelease.
>>>
>>> I think instead of the flag we need to keep the counter because we can
>>> acquire the same relation extension lock multiple times.  So
>>> basically, every time we acquire the lock we can increment the counter
>>> and while releasing we can decrement it.   During an error path, I
>>> think it is fine to set it to 0 in CommitTransaction/AbortTransaction.
>>> But, I am not sure that we can set to 0 or decrement it in
>>> AbortSubTransaction because we are not sure whether we have acquired
>>> the lock under this subtransaction or not.
>>>
>>> Having said that,  I think there should not be any case that we are
>>> starting the sub-transaction while holding the relation extension
>>> lock.
>>
>>
>> Right, this is exactly the point.  I think we can mention this in comments 
>> to make it clear why setting it to zero is fine during subtransaction abort.
>
>
> Is there anything wrong with having an Assert during subtransaction start to 
> indicate that we don't have a relation extension lock?

Yes, I was planning to do that.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-07 Thread Amit Kapila

On Sat, Mar 7, 2020 at 11:14 AM Amit Kapila  wrote:

> On Sat, Mar 7, 2020 at 9:57 AM Dilip Kumar  wrote:
>
>> On Fri, Mar 6, 2020 at 9:47 AM Amit Kapila 
>> wrote:
>> >
>> > On Thu, Mar 5, 2020 at 1:54 PM Dilip Kumar 
>> wrote:
>> > >
>> > > On Thu, Mar 5, 2020 at 12:15 PM Amit Kapila 
>> wrote:
>> > > >
>> > > >
>> > > > 5. I have also tried to think of another way to check if we already
>> > > > hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a
>> > > > cheaper way than this.  Basically, I think if we traverse the
>> > > > MyProc->myProcLocks queue, we will get this information, but that
>> > > > doesn't seem much cheaper than this.
>> > >
>> > > I think we can maintain a flag (rel_extlock_held).  And, we can set
>> > > that true in LockRelationForExtension,
>> > > ConditionalLockRelationForExtension functions and we can reset it in
>> > > UnlockRelationForExtension or in the error path e.g. LockReleaseAll.
>> > >
>> >
>> > I think if we reset it in LockReleaseAll during the error path, then
>> > we need to find a way to reset it during LockReleaseCurrentOwner as
>> > that is called during Subtransaction Abort which can be tricky as we
>> > don't know if it belongs to the current owner.  How about resetting in
>> > Abort(Sub)Transaction and CommitTransaction after we release locks via
>> > ResourceOwnerRelease.
>>
>> I think instead of the flag we need to keep the counter because we can
>> acquire the same relation extension lock multiple times.  So
>> basically, every time we acquire the lock we can increment the counter
>> and while releasing we can decrement it.   During an error path, I
>> think it is fine to set it to 0 in CommitTransaction/AbortTransaction.
>> But, I am not sure that we can set to 0 or decrement it in
>> AbortSubTransaction because we are not sure whether we have acquired
>> the lock under this subtransaction or not.
>>
>> Having said that,  I think there should not be any case that we are
>> starting the sub-transaction while holding the relation extension
>> lock.
>>
>
> Right, this is exactly the point.  I think we can mention this in comments
> to make it clear why setting it to zero is fine during subtransaction
> abort.
>

Is there anything wrong with having an Assert during subtransaction start
to indicate that we don't have a relation extension lock?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-06 Thread Amit Kapila

On Sat, Mar 7, 2020 at 9:57 AM Dilip Kumar  wrote:

> On Fri, Mar 6, 2020 at 9:47 AM Amit Kapila 
> wrote:
> >
> > On Thu, Mar 5, 2020 at 1:54 PM Dilip Kumar 
> wrote:
> > >
> > > On Thu, Mar 5, 2020 at 12:15 PM Amit Kapila 
> wrote:
> > > >
> > > >
> > > > 5. I have also tried to think of another way to check if we already
> > > > hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a
> > > > cheaper way than this.  Basically, I think if we traverse the
> > > > MyProc->myProcLocks queue, we will get this information, but that
> > > > doesn't seem much cheaper than this.
> > >
> > > I think we can maintain a flag (rel_extlock_held).  And, we can set
> > > that true in LockRelationForExtension,
> > > ConditionalLockRelationForExtension functions and we can reset it in
> > > UnlockRelationForExtension or in the error path e.g. LockReleaseAll.
> > >
> >
> > I think if we reset it in LockReleaseAll during the error path, then
> > we need to find a way to reset it during LockReleaseCurrentOwner as
> > that is called during Subtransaction Abort which can be tricky as we
> > don't know if it belongs to the current owner.  How about resetting in
> > Abort(Sub)Transaction and CommitTransaction after we release locks via
> > ResourceOwnerRelease.
>
> I think instead of the flag we need to keep the counter because we can
> acquire the same relation extension lock multiple times.  So
> basically, every time we acquire the lock we can increment the counter
> and while releasing we can decrement it.   During an error path, I
> think it is fine to set it to 0 in CommitTransaction/AbortTransaction.
> But, I am not sure that we can set to 0 or decrement it in
> AbortSubTransaction because we are not sure whether we have acquired
> the lock under this subtransaction or not.
>
> Having said that,  I think there should not be any case that we are
> starting the sub-transaction while holding the relation extension
> lock.
>

Right, this is exactly the point.  I think we can mention this in comments
to make it clear why setting it to zero is fine during subtransaction
abort.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-06 Thread Dilip Kumar

On Fri, Mar 6, 2020 at 9:47 AM Amit Kapila  wrote:
>
> On Thu, Mar 5, 2020 at 1:54 PM Dilip Kumar  wrote:
> >
> > On Thu, Mar 5, 2020 at 12:15 PM Amit Kapila  wrote:
> > >
> > >
> > > 5. I have also tried to think of another way to check if we already
> > > hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a
> > > cheaper way than this.  Basically, I think if we traverse the
> > > MyProc->myProcLocks queue, we will get this information, but that
> > > doesn't seem much cheaper than this.
> >
> > I think we can maintain a flag (rel_extlock_held).  And, we can set
> > that true in LockRelationForExtension,
> > ConditionalLockRelationForExtension functions and we can reset it in
> > UnlockRelationForExtension or in the error path e.g. LockReleaseAll.
> >
>
> I think if we reset it in LockReleaseAll during the error path, then
> we need to find a way to reset it during LockReleaseCurrentOwner as
> that is called during Subtransaction Abort which can be tricky as we
> don't know if it belongs to the current owner.  How about resetting in
> Abort(Sub)Transaction and CommitTransaction after we release locks via
> ResourceOwnerRelease.

I think instead of the flag we need to keep the counter because we can
acquire the same relation extension lock multiple times.  So
basically, every time we acquire the lock we can increment the counter
and while releasing we can decrement it.   During an error path, I
think it is fine to set it to 0 in CommitTransaction/AbortTransaction.
But, I am not sure that we can set to 0 or decrement it in
AbortSubTransaction because we are not sure whether we have acquired
the lock under this subtransaction or not.

Having said that,  I think there should not be any case that we are
starting the sub-transaction while holding the relation extension
lock.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-06 Thread Robert Haas

On Thu, Mar 5, 2020 at 11:45 PM Amit Kapila  wrote:
> I think we can keep such a flag in TopTransactionState.   We free such
> locks after the work is done (except during error where we free them
> at transaction abort) rather than at transaction commit, so one might
> say it is better not to associate with transaction state, but not sure
> if there is other better place.  Do you have any suggestions?

I assumed it would be a global variable in lock.c.  lock.c has got to
know when any lock is required or released, so I don't know why we
need to involve xact.c in the bookkeeping.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-05 Thread Amit Kapila

On Fri, Mar 6, 2020 at 2:19 AM Robert Haas  wrote:
>
> On Thu, Mar 5, 2020 at 2:18 PM Mahendra Singh Thalor  
> wrote:
> > Here, attaching new patch set for review.
>
> I was kind of assuming that the way this would work is that it would
> set a flag or increment a counter or something when we acquire a
> relation extension lock, and then reverse the process when we release
> it. Then the Assert could just check the flag. Walking the whole
> LOCALLOCK table is expensive.
>

I think we can keep such a flag in TopTransactionState.   We free such
locks after the work is done (except during error where we free them
at transaction abort) rather than at transaction commit, so one might
say it is better not to associate with transaction state, but not sure
if there is other better place.  Do you have any suggestions?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-05 Thread Amit Kapila

On Thu, Mar 5, 2020 at 1:54 PM Dilip Kumar  wrote:
>
> On Thu, Mar 5, 2020 at 12:15 PM Amit Kapila  wrote:
> >
> >
> > 5. I have also tried to think of another way to check if we already
> > hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a
> > cheaper way than this.  Basically, I think if we traverse the
> > MyProc->myProcLocks queue, we will get this information, but that
> > doesn't seem much cheaper than this.
>
> I think we can maintain a flag (rel_extlock_held).  And, we can set
> that true in LockRelationForExtension,
> ConditionalLockRelationForExtension functions and we can reset it in
> UnlockRelationForExtension or in the error path e.g. LockReleaseAll.
>

I think if we reset it in LockReleaseAll during the error path, then
we need to find a way to reset it during LockReleaseCurrentOwner as
that is called during Subtransaction Abort which can be tricky as we
don't know if it belongs to the current owner.  How about resetting in
Abort(Sub)Transaction and CommitTransaction after we release locks via
ResourceOwnerRelease.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-05 Thread Robert Haas

On Thu, Mar 5, 2020 at 2:18 PM Mahendra Singh Thalor  wrote:
> Here, attaching new patch set for review.

I was kind of assuming that the way this would work is that it would
set a flag or increment a counter or something when we acquire a
relation extension lock, and then reverse the process when we release
it. Then the Assert could just check the flag. Walking the whole
LOCALLOCK table is expensive.

Also, spelling counts. This patch features "extention" multiple times,
plus also "hask," "beloging," "belog," and "whle", which is an awful
lot of typos for a 70-line patch. If you are using macOS, try opening
the patch in TextEdit. If you are inventing a new function name, spell
the words you include the same way they are spelled elsewhere.

Even aside from the typo, AssertAnyExtentionLockHeadByMe() is not a
very good function name. It sounds like it's asserting that we hold an
extension lock, rather than that we don't, and also, that's not
exactly what it checks anyway, because there's this special case for
when we're acquiring a relation extension lock we already hold.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-05 Thread Mahendra Singh Thalor

On Wed, 4 Mar 2020 at 12:03, Dilip Kumar  wrote:
>
> On Wed, Mar 4, 2020 at 11:45 AM Mahendra Singh Thalor
>  wrote:
> >
> > On Mon, 24 Feb 2020 at 15:39, Amit Kapila  wrote:
> > >
> > > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
> > > >
> > > > Hi,
> > > >
> > > > On 2020-02-19 11:12:18 +0530, Amit Kapila wrote:
> > > > > I think till we know the real need for changing group locking, going
> > > > > in the direction of what Tom suggested to use an array of LWLocks [1]
> > > > > to address the problems in hand is a good idea.
> > > >
> > > > -many
> > > >
> > > > I think that building yet another locking subsystem is the entirely
> > > > wrong idea - especially when there's imo no convincing architectural
> > > > reasons to do so.
> > > >
> > >
> > > Hmm, AFAIU, it will be done by having an array of LWLocks which we do
> > > at other places as well (like BufferIO locks).  I am not sure if we
> > > can call it as new locking subsystem, but if we decide to continue
> > > using lock.c and change group locking then I think we can do that as
> > > well, see my comments below regarding that.
> > >
> > > >
> > > > > It is not very clear to me that are we thinking to give up on Tom's
> > > > > idea [1] and change group locking even though it is not clear or at
> > > > > least nobody has proposed an idea/patch which requires that?  Or are
> > > > > we thinking that we can do what Tom suggested for relation extension
> > > > > lock and also plan to change group locking for future parallel
> > > > > operations that might require it?
> > > >
> > > > What I'm advocating is that extension locks should continue to go
> > > > through lock.c. And yes, that requires some changes to group locking,
> > > > but I still don't see why they'd be complicated.
> > > >
> > >
> > > Fair position, as per initial analysis, I think if we do below three
> > > things, it should work out without changing to a new way of locking
> > > for relation extension or page type locks.
> > > a. As per the discussion above, ensure in code we will never try to
> > > acquire another heavy-weight lock after acquiring relation extension
> > > or page type locks (probably by having Asserts in code or maybe some
> > > other way).
> > > b. Change lock.c so that group locking is not considered for these two
> > > lock types. For ex. in LockCheckConflicts, along with the check (if
> > > (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)),
> > > we also check lock->tag and call it a conflict for these two locks.
> > > c. The deadlock detector can ignore checking these two types of locks
> > > because point (a) ensures that those won't lead to deadlock.  One idea
> > > could be that FindLockCycleRecurseMember just ignores these two types
> > > of locks by checking the lock tag.
> >
> > Thanks Amit for summary.
> >
> > Based on above 3 points, here attaching 2 patches for review.
> >
> > 1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip 
> > Kumar)
> > Basically this patch is for point b and c.
> >
> > 2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch
> > (Patch by me)
> > This patch is for point a.
> >
> > After applying both the patches, make check-world is passing.
> >
> > We are testing both the patches and will post results.
> >
> > Thoughts?
>
> +static void AssertAnyExtentionLockHeadByMe(void);
>
> +/*
> + * AssertAnyExtentionLockHeadByMe -- test whether any EXTENSION lock held by
> + * this backend.  If any EXTENSION lock is hold by this backend, then assert
> + * will fail.  To use this function, assert should be enabled.
> + */
> +void AssertAnyExtentionLockHeadByMe()
> +{
>
> Some minor observations on 0002.
> 1. static is missing in a function definition.
> 2. Function name should start in new line after function return type
> in function definition, as per pg guideline.
> +void AssertAnyExtentionLockHeadByMe()
> ->
> void
> AssertAnyExtentionLockHeadByMe()

Thanks Dilip for review.

I have fixed above 2 points in v2 patch set.

-- 
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-05 Thread Mahendra Singh Thalor

On Thu, 5 Mar 2020 at 13:54, Dilip Kumar  wrote:
>
> On Thu, Mar 5, 2020 at 12:15 PM Amit Kapila  wrote:
> >
> > On Wed, Mar 4, 2020 at 11:45 AM Mahendra Singh Thalor
> >  wrote:
> > >
> > > On Mon, 24 Feb 2020 at 15:39, Amit Kapila  wrote:
> > > >
> > > > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  
> > > > wrote:
> > > > >
> > > > > What I'm advocating is that extension locks should continue to go
> > > > > through lock.c. And yes, that requires some changes to group locking,
> > > > > but I still don't see why they'd be complicated.
> > > > >
> > > >
> > > > Fair position, as per initial analysis, I think if we do below three
> > > > things, it should work out without changing to a new way of locking
> > > > for relation extension or page type locks.
> > > > a. As per the discussion above, ensure in code we will never try to
> > > > acquire another heavy-weight lock after acquiring relation extension
> > > > or page type locks (probably by having Asserts in code or maybe some
> > > > other way).
> > > > b. Change lock.c so that group locking is not considered for these two
> > > > lock types. For ex. in LockCheckConflicts, along with the check (if
> > > > (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)),
> > > > we also check lock->tag and call it a conflict for these two locks.
> > > > c. The deadlock detector can ignore checking these two types of locks
> > > > because point (a) ensures that those won't lead to deadlock.  One idea
> > > > could be that FindLockCycleRecurseMember just ignores these two types
> > > > of locks by checking the lock tag.
> > >
> > > Thanks Amit for summary.
> > >
> > > Based on above 3 points, here attaching 2 patches for review.
> > >
> > > 1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip 
> > > Kumar)
> > > Basically this patch is for point b and c.
> > >
> > > 2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch
> > > (Patch by me)
> > > This patch is for point a.
> > >
> > > After applying both the patches, make check-world is passing.
> > >
> > > We are testing both the patches and will post results.
> > >
> >

Thanks Amit and Dilip for reviewing the patches.

> > I think we need to do detailed code review in the places where we are
> > taking Relation Extension Lock and see whether we are acquiring
> > another heavy-weight lock after that. It seems to me that in
> > brin_getinsertbuffer, after acquiring Relation Extension Lock, we
> > might again try to acquire the same lock.  See
> > brin_initialize_empty_new_buffer which is called after acquiring
> > Relation Extension Lock, in that function, we call
> > RecordPageWithFreeSpace and that can again try to acquire the same
> > lock if it needs to perform fsm_extend.  I think there will be similar
> > instances in the code.  I think it is fine if we again try to acquire
> > it, but the current assertion in your patch needs to be adjusted for
> > that.

I agree with you.  Dilip is doing code review and he will post
results.  As you mentioned that while holing Relation Extension Lock,
we might again try to acquire same Relation Extension Lock, so to
handle this in assert I did some changes in patch and attaching patch
for review. (I will test this scenario)

> >
> > Few other minor comments on
> > v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any:
> > 1. Ideally, this should be the first patch as we first need to ensure
> > that we don't take any heavy-weight locks after acquiring a relation
> > extension lock.

Fixed.

> > 2. I think it is better to add an Assert after initial error checks
> > (after RecoveryInProgress().. check)

I am not getting your points. Can you explain me, that which type of
assert you are suggesting?

> > 3.
> > + Assert (locallock->tag.lock.locktag_type != LOCKTAG_RELATION_EXTEND ||
> > + locallock->nLocks == 0);
> >
> > I think it is not possible that we have an entry in
> > LockMethodLocalHash and its value is zero.  Do you see any such
> > possibility, if not, then we might want to remove it?

Yes, this condition is not needed. Fixed.

> >
> > 4. We already have a macro for LOCALLOCK_LOCKMETHOD, can we write
> > another one tag type?  This will make the check look a bit cleaner and
> > probably if we need to extend it in future for Page type locks, then
> > also it will be good.

Good point. I added macros in this version.

Here, attaching new patch set for review.

Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com


v02_0001-Added-assert-to-verify-that-we-never-try-to-take-any.patch
Description: Binary data


v02_0002-Conflict-EXTENTION-lock-in-group-member.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-05 Thread Dilip Kumar

On Thu, Mar 5, 2020 at 12:15 PM Amit Kapila  wrote:
>
> On Wed, Mar 4, 2020 at 11:45 AM Mahendra Singh Thalor
>  wrote:
> >
> > On Mon, 24 Feb 2020 at 15:39, Amit Kapila  wrote:
> > >
> > > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
> > > >
> > > > What I'm advocating is that extension locks should continue to go
> > > > through lock.c. And yes, that requires some changes to group locking,
> > > > but I still don't see why they'd be complicated.
> > > >
> > >
> > > Fair position, as per initial analysis, I think if we do below three
> > > things, it should work out without changing to a new way of locking
> > > for relation extension or page type locks.
> > > a. As per the discussion above, ensure in code we will never try to
> > > acquire another heavy-weight lock after acquiring relation extension
> > > or page type locks (probably by having Asserts in code or maybe some
> > > other way).
> > > b. Change lock.c so that group locking is not considered for these two
> > > lock types. For ex. in LockCheckConflicts, along with the check (if
> > > (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)),
> > > we also check lock->tag and call it a conflict for these two locks.
> > > c. The deadlock detector can ignore checking these two types of locks
> > > because point (a) ensures that those won't lead to deadlock.  One idea
> > > could be that FindLockCycleRecurseMember just ignores these two types
> > > of locks by checking the lock tag.
> >
> > Thanks Amit for summary.
> >
> > Based on above 3 points, here attaching 2 patches for review.
> >
> > 1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip 
> > Kumar)
> > Basically this patch is for point b and c.
> >
> > 2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch
> > (Patch by me)
> > This patch is for point a.
> >
> > After applying both the patches, make check-world is passing.
> >
> > We are testing both the patches and will post results.
> >
>
> I think we need to do detailed code review in the places where we are
> taking Relation Extension Lock and see whether we are acquiring
> another heavy-weight lock after that. It seems to me that in
> brin_getinsertbuffer, after acquiring Relation Extension Lock, we
> might again try to acquire the same lock.  See
> brin_initialize_empty_new_buffer which is called after acquiring
> Relation Extension Lock, in that function, we call
> RecordPageWithFreeSpace and that can again try to acquire the same
> lock if it needs to perform fsm_extend.  I think there will be similar
> instances in the code.  I think it is fine if we again try to acquire
> it, but the current assertion in your patch needs to be adjusted for
> that.
>
> Few other minor comments on
> v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any:
> 1. Ideally, this should be the first patch as we first need to ensure
> that we don't take any heavy-weight locks after acquiring a relation
> extension lock.
>
> 2. I think it is better to add an Assert after initial error checks
> (after RecoveryInProgress().. check)
>
> 3.
> + Assert (locallock->tag.lock.locktag_type != LOCKTAG_RELATION_EXTEND ||
> + locallock->nLocks == 0);
>
> I think it is not possible that we have an entry in
> LockMethodLocalHash and its value is zero.  Do you see any such
> possibility, if not, then we might want to remove it?
>
> 4. We already have a macro for LOCALLOCK_LOCKMETHOD, can we write
> another one tag type?  This will make the check look a bit cleaner and
> probably if we need to extend it in future for Page type locks, then
> also it will be good.
>
> 5. I have also tried to think of another way to check if we already
> hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a
> cheaper way than this.  Basically, I think if we traverse the
> MyProc->myProcLocks queue, we will get this information, but that
> doesn't seem much cheaper than this.

I think we can maintain a flag (rel_extlock_held).  And, we can set
that true in LockRelationForExtension,
ConditionalLockRelationForExtension functions and we can reset it in
UnlockRelationForExtension or in the error path e.g. LockReleaseAll.
I think, this way we will be able to elog and this will be much
cheaper.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-04 Thread Amit Kapila

On Wed, Mar 4, 2020 at 11:45 AM Mahendra Singh Thalor
 wrote:
>
> On Mon, 24 Feb 2020 at 15:39, Amit Kapila  wrote:
> >
> > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
> > >
> > > What I'm advocating is that extension locks should continue to go
> > > through lock.c. And yes, that requires some changes to group locking,
> > > but I still don't see why they'd be complicated.
> > >
> >
> > Fair position, as per initial analysis, I think if we do below three
> > things, it should work out without changing to a new way of locking
> > for relation extension or page type locks.
> > a. As per the discussion above, ensure in code we will never try to
> > acquire another heavy-weight lock after acquiring relation extension
> > or page type locks (probably by having Asserts in code or maybe some
> > other way).
> > b. Change lock.c so that group locking is not considered for these two
> > lock types. For ex. in LockCheckConflicts, along with the check (if
> > (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)),
> > we also check lock->tag and call it a conflict for these two locks.
> > c. The deadlock detector can ignore checking these two types of locks
> > because point (a) ensures that those won't lead to deadlock.  One idea
> > could be that FindLockCycleRecurseMember just ignores these two types
> > of locks by checking the lock tag.
>
> Thanks Amit for summary.
>
> Based on above 3 points, here attaching 2 patches for review.
>
> 1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip 
> Kumar)
> Basically this patch is for point b and c.
>
> 2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch
> (Patch by me)
> This patch is for point a.
>
> After applying both the patches, make check-world is passing.
>
> We are testing both the patches and will post results.
>

I think we need to do detailed code review in the places where we are
taking Relation Extension Lock and see whether we are acquiring
another heavy-weight lock after that. It seems to me that in
brin_getinsertbuffer, after acquiring Relation Extension Lock, we
might again try to acquire the same lock.  See
brin_initialize_empty_new_buffer which is called after acquiring
Relation Extension Lock, in that function, we call
RecordPageWithFreeSpace and that can again try to acquire the same
lock if it needs to perform fsm_extend.  I think there will be similar
instances in the code.  I think it is fine if we again try to acquire
it, but the current assertion in your patch needs to be adjusted for
that.

Few other minor comments on
v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any:
1. Ideally, this should be the first patch as we first need to ensure
that we don't take any heavy-weight locks after acquiring a relation
extension lock.

2. I think it is better to add an Assert after initial error checks
(after RecoveryInProgress().. check)

3.
+ Assert (locallock->tag.lock.locktag_type != LOCKTAG_RELATION_EXTEND ||
+ locallock->nLocks == 0);

I think it is not possible that we have an entry in
LockMethodLocalHash and its value is zero.  Do you see any such
possibility, if not, then we might want to remove it?

4. We already have a macro for LOCALLOCK_LOCKMETHOD, can we write
another one tag type?  This will make the check look a bit cleaner and
probably if we need to extend it in future for Page type locks, then
also it will be good.

5. I have also tried to think of another way to check if we already
hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a
cheaper way than this.  Basically, I think if we traverse the
MyProc->myProcLocks queue, we will get this information, but that
doesn't seem much cheaper than this.

6. Another thing that could be possible is to make this a test and
elog so that it can hit in production scenarios, but I think the cost
of that will be high unless we have a very simple way to write this
test condition.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-04 Thread Mahendra Singh Thalor

On Wed, 4 Mar 2020 at 12:03, Dilip Kumar  wrote:
>
> On Wed, Mar 4, 2020 at 11:45 AM Mahendra Singh Thalor
>  wrote:
> >
> > On Mon, 24 Feb 2020 at 15:39, Amit Kapila  wrote:
> > >
> > > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
> > > >
> > > > Hi,
> > > >
> > > > On 2020-02-19 11:12:18 +0530, Amit Kapila wrote:
> > > > > I think till we know the real need for changing group locking, going
> > > > > in the direction of what Tom suggested to use an array of LWLocks [1]
> > > > > to address the problems in hand is a good idea.
> > > >
> > > > -many
> > > >
> > > > I think that building yet another locking subsystem is the entirely
> > > > wrong idea - especially when there's imo no convincing architectural
> > > > reasons to do so.
> > > >
> > >
> > > Hmm, AFAIU, it will be done by having an array of LWLocks which we do
> > > at other places as well (like BufferIO locks).  I am not sure if we
> > > can call it as new locking subsystem, but if we decide to continue
> > > using lock.c and change group locking then I think we can do that as
> > > well, see my comments below regarding that.
> > >
> > > >
> > > > > It is not very clear to me that are we thinking to give up on Tom's
> > > > > idea [1] and change group locking even though it is not clear or at
> > > > > least nobody has proposed an idea/patch which requires that?  Or are
> > > > > we thinking that we can do what Tom suggested for relation extension
> > > > > lock and also plan to change group locking for future parallel
> > > > > operations that might require it?
> > > >
> > > > What I'm advocating is that extension locks should continue to go
> > > > through lock.c. And yes, that requires some changes to group locking,
> > > > but I still don't see why they'd be complicated.
> > > >
> > >
> > > Fair position, as per initial analysis, I think if we do below three
> > > things, it should work out without changing to a new way of locking
> > > for relation extension or page type locks.
> > > a. As per the discussion above, ensure in code we will never try to
> > > acquire another heavy-weight lock after acquiring relation extension
> > > or page type locks (probably by having Asserts in code or maybe some
> > > other way).
> > > b. Change lock.c so that group locking is not considered for these two
> > > lock types. For ex. in LockCheckConflicts, along with the check (if
> > > (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)),
> > > we also check lock->tag and call it a conflict for these two locks.
> > > c. The deadlock detector can ignore checking these two types of locks
> > > because point (a) ensures that those won't lead to deadlock.  One idea
> > > could be that FindLockCycleRecurseMember just ignores these two types
> > > of locks by checking the lock tag.
> >
> > Thanks Amit for summary.
> >
> > Based on above 3 points, here attaching 2 patches for review.
> >
> > 1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip 
> > Kumar)
> > Basically this patch is for point b and c.
> >
> > 2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch
> > (Patch by me)
> > This patch is for point a.
> >
> > After applying both the patches, make check-world is passing.
> >
> > We are testing both the patches and will post results.

Hi all,

I am planing to test below 3 points on v1 patch set:

1. We will check that new added assert can be hit by hacking code
(while holding extension lock, try to take any heavyweight lock)
2. In FindLockCycleRecurseMember, for testing purposes, we can put
additional loop to check that for all relext holders, there must not
be any outer edge.
3. Test that group members are not granted the lock for the relation
extension lock (group members should conflict).

Please let me know your thoughts to test this patch.

-- 
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-03 Thread Dilip Kumar

On Wed, Mar 4, 2020 at 11:45 AM Mahendra Singh Thalor
 wrote:
>
> On Mon, 24 Feb 2020 at 15:39, Amit Kapila  wrote:
> >
> > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
> > >
> > > Hi,
> > >
> > > On 2020-02-19 11:12:18 +0530, Amit Kapila wrote:
> > > > I think till we know the real need for changing group locking, going
> > > > in the direction of what Tom suggested to use an array of LWLocks [1]
> > > > to address the problems in hand is a good idea.
> > >
> > > -many
> > >
> > > I think that building yet another locking subsystem is the entirely
> > > wrong idea - especially when there's imo no convincing architectural
> > > reasons to do so.
> > >
> >
> > Hmm, AFAIU, it will be done by having an array of LWLocks which we do
> > at other places as well (like BufferIO locks).  I am not sure if we
> > can call it as new locking subsystem, but if we decide to continue
> > using lock.c and change group locking then I think we can do that as
> > well, see my comments below regarding that.
> >
> > >
> > > > It is not very clear to me that are we thinking to give up on Tom's
> > > > idea [1] and change group locking even though it is not clear or at
> > > > least nobody has proposed an idea/patch which requires that?  Or are
> > > > we thinking that we can do what Tom suggested for relation extension
> > > > lock and also plan to change group locking for future parallel
> > > > operations that might require it?
> > >
> > > What I'm advocating is that extension locks should continue to go
> > > through lock.c. And yes, that requires some changes to group locking,
> > > but I still don't see why they'd be complicated.
> > >
> >
> > Fair position, as per initial analysis, I think if we do below three
> > things, it should work out without changing to a new way of locking
> > for relation extension or page type locks.
> > a. As per the discussion above, ensure in code we will never try to
> > acquire another heavy-weight lock after acquiring relation extension
> > or page type locks (probably by having Asserts in code or maybe some
> > other way).
> > b. Change lock.c so that group locking is not considered for these two
> > lock types. For ex. in LockCheckConflicts, along with the check (if
> > (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)),
> > we also check lock->tag and call it a conflict for these two locks.
> > c. The deadlock detector can ignore checking these two types of locks
> > because point (a) ensures that those won't lead to deadlock.  One idea
> > could be that FindLockCycleRecurseMember just ignores these two types
> > of locks by checking the lock tag.
>
> Thanks Amit for summary.
>
> Based on above 3 points, here attaching 2 patches for review.
>
> 1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip 
> Kumar)
> Basically this patch is for point b and c.
>
> 2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch
> (Patch by me)
> This patch is for point a.
>
> After applying both the patches, make check-world is passing.
>
> We are testing both the patches and will post results.
>
> Thoughts?

+static void AssertAnyExtentionLockHeadByMe(void);

+/*
+ * AssertAnyExtentionLockHeadByMe -- test whether any EXTENSION lock held by
+ * this backend.  If any EXTENSION lock is hold by this backend, then assert
+ * will fail.  To use this function, assert should be enabled.
+ */
+void AssertAnyExtentionLockHeadByMe()
+{

Some minor observations on 0002.
1. static is missing in a function definition.
2. Function name should start in new line after function return type
in function definition, as per pg guideline.
+void AssertAnyExtentionLockHeadByMe()
->
void
AssertAnyExtentionLockHeadByMe()

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-03-03 Thread Mahendra Singh Thalor

On Mon, 24 Feb 2020 at 15:39, Amit Kapila  wrote:
>
> On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
> >
> > Hi,
> >
> > On 2020-02-19 11:12:18 +0530, Amit Kapila wrote:
> > > I think till we know the real need for changing group locking, going
> > > in the direction of what Tom suggested to use an array of LWLocks [1]
> > > to address the problems in hand is a good idea.
> >
> > -many
> >
> > I think that building yet another locking subsystem is the entirely
> > wrong idea - especially when there's imo no convincing architectural
> > reasons to do so.
> >
>
> Hmm, AFAIU, it will be done by having an array of LWLocks which we do
> at other places as well (like BufferIO locks).  I am not sure if we
> can call it as new locking subsystem, but if we decide to continue
> using lock.c and change group locking then I think we can do that as
> well, see my comments below regarding that.
>
> >
> > > It is not very clear to me that are we thinking to give up on Tom's
> > > idea [1] and change group locking even though it is not clear or at
> > > least nobody has proposed an idea/patch which requires that?  Or are
> > > we thinking that we can do what Tom suggested for relation extension
> > > lock and also plan to change group locking for future parallel
> > > operations that might require it?
> >
> > What I'm advocating is that extension locks should continue to go
> > through lock.c. And yes, that requires some changes to group locking,
> > but I still don't see why they'd be complicated.
> >
>
> Fair position, as per initial analysis, I think if we do below three
> things, it should work out without changing to a new way of locking
> for relation extension or page type locks.
> a. As per the discussion above, ensure in code we will never try to
> acquire another heavy-weight lock after acquiring relation extension
> or page type locks (probably by having Asserts in code or maybe some
> other way).
> b. Change lock.c so that group locking is not considered for these two
> lock types. For ex. in LockCheckConflicts, along with the check (if
> (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)),
> we also check lock->tag and call it a conflict for these two locks.
> c. The deadlock detector can ignore checking these two types of locks
> because point (a) ensures that those won't lead to deadlock.  One idea
> could be that FindLockCycleRecurseMember just ignores these two types
> of locks by checking the lock tag.

Thanks Amit for summary.

Based on above 3 points, here attaching 2 patches for review.

1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip Kumar)
Basically this patch is for point b and c.

2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch
(Patch by me)
This patch is for point a.

After applying both the patches, make check-world is passing.

We are testing both the patches and will post results.

Thoughts?

-- 
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com


v01_0001-Conflict-EXTENTION-lock-in-group-member.patch
Description: Binary data


v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch
Description: Binary data

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-24 Thread Amit Kapila

On Thu, Feb 20, 2020 at 8:06 AM Andres Freund  wrote:
>
> Hi,
>
> On 2020-02-19 11:12:18 +0530, Amit Kapila wrote:
> > I think till we know the real need for changing group locking, going
> > in the direction of what Tom suggested to use an array of LWLocks [1]
> > to address the problems in hand is a good idea.
>
> -many
>
> I think that building yet another locking subsystem is the entirely
> wrong idea - especially when there's imo no convincing architectural
> reasons to do so.
>

Hmm, AFAIU, it will be done by having an array of LWLocks which we do
at other places as well (like BufferIO locks).  I am not sure if we
can call it as new locking subsystem, but if we decide to continue
using lock.c and change group locking then I think we can do that as
well, see my comments below regarding that.

>
> > It is not very clear to me that are we thinking to give up on Tom's
> > idea [1] and change group locking even though it is not clear or at
> > least nobody has proposed an idea/patch which requires that?  Or are
> > we thinking that we can do what Tom suggested for relation extension
> > lock and also plan to change group locking for future parallel
> > operations that might require it?
>
> What I'm advocating is that extension locks should continue to go
> through lock.c. And yes, that requires some changes to group locking,
> but I still don't see why they'd be complicated.
>

Fair position, as per initial analysis, I think if we do below three
things, it should work out without changing to a new way of locking
for relation extension or page type locks.
a. As per the discussion above, ensure in code we will never try to
acquire another heavy-weight lock after acquiring relation extension
or page type locks (probably by having Asserts in code or maybe some
other way).
b. Change lock.c so that group locking is not considered for these two
lock types. For ex. in LockCheckConflicts, along with the check (if
(proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)),
we also check lock->tag and call it a conflict for these two locks.
c. The deadlock detector can ignore checking these two types of locks
because point (a) ensures that those won't lead to deadlock.  One idea
could be that FindLockCycleRecurseMember just ignores these two types
of locks by checking the lock tag.

It is possible that I might be missing something or we could achieve
this some other way as well.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-19 Thread Andres Freund

Hi,

On 2020-02-19 11:12:18 +0530, Amit Kapila wrote:
> I think till we know the real need for changing group locking, going
> in the direction of what Tom suggested to use an array of LWLocks [1]
> to address the problems in hand is a good idea.

-many

I think that building yet another locking subsystem is the entirely
wrong idea - especially when there's imo no convincing architectural
reasons to do so.

> It is not very clear to me that are we thinking to give up on Tom's
> idea [1] and change group locking even though it is not clear or at
> least nobody has proposed an idea/patch which requires that?  Or are
> we thinking that we can do what Tom suggested for relation extension
> lock and also plan to change group locking for future parallel
> operations that might require it?

What I'm advocating is that extension locks should continue to go
through lock.c. And yes, that requires some changes to group locking,
but I still don't see why they'd be complicated. And if there's concerns
about the cost of lock.c, I outlined a pretty long list of improvements
that'll help everyone, and I showed that the locking itself isn't
actually a large fraction of the scalability issues that extension has.

Regards,

Andres

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-18 Thread Amit Kapila

On Mon, Feb 17, 2020 at 2:42 AM Tom Lane  wrote:
>
> Andres Freund  writes:
> > On 2020-02-14 13:34:03 -0500, Robert Haas wrote:
> >> I think the group locking + deadlock detection things are more
> >> fundamental than you might be crediting, but I agree that having
> >> parallel mechanisms has its own set of pitfalls.
>
> > It's possible. But I'm also hesitant to believe that we'll not need
> > other lock types that conflict between leader/worker, but that still
> > need deadlock detection.  The more work we want to parallelize, the more
> > likely that imo will become.
>
> Yeah.  The concept that leader and workers can't conflict seems to me
> to be dependent, in a very fundamental way, on the assumption that
> we only need to parallelize read-only workloads.  I don't think that's
> going to have a long half-life.
>

Surely, someday, we need to solve that problem.  But it is not clear
when because if we see the operations for which we want to solve the
relation extension lock problem doesn't require that.  For example,
for a parallel copy or further improving parallel vacuum to allow
multiple workers to scan and process the heap and individual index, we
don't need to change anything in group locking as far as I understand.

Now, for parallel deletes/updates, I think it will depend on how we
choose to parallelize those operations.  I mean if we decide that each
worker will work on an independent set of pages like we do for a
sequential scan, we again might not need to change the group locking
unless I am missing something which is possible.

I think till we know the real need for changing group locking, going
in the direction of what Tom suggested to use an array of LWLocks [1]
to address the problems in hand is a good idea.  It is not very clear
to me that are we thinking to give up on Tom's idea [1] and change
group locking even though it is not clear or at least nobody has
proposed an idea/patch which requires that?  Or are we thinking that
we can do what Tom suggested for relation extension lock and also plan
to change group locking for future parallel operations that might
require it?

[1] - https://www.postgresql.org/message-id/19443.1581435793%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-16 Thread Tom Lane

Andres Freund  writes:
> On 2020-02-14 13:34:03 -0500, Robert Haas wrote:
>> I think the group locking + deadlock detection things are more
>> fundamental than you might be crediting, but I agree that having
>> parallel mechanisms has its own set of pitfalls.

> It's possible. But I'm also hesitant to believe that we'll not need
> other lock types that conflict between leader/worker, but that still
> need deadlock detection.  The more work we want to parallelize, the more
> likely that imo will become.

Yeah.  The concept that leader and workers can't conflict seems to me
to be dependent, in a very fundamental way, on the assumption that
we only need to parallelize read-only workloads.  I don't think that's
going to have a long half-life.

regards, tom lane

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-16 Thread Andres Freund

Hi,

On 2020-02-14 13:34:03 -0500, Robert Haas wrote:
> On Fri, Feb 14, 2020 at 1:07 PM Andres Freund  wrote:
> > Yea, that seems possible.  I'm not really sure it's needed however? As
> > long as you're not teaching the locking mechanism new tricks that
> > influence the wait graph, why would the deadlock detector care? That's
> > quite different from the group locking case, where you explicitly needed
> > to teach it something fairly fundamental.
> 
> Well, you have to teach it that locks of certain types conflict even
> if they are in the same group, and that bleeds over pretty quickly
> into the whole area of deadlock detection, because lock waits are the
> edges in the graph that the deadlock detector processes.

Shouldn't this *theretically* be doable with changes mostly localized to
lock.c, by not using proc->lockGroupLeader but proc for lock types that
don't support group locking? I do see that deadlock.c largely looks at
->lockGroupLeader, but that kind of doesn't seem right to me.


> > It might still be a good idea independently to add the rule & enforce
> > that acquire heavyweight locks while holding certain classes of locks is
> > not allowed.
> 
> I think that's absolutely essential, if we're going to continue using
> the main lock manager for this. I remain somewhat unconvinced that
> doing so is the best way forward, but it is *a* way forward.

Seems like we should build this part independently of the lock.c/new
infra piece.


> > Right. For me that's *the* fundamental service that lock.c delivers. And
> > it's the fundamental bit this thread so far largely has been focusing
> > on.
> 
> For me, the deadlock detection is the far more complicated and problematic 
> bit.
> 
> > Isn't that mostly true to varying degrees for the majority of lock types
> > in lock.c? Sure, perhaps historically that's a misuse of lock.c, but
> > it's been pretty ingrained by now.  I just don't see where leaving out
> > any of these features is going to give us fundamental advantages
> > justifying a different locking infrastructure.
> 
> I think the group locking + deadlock detection things are more
> fundamental than you might be crediting, but I agree that having
> parallel mechanisms has its own set of pitfalls.

It's possible. But I'm also hesitant to believe that we'll not need
other lock types that conflict between leader/worker, but that still
need deadlock detection.  The more work we want to parallelize, the more
likely that imo will become.

Greetings,

Andres Freund

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Amit Kapila

On Fri, Feb 14, 2020 at 9:13 PM Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Wed, Feb 12, 2020 at 11:53 AM Tom Lane  wrote:
>
> > That's an interesting idea, but it doesn't make the lock acquisition
> > itself interruptible, which seems pretty important to me in this case.
>
> Good point: if you think the contained operation might run too long to
> suit you, then you don't want other backends to be stuck behind it for
> the same amount of time.
>

It is not clear to me why we should add that as a requirement for this
patch when other places like WALWriteLock, etc. have similar coding
patterns and we haven't heard a ton of complaints about making it
interruptable or if there are then I am not aware.

> > I wonder if we could have an LWLockAcquireInterruptibly() or some such
> > that allows the lock acquisition itself to be interruptible. I think
> > that would require some rejiggering but it might be doable.
>
> Yeah, I had the impression from a brief look at LWLockAcquire that
> it was itself depending on not throwing errors partway through.
> But with careful and perhaps-a-shade-slower coding, we could probably
> make a version that didn't require that.
>

If this becomes a requirement to move this patch, then surely we can
do that.  BTW, what exactly we need to ensure for it?  Is it something
on the lines of ensuring that in error path the state of the lock is
cleared?  Are we worried that interrupt handler might do something
which will change the state of lock we are acquiring?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Amit Kapila

On Fri, Feb 14, 2020 at 8:12 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > I think MaxBackends will generally limit the number of different
> > relations that can simultaneously extend, but maybe tables with many
> > partitions might change the situation.  You are right that some tests
> > might suggest a good number, let Mahendra write a patch and then we
> > can test it.  Do you have any better idea?
>
> In the first place, there certainly isn't more than one extension
> happening at a time per backend, else the entire premise of this
> thread is wrong.  Handwaving about partitions won't change that.
>

Having more number of partitions theoretically increases the chances
of false-sharing with the same number of concurrent sessions.  For ex.
two sessions operating on two relations vs. two sessions working on
two relations with 100 partitions each would increase the chances of
false-sharing.  Sawada-San and Mahendra have done many tests on
different systems and some monitoring with the previous patch that
with a decent number of fixed slots (1024), the false-sharing was very
less and even if it was there the effect was close to nothing.  So, in
short, this is not the point to worry about, but to ensure that we
don't create any significant regressions in this area.

> In the second place, it's ludicrous to expect that the underlying
> platform/filesystem can support an infinite number of concurrent
> file-extension operations.  At some level (e.g. where disk blocks
> are handed out, or where a record of the operation is written to
> a filesystem journal) it's quite likely that things are bottlenecked
> down to *one* such operation at a time per filesystem.  So I'm not
> that concerned about occasional false-sharing limiting our ability
> to issue concurrent requests.  There are probably worse restrictions
> at lower levels.
>

Agreed and what we have observed during the tests is what you have
said in this paragraph.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Robert Haas

On Fri, Feb 14, 2020 at 1:07 PM Andres Freund  wrote:
> Yea, that seems possible.  I'm not really sure it's needed however? As
> long as you're not teaching the locking mechanism new tricks that
> influence the wait graph, why would the deadlock detector care? That's
> quite different from the group locking case, where you explicitly needed
> to teach it something fairly fundamental.

Well, you have to teach it that locks of certain types conflict even
if they are in the same group, and that bleeds over pretty quickly
into the whole area of deadlock detection, because lock waits are the
edges in the graph that the deadlock detector processes.

> It might still be a good idea independently to add the rule & enforce
> that acquire heavyweight locks while holding certain classes of locks is
> not allowed.

I think that's absolutely essential, if we're going to continue using
the main lock manager for this. I remain somewhat unconvinced that
doing so is the best way forward, but it is *a* way forward.

> Right. For me that's *the* fundamental service that lock.c delivers. And
> it's the fundamental bit this thread so far largely has been focusing
> on.

For me, the deadlock detection is the far more complicated and problematic bit.

> Isn't that mostly true to varying degrees for the majority of lock types
> in lock.c? Sure, perhaps historically that's a misuse of lock.c, but
> it's been pretty ingrained by now.  I just don't see where leaving out
> any of these features is going to give us fundamental advantages
> justifying a different locking infrastructure.

I think the group locking + deadlock detection things are more
fundamental than you might be crediting, but I agree that having
parallel mechanisms has its own set of pitfalls.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Andres Freund

Hi,

On 2020-02-14 12:08:45 -0500, Robert Haas wrote:
> On Fri, Feb 14, 2020 at 11:40 AM Andres Freund  wrote:
> > IMO the right thing here is to extend lock.c so we can better represent
> > whether certain types of lockmethods (& levels ?) are [not] to be
> > shared.
> 
> The part that I find awkward about that is the whole thing with the
> deadlock detector. The deadlock detection code is old, crufty,
> complex, and very difficult to test (or at least I have found it so).
> A bug that I introduced when inventing group locking took like 5 years
> for somebody to find.

Oh, I agree, lock.c and surrounding code is pretty crufty. Doubtful that
just building up a largely parallel piece of infrastructure next to it
is a good answer though.

> One way of looking at the requirement that we have here is that
> certain kinds of locks need to be exempted from group locking.
> Basically, these are because they are a lower-level concept: a lock on
> a relation is more of a "logical" concept, and you hold the lock until
> eoxact, whereas a lock on an extend the relation is more of a
> "physical" concept, and you give it up as soon as you are done. Page
> locks are like relation extension locks in this regard. Unlike locks
> on SQL-level objects, these should not be shared between members of a
> lock group.
> 
> Now, if it weren't for the deadlock detector, that would be easy
> enough. But figuring out what to do with the deadlock detector seems
> really painful to me. I wonder if there's some way we can make an end
> run around that problem. For instance, if we could make (and enforce)
> a coding rule that you cannot acquire a heavyweight lock while holding
> a relation extension or page lock, then maybe we could somehow teach
> the deadlock detector to just ignore those kinds of locks, and teach
> the lock acquisition machinery that they conflict between lock group
> members.

Yea, that seems possible.  I'm not really sure it's needed however? As
long as you're not teaching the locking mechanism new tricks that
influence the wait graph, why would the deadlock detector care? That's
quite different from the group locking case, where you explicitly needed
to teach it something fairly fundamental.

It might still be a good idea independently to add the rule & enforce
that acquire heavyweight locks while holding certain classes of locks is
not allowed.

> On the other hand, I think you might also be understating the
> differences between these kinds of locks and other heavyweight locks.
> I suspect that the reason why we use lwlocks for buffers and
> heavyweight locks here is because there are a conceptually infinite
> number of relations, and lwlocks can't handle that.

Right. For me that's *the* fundamental service that lock.c delivers. And
it's the fundamental bit this thread so far largely has been focusing
on.

> The only mechanism we currently have that does handle that is the
> heavyweight lock mechanism, and from my point of view, somebody just
> beat it with a stick to make it fit this application. But the fact
> that it has been made to fit does not mean that it is really fit for
> purpose. We use 2 of 9 lock levels, we don't need deadlock detection,
> we need different behavior when group locking is in use, we release
> locks right away rather than at eoxact. I don't think it's crazy to
> think that those differences are significant enough to justify having
> a separate mechanism, even if the one that is currently on the table
> is not exactly what we want.

Isn't that mostly true to varying degrees for the majority of lock types
in lock.c? Sure, perhaps historically that's a misuse of lock.c, but
it's been pretty ingrained by now.  I just don't see where leaving out
any of these features is going to give us fundamental advantages
justifying a different locking infrastructure.

E.g. not needing to support "conceptually infinite" number of relations
IMO does provide a fundamental advantage - no need for a mapping.  I'm
not yet seeing anything equivalent for the extension vs. lock.c style
lock case.

Greetings,

Andres Freund

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Robert Haas

On Fri, Feb 14, 2020 at 11:40 AM Andres Freund  wrote:
> IMO the right thing here is to extend lock.c so we can better represent
> whether certain types of lockmethods (& levels ?) are [not] to be
> shared.

The part that I find awkward about that is the whole thing with the
deadlock detector. The deadlock detection code is old, crufty,
complex, and very difficult to test (or at least I have found it so).
A bug that I introduced when inventing group locking took like 5 years
for somebody to find.

One way of looking at the requirement that we have here is that
certain kinds of locks need to be exempted from group locking.
Basically, these are because they are a lower-level concept: a lock on
a relation is more of a "logical" concept, and you hold the lock until
eoxact, whereas a lock on an extend the relation is more of a
"physical" concept, and you give it up as soon as you are done. Page
locks are like relation extension locks in this regard. Unlike locks
on SQL-level objects, these should not be shared between members of a
lock group.

Now, if it weren't for the deadlock detector, that would be easy
enough. But figuring out what to do with the deadlock detector seems
really painful to me. I wonder if there's some way we can make an end
run around that problem. For instance, if we could make (and enforce)
a coding rule that you cannot acquire a heavyweight lock while holding
a relation extension or page lock, then maybe we could somehow teach
the deadlock detector to just ignore those kinds of locks, and teach
the lock acquisition machinery that they conflict between lock group
members.

On the other hand, I think you might also be understating the
differences between these kinds of locks and other heavyweight locks.
I suspect that the reason why we use lwlocks for buffers and
heavyweight locks here is because there are a conceptually infinite
number of relations, and lwlocks can't handle that. The only mechanism
we currently have that does handle that is the heavyweight lock
mechanism, and from my point of view, somebody just beat it with a
stick to make it fit this application. But the fact that it has been
made to fit does not mean that it is really fit for purpose. We use 2
of 9 lock levels, we don't need deadlock detection, we need different
behavior when group locking is in use, we release locks right away
rather than at eoxact. I don't think it's crazy to think that those
differences are significant enough to justify having a separate
mechanism, even if the one that is currently on the table is not
exactly what we want.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Andres Freund

Hi,

On 2020-02-14 09:42:40 -0500, Tom Lane wrote:
> In the second place, it's ludicrous to expect that the underlying
> platform/filesystem can support an infinite number of concurrent
> file-extension operations.  At some level (e.g. where disk blocks
> are handed out, or where a record of the operation is written to
> a filesystem journal) it's quite likely that things are bottlenecked
> down to *one* such operation at a time per filesystem.

That's probably true to some degree from a theoretical POV, but I think
it's so far from where we are at, that it's effectively wrong. I can
concurrently extend a few files at close to 10GB/s on a set of fast
devices below a *single* filesystem. Whereas postgres bottlenecks far
far before this.  Given that a lot of today's storage has latencies in
the 10-100s of microseconds, a journal flush doesn't necessarily cause
that much serialization - and OS journals do group commit like things
too.

Greetings,

Andres Freund

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Andres Freund

Hi,

On 2020-02-12 11:53:49 -0500, Tom Lane wrote:
> In general, if we think there are issues with LWLock, it seems to me
> we'd be better off to try to fix them, not to invent a whole new
> single-purpose lock manager that we'll have to debug and maintain.

My impression is that what's being discussed here is doing exactly that,
except with s/lwlock/heavyweight locks/. We're basically replacing the
lock.c lock mapping table with an ad-hoc implementation, and now we're
also reinventing interruptability etc.

I still find the performance arguments pretty ludicruous, to be honest -
I think the numbers I posted about how much time we spend with the locks
held, back that up.  I have a bit more understanding for the parallel
worker arguments, but only a bit:

I think if we develop a custom solution for the extension lock, we're
just going to end up having to develop another custom solution for a
bunch of other types of locks.  It seems quite likely that we'll end up
also wanting TUPLE and also SPECULATIVE and PAGE type locks that we
don't want to share between leader & workers.

IMO the right thing here is to extend lock.c so we can better represent
whether certain types of lockmethods (& levels ?) are [not] to be
shared.

Greetings,

Andres Freund

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Robert Haas

On Fri, Feb 14, 2020 at 10:43 AM Tom Lane  wrote:
> I remain unconvinced ... wouldn't both of those claims apply to any disk
> I/O request?  Are we going to try to ensure that no I/O ever happens
> while holding an LWLock, and if so how?  (Again, CheckpointLock is a
> counterexample, which has been that way for decades without reported
> problems.  But actually I think buffer I/O locks are an even more
> direct counterexample.)

Yes, that's a problem. I proposed a patch a few years ago that
replaced the buffer I/O locks with condition variables, and I think
that's a good idea for lots of reasons, including this one. I never
quite got around to pushing that through to commit, but I think we
should do that. Aside from fixing this problem, it also prevents
certain scenarios where we can currently busy-loop.

I do realize that we're unlikely to ever solve this problem
completely, but I don't think that should discourage us from making
incremental progress. Just as debuggability is a sticking point for
you, what I'm going to call operate-ability is a sticking point for
me. My work here at EnterpriseDB exposes me on a fairly regular basis
to real broken systems, and I'm therefore really sensitive to the
concerns that people have when trying to recover after a system has
become, for one reason or another, really broken.

Interruptibility may not be the #1 concern in that area, but it's very
high on the list. EnterpriseDB customers, as a rule, *really* hate
being told to restart the database because one session is stuck. It
causes a lot of disruption for them and the person who does the
restart gets yelled at by their boss, and maybe their bosses boss and
the boss above that. It means that their whole application, which may
be mission-critical, is down until the database finishes restarting,
and that is not always a quick process, especially after an immediate
shutdown. I don't think we can ever make everything that can get stuck
interruptible, but the more we can do the better.

The work you and others have done over the years to add
CHECK_FOR_INTERRUPTS() to more places pays real dividends. Making
sessions that are blocked on disk I/O interruptible in at least some
of the more common cases would be a huge win. Other people may well
have different experiences, but my experience is that the disk
deciding to conk out for a while or just respond very very slowly is a
very common problem even (and sometimes especially) on very expensive
hardware. Obviously that's not great and you're in lots of trouble,
but being able to hit ^C and get control back significantly improves
your chances of being able to understand what has happened and recover
from it.

> > That's an interesting idea, but it doesn't make the lock acquisition
> > itself interruptible, which seems pretty important to me in this case.
>
> Good point: if you think the contained operation might run too long to
> suit you, then you don't want other backends to be stuck behind it for
> the same amount of time.

Right.

> > I wonder if we could have an LWLockAcquireInterruptibly() or some such
> > that allows the lock acquisition itself to be interruptible. I think
> > that would require some rejiggering but it might be doable.
>
> Yeah, I had the impression from a brief look at LWLockAcquire that
> it was itself depending on not throwing errors partway through.
> But with careful and perhaps-a-shade-slower coding, we could probably
> make a version that didn't require that.

Yeah, that was my thought, too, but I didn't study it that carefully,
so somebody would need to do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Tom Lane

Robert Haas  writes:
> On Wed, Feb 12, 2020 at 11:53 AM Tom Lane  wrote:
>> * I see no reason to think that a relation extension lock would ever
>> be held long enough for noninterruptibility to be a real issue.  Our
>> expectations for query cancel response time are in the tens to
>> hundreds of msec anyway.

> I don't agree, because (1) the time to perform a relation extension on
> a busy system can be far longer than that and (2) if the disk is
> failing, then it can be *really* long, or indefinite.

I remain unconvinced ... wouldn't both of those claims apply to any disk
I/O request?  Are we going to try to ensure that no I/O ever happens
while holding an LWLock, and if so how?  (Again, CheckpointLock is a
counterexample, which has been that way for decades without reported
problems.  But actually I think buffer I/O locks are an even more
direct counterexample.)

>> * There are other places where an LWLock can be held for a *long* time,
>> notably the CheckpointLock.  If we do think this is an issue, we could
>> devise a way to not insist on noninterruptibility.  The easiest fix
>> is just to do a matching RESUME_INTERRUPTS after getting the lock and
>> HOLD_INTERRUPTS again before releasing it; though maybe it'd be worth
>> offering some slightly cleaner way.  Point here is that LWLockAcquire
>> only does that because it's useful to the majority of callers, not
>> because it's graven in stone that it must be like that.

> That's an interesting idea, but it doesn't make the lock acquisition
> itself interruptible, which seems pretty important to me in this case.

Good point: if you think the contained operation might run too long to
suit you, then you don't want other backends to be stuck behind it for
the same amount of time.

> I wonder if we could have an LWLockAcquireInterruptibly() or some such
> that allows the lock acquisition itself to be interruptible. I think
> that would require some rejiggering but it might be doable.

Yeah, I had the impression from a brief look at LWLockAcquire that
it was itself depending on not throwing errors partway through.
But with careful and perhaps-a-shade-slower coding, we could probably
make a version that didn't require that.

regards, tom lane

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Robert Haas

On Wed, Feb 12, 2020 at 11:53 AM Tom Lane  wrote:
> Yeah.  I would say a couple more things:
>
> * I see no reason to think that a relation extension lock would ever
> be held long enough for noninterruptibility to be a real issue.  Our
> expectations for query cancel response time are in the tens to
> hundreds of msec anyway.

I don't agree, because (1) the time to perform a relation extension on
a busy system can be far longer than that and (2) if the disk is
failing, then it can be *really* long, or indefinite.

> * There are other places where an LWLock can be held for a *long* time,
> notably the CheckpointLock.  If we do think this is an issue, we could
> devise a way to not insist on noninterruptibility.  The easiest fix
> is just to do a matching RESUME_INTERRUPTS after getting the lock and
> HOLD_INTERRUPTS again before releasing it; though maybe it'd be worth
> offering some slightly cleaner way.  Point here is that LWLockAcquire
> only does that because it's useful to the majority of callers, not
> because it's graven in stone that it must be like that.

That's an interesting idea, but it doesn't make the lock acquisition
itself interruptible, which seems pretty important to me in this case.
I wonder if we could have an LWLockAcquireInterruptibly() or some such
that allows the lock acquisition itself to be interruptible. I think
that would require some rejiggering but it might be doable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Tom Lane

Amit Kapila  writes:
> I think MaxBackends will generally limit the number of different
> relations that can simultaneously extend, but maybe tables with many
> partitions might change the situation.  You are right that some tests
> might suggest a good number, let Mahendra write a patch and then we
> can test it.  Do you have any better idea?

In the first place, there certainly isn't more than one extension
happening at a time per backend, else the entire premise of this
thread is wrong.  Handwaving about partitions won't change that.

In the second place, it's ludicrous to expect that the underlying
platform/filesystem can support an infinite number of concurrent
file-extension operations.  At some level (e.g. where disk blocks
are handed out, or where a record of the operation is written to
a filesystem journal) it's quite likely that things are bottlenecked
down to *one* such operation at a time per filesystem.  So I'm not
that concerned about occasional false-sharing limiting our ability
to issue concurrent requests.  There are probably worse restrictions
at lower levels.

regards, tom lane

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-14 Thread Amit Kapila

On Fri, Feb 14, 2020 at 11:42 AM Masahiko Sawada
 wrote:
>
> On Thu, 13 Feb 2020 at 13:16, Amit Kapila  wrote:
> >
> > On Tue, Feb 11, 2020 at 9:13 PM Tom Lane  wrote:
> > >
> > > I took a brief look through this patch.  I agree with the fundamental
> > > idea that we shouldn't need to use the heavyweight lock manager for
> > > relation extension, since deadlock is not a concern and no backend
> > > should ever need to hold more than one such lock at once.  But it feels
> > > to me like this particular solution is rather seriously overengineered.
> > > I would like to suggest that we do something similar to Robert Haas'
> > > excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
> > > that is,
> > >
> > > * Create some predetermined number N of LWLocks for relation extension.
> > > * When we want to extend some relation R, choose one of those locks
> > >   (say, R's relfilenode number mod N) and lock it.
> > >
> >
> > I am imagining something on the lines of BufferIOLWLockArray (here it
> > will be RelExtLWLockArray).  The size (N) could MaxBackends or some
> > percentage of it (depending on testing) and indexing into an array
> > could be as suggested (R's relfilenode number mod N).
>
> I'm not sure it's good that the contention of LWLock slot depends on
> MaxBackends. Because it means that the more MaxBackends is larger, the
> less the LWLock slot conflicts, even if the same number of backends
> actually connecting. Normally we don't want to increase unnecessarily
> MaxBackends for security reasons. In the current patch we defined a
> fixed length of array for extension lock but I agree that we need to
> determine what approach is the best depending on testing.
>

I think MaxBackends will generally limit the number of different
relations that can simultaneously extend, but maybe tables with many
partitions might change the situation.  You are right that some tests
might suggest a good number, let Mahendra write a patch and then we
can test it.  Do you have any better idea?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-13 Thread Masahiko Sawada

On Thu, 13 Feb 2020 at 13:16, Amit Kapila  wrote:
>
> On Tue, Feb 11, 2020 at 9:13 PM Tom Lane  wrote:
> >
> > I took a brief look through this patch.  I agree with the fundamental
> > idea that we shouldn't need to use the heavyweight lock manager for
> > relation extension, since deadlock is not a concern and no backend
> > should ever need to hold more than one such lock at once.  But it feels
> > to me like this particular solution is rather seriously overengineered.
> > I would like to suggest that we do something similar to Robert Haas'
> > excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
> > that is,
> >
> > * Create some predetermined number N of LWLocks for relation extension.
> > * When we want to extend some relation R, choose one of those locks
> >   (say, R's relfilenode number mod N) and lock it.
> >
>
> I am imagining something on the lines of BufferIOLWLockArray (here it
> will be RelExtLWLockArray).  The size (N) could MaxBackends or some
> percentage of it (depending on testing) and indexing into an array
> could be as suggested (R's relfilenode number mod N).

I'm not sure it's good that the contention of LWLock slot depends on
MaxBackends. Because it means that the more MaxBackends is larger, the
less the LWLock slot conflicts, even if the same number of backends
actually connecting. Normally we don't want to increase unnecessarily
MaxBackends for security reasons. In the current patch we defined a
fixed length of array for extension lock but I agree that we need to
determine what approach is the best depending on testing.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-12 Thread Mahendra Singh Thalor

On Thu, 13 Feb 2020 at 09:46, Amit Kapila  wrote:
>
> On Tue, Feb 11, 2020 at 9:13 PM Tom Lane  wrote:
> >
> > I took a brief look through this patch.  I agree with the fundamental
> > idea that we shouldn't need to use the heavyweight lock manager for
> > relation extension, since deadlock is not a concern and no backend
> > should ever need to hold more than one such lock at once.  But it feels
> > to me like this particular solution is rather seriously overengineered.
> > I would like to suggest that we do something similar to Robert Haas'
> > excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
> > that is,
> >
> > * Create some predetermined number N of LWLocks for relation extension.
> > * When we want to extend some relation R, choose one of those locks
> >   (say, R's relfilenode number mod N) and lock it.
> >
>
> I am imagining something on the lines of BufferIOLWLockArray (here it
> will be RelExtLWLockArray).  The size (N) could MaxBackends or some
> percentage of it (depending on testing) and indexing into an array
> could be as suggested (R's relfilenode number mod N).  We need to
> initialize this during shared memory initialization.  Then, to extend
> the relation with multiple blocks at-a-time (as we do in
> RelationAddExtraBlocks), we can either use the already proven
> technique of group clear xid mechanism (see ProcArrayGroupClearXid) or
> have an additional state in the RelExtLWLockArray which will keep the
> count of waiters (as done in latest patch of Sawada-san [1]).  We
> might want to experiment with both approaches and see which yields
> better results.

Thanks all for the suggestions. I have started working on the
implementation based on the suggestion.  I will post a patch for this
in few days.

-- 
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-12 Thread Dilip Kumar

On Thu, Feb 13, 2020 at 9:46 AM Amit Kapila  wrote:
>
> On Tue, Feb 11, 2020 at 9:13 PM Tom Lane  wrote:
> >
> > I took a brief look through this patch.  I agree with the fundamental
> > idea that we shouldn't need to use the heavyweight lock manager for
> > relation extension, since deadlock is not a concern and no backend
> > should ever need to hold more than one such lock at once.  But it feels
> > to me like this particular solution is rather seriously overengineered.
> > I would like to suggest that we do something similar to Robert Haas'
> > excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
> > that is,
> >
> > * Create some predetermined number N of LWLocks for relation extension.
> > * When we want to extend some relation R, choose one of those locks
> >   (say, R's relfilenode number mod N) and lock it.
> >
>
> I am imagining something on the lines of BufferIOLWLockArray (here it
> will be RelExtLWLockArray).  The size (N) could MaxBackends or some
> percentage of it (depending on testing) and indexing into an array
> could be as suggested (R's relfilenode number mod N).  We need to
> initialize this during shared memory initialization.  Then, to extend
> the relation with multiple blocks at-a-time (as we do in
> RelationAddExtraBlocks), we can either use the already proven
> technique of group clear xid mechanism (see ProcArrayGroupClearXid) or
> have an additional state in the RelExtLWLockArray which will keep the
> count of waiters (as done in latest patch of Sawada-san [1]).  We
> might want to experiment with both approaches and see which yields
> better results.

IMHO, in this case, there is no point in using the "group clear" type
of mechanism mainly for two reasons 1) It will unnecessarily make
PGPROC structure heavy.
2) For our case, we don't need any specific pieces of information from
other waiters, we just need the count.


Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-12 Thread Amit Kapila

On Tue, Feb 11, 2020 at 9:13 PM Tom Lane  wrote:
>
> I took a brief look through this patch.  I agree with the fundamental
> idea that we shouldn't need to use the heavyweight lock manager for
> relation extension, since deadlock is not a concern and no backend
> should ever need to hold more than one such lock at once.  But it feels
> to me like this particular solution is rather seriously overengineered.
> I would like to suggest that we do something similar to Robert Haas'
> excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
> that is,
>
> * Create some predetermined number N of LWLocks for relation extension.
> * When we want to extend some relation R, choose one of those locks
>   (say, R's relfilenode number mod N) and lock it.
>

I am imagining something on the lines of BufferIOLWLockArray (here it
will be RelExtLWLockArray).  The size (N) could MaxBackends or some
percentage of it (depending on testing) and indexing into an array
could be as suggested (R's relfilenode number mod N).  We need to
initialize this during shared memory initialization.  Then, to extend
the relation with multiple blocks at-a-time (as we do in
RelationAddExtraBlocks), we can either use the already proven
technique of group clear xid mechanism (see ProcArrayGroupClearXid) or
have an additional state in the RelExtLWLockArray which will keep the
count of waiters (as done in latest patch of Sawada-san [1]).  We
might want to experiment with both approaches and see which yields
better results.

[1] - 
https://www.postgresql.org/message-id/CAD21AoADkWhkLEB_%3DkjLZeZ_ML9_hSQqNBWz%2Bd821QHf%3DO9LJQ%40mail.gmail.com
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-12 Thread Amit Kapila

On Wed, Feb 12, 2020 at 10:23 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > On Wed, Feb 12, 2020 at 7:36 AM Masahiko Sawada
> >  wrote:
> >> On Wed, 12 Feb 2020 at 00:43, Tom Lane  wrote:
> >>> I would like to suggest that we do something similar to Robert Haas'
> >>> excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
>
> >> My original proposal used LWLocks and hash tables for relation
> >> extension but there was a discussion that using LWLocks is not good
> >> because it's not interruptible[1].
>
> > Hmm, but we use LWLocks for (a) WALWrite/Flush (see the usage of
> > WALWriteLock), (b) writing the shared buffer contents (see
> > io_in_progress lock and its usage in FlushBuffer) and might be for few
> > other similar stuff.  Many times those take more time than extending a
> > block in relation especially when we combine the WAL write for
> > multiple commits.  So, if this is a problem for relation extension
> > lock, then the same thing holds true there also.
>
> Yeah.  I would say a couple more things:
>
> * I see no reason to think that a relation extension lock would ever
> be held long enough for noninterruptibility to be a real issue.  Our
> expectations for query cancel response time are in the tens to
> hundreds of msec anyway.
>
> * There are other places where an LWLock can be held for a *long* time,
> notably the CheckpointLock.  If we do think this is an issue, we could
> devise a way to not insist on noninterruptibility.  The easiest fix
> is just to do a matching RESUME_INTERRUPTS after getting the lock and
> HOLD_INTERRUPTS again before releasing it; though maybe it'd be worth
> offering some slightly cleaner way.
>

Yeah, this sounds like the better answer for noninterruptibility
aspect of this design.  One idea that occurred to me was to pass a
parameter to LWLOCK acquire/release APIs to indicate whether to
hold/resume interrupts, but I don't know if that is any better than
doing it at the required place.  I am not sure if all places are
careful whether they really want to hold interrupts, so if we provide
a new parameter at least new users of API will think about it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-12 Thread Tom Lane

Amit Kapila  writes:
> On Wed, Feb 12, 2020 at 7:36 AM Masahiko Sawada
>  wrote:
>> On Wed, 12 Feb 2020 at 00:43, Tom Lane  wrote:
>>> I would like to suggest that we do something similar to Robert Haas'
>>> excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,

>> My original proposal used LWLocks and hash tables for relation
>> extension but there was a discussion that using LWLocks is not good
>> because it's not interruptible[1].

> Hmm, but we use LWLocks for (a) WALWrite/Flush (see the usage of
> WALWriteLock), (b) writing the shared buffer contents (see
> io_in_progress lock and its usage in FlushBuffer) and might be for few
> other similar stuff.  Many times those take more time than extending a
> block in relation especially when we combine the WAL write for
> multiple commits.  So, if this is a problem for relation extension
> lock, then the same thing holds true there also.

Yeah.  I would say a couple more things:

* I see no reason to think that a relation extension lock would ever
be held long enough for noninterruptibility to be a real issue.  Our
expectations for query cancel response time are in the tens to
hundreds of msec anyway.

* There are other places where an LWLock can be held for a *long* time,
notably the CheckpointLock.  If we do think this is an issue, we could
devise a way to not insist on noninterruptibility.  The easiest fix
is just to do a matching RESUME_INTERRUPTS after getting the lock and
HOLD_INTERRUPTS again before releasing it; though maybe it'd be worth
offering some slightly cleaner way.  Point here is that LWLockAcquire
only does that because it's useful to the majority of callers, not
because it's graven in stone that it must be like that.

In general, if we think there are issues with LWLock, it seems to me
we'd be better off to try to fix them, not to invent a whole new
single-purpose lock manager that we'll have to debug and maintain.
I do not see anything about this problem that suggests that that would
provide a major win.  As Andres has noted, there are lots of other
aspects of it that are likely to be more useful to spend effort on.

regards, tom lane

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-12 Thread Amit Kapila

On Wed, Feb 12, 2020 at 10:24 AM Andres Freund  wrote:
>
> Hi,
>
> On 2020-02-11 08:01:34 +0530, Amit Kapila wrote:
> > I don't see much downside with the patch, rather there is a
> > performance increase of 3-9% in various scenarios.
>
> As I wrote in [1] I started to look at this patch. My problem with itis
> that it just seems like the wrong direction architecturally to
> me. There's two main aspects to this:
>
> 1) It basically builds a another, more lightweight but less capable, of
>a lock manager that can lock more objects than we can have distinct
>locks for.  It is faster because it uses *one* hashtable without
>conflict handling, because it has fewer lock modes, and because it
>doesn't support detecting deadlocks. And probably some other things.
>
> 2) A lot of the contention around file extension comes from us doing
>multiple expensive things under one lock (determining current
>relation size, searching victim buffer, extending file), and in tiny
>increments (growing a 1TB table by 8kb). This patch doesn't address
>that at all.
>

It seems to me both the two points try to address the performance
angle of the patch, but here our actual intention was to make this
lock block among parallel workers so that we can implement/improve
some of the parallel writes operations (like parallelly vacuuming the
heap or index, parallel bulk load, etc.).  Both independently are
worth accomplishing, but not w.r.t parallel writes.  Here, we were
doing some benchmarking to see if we haven't regressed performance in
any cases.

> I've focused on 1) in the email referenced above ([1]). Here I'll focus
> on 2).
>
>
>
> Which, I think, pretty clearly shows a few things:
>

I agree with all your below observations.

> 1) It's crucial to move acquiring a victim buffer to the outside of the
>extension lock, as for copy acquiring the victim buffer will commonly
>cause a buffer having to be written out, due to the ringbuffer. This
>is even more crucial when using a logged table, as the writeout then
>also will often also trigger a WAL flush.
>
>While doing so will sometimes add a round of acquiring the buffer
>mapping locks, having to do the FlushBuffer while holding the
>extension lock is a huge problem.
>
>This'd also move a good bit of the cost of finding (i.e. clock sweep
>/ ringbuffer replacement) and invalidating the old buffer mapping out
>of the lock.
>

I think this mostly because of the way currently code is arranged to
extend a block via ReadBuffer* API.  IIUC, currently the main
operations under relation extension lock are as follows:
a. get the block number for extension via smgrnblocks.
b. find victim buffer
c. associate buffer with the block no. found in step-a.
d. initialize the block with zeros
e. write the block
f.  PageInit

I think if we can rearrange such that steps b and c can be done after
e or f, then we don't need to hold the extension lock to find the
victim buffer.

> 2) We need to make the smgrwrite more efficient, it is costing a lot of
>time. A small additional experiment shows the cost of doing 8kb
>writes:
>
>I wrote a small program that just iteratively writes a 32GB file:
>
..
>
>
>So it looks like extending the file with posix_fallocate() might be a
>winner, but only if we actually can do so in larger chunks than 8kb
>at once.
>

A good experiment and sounds like worth doing.

>
>
> 3) We should move the PageInit() that's currently done with the
>extension lock held, to the outside. Since we get the buffer with
>RBM_ZERO_AND_LOCK these days, that should be safe.  Also, we don't
>need to zero the entire buffer both in RelationGetBufferForTuple()'s
>PageInit(), and in ReadBuffer_common() before calling smgrextend().
>

Agreed.

I feel all three are independent improvements and can be done separately.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-12 Thread Amit Kapila

On Wed, Feb 12, 2020 at 7:36 AM Masahiko Sawada
 wrote:
>
> On Wed, 12 Feb 2020 at 00:43, Tom Lane  wrote:
> >
> > I took a brief look through this patch.  I agree with the fundamental
> > idea that we shouldn't need to use the heavyweight lock manager for
> > relation extension, since deadlock is not a concern and no backend
> > should ever need to hold more than one such lock at once.  But it feels
> > to me like this particular solution is rather seriously overengineered.
> > I would like to suggest that we do something similar to Robert Haas'
> > excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
> > that is,
> >
> > * Create some predetermined number N of LWLocks for relation extension.
>
> My original proposal used LWLocks and hash tables for relation
> extension but there was a discussion that using LWLocks is not good
> because it's not interruptible[1]. Because of this reason and that we
> don't need to have two lock level (shared, exclusive) for relation
> extension lock we ended up with implementing dedicated lock manager
> for extension lock. I think we will have that problem if we use LWLocks.
>

Hmm, but we use LWLocks for (a) WALWrite/Flush (see the usage of
WALWriteLock), (b) writing the shared buffer contents (see
io_in_progress lock and its usage in FlushBuffer) and might be for few
other similar stuff.  Many times those take more time than extending a
block in relation especially when we combine the WAL write for
multiple commits.  So, if this is a problem for relation extension
lock, then the same thing holds true there also.  Now, there are cases
like when we extend the relation with multiple blocks, finding victim
buffer under this lock, etc. where this can be also equally or more
costly, but I think we can improve some of those cases (some of this
is even pointed by Andres in his email) if we agree on a fundamental
idea of using LWLocks as proposed by Tom.   I am not telling that we
implement Tom's idea without weighing its pros and cons, but it has an
appeal due to its simplicity.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-11 Thread Masahiko Sawada

On Tue, 11 Feb 2020 at 11:31, Amit Kapila  wrote:
>
> On Wed, Feb 5, 2020 at 12:07 PM Masahiko Sawada  wrote:
> >
> >
> > Unfortunately the environment I used for performance verification is
> > no longer available.
> >
> > I agree to run this test in a different environment. I've attached the
> > rebased version patch. I'm measuring the performance with/without
> > patch, so will share the results.
> >
>
> Did you get a chance to run these tests?  Lately, Mahendra has done a
> lot of performance testing of this patch and shared his results.  I
> don't see much downside with the patch, rather there is a performance
> increase of 3-9% in various scenarios.

I've done performance tests on my laptop while changing the number of
partitions. 4 clients concurrently insert 32 tuples to randomly
selected partitions in a transaction. Therefore by changing the number
of partition the contention of relation extension lock would also be
changed. All tables are unlogged tables and N_RELEXTLOCK_ENTS is 1024.

Here is my test results:

* HEAD
nchilds = 64 tps = 33135
nchilds = 128 tps = 31249
nchilds = 256 tps = 29356

* Patched
nchilds = 64 tps = 32057
nchilds = 128 tps = 32426
nchilds = 256 tps = 29483

The performance has been slightly improved by the patch in two cases.
I've also attached the shell script I used to test.

When I set N_RELEXTLOCK_ENTS to 1 so that all relation locks conflicts
the result is:

nchilds = 64 tps = 30887
nchilds = 128 tps = 30015
nchilds = 256 tps = 27837

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
NDATA=32
NTRY=2
NCLIENTS=4
TIME=120

for childs in 64 128 256
do
pg_ctl stop -mi
rm -r $PGDATA
initdb -E UTF8 --no-locale
cat <> $PGDATA/postgresql.conf
shared_buffers = 512MB
max_wal_size = 20GB
checkpoint_timeout = 1h
EOF
pg_ctl start

psql -c "create unlogged table parent (c int) partition by list(c)"

cat < /dev/null
select 'create unlogged table p' || i || ' partition of parent for values in (' || i || ')' from generate_series(0,$childs) i; \gexec
EOF

echo "insert into parent select ((random() * 1000)::int % $childs) from generate_series(1,$NDATA)" > data.sql

pgbench -i -n postgres
avg=0
total=0
for t in `seq 1 $NTRY`
do
	tps=`bin/pgbench -T $TIME -c ${NCLIENTS} -f data.sql -n  postgres | grep "excluding" | cut -d " " -f 3`
	echo "CHILDS = $childs, TRIS = $t, TPS = $tps"
	total=$(echo "$tps + $total" | bc)
done
avg=$(echo "$total / $NTRY" | bc)
echo "nchilds = $childs tps = $avg" >> result_${1}.txt
done

pg_ctl stop

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-11 Thread Andres Freund

Hi,

On 2020-02-11 08:01:34 +0530, Amit Kapila wrote:
> I don't see much downside with the patch, rather there is a
> performance increase of 3-9% in various scenarios.

As I wrote in [1] I started to look at this patch. My problem with itis
that it just seems like the wrong direction architecturally to
me. There's two main aspects to this:

1) It basically builds a another, more lightweight but less capable, of
   a lock manager that can lock more objects than we can have distinct
   locks for.  It is faster because it uses *one* hashtable without
   conflict handling, because it has fewer lock modes, and because it
   doesn't support detecting deadlocks. And probably some other things.

2) A lot of the contention around file extension comes from us doing
   multiple expensive things under one lock (determining current
   relation size, searching victim buffer, extending file), and in tiny
   increments (growing a 1TB table by 8kb). This patch doesn't address
   that at all.

I've focused on 1) in the email referenced above ([1]). Here I'll focus
on 2).

To quantify my concerns I instrumented postgres to measure the time for
various operations that are part of extending a file (all per
process). The hardware is a pretty fast nvme, with unlogged tables, on a
20/40 core/threads machine. The workload is copying a scale 10
pgbench_accounts into an unindexed, unlogged table using pgbench.

Here are the instrumentations for various client counts, when just
measuring 20s:

1 client:
LOG:  extension time: lock wait: 0.00 lock held: 3.19 filesystem: 1.29 
buffersearch: 1.58

2 clients:
LOG:  extension time: lock wait: 0.47 lock held: 2.99 filesystem: 1.24 
buffersearch: 1.43
LOG:  extension time: lock wait: 0.60 lock held: 3.05 filesystem: 1.23 
buffersearch: 1.50

4 clients:
LOG:  extension time: lock wait: 3.92 lock held: 2.69 filesystem: 1.10 
buffersearch: 1.29
LOG:  extension time: lock wait: 4.40 lock held: 2.02 filesystem: 0.81 
buffersearch: 0.93
LOG:  extension time: lock wait: 3.86 lock held: 2.59 filesystem: 1.06 
buffersearch: 1.22
LOG:  extension time: lock wait: 4.00 lock held: 2.65 filesystem: 1.08 
buffersearch: 1.26

8 clients:
LOG:  extension time: lock wait: 6.94 lock held: 1.74 filesystem: 0.70 
buffersearch: 0.80
LOG:  extension time: lock wait: 7.16 lock held: 1.81 filesystem: 0.73 
buffersearch: 0.82
LOG:  extension time: lock wait: 6.93 lock held: 1.95 filesystem: 0.80 
buffersearch: 0.89
LOG:  extension time: lock wait: 7.08 lock held: 1.87 filesystem: 0.76 
buffersearch: 0.86
LOG:  extension time: lock wait: 6.95 lock held: 1.95 filesystem: 0.80 
buffersearch: 0.89
LOG:  extension time: lock wait: 6.88 lock held: 2.01 filesystem: 0.83 
buffersearch: 0.93
LOG:  extension time: lock wait: 6.94 lock held: 2.02 filesystem: 0.82 
buffersearch: 0.93
LOG:  extension time: lock wait: 7.02 lock held: 1.95 filesystem: 0.80 
buffersearch: 0.89

16 clients:
LOG:  extension time: lock wait: 10.37 lock held: 0.88 filesystem: 0.36 
buffersearch: 0.39
LOG:  extension time: lock wait: 10.53 lock held: 0.90 filesystem: 0.37 
buffersearch: 0.40
LOG:  extension time: lock wait: 10.72 lock held: 1.01 filesystem: 0.42 
buffersearch: 0.45
LOG:  extension time: lock wait: 10.45 lock held: 1.25 filesystem: 0.52 
buffersearch: 0.55
LOG:  extension time: lock wait: 10.66 lock held: 0.94 filesystem: 0.38 
buffersearch: 0.41
LOG:  extension time: lock wait: 10.50 lock held: 1.27 filesystem: 0.53 
buffersearch: 0.56
LOG:  extension time: lock wait: 10.53 lock held: 1.19 filesystem: 0.49 
buffersearch: 0.53
LOG:  extension time: lock wait: 10.57 lock held: 1.22 filesystem: 0.50 
buffersearch: 0.53
LOG:  extension time: lock wait: 10.72 lock held: 1.17 filesystem: 0.48 
buffersearch: 0.52
LOG:  extension time: lock wait: 10.67 lock held: 1.32 filesystem: 0.55 
buffersearch: 0.58
LOG:  extension time: lock wait: 10.95 lock held: 0.92 filesystem: 0.38 
buffersearch: 0.40
LOG:  extension time: lock wait: 10.81 lock held: 1.24 filesystem: 0.51 
buffersearch: 0.56
LOG:  extension time: lock wait: 10.62 lock held: 1.27 filesystem: 0.53 
buffersearch: 0.56
LOG:  extension time: lock wait: 11.14 lock held: 0.94 filesystem: 0.38 
buffersearch: 0.41
LOG:  extension time: lock wait: 11.20 lock held: 0.96 filesystem: 0.39 
buffersearch: 0.42
LOG:  extension time: lock wait: 10.75 lock held: 1.41 filesystem: 0.58 
buffersearch: 0.63
0.88 + 0.90 + 1.01 + 1.25 + 0.94 + 1.27 + 1.19 + 1.22 + 1.17 + 1.32 + 0.92 + 
1.24 + 1.27 + 0.94 + 0.96 + 1.41
in *none* of these cases the drive gets even close to being
saturated. Like not even 1/3.

If you consider the total time with the lock held, and the total time of
the test, it becomes very quickly obvious that pretty quickly we spend
the majority of the total time with the lock held.
client count 1: 3.18/20 = 0.16
client count 2: 6.04/20 = 0.30
client count 4: 9.95/20 = 0.50
client count 8: 15.30/20 = 0.76
client count 16: 17.89/20 = 0.89

In other words, the reason that relation extension

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-11 Thread Masahiko Sawada

On Wed, 12 Feb 2020 at 00:43, Tom Lane  wrote:
>
> I took a brief look through this patch.  I agree with the fundamental
> idea that we shouldn't need to use the heavyweight lock manager for
> relation extension, since deadlock is not a concern and no backend
> should ever need to hold more than one such lock at once.  But it feels
> to me like this particular solution is rather seriously overengineered.
> I would like to suggest that we do something similar to Robert Haas'
> excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
> that is,
>
> * Create some predetermined number N of LWLocks for relation extension.

My original proposal used LWLocks and hash tables for relation
extension but there was a discussion that using LWLocks is not good
because it's not interruptible[1]. Because of this reason and that we
don't need to have two lock level (shared, exclusive) for relation
extension lock we ended up with implementing dedicated lock manager
for extension lock. I think we will have that problem if we use LWLocks.

Regards,

[1] 
https://www.postgresql.org/message-id/CA%2BTgmoZnWYQvmeqeGyY%2B0j-Tfmx8cTzRadfxJQwK9A-nCQ7GkA%40mail.gmail.com

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-11 Thread Tom Lane

I took a brief look through this patch.  I agree with the fundamental
idea that we shouldn't need to use the heavyweight lock manager for
relation extension, since deadlock is not a concern and no backend
should ever need to hold more than one such lock at once.  But it feels
to me like this particular solution is rather seriously overengineered.
I would like to suggest that we do something similar to Robert Haas'
excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,
that is,

* Create some predetermined number N of LWLocks for relation extension.
* When we want to extend some relation R, choose one of those locks
  (say, R's relfilenode number mod N) and lock it.

1. As long as all backends agree on the relation-to-lock mapping, this
provides full security against concurrent extensions of the same
relation.

2. Occasionally a backend will be blocked when it doesn't need to be,
because of false sharing of a lock between two relations that need to
be extended at the same time.  But as long as N is large enough (and
I doubt that it needs to be very large), that will be a negligible
penalty.

3. Aside from being a lot simpler than the proposed extension_lock.c,
this approach involves absolutely negligible overhead beyond the raw
LWLockAcquire and LWLockRelease calls.  I suspect therefore that in
typical noncontended cases it will be faster.  It also does not require
any new resource management overhead, thus eliminating this patch's
small but real penalty on transaction exit/cleanup.

We'd need to do a bit of performance testing to choose a good value
for N.  I think that with N comparable to MaxBackends, the odds of
false sharing being a problem would be quite negligible ... but it
could be that we could get away with a lot less than that.

regards, tom lane

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-10 Thread Amit Kapila

On Wed, Feb 5, 2020 at 12:07 PM Masahiko Sawada  wrote:
>
>
> Unfortunately the environment I used for performance verification is
> no longer available.
>
> I agree to run this test in a different environment. I've attached the
> rebased version patch. I'm measuring the performance with/without
> patch, so will share the results.
>

Did you get a chance to run these tests?  Lately, Mahendra has done a
lot of performance testing of this patch and shared his results.  I
don't see much downside with the patch, rather there is a performance
increase of 3-9% in various scenarios.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-10 Thread Amit Kapila

On Mon, Feb 10, 2020 at 10:28 PM Mahendra Singh Thalor
 wrote:
>
> On Sat, 8 Feb 2020 at 00:27, Mahendra Singh Thalor  wrote:
> >
> > On Thu, 6 Feb 2020 at 09:44, Amit Kapila  wrote:
> > >
> > >
> > > The number at 56 and 74 client count seem slightly suspicious.   Can
> > > you please repeat those tests?  Basically, I am not able to come up
> > > with a theory why at 56 clients the performance with the patch is a
> > > bit lower and then at 74 it is higher.
> >
> > Okay. I will repeat test.
>
> I re-tested in different machine because in previous machine, results are  
> in-consistent
>

Thanks for doing detailed tests.

> My testing machine:
> $ lscpu
> Architecture:  ppc64le
> Byte Order:Little Endian
> CPU(s):192
> On-line CPU(s) list:   0-191
> Thread(s) per core:8
> Core(s) per socket:1
> Socket(s): 24
> NUMA node(s):  4
> Model: IBM,8286-42A
> L1d cache: 64K
> L1i cache: 32K
> L2 cache:  512K
> L3 cache:  8192K
> NUMA node0 CPU(s): 0-47
> NUMA node1 CPU(s): 48-95
> NUMA node2 CPU(s): 96-143
> NUMA node3 CPU(s): 144-191
>
> ./pgbench -c $threads -j $threads -T 180 -f insert1.sql@1 -f insert2.sql@1 -f 
> insert3.sql@1 -f insert4.sql@1 postgres
>
> ClientsHEAD(tps)With v14 patch(tps)%change   
> (time: 180s)
> 1 41.491486  41.375532  -0.27%
> 32  335.138568330.028739 -1.52%
> 56 353.783930 360.883710  +2.00%
> 60 341.741925 359.028041 +5.05%
> 64 338.521730 356.511423  +5.13%
> 66 339.838921 352.761766  +3.80%
> 70339.305454  353.658425+4.23%
> 74332.016217  348.809042 +5.05%
>
> From above results, it seems that there is very little regression with the 
> patch(+-5%) that can be run to run variation.
>

Hmm, I don't see 5% regression, rather it is a performance gain of ~5%
with the patch?  When we use regression, that indicates with the patch
performance (TPS) is reduced, but I don't see that in the above
numbers.  Kindly clarify.

> >
> > >
> > > > I want to test extension lock by blocking use of fsm(use_fsm=false in 
> > > > code).  I think, if we block use of fsm, then load will increase into 
> > > > extension lock.  Is this correct way to test?
> > > >
> > >
> > > Hmm, I think instead of directly hacking the code, you might want to
> > > use the operation (probably cluster or vacuum full) where we set
> > > HEAP_INSERT_SKIP_FSM.  I think along with this you can try with
> > > unlogged tables because that might stress the extension lock.
> >
> > Okay. I will test.
>
> I tested with unlogged tables also.  There also I was getting 3-6% gain in 
> tps.
>
> >
> > >
> > > In the above test, you might want to test with a higher number of
> > > partitions (say up to 100) as well.  Also, see if you want to use the
> > > Copy command.
> >
> > Okay. I will test.
>
> I tested with 500, 1000, 2000 paratitions. I observed max +5% regress in the 
> tps and there was no performace degradation.
>

Again, I am not sure if you see performance dip here.  I think your
usage of the word 'regression' is not correct or at least confusing.

> For example:
> I created a table with 2000 paratitions and then I checked false sharing.
> Slot NumberSlot Freq.Slot NumberSlot Freq.Slot NumberSlot Freq.
> 156139731144610
> 62713521048810
> 782121031050110
> 812121131070110
> 192111751073710
> 221112351075410
> 367112541078110
> 546113141079010
> 814114191083310
> 917114241088810
>
> From above table, we can see that total 13 child tables are falling in same 
> backet (slot 156) so I did bulk-loading only in those 13 child tables to 
> check tps in false sharing but I noticed that there was no performance 
> degradation.
>

Okay.  Is it possible to share these numbers and scripts?

Thanks for doing the detailed tests for this patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-10 Thread Mahendra Singh Thalor

On Sat, 8 Feb 2020 at 00:27, Mahendra Singh Thalor 
wrote:
>
> On Thu, 6 Feb 2020 at 09:44, Amit Kapila  wrote:
> >
> > On Thu, Feb 6, 2020 at 1:57 AM Mahendra Singh Thalor 
wrote:
> > >
> > > On Wed, 5 Feb 2020 at 12:07, Masahiko Sawada 
wrote:
> > > >
> > > > On Mon, Feb 3, 2020 at 8:03 PM Amit Kapila 
wrote:
> > > > >
> > > > > On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada <
sawada.m...@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas <
robertmh...@gmail.com> wrote:
> > > > > > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund <
and...@anarazel.de> wrote:
> > > > > > >>> I think the real question is whether the scenario is common
enough to
> > > > > > >>> worry about.  In practice, you'd have to be extremely
unlucky to be
> > > > > > >>> doing many bulk loads at the same time that all happened to
hash to
> > > > > > >>> the same bucket.
> > > > > > >>
> > > > > > >> With a bunch of parallel bulkloads into partitioned tables
that really
> > > > > > >> doesn't seem that unlikely?
> > > > > > >
> > > > > > > It increases the likelihood of collisions, but probably
decreases the
> > > > > > > number of cases where the contention gets really bad.
> > > > > > >
> > > > > > > For example, suppose each table has 100 partitions and you are
> > > > > > > bulk-loading 10 of them at a time.  It's virtually certain
that you
> > > > > > > will have some collisions, but the amount of contention
within each
> > > > > > > bucket will remain fairly low because each backend spends
only 1% of
> > > > > > > its time in the bucket corresponding to any given partition.
> > > > > > >
> > > > > >
> > > > > > I share another result of performance evaluation between
current HEAD
> > > > > > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
> > > > > >
> > > > > > Type of table: normal table, unlogged table
> > > > > > Number of child tables : 16, 64 (all tables are located on the
same tablespace)
> > > > > > Number of clients : 32
> > > > > > Number of trials : 100
> > > > > > Duration: 180 seconds for each trials
> > > > > >
> > > > > > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores),
256GB
> > > > > > RAM, NVMe SSD 1.5TB.
> > > > > > Each clients load 10kB random data across all partitioned
tables.
> > > > > >
> > > > > > Here is the result.
> > > > > >
> > > > > >  childs |   type   | target  |  avg_tps   | diff with HEAD
> > > > > > +--+-++--
> > > > > >  16 | normal   | HEAD|   1643.833 |
> > > > > >  16 | normal   | Patched |  1619.5404 |  0.985222
> > > > > >  16 | unlogged | HEAD|  9069.3543 |
> > > > > >  16 | unlogged | Patched |  9368.0263 |  1.032932
> > > > > >  64 | normal   | HEAD|   1598.698 |
> > > > > >  64 | normal   | Patched |  1587.5906 |  0.993052
> > > > > >  64 | unlogged | HEAD|  9629.7315 |
> > > > > >  64 | unlogged | Patched | 10208.2196 |  1.060073
> > > > > > (8 rows)
> > > > > >
> > > > > > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > > > > > whereas it increased 3% ~ 6% for unlogged tables. There were
> > > > > > collisions at 0 ~ 5 relation extension lock slots between 2
relations
> > > > > > in the 64 child tables case but it didn't seem to affect the
tps.
> > > > > >
> > > > >
> > > > > AFAIU, this resembles the workload that Andres was worried about.
  I
> > > > > think we should once run this test in a different environment, but
> > > > > considering this to be correct and repeatable, where do we go with
> > > > > this patch especially when we know it improves many workloads [1]
as
> > > > > well.  We know that on a pathological case constructed by Mithun
[2],
> > > > > this causes regression as well.  I am not sure if the test done by
> > > > > Mithun really mimics any real-world workload as he has tested by
> > > > > making N_RELEXTLOCK_ENTS = 1 to hit the worst case.
> > > > >
> > > > > Sawada-San, if you have a script or data for the test done by you,
> > > > > then please share it so that others can also try to reproduce it.
> > > >
> > > > Unfortunately the environment I used for performance verification is
> > > > no longer available.
> > > >
> > > > I agree to run this test in a different environment. I've attached
the
> > > > rebased version patch. I'm measuring the performance with/without
> > > > patch, so will share the results.
> > > >
> > >
> > > Thanks Sawada-san for patch.
> > >
> > > From last few days, I was reading this thread and was reviewing v13
patch.  To debug and test, I did re-base of v13 patch. I compared my
re-based patch and v14 patch. I think,  ordering of header files is not
alphabetically in v14 patch. (I haven't reviewed v14 patch fully because
before review, I wanted to test false sharing).  While debugging, I didn't
noticed any hang or lock related issue.
> > >
> > > I did some testing to test false sharing(bulk insert, COPY data, bulk
insert into partitions

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-07 Thread Mahendra Singh Thalor

On Thu, 6 Feb 2020 at 09:44, Amit Kapila  wrote:
>
> On Thu, Feb 6, 2020 at 1:57 AM Mahendra Singh Thalor 
wrote:
> >
> > On Wed, 5 Feb 2020 at 12:07, Masahiko Sawada 
wrote:
> > >
> > > On Mon, Feb 3, 2020 at 8:03 PM Amit Kapila 
wrote:
> > > >
> > > > On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada <
sawada.m...@gmail.com> wrote:
> > > > >
> > > > > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas <
robertmh...@gmail.com> wrote:
> > > > > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund <
and...@anarazel.de> wrote:
> > > > > >>> I think the real question is whether the scenario is common
enough to
> > > > > >>> worry about.  In practice, you'd have to be extremely unlucky
to be
> > > > > >>> doing many bulk loads at the same time that all happened to
hash to
> > > > > >>> the same bucket.
> > > > > >>
> > > > > >> With a bunch of parallel bulkloads into partitioned tables
that really
> > > > > >> doesn't seem that unlikely?
> > > > > >
> > > > > > It increases the likelihood of collisions, but probably
decreases the
> > > > > > number of cases where the contention gets really bad.
> > > > > >
> > > > > > For example, suppose each table has 100 partitions and you are
> > > > > > bulk-loading 10 of them at a time.  It's virtually certain that
you
> > > > > > will have some collisions, but the amount of contention within
each
> > > > > > bucket will remain fairly low because each backend spends only
1% of
> > > > > > its time in the bucket corresponding to any given partition.
> > > > > >
> > > > >
> > > > > I share another result of performance evaluation between current
HEAD
> > > > > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
> > > > >
> > > > > Type of table: normal table, unlogged table
> > > > > Number of child tables : 16, 64 (all tables are located on the
same tablespace)
> > > > > Number of clients : 32
> > > > > Number of trials : 100
> > > > > Duration: 180 seconds for each trials
> > > > >
> > > > > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores),
256GB
> > > > > RAM, NVMe SSD 1.5TB.
> > > > > Each clients load 10kB random data across all partitioned tables.
> > > > >
> > > > > Here is the result.
> > > > >
> > > > >  childs |   type   | target  |  avg_tps   | diff with HEAD
> > > > > +--+-++--
> > > > >  16 | normal   | HEAD|   1643.833 |
> > > > >  16 | normal   | Patched |  1619.5404 |  0.985222
> > > > >  16 | unlogged | HEAD|  9069.3543 |
> > > > >  16 | unlogged | Patched |  9368.0263 |  1.032932
> > > > >  64 | normal   | HEAD|   1598.698 |
> > > > >  64 | normal   | Patched |  1587.5906 |  0.993052
> > > > >  64 | unlogged | HEAD|  9629.7315 |
> > > > >  64 | unlogged | Patched | 10208.2196 |  1.060073
> > > > > (8 rows)
> > > > >
> > > > > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > > > > whereas it increased 3% ~ 6% for unlogged tables. There were
> > > > > collisions at 0 ~ 5 relation extension lock slots between 2
relations
> > > > > in the 64 child tables case but it didn't seem to affect the tps.
> > > > >
> > > >
> > > > AFAIU, this resembles the workload that Andres was worried about.
I
> > > > think we should once run this test in a different environment, but
> > > > considering this to be correct and repeatable, where do we go with
> > > > this patch especially when we know it improves many workloads [1] as
> > > > well.  We know that on a pathological case constructed by Mithun
[2],
> > > > this causes regression as well.  I am not sure if the test done by
> > > > Mithun really mimics any real-world workload as he has tested by
> > > > making N_RELEXTLOCK_ENTS = 1 to hit the worst case.
> > > >
> > > > Sawada-San, if you have a script or data for the test done by you,
> > > > then please share it so that others can also try to reproduce it.
> > >
> > > Unfortunately the environment I used for performance verification is
> > > no longer available.
> > >
> > > I agree to run this test in a different environment. I've attached the
> > > rebased version patch. I'm measuring the performance with/without
> > > patch, so will share the results.
> > >
> >
> > Thanks Sawada-san for patch.
> >
> > From last few days, I was reading this thread and was reviewing v13
patch.  To debug and test, I did re-base of v13 patch. I compared my
re-based patch and v14 patch. I think,  ordering of header files is not
alphabetically in v14 patch. (I haven't reviewed v14 patch fully because
before review, I wanted to test false sharing).  While debugging, I didn't
noticed any hang or lock related issue.
> >
> > I did some testing to test false sharing(bulk insert, COPY data, bulk
insert into partitions tables).  Below is the testing summary.
> >
> > Test setup(Bulk insert into partition tables):
> > autovacuum=off
> > shared_buffers=512MB -c max_wal_size=20GB -c checkpoint_timeout=12min
> >
> > Basically, I created a table with 13

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-05 Thread Masahiko Sawada

On Thu, 6 Feb 2020 at 13:16, Amit Kapila  wrote:
>
> On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada  
> wrote:
> >
> > Type of table: normal table, unlogged table
> > Number of child tables : 16, 64 (all tables are located on the same 
> > tablespace)
> > Number of clients : 32
> > Number of trials : 100
> > Duration: 180 seconds for each trials
> >
> > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores), 256GB
> > RAM, NVMe SSD 1.5TB.
> > Each clients load 10kB random data across all partitioned tables.
> >
> > Here is the result.
> >
> >  childs |   type   | target  |  avg_tps   | diff with HEAD
> > +--+-++--
> >  16 | normal   | HEAD|   1643.833 |
> >  16 | normal   | Patched |  1619.5404 |  0.985222
> >  16 | unlogged | HEAD|  9069.3543 |
> >  16 | unlogged | Patched |  9368.0263 |  1.032932
> >  64 | normal   | HEAD|   1598.698 |
> >  64 | normal   | Patched |  1587.5906 |  0.993052
> >  64 | unlogged | HEAD|  9629.7315 |
> >  64 | unlogged | Patched | 10208.2196 |  1.060073
> > (8 rows)
> >
> > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > whereas it increased 3% ~ 6% for unlogged tables. There were
> > collisions at 0 ~ 5 relation extension lock slots between 2 relations
> > in the 64 child tables case but it didn't seem to affect the tps.
> >
>
> How did you measure the collisions in this test?  I think it is better
> if Mahendra can also use the same technique in measuring that count.
>

I created a created a SQL function that returns the hash value of the
lock tag, which is tag_hash(locktag, sizeof(RelExtLockTag)) %
N_RELEXTLOCK_ENTS. And examined the hash values of all tables.

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-05 Thread Amit Kapila

On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada  wrote:
>
> Type of table: normal table, unlogged table
> Number of child tables : 16, 64 (all tables are located on the same 
> tablespace)
> Number of clients : 32
> Number of trials : 100
> Duration: 180 seconds for each trials
>
> The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores), 256GB
> RAM, NVMe SSD 1.5TB.
> Each clients load 10kB random data across all partitioned tables.
>
> Here is the result.
>
>  childs |   type   | target  |  avg_tps   | diff with HEAD
> +--+-++--
>  16 | normal   | HEAD|   1643.833 |
>  16 | normal   | Patched |  1619.5404 |  0.985222
>  16 | unlogged | HEAD|  9069.3543 |
>  16 | unlogged | Patched |  9368.0263 |  1.032932
>  64 | normal   | HEAD|   1598.698 |
>  64 | normal   | Patched |  1587.5906 |  0.993052
>  64 | unlogged | HEAD|  9629.7315 |
>  64 | unlogged | Patched | 10208.2196 |  1.060073
> (8 rows)
>
> For normal tables, loading tps decreased 1% ~ 2% with this patch
> whereas it increased 3% ~ 6% for unlogged tables. There were
> collisions at 0 ~ 5 relation extension lock slots between 2 relations
> in the 64 child tables case but it didn't seem to affect the tps.
>

How did you measure the collisions in this test?  I think it is better
if Mahendra can also use the same technique in measuring that count.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-05 Thread Amit Kapila

On Thu, Feb 6, 2020 at 1:57 AM Mahendra Singh Thalor  wrote:
>
> On Wed, 5 Feb 2020 at 12:07, Masahiko Sawada  wrote:
> >
> > On Mon, Feb 3, 2020 at 8:03 PM Amit Kapila  wrote:
> > >
> > > On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas  
> > > > wrote:
> > > > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund  
> > > > > wrote:
> > > > >>> I think the real question is whether the scenario is common enough 
> > > > >>> to
> > > > >>> worry about.  In practice, you'd have to be extremely unlucky to be
> > > > >>> doing many bulk loads at the same time that all happened to hash to
> > > > >>> the same bucket.
> > > > >>
> > > > >> With a bunch of parallel bulkloads into partitioned tables that 
> > > > >> really
> > > > >> doesn't seem that unlikely?
> > > > >
> > > > > It increases the likelihood of collisions, but probably decreases the
> > > > > number of cases where the contention gets really bad.
> > > > >
> > > > > For example, suppose each table has 100 partitions and you are
> > > > > bulk-loading 10 of them at a time.  It's virtually certain that you
> > > > > will have some collisions, but the amount of contention within each
> > > > > bucket will remain fairly low because each backend spends only 1% of
> > > > > its time in the bucket corresponding to any given partition.
> > > > >
> > > >
> > > > I share another result of performance evaluation between current HEAD
> > > > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
> > > >
> > > > Type of table: normal table, unlogged table
> > > > Number of child tables : 16, 64 (all tables are located on the same 
> > > > tablespace)
> > > > Number of clients : 32
> > > > Number of trials : 100
> > > > Duration: 180 seconds for each trials
> > > >
> > > > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores), 256GB
> > > > RAM, NVMe SSD 1.5TB.
> > > > Each clients load 10kB random data across all partitioned tables.
> > > >
> > > > Here is the result.
> > > >
> > > >  childs |   type   | target  |  avg_tps   | diff with HEAD
> > > > +--+-++--
> > > >  16 | normal   | HEAD|   1643.833 |
> > > >  16 | normal   | Patched |  1619.5404 |  0.985222
> > > >  16 | unlogged | HEAD|  9069.3543 |
> > > >  16 | unlogged | Patched |  9368.0263 |  1.032932
> > > >  64 | normal   | HEAD|   1598.698 |
> > > >  64 | normal   | Patched |  1587.5906 |  0.993052
> > > >  64 | unlogged | HEAD|  9629.7315 |
> > > >  64 | unlogged | Patched | 10208.2196 |  1.060073
> > > > (8 rows)
> > > >
> > > > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > > > whereas it increased 3% ~ 6% for unlogged tables. There were
> > > > collisions at 0 ~ 5 relation extension lock slots between 2 relations
> > > > in the 64 child tables case but it didn't seem to affect the tps.
> > > >
> > >
> > > AFAIU, this resembles the workload that Andres was worried about.   I
> > > think we should once run this test in a different environment, but
> > > considering this to be correct and repeatable, where do we go with
> > > this patch especially when we know it improves many workloads [1] as
> > > well.  We know that on a pathological case constructed by Mithun [2],
> > > this causes regression as well.  I am not sure if the test done by
> > > Mithun really mimics any real-world workload as he has tested by
> > > making N_RELEXTLOCK_ENTS = 1 to hit the worst case.
> > >
> > > Sawada-San, if you have a script or data for the test done by you,
> > > then please share it so that others can also try to reproduce it.
> >
> > Unfortunately the environment I used for performance verification is
> > no longer available.
> >
> > I agree to run this test in a different environment. I've attached the
> > rebased version patch. I'm measuring the performance with/without
> > patch, so will share the results.
> >
>
> Thanks Sawada-san for patch.
>
> From last few days, I was reading this thread and was reviewing v13 patch.  
> To debug and test, I did re-base of v13 patch. I compared my re-based patch 
> and v14 patch. I think,  ordering of header files is not alphabetically in 
> v14 patch. (I haven't reviewed v14 patch fully because before review, I 
> wanted to test false sharing).  While debugging, I didn't noticed any hang or 
> lock related issue.
>
> I did some testing to test false sharing(bulk insert, COPY data, bulk insert 
> into partitions tables).  Below is the testing summary.
>
> Test setup(Bulk insert into partition tables):
> autovacuum=off
> shared_buffers=512MB -c max_wal_size=20GB -c checkpoint_timeout=12min
>
> Basically, I created a table with 13 partitions. Using pgbench, I inserted 
> bulk data. I used below pgbench command:
> ./pgbench -c $threads -j $threads -T 180 -f insert1.sql@1 -f insert2.sql@1 -f 
> insert3.sql@1 -f insert4.sql@1 postgres
>
> I took scripts from

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-05 Thread Mahendra Singh Thalor

On Wed, 5 Feb 2020 at 12:07, Masahiko Sawada  wrote:
>
> On Mon, Feb 3, 2020 at 8:03 PM Amit Kapila 
wrote:
> >
> > On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada 
wrote:
> > >
> > > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas 
wrote:
> > > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund 
wrote:
> > > >>> I think the real question is whether the scenario is common
enough to
> > > >>> worry about.  In practice, you'd have to be extremely unlucky to
be
> > > >>> doing many bulk loads at the same time that all happened to hash
to
> > > >>> the same bucket.
> > > >>
> > > >> With a bunch of parallel bulkloads into partitioned tables that
really
> > > >> doesn't seem that unlikely?
> > > >
> > > > It increases the likelihood of collisions, but probably decreases
the
> > > > number of cases where the contention gets really bad.
> > > >
> > > > For example, suppose each table has 100 partitions and you are
> > > > bulk-loading 10 of them at a time.  It's virtually certain that you
> > > > will have some collisions, but the amount of contention within each
> > > > bucket will remain fairly low because each backend spends only 1% of
> > > > its time in the bucket corresponding to any given partition.
> > > >
> > >
> > > I share another result of performance evaluation between current HEAD
> > > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
> > >
> > > Type of table: normal table, unlogged table
> > > Number of child tables : 16, 64 (all tables are located on the same
tablespace)
> > > Number of clients : 32
> > > Number of trials : 100
> > > Duration: 180 seconds for each trials
> > >
> > > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores), 256GB
> > > RAM, NVMe SSD 1.5TB.
> > > Each clients load 10kB random data across all partitioned tables.
> > >
> > > Here is the result.
> > >
> > >  childs |   type   | target  |  avg_tps   | diff with HEAD
> > > +--+-++--
> > >  16 | normal   | HEAD|   1643.833 |
> > >  16 | normal   | Patched |  1619.5404 |  0.985222
> > >  16 | unlogged | HEAD|  9069.3543 |
> > >  16 | unlogged | Patched |  9368.0263 |  1.032932
> > >  64 | normal   | HEAD|   1598.698 |
> > >  64 | normal   | Patched |  1587.5906 |  0.993052
> > >  64 | unlogged | HEAD|  9629.7315 |
> > >  64 | unlogged | Patched | 10208.2196 |  1.060073
> > > (8 rows)
> > >
> > > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > > whereas it increased 3% ~ 6% for unlogged tables. There were
> > > collisions at 0 ~ 5 relation extension lock slots between 2 relations
> > > in the 64 child tables case but it didn't seem to affect the tps.
> > >
> >
> > AFAIU, this resembles the workload that Andres was worried about.   I
> > think we should once run this test in a different environment, but
> > considering this to be correct and repeatable, where do we go with
> > this patch especially when we know it improves many workloads [1] as
> > well.  We know that on a pathological case constructed by Mithun [2],
> > this causes regression as well.  I am not sure if the test done by
> > Mithun really mimics any real-world workload as he has tested by
> > making N_RELEXTLOCK_ENTS = 1 to hit the worst case.
> >
> > Sawada-San, if you have a script or data for the test done by you,
> > then please share it so that others can also try to reproduce it.
>
> Unfortunately the environment I used for performance verification is
> no longer available.
>
> I agree to run this test in a different environment. I've attached the
> rebased version patch. I'm measuring the performance with/without
> patch, so will share the results.
>

Thanks Sawada-san for patch.

>From last few days, I was reading this thread and was reviewing v13 patch.
To debug and test, I did re-base of v13 patch. I compared my re-based patch
and v14 patch. I think,  ordering of header files is not alphabetically in
v14 patch. (I haven't reviewed v14 patch fully because before review, I
wanted to test false sharing).  While debugging, I didn't noticed any hang
or lock related issue.

I did some testing to test false sharing(bulk insert, COPY data, bulk
insert into partitions tables).  Below is the testing summary.


*Test setup(Bulk insert into partition tables):*
autovacuum=off
shared_buffers=512MB -c max_wal_size=20GB -c checkpoint_timeout=12min

Basically, I created a table with 13 partitions. Using pgbench, I inserted
bulk data. I used below pgbench command:
*./pgbench -c $threads -j $threads -T 180 -f insert1.sql@1 -f insert2.sql@1
-f insert3.sql@1 -f insert4.sql@1 postgres*

I took scripts from previews mails and modified. For reference, I am
attaching test scripts.  I tested with default 1024 slots(N_RELEXTLOCK_ENTS
= 1024).


*Clients  HEAD (tps) With v14 patch (tps)
%change  (time: 180s)*
192.979796
100.877446 +8.49 %
32

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2020-02-04 Thread Masahiko Sawada

On Mon, Feb 3, 2020 at 8:03 PM Amit Kapila  wrote:
>
> On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada  
> wrote:
> >
> > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas  wrote:
> > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund  wrote:
> > >>> I think the real question is whether the scenario is common enough to
> > >>> worry about.  In practice, you'd have to be extremely unlucky to be
> > >>> doing many bulk loads at the same time that all happened to hash to
> > >>> the same bucket.
> > >>
> > >> With a bunch of parallel bulkloads into partitioned tables that really
> > >> doesn't seem that unlikely?
> > >
> > > It increases the likelihood of collisions, but probably decreases the
> > > number of cases where the contention gets really bad.
> > >
> > > For example, suppose each table has 100 partitions and you are
> > > bulk-loading 10 of them at a time.  It's virtually certain that you
> > > will have some collisions, but the amount of contention within each
> > > bucket will remain fairly low because each backend spends only 1% of
> > > its time in the bucket corresponding to any given partition.
> > >
> >
> > I share another result of performance evaluation between current HEAD
> > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
> >
> > Type of table: normal table, unlogged table
> > Number of child tables : 16, 64 (all tables are located on the same 
> > tablespace)
> > Number of clients : 32
> > Number of trials : 100
> > Duration: 180 seconds for each trials
> >
> > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores), 256GB
> > RAM, NVMe SSD 1.5TB.
> > Each clients load 10kB random data across all partitioned tables.
> >
> > Here is the result.
> >
> >  childs |   type   | target  |  avg_tps   | diff with HEAD
> > +--+-++--
> >  16 | normal   | HEAD|   1643.833 |
> >  16 | normal   | Patched |  1619.5404 |  0.985222
> >  16 | unlogged | HEAD|  9069.3543 |
> >  16 | unlogged | Patched |  9368.0263 |  1.032932
> >  64 | normal   | HEAD|   1598.698 |
> >  64 | normal   | Patched |  1587.5906 |  0.993052
> >  64 | unlogged | HEAD|  9629.7315 |
> >  64 | unlogged | Patched | 10208.2196 |  1.060073
> > (8 rows)
> >
> > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > whereas it increased 3% ~ 6% for unlogged tables. There were
> > collisions at 0 ~ 5 relation extension lock slots between 2 relations
> > in the 64 child tables case but it didn't seem to affect the tps.
> >
>
> AFAIU, this resembles the workload that Andres was worried about.   I
> think we should once run this test in a different environment, but
> considering this to be correct and repeatable, where do we go with
> this patch especially when we know it improves many workloads [1] as
> well.  We know that on a pathological case constructed by Mithun [2],
> this causes regression as well.  I am not sure if the test done by
> Mithun really mimics any real-world workload as he has tested by
> making N_RELEXTLOCK_ENTS = 1 to hit the worst case.
>
> Sawada-San, if you have a script or data for the test done by you,
> then please share it so that others can also try to reproduce it.

Unfortunately the environment I used for performance verification is
no longer available.

I agree to run this test in a different environment. I've attached the
rebased version patch. I'm measuring the performance with/without
patch, so will share the results.

Regards,

--
Masahiko Sawada  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


v14-0001-Move-relation-extension-locks-out-of-heavyweigth.patch
Description: Binary data

1 2 >

1 - 100 of 155 matches

Mail list logo