Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-28 Thread Amit Kapila
On Thu, Feb 28, 2019 at 9:59 AM John Naylor  wrote:
>
> On Thu, Feb 28, 2019 at 10:25 AM Amit Kapila  wrote:
> >
> > Here's an updated patch based on comments by you.  I will proceed with
> > this unless you have any more comments.
>
> Looks good to me. I would just adjust the grammar in the comment, from
> "This prevents us to use the map", to "This prevents us from using the
> map". Perhaps also a comma after "first".
>

Okay, pushed after making changes suggested by you.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread John Naylor
On Thu, Feb 28, 2019 at 10:25 AM Amit Kapila  wrote:
>
> Here's an updated patch based on comments by you.  I will proceed with
> this unless you have any more comments.

Looks good to me. I would just adjust the grammar in the comment, from
"This prevents us to use the map", to "This prevents us from using the
map". Perhaps also a comma after "first".

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread Amit Kapila
On Thu, Feb 28, 2019 at 8:10 AM Amit Kapila  wrote:
>
> On Thu, Feb 28, 2019 at 7:45 AM John Naylor  
> wrote:
> >
> > On Thu, Feb 28, 2019 at 7:24 AM Amit Kapila  wrote:
> > > > The flaw in my thinking was treating extension too similarly too
> > > > finding an existing block. With your patch clearing the local map in
> > > > the correct place, it seems the call at hio.c:682 is now superfluous?
> > >
> > > What if get some valid block from the first call to
> > > GetPageWithFreeSpace via local map which has required space?  I think
> > > in that case we will need the call at hio.c:682.  Am I missing
> > > something?
> >
> > Are you referring to the call at line 393? Then the map will be
> > cleared on line 507 before we return that buffer.
> >
>
> That's correct, I haven't looked at line number very carefully and
> assumed that you are saying to remove the call at 507.  I also think
> that the call at line 682 doesn't seem to be required, but I would
> prefer to keep it for the sake of consistency in this function and
> safety purpose.  However, I think we should add a comment on the lines
> suggested by you.  I will send the patch after making these
> modifications.
>

Here's an updated patch based on comments by you.  I will proceed with
this unless you have any more comments.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_loc_map_clear_3.patch
Description: Binary data


Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread Amit Kapila
On Thu, Feb 28, 2019 at 7:45 AM John Naylor  wrote:
>
> On Thu, Feb 28, 2019 at 7:24 AM Amit Kapila  wrote:
> > > The flaw in my thinking was treating extension too similarly too
> > > finding an existing block. With your patch clearing the local map in
> > > the correct place, it seems the call at hio.c:682 is now superfluous?
> >
> > What if get some valid block from the first call to
> > GetPageWithFreeSpace via local map which has required space?  I think
> > in that case we will need the call at hio.c:682.  Am I missing
> > something?
>
> Are you referring to the call at line 393? Then the map will be
> cleared on line 507 before we return that buffer.
>

That's correct, I haven't looked at line number very carefully and
assumed that you are saying to remove the call at 507.  I also think
that the call at line 682 doesn't seem to be required, but I would
prefer to keep it for the sake of consistency in this function and
safety purpose.  However, I think we should add a comment on the lines
suggested by you.  I will send the patch after making these
modifications.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread John Naylor
On Thu, Feb 28, 2019 at 7:24 AM Amit Kapila  wrote:
> > The flaw in my thinking was treating extension too similarly too
> > finding an existing block. With your patch clearing the local map in
> > the correct place, it seems the call at hio.c:682 is now superfluous?
>
> What if get some valid block from the first call to
> GetPageWithFreeSpace via local map which has required space?  I think
> in that case we will need the call at hio.c:682.  Am I missing
> something?

Are you referring to the call at line 393? Then the map will be
cleared on line 507 before we return that buffer.

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread Amit Kapila
On Wed, Feb 27, 2019 at 11:07 AM John Naylor
 wrote:
>
> On Wed, Feb 27, 2019 at 5:06 AM Amit Kapila  wrote:
> > I have tried this test many times (more than 1000 times) by varying
> > thread count, but couldn't reproduce it.  My colleague, Kuntal has
> > tried a similar test overnight, but the issue didn't reproduce which
> > is not surprising to me seeing the nature of the problem.  As I could
> > reproduce it via the debugger, I think we can go ahead with the fix.
> > I have improved the comments in the attached patch and I have also
> > tried to address Tom's concern w.r.t comments by adding additional
> > comments atop of data-structure used to maintain the local map.
>
> The flaw in my thinking was treating extension too similarly too
> finding an existing block. With your patch clearing the local map in
> the correct place, it seems the call at hio.c:682 is now superfluous?
>

What if get some valid block from the first call to
GetPageWithFreeSpace via local map which has required space?  I think
in that case we will need the call at hio.c:682.  Am I missing
something?

> It wasn't sufficient before, so should this call be removed so as not
> to mislead? Or maybe keep it to be safe, with a comment like "This
> should already be cleared by now, but make sure it is."
>
> + * To maintain good performance, we only mark every other page as available
> + * in this map.
>
> I think this implementation detail is not needed to comment on the
> struct. I think the pointer to the README is sufficient.
>

Fair enough, I will remove that part of a comment once we agree on the
above point.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-26 Thread John Naylor
On Wed, Feb 27, 2019 at 5:06 AM Amit Kapila  wrote:
> I have tried this test many times (more than 1000 times) by varying
> thread count, but couldn't reproduce it.  My colleague, Kuntal has
> tried a similar test overnight, but the issue didn't reproduce which
> is not surprising to me seeing the nature of the problem.  As I could
> reproduce it via the debugger, I think we can go ahead with the fix.
> I have improved the comments in the attached patch and I have also
> tried to address Tom's concern w.r.t comments by adding additional
> comments atop of data-structure used to maintain the local map.

The flaw in my thinking was treating extension too similarly too
finding an existing block. With your patch clearing the local map in
the correct place, it seems the call at hio.c:682 is now superfluous?
It wasn't sufficient before, so should this call be removed so as not
to mislead? Or maybe keep it to be safe, with a comment like "This
should already be cleared by now, but make sure it is."

+ * To maintain good performance, we only mark every other page as available
+ * in this map.

I think this implementation detail is not needed to comment on the
struct. I think the pointer to the README is sufficient.

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-26 Thread John Naylor
On Tue, Feb 26, 2019 at 10:28 AM Amit Kapila  wrote:
> John, others, can you review my findings and patch?

I can confirm your findings with your debugging steps. Nice work!

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-26 Thread Amit Kapila
On Tue, Feb 26, 2019 at 2:58 PM Amit Kapila  wrote:
>
> On Mon, Feb 25, 2019 at 10:32 PM Tom Lane  wrote:
> >
>
> To fix this symptom, we can ensure that once we didn't get any block
> from local map, we must clear it.  See the attached patch.  I will try
> to evaluate this code path to see if there is any similar race
> condition and will also try to generate a reproducer.  I don't have
> any great idea to write a reproducer for this issue apart from trying
> some thing similar to ecpg/test/thread/prep.pgc, if you can think of
> any, I am all ears.
>

I have tried this test many times (more than 1000 times) by varying
thread count, but couldn't reproduce it.  My colleague, Kuntal has
tried a similar test overnight, but the issue didn't reproduce which
is not surprising to me seeing the nature of the problem.  As I could
reproduce it via the debugger, I think we can go ahead with the fix.
I have improved the comments in the attached patch and I have also
tried to address Tom's concern w.r.t comments by adding additional
comments atop of data-structure used to maintain the local map.

Let me know what you think?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_loc_map_clear_2.patch
Description: Binary data


Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-26 Thread Amit Kapila
On Mon, Feb 25, 2019 at 10:32 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > Avoid creation of the free space map for small heap relations, take 2.
>
> I think this patch still has some issues.  Note the following two
> recent buildfarm failures:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura=2019-02-20%2004%3A20%3A01
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet=2019-02-24%2022%3A48%3A02
>
> Both of them failed in an assertion added by this patch:
>
> TRAP: FailedAssertion("!((rel->rd_rel->relkind == 'r' || rel->rd_rel->relkind 
> == 't') && fsm_local_map.map[oldPage] == 0x01)", File: 
> "/home/bf/build/buildfarm-petalura/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c",
>  Line: 229)
> ...
> 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:4] LOG:  server process (PID 
> 17981) was terminated by signal 6: Aborted
> 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:5] DETAIL:  Failed process was 
> running: INSERT INTO T VALUES ( $1 )
>
> I can even offer you a theory as to why we're managing to get through all
> the core regression tests and then failing in ecpg: the failing test is
> ecpg/test/thread/prep.pgc, and I think that may be the only place in our
> tests where we stress concurrent insertions into a single small table.
> So I think you've got race-condition issues here.
>

Right, I think I have some theory what is going on here.   There seems
to be a narrow window during which this problem can occur (with
concurrent insertions) where we will try to use the local map for
block which doesn't exist in that map.

During insertion (say by session-1) once we have exhausted all the
pages in local map and didn't find available space, we try to extend
the relation.  Note, by this time we haven't cleared the local map.
Now, say we didn't got the conditional relation extension lock (in
function RelationGetBufferForTuple, line 561) and before we get the
relation extension lock, the other backend (say session-2) acquired it
and extended the relation in bulk (via RelationAddExtraBlocks) and
updated the FSM (as the total number of blocks exceeds
HEAP_FSM_CREATION_THRESHOLD).   After that, session-1 got the lock and
found the page from FSM which has some freespace. Then before it tries
to validate whether the page has enough space to insert a new tuple,
session-2 again inserted few more rows such that the page which
session-1 has got from FSM doesn't have free space.  After that,
session-1 will try to record the freespace for this page and find a
new page from FSM, however as our local map is still not cleared, it
will hit the assertion shown above.

Based on above theory, I have tried to reproduce this problem via
debugger and here are the steps, I have followed.

Session-1
-
1. Create table fsm_1(c1 int, c2 char(1500));
2. Attach debugger, set breakpoint on line no. 535 in hio.c ( aka
targetBlock = RecordAndGetPageWithFreeSpace(relation,)
3. insert into fsm_1 values(generate_series(1,10),'a');
4. Now, when you debug the code at line 535, you must get
InvalidBlockNumber and you can check that local map will still exist
(value of fsm_local_map.nblocks must be greater than 0).
5. Debug till line 561 (else if
(!ConditionalLockRelationForExtension(relation, ExclusiveLock))).

Session-2
--
1. Attach debugger, set breakpoint on line no. 599 in hio.c (aka
buffer = ReadBufferBI(relation, P_NEW, RBM_ZERO_AND_LOCK, bistate);)
2. insert into fsm_1 values(generate_series(11,37),'a');
3. Now leave the debugger of this session at this stage, this ensures
that Session-2 has RelationExtensionLock

Session-1

6. Step over line 561.
7. Before acquiring RelationExtension Lock in in line 564
(LockRelationForExtension(relation, ExclusiveLock);), allow Session-2
to complete.

Session-2
---
4. continue
5. vacuum fsm_1; --This will allow to create FSM for this table.

Session-1

8. Step through debugger, you will get the block with freespace from
FSM via GetPageWithFreeSpace.
9. Now, it will allow to try that block (via goto loop).
10. Let the debugger in the same state and allow Session-2 to execute
some statements so that the space in this block is used up.

Session-2

6. insert into fsm_1 values(generate_series(41,50),'a');

Session-1

11.  Step through debugger and it should hit Assertion in
RecordAndGetPageWithFreeSpace (line 227, freespace.c)

I am looking at code as of commit -
2ab23445bc6af517bddb40d8429146d8ff8d7ff4.  I have also tried to
reproduce it via the ecpg test and my colleague Kuntal Ghosh has
helped me to run that ecpg test by varying THREADS and REPEATS in
ecpg/test/thread/prep.pgc on different machines, but no luck till now.

To fix this symptom, we can ensure that once we didn't get any block
from local map, we must clear it.  See the attached patch.  I will try
to evaluate this code path to see if there is any similar race
condition and will 

Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Amit Kapila
On Tue, Feb 26, 2019 at 8:38 AM John Naylor  wrote:
>
> On Tue, Feb 26, 2019 at 8:59 AM Amit Kapila  wrote:
> > I will look into it today and respond back with my findings.  John,
> > see if you also get time to investigate this.
>
> Will do, but it will likely be tomorrow
>

No problem, thanks!

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Amit Kapila
On Mon, Feb 25, 2019 at 11:14 PM Tom Lane  wrote:
>
> I wrote:
> > Amit Kapila  writes:
> >> Avoid creation of the free space map for small heap relations, take 2.
>
> > I think this patch still has some issues.
>
> Just out of curiosity ... how can it possibly be even a little bit sane
> that fsm_local_map is a single static data structure, without even any
> indication of which table it is for?
>

It is just for finding a free block among a few blocks of relation.
We clear this when we have found a block with enough free space, when
we extend the relation, or on transaction abort.  So, I think we don't
need any per table information.

> If, somehow, there's a rational argument for that design, why is it
> not explained in freespace.c?  The most charitable interpretation of
> what I see there is that it's fatally undercommented.
>

There is some explanation in freespace.c and in README
(src/backend/storage/freespace/README).  I think we should add some
more information where this data structure (FSMLocalMap) is declared.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread John Naylor
On Tue, Feb 26, 2019 at 8:59 AM Amit Kapila  wrote:
> I will look into it today and respond back with my findings.  John,
> see if you also get time to investigate this.

Will do, but it will likely be tomorrow

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Amit Kapila
On Mon, Feb 25, 2019 at 10:32 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > Avoid creation of the free space map for small heap relations, take 2.
>
> I think this patch still has some issues.
>

I will look into it today and respond back with my findings.  John,
see if you also get time to investigate this.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Tom Lane
I wrote:
> Amit Kapila  writes:
>> Avoid creation of the free space map for small heap relations, take 2.

> I think this patch still has some issues.

Just out of curiosity ... how can it possibly be even a little bit sane
that fsm_local_map is a single static data structure, without even any
indication of which table it is for?

If, somehow, there's a rational argument for that design, why is it
not explained in freespace.c?  The most charitable interpretation of
what I see there is that it's fatally undercommented.

regards, tom lane



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Tom Lane
Amit Kapila  writes:
> Avoid creation of the free space map for small heap relations, take 2.

I think this patch still has some issues.  Note the following two
recent buildfarm failures:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura=2019-02-20%2004%3A20%3A01
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet=2019-02-24%2022%3A48%3A02

Both of them failed in an assertion added by this patch:

TRAP: FailedAssertion("!((rel->rd_rel->relkind == 'r' || rel->rd_rel->relkind 
== 't') && fsm_local_map.map[oldPage] == 0x01)", File: 
"/home/bf/build/buildfarm-petalura/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c",
 Line: 229)
...
2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:4] LOG:  server process (PID 17981) 
was terminated by signal 6: Aborted
2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:5] DETAIL:  Failed process was 
running: INSERT INTO T VALUES ( $1 )

I can even offer you a theory as to why we're managing to get through all
the core regression tests and then failing in ecpg: the failing test is
ecpg/test/thread/prep.pgc, and I think that may be the only place in our
tests where we stress concurrent insertions into a single small table.
So I think you've got race-condition issues here.

It might or might not be significant that these two animals are part
of Andres' JIT-enabled menagerie.  I could see that affecting test
timing to the point of showing a symptom that's otherwise hard to hit.

regards, tom lane