Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-28 Thread Amit Kapila
On Thu, Feb 28, 2019 at 9:59 AM John Naylor  wrote:
>
> On Thu, Feb 28, 2019 at 10:25 AM Amit Kapila  wrote:
> >
> > Here's an updated patch based on comments by you.  I will proceed with
> > this unless you have any more comments.
>
> Looks good to me. I would just adjust the grammar in the comment, from
> "This prevents us to use the map", to "This prevents us from using the
> map". Perhaps also a comma after "first".
>

Okay, pushed after making changes suggested by you.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread John Naylor
On Thu, Feb 28, 2019 at 10:25 AM Amit Kapila  wrote:
>
> Here's an updated patch based on comments by you.  I will proceed with
> this unless you have any more comments.

Looks good to me. I would just adjust the grammar in the comment, from
"This prevents us to use the map", to "This prevents us from using the
map". Perhaps also a comma after "first".

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread Amit Kapila
On Thu, Feb 28, 2019 at 8:10 AM Amit Kapila  wrote:
>
> On Thu, Feb 28, 2019 at 7:45 AM John Naylor  
> wrote:
> >
> > On Thu, Feb 28, 2019 at 7:24 AM Amit Kapila  wrote:
> > > > The flaw in my thinking was treating extension too similarly too
> > > > finding an existing block. With your patch clearing the local map in
> > > > the correct place, it seems the call at hio.c:682 is now superfluous?
> > >
> > > What if get some valid block from the first call to
> > > GetPageWithFreeSpace via local map which has required space?  I think
> > > in that case we will need the call at hio.c:682.  Am I missing
> > > something?
> >
> > Are you referring to the call at line 393? Then the map will be
> > cleared on line 507 before we return that buffer.
> >
>
> That's correct, I haven't looked at line number very carefully and
> assumed that you are saying to remove the call at 507.  I also think
> that the call at line 682 doesn't seem to be required, but I would
> prefer to keep it for the sake of consistency in this function and
> safety purpose.  However, I think we should add a comment on the lines
> suggested by you.  I will send the patch after making these
> modifications.
>

Here's an updated patch based on comments by you.  I will proceed with
this unless you have any more comments.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_loc_map_clear_3.patch
Description: Binary data


Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread Amit Kapila
On Thu, Feb 28, 2019 at 7:45 AM John Naylor  wrote:
>
> On Thu, Feb 28, 2019 at 7:24 AM Amit Kapila  wrote:
> > > The flaw in my thinking was treating extension too similarly too
> > > finding an existing block. With your patch clearing the local map in
> > > the correct place, it seems the call at hio.c:682 is now superfluous?
> >
> > What if get some valid block from the first call to
> > GetPageWithFreeSpace via local map which has required space?  I think
> > in that case we will need the call at hio.c:682.  Am I missing
> > something?
>
> Are you referring to the call at line 393? Then the map will be
> cleared on line 507 before we return that buffer.
>

That's correct, I haven't looked at line number very carefully and
assumed that you are saying to remove the call at 507.  I also think
that the call at line 682 doesn't seem to be required, but I would
prefer to keep it for the sake of consistency in this function and
safety purpose.  However, I think we should add a comment on the lines
suggested by you.  I will send the patch after making these
modifications.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread John Naylor
On Thu, Feb 28, 2019 at 7:24 AM Amit Kapila  wrote:
> > The flaw in my thinking was treating extension too similarly too
> > finding an existing block. With your patch clearing the local map in
> > the correct place, it seems the call at hio.c:682 is now superfluous?
>
> What if get some valid block from the first call to
> GetPageWithFreeSpace via local map which has required space?  I think
> in that case we will need the call at hio.c:682.  Am I missing
> something?

Are you referring to the call at line 393? Then the map will be
cleared on line 507 before we return that buffer.

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-27 Thread Amit Kapila
On Wed, Feb 27, 2019 at 11:07 AM John Naylor
 wrote:
>
> On Wed, Feb 27, 2019 at 5:06 AM Amit Kapila  wrote:
> > I have tried this test many times (more than 1000 times) by varying
> > thread count, but couldn't reproduce it.  My colleague, Kuntal has
> > tried a similar test overnight, but the issue didn't reproduce which
> > is not surprising to me seeing the nature of the problem.  As I could
> > reproduce it via the debugger, I think we can go ahead with the fix.
> > I have improved the comments in the attached patch and I have also
> > tried to address Tom's concern w.r.t comments by adding additional
> > comments atop of data-structure used to maintain the local map.
>
> The flaw in my thinking was treating extension too similarly too
> finding an existing block. With your patch clearing the local map in
> the correct place, it seems the call at hio.c:682 is now superfluous?
>

What if get some valid block from the first call to
GetPageWithFreeSpace via local map which has required space?  I think
in that case we will need the call at hio.c:682.  Am I missing
something?

> It wasn't sufficient before, so should this call be removed so as not
> to mislead? Or maybe keep it to be safe, with a comment like "This
> should already be cleared by now, but make sure it is."
>
> + * To maintain good performance, we only mark every other page as available
> + * in this map.
>
> I think this implementation detail is not needed to comment on the
> struct. I think the pointer to the README is sufficient.
>

Fair enough, I will remove that part of a comment once we agree on the
above point.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-26 Thread John Naylor
On Wed, Feb 27, 2019 at 5:06 AM Amit Kapila  wrote:
> I have tried this test many times (more than 1000 times) by varying
> thread count, but couldn't reproduce it.  My colleague, Kuntal has
> tried a similar test overnight, but the issue didn't reproduce which
> is not surprising to me seeing the nature of the problem.  As I could
> reproduce it via the debugger, I think we can go ahead with the fix.
> I have improved the comments in the attached patch and I have also
> tried to address Tom's concern w.r.t comments by adding additional
> comments atop of data-structure used to maintain the local map.

The flaw in my thinking was treating extension too similarly too
finding an existing block. With your patch clearing the local map in
the correct place, it seems the call at hio.c:682 is now superfluous?
It wasn't sufficient before, so should this call be removed so as not
to mislead? Or maybe keep it to be safe, with a comment like "This
should already be cleared by now, but make sure it is."

+ * To maintain good performance, we only mark every other page as available
+ * in this map.

I think this implementation detail is not needed to comment on the
struct. I think the pointer to the README is sufficient.

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-26 Thread John Naylor
On Tue, Feb 26, 2019 at 10:28 AM Amit Kapila  wrote:
> John, others, can you review my findings and patch?

I can confirm your findings with your debugging steps. Nice work!

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-26 Thread Amit Kapila
On Tue, Feb 26, 2019 at 2:58 PM Amit Kapila  wrote:
>
> On Mon, Feb 25, 2019 at 10:32 PM Tom Lane  wrote:
> >
>
> To fix this symptom, we can ensure that once we didn't get any block
> from local map, we must clear it.  See the attached patch.  I will try
> to evaluate this code path to see if there is any similar race
> condition and will also try to generate a reproducer.  I don't have
> any great idea to write a reproducer for this issue apart from trying
> some thing similar to ecpg/test/thread/prep.pgc, if you can think of
> any, I am all ears.
>

I have tried this test many times (more than 1000 times) by varying
thread count, but couldn't reproduce it.  My colleague, Kuntal has
tried a similar test overnight, but the issue didn't reproduce which
is not surprising to me seeing the nature of the problem.  As I could
reproduce it via the debugger, I think we can go ahead with the fix.
I have improved the comments in the attached patch and I have also
tried to address Tom's concern w.r.t comments by adding additional
comments atop of data-structure used to maintain the local map.

Let me know what you think?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_loc_map_clear_2.patch
Description: Binary data


Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-26 Thread Amit Kapila
On Mon, Feb 25, 2019 at 10:32 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > Avoid creation of the free space map for small heap relations, take 2.
>
> I think this patch still has some issues.  Note the following two
> recent buildfarm failures:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura=2019-02-20%2004%3A20%3A01
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet=2019-02-24%2022%3A48%3A02
>
> Both of them failed in an assertion added by this patch:
>
> TRAP: FailedAssertion("!((rel->rd_rel->relkind == 'r' || rel->rd_rel->relkind 
> == 't') && fsm_local_map.map[oldPage] == 0x01)", File: 
> "/home/bf/build/buildfarm-petalura/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c",
>  Line: 229)
> ...
> 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:4] LOG:  server process (PID 
> 17981) was terminated by signal 6: Aborted
> 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:5] DETAIL:  Failed process was 
> running: INSERT INTO T VALUES ( $1 )
>
> I can even offer you a theory as to why we're managing to get through all
> the core regression tests and then failing in ecpg: the failing test is
> ecpg/test/thread/prep.pgc, and I think that may be the only place in our
> tests where we stress concurrent insertions into a single small table.
> So I think you've got race-condition issues here.
>

Right, I think I have some theory what is going on here.   There seems
to be a narrow window during which this problem can occur (with
concurrent insertions) where we will try to use the local map for
block which doesn't exist in that map.

During insertion (say by session-1) once we have exhausted all the
pages in local map and didn't find available space, we try to extend
the relation.  Note, by this time we haven't cleared the local map.
Now, say we didn't got the conditional relation extension lock (in
function RelationGetBufferForTuple, line 561) and before we get the
relation extension lock, the other backend (say session-2) acquired it
and extended the relation in bulk (via RelationAddExtraBlocks) and
updated the FSM (as the total number of blocks exceeds
HEAP_FSM_CREATION_THRESHOLD).   After that, session-1 got the lock and
found the page from FSM which has some freespace. Then before it tries
to validate whether the page has enough space to insert a new tuple,
session-2 again inserted few more rows such that the page which
session-1 has got from FSM doesn't have free space.  After that,
session-1 will try to record the freespace for this page and find a
new page from FSM, however as our local map is still not cleared, it
will hit the assertion shown above.

Based on above theory, I have tried to reproduce this problem via
debugger and here are the steps, I have followed.

Session-1
-
1. Create table fsm_1(c1 int, c2 char(1500));
2. Attach debugger, set breakpoint on line no. 535 in hio.c ( aka
targetBlock = RecordAndGetPageWithFreeSpace(relation,)
3. insert into fsm_1 values(generate_series(1,10),'a');
4. Now, when you debug the code at line 535, you must get
InvalidBlockNumber and you can check that local map will still exist
(value of fsm_local_map.nblocks must be greater than 0).
5. Debug till line 561 (else if
(!ConditionalLockRelationForExtension(relation, ExclusiveLock))).

Session-2
--
1. Attach debugger, set breakpoint on line no. 599 in hio.c (aka
buffer = ReadBufferBI(relation, P_NEW, RBM_ZERO_AND_LOCK, bistate);)
2. insert into fsm_1 values(generate_series(11,37),'a');
3. Now leave the debugger of this session at this stage, this ensures
that Session-2 has RelationExtensionLock

Session-1

6. Step over line 561.
7. Before acquiring RelationExtension Lock in in line 564
(LockRelationForExtension(relation, ExclusiveLock);), allow Session-2
to complete.

Session-2
---
4. continue
5. vacuum fsm_1; --This will allow to create FSM for this table.

Session-1

8. Step through debugger, you will get the block with freespace from
FSM via GetPageWithFreeSpace.
9. Now, it will allow to try that block (via goto loop).
10. Let the debugger in the same state and allow Session-2 to execute
some statements so that the space in this block is used up.

Session-2

6. insert into fsm_1 values(generate_series(41,50),'a');

Session-1

11.  Step through debugger and it should hit Assertion in
RecordAndGetPageWithFreeSpace (line 227, freespace.c)

I am looking at code as of commit -
2ab23445bc6af517bddb40d8429146d8ff8d7ff4.  I have also tried to
reproduce it via the ecpg test and my colleague Kuntal Ghosh has
helped me to run that ecpg test by varying THREADS and REPEATS in
ecpg/test/thread/prep.pgc on different machines, but no luck till now.

To fix this symptom, we can ensure that once we didn't get any block
from local map, we must clear it.  See the attached patch.  I will try
to evaluate this code path to see if there is any similar race
condition and will 

Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Amit Kapila
On Tue, Feb 26, 2019 at 8:38 AM John Naylor  wrote:
>
> On Tue, Feb 26, 2019 at 8:59 AM Amit Kapila  wrote:
> > I will look into it today and respond back with my findings.  John,
> > see if you also get time to investigate this.
>
> Will do, but it will likely be tomorrow
>

No problem, thanks!

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Amit Kapila
On Mon, Feb 25, 2019 at 11:14 PM Tom Lane  wrote:
>
> I wrote:
> > Amit Kapila  writes:
> >> Avoid creation of the free space map for small heap relations, take 2.
>
> > I think this patch still has some issues.
>
> Just out of curiosity ... how can it possibly be even a little bit sane
> that fsm_local_map is a single static data structure, without even any
> indication of which table it is for?
>

It is just for finding a free block among a few blocks of relation.
We clear this when we have found a block with enough free space, when
we extend the relation, or on transaction abort.  So, I think we don't
need any per table information.

> If, somehow, there's a rational argument for that design, why is it
> not explained in freespace.c?  The most charitable interpretation of
> what I see there is that it's fatally undercommented.
>

There is some explanation in freespace.c and in README
(src/backend/storage/freespace/README).  I think we should add some
more information where this data structure (FSMLocalMap) is declared.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread John Naylor
On Tue, Feb 26, 2019 at 8:59 AM Amit Kapila  wrote:
> I will look into it today and respond back with my findings.  John,
> see if you also get time to investigate this.

Will do, but it will likely be tomorrow

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Amit Kapila
On Mon, Feb 25, 2019 at 10:32 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > Avoid creation of the free space map for small heap relations, take 2.
>
> I think this patch still has some issues.
>

I will look into it today and respond back with my findings.  John,
see if you also get time to investigate this.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Tom Lane
I wrote:
> Amit Kapila  writes:
>> Avoid creation of the free space map for small heap relations, take 2.

> I think this patch still has some issues.

Just out of curiosity ... how can it possibly be even a little bit sane
that fsm_local_map is a single static data structure, without even any
indication of which table it is for?

If, somehow, there's a rational argument for that design, why is it
not explained in freespace.c?  The most charitable interpretation of
what I see there is that it's fatally undercommented.

regards, tom lane



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Tom Lane
Amit Kapila  writes:
> Avoid creation of the free space map for small heap relations, take 2.

I think this patch still has some issues.  Note the following two
recent buildfarm failures:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura=2019-02-20%2004%3A20%3A01
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet=2019-02-24%2022%3A48%3A02

Both of them failed in an assertion added by this patch:

TRAP: FailedAssertion("!((rel->rd_rel->relkind == 'r' || rel->rd_rel->relkind 
== 't') && fsm_local_map.map[oldPage] == 0x01)", File: 
"/home/bf/build/buildfarm-petalura/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c",
 Line: 229)
...
2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:4] LOG:  server process (PID 17981) 
was terminated by signal 6: Aborted
2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:5] DETAIL:  Failed process was 
running: INSERT INTO T VALUES ( $1 )

I can even offer you a theory as to why we're managing to get through all
the core regression tests and then failing in ecpg: the failing test is
ecpg/test/thread/prep.pgc, and I think that may be the only place in our
tests where we stress concurrent insertions into a single small table.
So I think you've got race-condition issues here.

It might or might not be significant that these two animals are part
of Andres' JIT-enabled menagerie.  I could see that affecting test
timing to the point of showing a symptom that's otherwise hard to hit.

regards, tom lane



Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-02-03 Thread Michael Paquier
On Mon, Jan 28, 2019 at 07:20:29AM +0100, John Naylor wrote:
> That particular test could be removed -- it's just verifying behavior
> that's already been there for years and is a direct consquence of
> normal truncation combined with the addressing scheme of the FSM
> logic.

Moved to next CF, please feel free to continue this work, this is
interesting stuff.
--
Michael


signature.asc
Description: PGP signature


Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-01-27 Thread John Naylor
On Mon, Jan 28, 2019 at 6:58 AM Amit Kapila  wrote:
>
> On Mon, Jan 28, 2019 at 11:10 AM Andrew Gierth
> you just need to create free space on a page that didn't have enough
> > before? It might be worth tweaking the fillfactor rather than trying to
> > delete anything.
> >
>
> No, it also depends on vacuum to remove rows and then truncate the
> relation accordingly.

That particular test could be removed -- it's just verifying behavior
that's already been there for years and is a direct consquence of
normal truncation combined with the addressing scheme of the FSM
logic.

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-01-27 Thread Tom Lane
Amit Kapila  writes:
>> I don't know what the common thread is here, but you don't get to leave
>> the buildfarm broken this badly while you figure it out.

> Sure, but I am wondering why none of this ever shown in local tests,
> as we have done quite some testing related to pgbench as well.

Not sure, but I notice that two of the three critters I just pointed to
use force_parallel_mode.  Dromedary does not, but it's old and slow.
Kind of smells like a timing problem ...

regards, tom lane



Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-01-27 Thread Amit Kapila
On Mon, Jan 28, 2019 at 11:10 AM Andrew Gierth
 wrote:
>
> > "Amit" == Amit Kapila  writes:
>
>  Amit> Yes, so this could be the cause of the problem. I think we need
>  Amit> to change the tests added by the patch such that they don't rely
>  Amit> on vacuum to remove dead-row versions? Do you or anybody else see
>  Amit> any better way to fix this?
>
> Do you just need to create free space on a page that didn't have enough
> before? It might be worth tweaking the fillfactor rather than trying to
> delete anything.
>

No, it also depends on vacuum to remove rows and then truncate the
relation accordingly.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-01-27 Thread Amit Kapila
On Mon, Jan 28, 2019 at 11:18 AM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > Yes, so this could be the cause of the problem.  I think we need to
> > change the tests added by the patch such that they don't rely on
> > vacuum to remove dead-row versions?  Do you or anybody else see any
> > better way to fix this?
>
> To be blunt, this patch needs to be reverted immediately.

Okay, I will do it.

>  The failures
> that are showing up are not just "the fsm test is not portable" problems.
> See for example
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mantid=2019-01-28%2005%3A07%3A06
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dromedary=2019-01-28%2003%3A07%3A39
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing=2019-01-28%2003%3A20%3A02
>
> I don't know what the common thread is here, but you don't get to leave
> the buildfarm broken this badly while you figure it out.
>

Sure, but I am wondering why none of this ever shown in local tests,
as we have done quite some testing related to pgbench as well.
Anyway, I think we need figure that out seprately.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-01-27 Thread Tom Lane
Amit Kapila  writes:
> Yes, so this could be the cause of the problem.  I think we need to
> change the tests added by the patch such that they don't rely on
> vacuum to remove dead-row versions?  Do you or anybody else see any
> better way to fix this?

To be blunt, this patch needs to be reverted immediately.  The failures
that are showing up are not just "the fsm test is not portable" problems.
See for example

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mantid=2019-01-28%2005%3A07%3A06

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dromedary=2019-01-28%2003%3A07%3A39

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing=2019-01-28%2003%3A20%3A02

I don't know what the common thread is here, but you don't get to leave
the buildfarm broken this badly while you figure it out.

regards, tom lane



Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-01-27 Thread Andrew Gierth
> "Amit" == Amit Kapila  writes:

 Amit> Yes, so this could be the cause of the problem. I think we need
 Amit> to change the tests added by the patch such that they don't rely
 Amit> on vacuum to remove dead-row versions? Do you or anybody else see
 Amit> any better way to fix this?

Do you just need to create free space on a page that didn't have enough
before? It might be worth tweaking the fillfactor rather than trying to
delete anything.

-- 
Andrew (irc:RhodiumToad)



Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-01-27 Thread Amit Kapila
On Mon, Jan 28, 2019 at 10:25 AM Andrew Gierth
 wrote:
>
> > "Amit" == Amit Kapila  writes:
>
>  Amit> One possibility is that autovacuum has triggered to perform
>  Amit> truncation of some other relation (remove pages at the end) which
>  Amit> doesn't allow the FSM test to remove the rows/perform truncation
>  Amit> and thus let to the failure. Can there be anything else which can
>  Amit> start a transaction when a regression test is running? Still
>  Amit> thinking, but inputs are welcome.
>
> I've bumped into issues (cf. commit 64ae420b2) with regression tests of
> this kind caused by concurrent (auto-)ANALYZE (not VACUUM);
>

Yes, so this could be the cause of the problem.  I think we need to
change the tests added by the patch such that they don't rely on
vacuum to remove dead-row versions?  Do you or anybody else see any
better way to fix this?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-01-27 Thread Andrew Gierth
> "Amit" == Amit Kapila  writes:

 Amit> One possibility is that autovacuum has triggered to perform
 Amit> truncation of some other relation (remove pages at the end) which
 Amit> doesn't allow the FSM test to remove the rows/perform truncation
 Amit> and thus let to the failure. Can there be anything else which can
 Amit> start a transaction when a regression test is running? Still
 Amit> thinking, but inputs are welcome.

I've bumped into issues (cf. commit 64ae420b2) with regression tests of
this kind caused by concurrent (auto-)ANALYZE (not VACUUM); VACUUM's
snapshot is ignored for testing for whether row versions can be removed,
but ANALYZE's is not.

-- 
Andrew (irc:RhodiumToad)



Re: pgsql: Avoid creation of the free space map for small heap relations.

2019-01-27 Thread Amit Kapila
On Mon, Jan 28, 2019 at 8:49 AM Amit Kapila  wrote:
>
> On Mon, Jan 28, 2019 at 8:17 AM Amit Kapila  wrote:
> >
> > Avoid creation of the free space map for small heap relations.
> >
>
> It seems there is some failure due to this on build farm machines. I
> will investigate!
>

The failure is as below:

@@ -26,7 +26,7 @@
 pg_relation_size('fsm_check_size', 'fsm') AS fsm_size;
  heap_size | fsm_size
 ---+--
- 24576 |0
+ 32768 |0
 (1 row)

 -- Extend table with enough blocks to exceed the FSM threshold
@@ -56,7 +56,7 @@
 SELECT pg_relation_size('fsm_check_size', 'fsm') AS fsm_size;
  fsm_size
 --
-16384
+24576
 (1 row)

So here in the above tests, we are deleting some rows, performing
vacuum and then checking the size of heap and or vacuum.  It seems to
me sometimes the vacuum is *not* able to remove the rows and truncate
the relation/FSM as per tests expectation.  One possible theory is
that there is some parallel transaction running which prevents a
vacuum from doing so.  We have tried to avoid that by not allowing to
run the FSM test in parallel with other tests, but I think still
something seems to have run in parallel to the FSM test.  One
possibility is that autovacuum has triggered to perform truncation of
some other relation (remove pages at the end) which doesn't allow the
FSM test to remove the rows/perform truncation and thus let to the
failure.  Can there be anything else which can start a transaction
when a regression test is running?  Still thinking, but inputs are
welcome.

If my theory is correct, then in the newly added tests by this patch,
we can't rely on the vacuum to truncate the relation pages at the end
and hence can't rely on heap/FSM size.

Thoughts?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com