Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Thu, Feb 28, 2019 at 9:59 AM John Naylor wrote: > > On Thu, Feb 28, 2019 at 10:25 AM Amit Kapila wrote: > > > > Here's an updated patch based on comments by you. I will proceed with > > this unless you have any more comments. > > Looks good to me. I would just adjust the grammar in the comment, from > "This prevents us to use the map", to "This prevents us from using the > map". Perhaps also a comma after "first". > Okay, pushed after making changes suggested by you. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Thu, Feb 28, 2019 at 10:25 AM Amit Kapila wrote: > > Here's an updated patch based on comments by you. I will proceed with > this unless you have any more comments. Looks good to me. I would just adjust the grammar in the comment, from "This prevents us to use the map", to "This prevents us from using the map". Perhaps also a comma after "first". -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Thu, Feb 28, 2019 at 8:10 AM Amit Kapila wrote: > > On Thu, Feb 28, 2019 at 7:45 AM John Naylor > wrote: > > > > On Thu, Feb 28, 2019 at 7:24 AM Amit Kapila wrote: > > > > The flaw in my thinking was treating extension too similarly too > > > > finding an existing block. With your patch clearing the local map in > > > > the correct place, it seems the call at hio.c:682 is now superfluous? > > > > > > What if get some valid block from the first call to > > > GetPageWithFreeSpace via local map which has required space? I think > > > in that case we will need the call at hio.c:682. Am I missing > > > something? > > > > Are you referring to the call at line 393? Then the map will be > > cleared on line 507 before we return that buffer. > > > > That's correct, I haven't looked at line number very carefully and > assumed that you are saying to remove the call at 507. I also think > that the call at line 682 doesn't seem to be required, but I would > prefer to keep it for the sake of consistency in this function and > safety purpose. However, I think we should add a comment on the lines > suggested by you. I will send the patch after making these > modifications. > Here's an updated patch based on comments by you. I will proceed with this unless you have any more comments. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com fix_loc_map_clear_3.patch Description: Binary data
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Thu, Feb 28, 2019 at 7:45 AM John Naylor wrote: > > On Thu, Feb 28, 2019 at 7:24 AM Amit Kapila wrote: > > > The flaw in my thinking was treating extension too similarly too > > > finding an existing block. With your patch clearing the local map in > > > the correct place, it seems the call at hio.c:682 is now superfluous? > > > > What if get some valid block from the first call to > > GetPageWithFreeSpace via local map which has required space? I think > > in that case we will need the call at hio.c:682. Am I missing > > something? > > Are you referring to the call at line 393? Then the map will be > cleared on line 507 before we return that buffer. > That's correct, I haven't looked at line number very carefully and assumed that you are saying to remove the call at 507. I also think that the call at line 682 doesn't seem to be required, but I would prefer to keep it for the sake of consistency in this function and safety purpose. However, I think we should add a comment on the lines suggested by you. I will send the patch after making these modifications. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Thu, Feb 28, 2019 at 7:24 AM Amit Kapila wrote: > > The flaw in my thinking was treating extension too similarly too > > finding an existing block. With your patch clearing the local map in > > the correct place, it seems the call at hio.c:682 is now superfluous? > > What if get some valid block from the first call to > GetPageWithFreeSpace via local map which has required space? I think > in that case we will need the call at hio.c:682. Am I missing > something? Are you referring to the call at line 393? Then the map will be cleared on line 507 before we return that buffer. -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Wed, Feb 27, 2019 at 11:07 AM John Naylor wrote: > > On Wed, Feb 27, 2019 at 5:06 AM Amit Kapila wrote: > > I have tried this test many times (more than 1000 times) by varying > > thread count, but couldn't reproduce it. My colleague, Kuntal has > > tried a similar test overnight, but the issue didn't reproduce which > > is not surprising to me seeing the nature of the problem. As I could > > reproduce it via the debugger, I think we can go ahead with the fix. > > I have improved the comments in the attached patch and I have also > > tried to address Tom's concern w.r.t comments by adding additional > > comments atop of data-structure used to maintain the local map. > > The flaw in my thinking was treating extension too similarly too > finding an existing block. With your patch clearing the local map in > the correct place, it seems the call at hio.c:682 is now superfluous? > What if get some valid block from the first call to GetPageWithFreeSpace via local map which has required space? I think in that case we will need the call at hio.c:682. Am I missing something? > It wasn't sufficient before, so should this call be removed so as not > to mislead? Or maybe keep it to be safe, with a comment like "This > should already be cleared by now, but make sure it is." > > + * To maintain good performance, we only mark every other page as available > + * in this map. > > I think this implementation detail is not needed to comment on the > struct. I think the pointer to the README is sufficient. > Fair enough, I will remove that part of a comment once we agree on the above point. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Wed, Feb 27, 2019 at 5:06 AM Amit Kapila wrote: > I have tried this test many times (more than 1000 times) by varying > thread count, but couldn't reproduce it. My colleague, Kuntal has > tried a similar test overnight, but the issue didn't reproduce which > is not surprising to me seeing the nature of the problem. As I could > reproduce it via the debugger, I think we can go ahead with the fix. > I have improved the comments in the attached patch and I have also > tried to address Tom's concern w.r.t comments by adding additional > comments atop of data-structure used to maintain the local map. The flaw in my thinking was treating extension too similarly too finding an existing block. With your patch clearing the local map in the correct place, it seems the call at hio.c:682 is now superfluous? It wasn't sufficient before, so should this call be removed so as not to mislead? Or maybe keep it to be safe, with a comment like "This should already be cleared by now, but make sure it is." + * To maintain good performance, we only mark every other page as available + * in this map. I think this implementation detail is not needed to comment on the struct. I think the pointer to the README is sufficient. -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Tue, Feb 26, 2019 at 10:28 AM Amit Kapila wrote: > John, others, can you review my findings and patch? I can confirm your findings with your debugging steps. Nice work! -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Tue, Feb 26, 2019 at 2:58 PM Amit Kapila wrote: > > On Mon, Feb 25, 2019 at 10:32 PM Tom Lane wrote: > > > > To fix this symptom, we can ensure that once we didn't get any block > from local map, we must clear it. See the attached patch. I will try > to evaluate this code path to see if there is any similar race > condition and will also try to generate a reproducer. I don't have > any great idea to write a reproducer for this issue apart from trying > some thing similar to ecpg/test/thread/prep.pgc, if you can think of > any, I am all ears. > I have tried this test many times (more than 1000 times) by varying thread count, but couldn't reproduce it. My colleague, Kuntal has tried a similar test overnight, but the issue didn't reproduce which is not surprising to me seeing the nature of the problem. As I could reproduce it via the debugger, I think we can go ahead with the fix. I have improved the comments in the attached patch and I have also tried to address Tom's concern w.r.t comments by adding additional comments atop of data-structure used to maintain the local map. Let me know what you think? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com fix_loc_map_clear_2.patch Description: Binary data
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Mon, Feb 25, 2019 at 10:32 PM Tom Lane wrote: > > Amit Kapila writes: > > Avoid creation of the free space map for small heap relations, take 2. > > I think this patch still has some issues. Note the following two > recent buildfarm failures: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura=2019-02-20%2004%3A20%3A01 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet=2019-02-24%2022%3A48%3A02 > > Both of them failed in an assertion added by this patch: > > TRAP: FailedAssertion("!((rel->rd_rel->relkind == 'r' || rel->rd_rel->relkind > == 't') && fsm_local_map.map[oldPage] == 0x01)", File: > "/home/bf/build/buildfarm-petalura/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c", > Line: 229) > ... > 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:4] LOG: server process (PID > 17981) was terminated by signal 6: Aborted > 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:5] DETAIL: Failed process was > running: INSERT INTO T VALUES ( $1 ) > > I can even offer you a theory as to why we're managing to get through all > the core regression tests and then failing in ecpg: the failing test is > ecpg/test/thread/prep.pgc, and I think that may be the only place in our > tests where we stress concurrent insertions into a single small table. > So I think you've got race-condition issues here. > Right, I think I have some theory what is going on here. There seems to be a narrow window during which this problem can occur (with concurrent insertions) where we will try to use the local map for block which doesn't exist in that map. During insertion (say by session-1) once we have exhausted all the pages in local map and didn't find available space, we try to extend the relation. Note, by this time we haven't cleared the local map. Now, say we didn't got the conditional relation extension lock (in function RelationGetBufferForTuple, line 561) and before we get the relation extension lock, the other backend (say session-2) acquired it and extended the relation in bulk (via RelationAddExtraBlocks) and updated the FSM (as the total number of blocks exceeds HEAP_FSM_CREATION_THRESHOLD). After that, session-1 got the lock and found the page from FSM which has some freespace. Then before it tries to validate whether the page has enough space to insert a new tuple, session-2 again inserted few more rows such that the page which session-1 has got from FSM doesn't have free space. After that, session-1 will try to record the freespace for this page and find a new page from FSM, however as our local map is still not cleared, it will hit the assertion shown above. Based on above theory, I have tried to reproduce this problem via debugger and here are the steps, I have followed. Session-1 - 1. Create table fsm_1(c1 int, c2 char(1500)); 2. Attach debugger, set breakpoint on line no. 535 in hio.c ( aka targetBlock = RecordAndGetPageWithFreeSpace(relation,) 3. insert into fsm_1 values(generate_series(1,10),'a'); 4. Now, when you debug the code at line 535, you must get InvalidBlockNumber and you can check that local map will still exist (value of fsm_local_map.nblocks must be greater than 0). 5. Debug till line 561 (else if (!ConditionalLockRelationForExtension(relation, ExclusiveLock))). Session-2 -- 1. Attach debugger, set breakpoint on line no. 599 in hio.c (aka buffer = ReadBufferBI(relation, P_NEW, RBM_ZERO_AND_LOCK, bistate);) 2. insert into fsm_1 values(generate_series(11,37),'a'); 3. Now leave the debugger of this session at this stage, this ensures that Session-2 has RelationExtensionLock Session-1 6. Step over line 561. 7. Before acquiring RelationExtension Lock in in line 564 (LockRelationForExtension(relation, ExclusiveLock);), allow Session-2 to complete. Session-2 --- 4. continue 5. vacuum fsm_1; --This will allow to create FSM for this table. Session-1 8. Step through debugger, you will get the block with freespace from FSM via GetPageWithFreeSpace. 9. Now, it will allow to try that block (via goto loop). 10. Let the debugger in the same state and allow Session-2 to execute some statements so that the space in this block is used up. Session-2 6. insert into fsm_1 values(generate_series(41,50),'a'); Session-1 11. Step through debugger and it should hit Assertion in RecordAndGetPageWithFreeSpace (line 227, freespace.c) I am looking at code as of commit - 2ab23445bc6af517bddb40d8429146d8ff8d7ff4. I have also tried to reproduce it via the ecpg test and my colleague Kuntal Ghosh has helped me to run that ecpg test by varying THREADS and REPEATS in ecpg/test/thread/prep.pgc on different machines, but no luck till now. To fix this symptom, we can ensure that once we didn't get any block from local map, we must clear it. See the attached patch. I will try to evaluate this code path to see if there is any similar race condition and will
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Tue, Feb 26, 2019 at 8:38 AM John Naylor wrote: > > On Tue, Feb 26, 2019 at 8:59 AM Amit Kapila wrote: > > I will look into it today and respond back with my findings. John, > > see if you also get time to investigate this. > > Will do, but it will likely be tomorrow > No problem, thanks! -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Mon, Feb 25, 2019 at 11:14 PM Tom Lane wrote: > > I wrote: > > Amit Kapila writes: > >> Avoid creation of the free space map for small heap relations, take 2. > > > I think this patch still has some issues. > > Just out of curiosity ... how can it possibly be even a little bit sane > that fsm_local_map is a single static data structure, without even any > indication of which table it is for? > It is just for finding a free block among a few blocks of relation. We clear this when we have found a block with enough free space, when we extend the relation, or on transaction abort. So, I think we don't need any per table information. > If, somehow, there's a rational argument for that design, why is it > not explained in freespace.c? The most charitable interpretation of > what I see there is that it's fatally undercommented. > There is some explanation in freespace.c and in README (src/backend/storage/freespace/README). I think we should add some more information where this data structure (FSMLocalMap) is declared. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Tue, Feb 26, 2019 at 8:59 AM Amit Kapila wrote: > I will look into it today and respond back with my findings. John, > see if you also get time to investigate this. Will do, but it will likely be tomorrow -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgsql: Avoid creation of the free space map for small heap relations, t
On Mon, Feb 25, 2019 at 10:32 PM Tom Lane wrote: > > Amit Kapila writes: > > Avoid creation of the free space map for small heap relations, take 2. > > I think this patch still has some issues. > I will look into it today and respond back with my findings. John, see if you also get time to investigate this. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: pgsql: Avoid creation of the free space map for small heap relations, t
I wrote: > Amit Kapila writes: >> Avoid creation of the free space map for small heap relations, take 2. > I think this patch still has some issues. Just out of curiosity ... how can it possibly be even a little bit sane that fsm_local_map is a single static data structure, without even any indication of which table it is for? If, somehow, there's a rational argument for that design, why is it not explained in freespace.c? The most charitable interpretation of what I see there is that it's fatally undercommented. regards, tom lane
Re: pgsql: Avoid creation of the free space map for small heap relations, t
Amit Kapila writes: > Avoid creation of the free space map for small heap relations, take 2. I think this patch still has some issues. Note the following two recent buildfarm failures: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura=2019-02-20%2004%3A20%3A01 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet=2019-02-24%2022%3A48%3A02 Both of them failed in an assertion added by this patch: TRAP: FailedAssertion("!((rel->rd_rel->relkind == 'r' || rel->rd_rel->relkind == 't') && fsm_local_map.map[oldPage] == 0x01)", File: "/home/bf/build/buildfarm-petalura/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c", Line: 229) ... 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:4] LOG: server process (PID 17981) was terminated by signal 6: Aborted 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:5] DETAIL: Failed process was running: INSERT INTO T VALUES ( $1 ) I can even offer you a theory as to why we're managing to get through all the core regression tests and then failing in ecpg: the failing test is ecpg/test/thread/prep.pgc, and I think that may be the only place in our tests where we stress concurrent insertions into a single small table. So I think you've got race-condition issues here. It might or might not be significant that these two animals are part of Andres' JIT-enabled menagerie. I could see that affecting test timing to the point of showing a symptom that's otherwise hard to hit. regards, tom lane