Re: Rethinking MemoryContext creation
Andres Freundwrites: > On 2017-12-12 14:50:37 -0500, Robert Haas wrote: >> It strikes me that a way to optimize these cases even more would be to >> postpone creating the context until it's actually needed. That might >> not always be a reasonable plan -- in particular, it occurs to me to >> think that CurTransactionContext is probably so widely used that >> changing anything about how it works would probably be really painful >> -- but it might be possible in some cases. > That's not generally easy without slowing things down though - e.g. we > don't want to check for ExprContext's existence before every use, there > can be billions of usages in a single analytics query. The branches (yea > yea ;)) would show up as being noticeable. Yeah. Also, in most of these cases what we're doing with the context is installing it as CurrentMemoryContext while we execute some random code that might or might not need to palloc anything. We can't just set CurrentMemoryContext to null - for one thing, that would leave no trace of how the context should get set up if it's needed. You could imagine installing some sort of placeholder, but really that's the mechanism we've already got, in the case where we just make a context header and no blocks. >> Another idea I have is that perhaps we could arrange to reuse contexts >> instead of destroying them and recreating them. > I'm a bit doubtful that's going to help, maintaining that list isn't > going to be free, and the lifetime and number of those contexts aren't > always going to match up. Actually, this seems like a really promising idea to me. To the extent that an empty context has standard parameters, they're all interchangeable, so you could imagine that AllocSetDelete shoves it onto a freelist after resetting it, instead of just giving it back to libc. I'm slightly worried about creating allocation islands that way, but that problem is probably surmountable, if it's real at all. However, that seems like a different patch from what I'm working on here. regards, tom lane
Re: Rethinking MemoryContext creation
On 2017-12-12 14:50:37 -0500, Robert Haas wrote: > On Tue, Dec 12, 2017 at 2:30 PM, Tom Lanewrote: > > 379 CurTransactionContext was never used > > 24 CurTransactionContext was used > > 66978 ExprContext was never used > > 17364 ExprContext was used > > 11139 SPI Exec was never used > >2421 SPI Exec was used > > 38386 ginPlaceToPage temporary context was never used > >3348 ginPlaceToPage temporary context was used > > It strikes me that a way to optimize these cases even more would be to > postpone creating the context until it's actually needed. That might > not always be a reasonable plan -- in particular, it occurs to me to > think that CurTransactionContext is probably so widely used that > changing anything about how it works would probably be really painful > -- but it might be possible in some cases. That's not generally easy without slowing things down though - e.g. we don't want to check for ExprContext's existence before every use, there can be billions of usages in a single analytics query. The branches (yea yea ;)) would show up as being noticeable. There are a few places where could probably reliably *detect* that they're not needed however. E.g. plenty executor nodes only need a ExprContext when either a qual or projection is needed, both are pretty cheap to detect at ExecInitNode() time. > Another idea I have is that perhaps we could arrange to reuse contexts > instead of destroying them and recreating them. For example, when > asked to delete a context, we could instead push it on a linked list > of old contexts, or only if the list isn't too long already, and when > asked to create one, we could pop from the list. Or we could keep > around an array of, say, 1024 contexts that are never freed and only > allocated dynamically when we run out. I'm a bit doubtful that's going to help, maintaining that list isn't going to be free, and the lifetime and number of those contexts aren't always going to match up. I think you're somewhat on to something however: I do think that especially executor startup would have a good chance of avoiding noticeable overhead by batching the allocation of all these tiny contexts together - I just don't quite know how given how our current initialization works. The best thing to do would probably to do two walks during executor initialization, one to compute memory sizes, and a second to initialize pre-requested memory that's laid out serially... But obviously that's not a small change. Greetings, Andres Freund
Re: Rethinking MemoryContext creation
On Tue, Dec 12, 2017 at 2:30 PM, Tom Lanewrote: > 379 CurTransactionContext was never used > 24 CurTransactionContext was used > 66978 ExprContext was never used > 17364 ExprContext was used > 11139 SPI Exec was never used >2421 SPI Exec was used > 38386 ginPlaceToPage temporary context was never used >3348 ginPlaceToPage temporary context was used It strikes me that a way to optimize these cases even more would be to postpone creating the context until it's actually needed. That might not always be a reasonable plan -- in particular, it occurs to me to think that CurTransactionContext is probably so widely used that changing anything about how it works would probably be really painful -- but it might be possible in some cases. Another idea I have is that perhaps we could arrange to reuse contexts instead of destroying them and recreating them. For example, when asked to delete a context, we could instead push it on a linked list of old contexts, or only if the list isn't too long already, and when asked to create one, we could pop from the list. Or we could keep around an array of, say, 1024 contexts that are never freed and only allocated dynamically when we run out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Rethinking MemoryContext creation
I wrote: > I've not done any benchmarking on this yet, just confirmed that it > compiles and passes check-world. So once I'd done some benchmarking, I was a bit disappointed: I could not measure any difference anymore in "pgbench -S", and poking into other scenarios found cases that were actually slower, like do $$ begin for i in 1..1000 loop declare x int; begin x := 42; exception when others then null; end; end loop; end$$; which went from about 12.4 seconds in HEAD to about 13.6 with my v2 patch. When I looked into the reason, I found that my earlier blithe pronouncement that "optimizing for the unused-context case seems like the wrong thing" was too simple. In this example we create and delete two memory contexts per loop (a subtransaction context and an ExprContext) and neither of them receives any palloc requests. Some of the contexts created in a "pgbench -S" scenario are the same way. So in these examples, we were on the losing side of the replace-a-palloc-with-a-malloc trade. However, if we can predict which contexts are more likely not to receive traffic, we can fix this by doing things the old way for those contexts. I instrumented AllocSetDelete to log whether the target context had received any requests in its lifespan, and aggregated the reports over a run of the core regression tests. I found these cases where there were significantly more reports of "context was never used" than "context was used": 379 CurTransactionContext was never used 24 CurTransactionContext was used 66978 ExprContext was never used 17364 ExprContext was used 993 HashTableContext was never used 25 HashTableContext was used 88 Logical Replication Launcher sublist was never used 2 Logical Replication Launcher sublist was used 11139 SPI Exec was never used 2421 SPI Exec was used 36 SetOp was never used 2 SetOp was used 185 Statistics snapshot was never used 104 Subplan HashTable Context was never used 44 Subplan HashTable Context was used 148 Subplan HashTable Temp Context was never used 1481 Table function arguments was never used 45 Table function arguments was used 22 TableFunc per value context was never used 52 Unique was never used 4 Unique was used 137 WindowAgg Aggregates was never used 2 WindowAgg Aggregates was used 127 WindowAgg Partition was never used 12 WindowAgg Partition was used 288 _bt_pagedel was never used 14 _bt_pagedel was used 35 event trigger context was never used 3 event trigger context was used 38386 ginPlaceToPage temporary context was never used 3348 ginPlaceToPage temporary context was used 454 ident parser context was never used 229 tSRF function arguments was never used 46 tSRF function arguments was used A lot of these probably aren't bothering with optimizing because they just don't occur often enough to move the needle. (And in workloads where that wasn't true, maybe the usage statistics would be different anyway.) But it looked to me like CurTransactionContext, ExprContext, HashTableContext, SPI Exec, and ginPlaceToPage temporary context would be worth doing it the old way for. Accordingly, attached is a v3 patch that adds another flag to AllocSetContextCreateExtended telling it to do things the old way with the context header in TopMemoryContext. (We can still optimize context name copying as before.) This fixes the subtransaction case I showed above, bringing it to 10.7 sec which is noticeably better than HEAD. I now see maybe a 1 or 2 percent improvement in "pgbench -S", which isn't much, but it turns out that that test case only involves half a dozen context creations/deletions per transaction. So probably we aren't going to get any noticeable movement on that benchmark, and it'd be brighter to look for test cases where more contexts are involved. regards, tom lane diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 868c14e..adbbc44 100644 *** a/contrib/amcheck/verify_nbtree.c --- b/contrib/amcheck/verify_nbtree.c *** bt_check_every_level(Relation rel, bool *** 295,303 /* Create context for page */ state->targetcontext = AllocSetContextCreate(CurrentMemoryContext, "amcheck context", ! ALLOCSET_DEFAULT_MINSIZE, ! ALLOCSET_DEFAULT_INITSIZE, ! ALLOCSET_DEFAULT_MAXSIZE); state->checkstrategy = GetAccessStrategy(BAS_BULKREAD); /* Get true root block from meta-page */ --- 295,301 /* Create context for page */ state->targetcontext = AllocSetContextCreate(CurrentMemoryContext, "amcheck context", ! ALLOCSET_DEFAULT_SIZES); state->checkstrategy = GetAccessStrategy(BAS_BULKREAD); /* Get true root block from meta-page */ diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c index
Re: Rethinking MemoryContext creation
On 11 December 2017 at 17:38, Tom Lanewrote: > Simon Riggs writes: >> On 11 December 2017 at 16:27, Tom Lane wrote: >>> For a *very* large majority of the callers of AllocSetContextCreate, >>> the context name is a simple C string constant, so we could just store >>> the pointer to it and save the space and cycles required to copy it. > >> Why have the string at all in that case? > > Try reading a MemoryContextStats dump without it ... I understood. I thought you were suggesting removing it in favour of a pointer. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Rethinking MemoryContext creation
Robert Haaswrites: > On Mon, Dec 11, 2017 at 12:36 PM, Tom Lane wrote: >> [ thinks... ] If we wanted to go that way, one thing we could do to >> help extension authors (and ourselves) is to define the proposed >> AllocSetContextCreate macro to include >> >> StaticAssertExpr(__builtin_constant_p(name)) >> >> on compilers that have __builtin_constant_p. Now, that only helps >> people using gcc and gcc-alikes, but that's a large fraction of >> developers I should think. (I tested this and it does seem to >> correctly recognize string literals as constants.) > I like that idea. I think that would provide good protection not only > for third-party developers but for core developers. It turns out this is slightly more painful than I'd anticipated. I tried to #define AllocSetContextCreate with five parameters, but all of the call sites that use the size abstraction macros (ALLOCSET_DEFAULT_SIZES and friends) blew up, because as far as they were concerned there were only three parameters, since the abstraction macros hadn't gotten expanded yet. We can make it work by #defining AllocSetContextCreate with three parameters #define AllocSetContextCreate(parent, name, allocparams) ... This approach means that you *must* use an abstraction macro when going through AllocSetContextCreate; if you want to write out the parameters longhand, you have to call AllocSetContextCreateExtended. I do not feel that this is a big loss, but there were half a dozen sites in our code that were doing it the old way. More significantly, since we only introduced those macros in 9.6, I suspect that most extensions are still doing it the old way and will get broken by this change. It's not hard to fix, but the annoyance factor will probably be real. I see no good way around it though: we can't use a static inline function instead, because that will almost certainly break the __builtin_constant_p test. I did not bother with compatibility macros for SlabContext or GenerationContext --- I really doubt any extension code is using the former yet, and they couldn't be using the latter since it's new. I've not done any benchmarking on this yet, just confirmed that it compiles and passes check-world. regards, tom lane diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 868c14e..adbbc44 100644 *** a/contrib/amcheck/verify_nbtree.c --- b/contrib/amcheck/verify_nbtree.c *** bt_check_every_level(Relation rel, bool *** 295,303 /* Create context for page */ state->targetcontext = AllocSetContextCreate(CurrentMemoryContext, "amcheck context", ! ALLOCSET_DEFAULT_MINSIZE, ! ALLOCSET_DEFAULT_INITSIZE, ! ALLOCSET_DEFAULT_MAXSIZE); state->checkstrategy = GetAccessStrategy(BAS_BULKREAD); /* Get true root block from meta-page */ --- 295,301 /* Create context for page */ state->targetcontext = AllocSetContextCreate(CurrentMemoryContext, "amcheck context", ! ALLOCSET_DEFAULT_SIZES); state->checkstrategy = GetAccessStrategy(BAS_BULKREAD); /* Get true root block from meta-page */ diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 046898c..e93d740 100644 *** a/src/backend/access/transam/xact.c --- b/src/backend/access/transam/xact.c *** AtStart_Memory(void) *** 997,1007 */ if (TransactionAbortContext == NULL) TransactionAbortContext = ! AllocSetContextCreate(TopMemoryContext, ! "TransactionAbortContext", ! 32 * 1024, ! 32 * 1024, ! 32 * 1024); /* * We shouldn't have a transaction context already. --- 997,1008 */ if (TransactionAbortContext == NULL) TransactionAbortContext = ! AllocSetContextCreateExtended(TopMemoryContext, ! "TransactionAbortContext", ! 0, ! 32 * 1024, ! 32 * 1024, ! 32 * 1024); /* * We shouldn't have a transaction context already. diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c index dd4a8d3..6e27856 100644 *** a/src/backend/catalog/partition.c --- b/src/backend/catalog/partition.c *** RelationBuildPartitionDesc(Relation rel) *** 513,521 } /* Now build the actual relcache partition descriptor */ ! rel->rd_pdcxt = AllocSetContextCreate(CacheMemoryContext, ! RelationGetRelationName(rel), ! ALLOCSET_DEFAULT_SIZES); oldcxt = MemoryContextSwitchTo(rel->rd_pdcxt); result = (PartitionDescData *) palloc0(sizeof(PartitionDescData)); --- 513,522 } /* Now build the actual relcache partition descriptor */ ! rel->rd_pdcxt = AllocSetContextCreateExtended(CacheMemoryContext, ! RelationGetRelationName(rel), ! MEMCONTEXT_OPTION_COPY_NAME, !
Re: Rethinking MemoryContext creation
On 2017-12-11 13:09:42 -0500, Tom Lane wrote: > Tomas Vondrawrites: > > On 12/10/2017 04:42 PM, Tom Lane wrote: > >> Peter Geoghegan writes: > >>> On Sat, Dec 9, 2017 at 5:53 PM, Tom Lane wrote: > Overall I'm seeing about a 5% improvement in a "pgbench -S" scenario, > although that number is a bit shaky since the run-to-run variation > is a few percent anyway. > > > FWIW I've done some measurements, and while there is a improvement, it's > > far from 5%. ... > > So that's about 1.3% and 1.2% improvement. It seems fairly consistent, > > but it might easily be due to different in layout of the binaries. > > Thanks for checking. With these sorts of small-percentage improvements, > I would not be surprised for platform-to-platform results to be different. > At least you do see some improvement too. > > Let me code up the change to avoid copying constant name strings, > and then we can see if that helps any. FWIW: +5.37% postgres postgres[.] hash_search_with_hash_value +2.94% postgres postgres[.] AllocSetAlloc +2.68% postgres postgres[.] _bt_compare +2.36% postgres postgres[.] LWLockAcquire +2.13% postgres postgres[.] PostgresMain +1.47% postgres postgres[.] LWLockRelease +1.32% postgres libc-2.25.so[.] _int_malloc +1.23% postgres libc-2.25.so[.] vfprintf +1.22% postgres postgres[.] LockAcquire +1.20% postgres postgres[.] _bt_first -1.11% postgres libc-2.25.so[.] strlen - strlen + 32.24% MemoryContextCreate + 16.97% pq_getmsgstring + 14.63% set_ps_display So I'd expect it to help a small amount. - Andres
Re: Rethinking MemoryContext creation
Tomas Vondrawrites: > On 12/10/2017 04:42 PM, Tom Lane wrote: >> Peter Geoghegan writes: >>> On Sat, Dec 9, 2017 at 5:53 PM, Tom Lane wrote: Overall I'm seeing about a 5% improvement in a "pgbench -S" scenario, although that number is a bit shaky since the run-to-run variation is a few percent anyway. > FWIW I've done some measurements, and while there is a improvement, it's > far from 5%. ... > So that's about 1.3% and 1.2% improvement. It seems fairly consistent, > but it might easily be due to different in layout of the binaries. Thanks for checking. With these sorts of small-percentage improvements, I would not be surprised for platform-to-platform results to be different. At least you do see some improvement too. Let me code up the change to avoid copying constant name strings, and then we can see if that helps any. regards, tom lane
Re: Rethinking MemoryContext creation
On Mon, Dec 11, 2017 at 12:36 PM, Tom Lanewrote: > [ thinks... ] If we wanted to go that way, one thing we could do to > help extension authors (and ourselves) is to define the proposed > AllocSetContextCreate macro to include > > StaticAssertExpr(__builtin_constant_p(name)) > > on compilers that have __builtin_constant_p. Now, that only helps > people using gcc and gcc-alikes, but that's a large fraction of > developers I should think. (I tested this and it does seem to > correctly recognize string literals as constants.) I like that idea. I think that would provide good protection not only for third-party developers but for core developers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Rethinking MemoryContext creation
Robert Haaswrites: > On Mon, Dec 11, 2017 at 11:59 AM, Tom Lane wrote: >> Tomas Vondra writes: >>> On 12/11/2017 05:27 PM, Tom Lane wrote: However, unless we want to run around and touch all the ~ 150 calls with constant arguments, we'd have to set things up so that the default behavior for AllocSetContextCreate is to not copy. This risks breaking callers in extensions. Not entirely sure if it's worth that --- any thoughts? >> >>> I don't think silently breaking extensions is particularly attractive >>> option, so I guess we'll have to run around and tweak the ~150 calls. >> >> Meh. I suppose that of the ~150 call sites, there are probably only >> a dozen or two where it would actually make a performance difference, >> so maybe this needn't be quite as invasive as I first thought. > > I think changing only a subset of the call sites is unappealing > because, even though it may not make a measurable performance > difference in other cases, it may get cargo-culted into some place > where it does make a difference. Would it be acceptable to only get this optimisation on compilers that support __builtin_constant_p or similar? If so, the wrapper macro could use that to automatically pass the no-copy flag when called with a literal string. - ilmari -- "I use RMS as a guide in the same way that a boat captain would use a lighthouse. It's good to know where it is, but you generally don't want to find yourself in the same spot." - Tollef Fog Heen
Re: Rethinking MemoryContext creation
Simon Riggswrites: > On 11 December 2017 at 16:27, Tom Lane wrote: >> For a *very* large majority of the callers of AllocSetContextCreate, >> the context name is a simple C string constant, so we could just store >> the pointer to it and save the space and cycles required to copy it. > Why have the string at all in that case? Try reading a MemoryContextStats dump without it ... regards, tom lane
Re: Rethinking MemoryContext creation
On 12/11/2017 06:22 PM, Robert Haas wrote: > On Mon, Dec 11, 2017 at 11:59 AM, Tom Lanewrote: >> Tomas Vondra writes: >>> On 12/11/2017 05:27 PM, Tom Lane wrote: However, unless we want to run around and touch all the ~ 150 calls with constant arguments, we'd have to set things up so that the default behavior for AllocSetContextCreate is to not copy. This risks breaking callers in extensions. Not entirely sure if it's worth that --- any thoughts? >> >>> I don't think silently breaking extensions is particularly attractive >>> option, so I guess we'll have to run around and tweak the ~150 calls. >> >> Meh. I suppose that of the ~150 call sites, there are probably only >> a dozen or two where it would actually make a performance difference, >> so maybe this needn't be quite as invasive as I first thought. > > I think changing only a subset of the call sites is unappealing > because, even though it may not make a measurable performance > difference in other cases, it may get cargo-culted into some place > where it does make a difference. > Not sure. One the one hand, it might certainly be somewhat confusing when we use two different methods to create memory contexts with C string constants. On the other hand, I'm sure we have other similar "local" optimizations. I'd say "let's just tweak all the calls to use the new function" but I'm not the person doing the leg work ... > I also don't think silently breaking extensions is a terribly big deal > in this case. It seems likely that most extensions use static names > just as most of our internal stuff does. I'm going to guess that the > number of extensions that will actually break as a result of a change > in this area is probably very small - conceivably zero, and likely > less than five. I don't think we should be willing to uglify the core > code too much for that level of breakage. > I don't know how many extensions you maintain, but in my experience this type of silent breakage is extremely annoying. I definitely prefer a clean compile-time API breakage, for example. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Rethinking MemoryContext creation
On 11 December 2017 at 16:27, Tom Lanewrote: > For a *very* large majority of the callers of AllocSetContextCreate, > the context name is a simple C string constant, so we could just store > the pointer to it and save the space and cycles required to copy it. Why have the string at all in that case? -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Rethinking MemoryContext creation
On Mon, Dec 11, 2017 at 11:59 AM, Tom Lanewrote: > Tomas Vondra writes: >> On 12/11/2017 05:27 PM, Tom Lane wrote: >>> However, unless we want to run around and touch all the ~ 150 calls >>> with constant arguments, we'd have to set things up so that the default >>> behavior for AllocSetContextCreate is to not copy. This risks breaking >>> callers in extensions. Not entirely sure if it's worth that --- any >>> thoughts? > >> I don't think silently breaking extensions is particularly attractive >> option, so I guess we'll have to run around and tweak the ~150 calls. > > Meh. I suppose that of the ~150 call sites, there are probably only > a dozen or two where it would actually make a performance difference, > so maybe this needn't be quite as invasive as I first thought. I think changing only a subset of the call sites is unappealing because, even though it may not make a measurable performance difference in other cases, it may get cargo-culted into some place where it does make a difference. I also don't think silently breaking extensions is a terribly big deal in this case. It seems likely that most extensions use static names just as most of our internal stuff does. I'm going to guess that the number of extensions that will actually break as a result of a change in this area is probably very small - conceivably zero, and likely less than five. I don't think we should be willing to uglify the core code too much for that level of breakage. But those are just my opinions. I am glad you are working on this; I noticed this problem before and thought of trying to do something about it, but ran out of time and ideas. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Rethinking MemoryContext creation
Tomas Vondrawrites: > On 12/11/2017 05:27 PM, Tom Lane wrote: >> However, unless we want to run around and touch all the ~ 150 calls >> with constant arguments, we'd have to set things up so that the default >> behavior for AllocSetContextCreate is to not copy. This risks breaking >> callers in extensions. Not entirely sure if it's worth that --- any >> thoughts? > I don't think silently breaking extensions is particularly attractive > option, so I guess we'll have to run around and tweak the ~150 calls. Meh. I suppose that of the ~150 call sites, there are probably only a dozen or two where it would actually make a performance difference, so maybe this needn't be quite as invasive as I first thought. regards, tom lane
Re: Rethinking MemoryContext creation
On 12/11/2017 05:27 PM, Tom Lane wrote: > I wrote: >> While fooling around with a different problem, I got annoyed at how slow >> MemoryContext creation and deletion is. > > Looking at this some more, there's another micro-optimization we could > make, which is to try to get rid of the strlen+strcpy operations needed > for the context name. (And yes, I'm seeing those show up in profiles, > to the tune of a couple percent of total runtime in some examples.) > > For a *very* large majority of the callers of AllocSetContextCreate, > the context name is a simple C string constant, so we could just store > the pointer to it and save the space and cycles required to copy it. > This would require providing a separate API for the few callers that are > actually passing transient strings, but that's easy enough. I envision > AllocSetContextCreate becoming a wrapper macro for > "AllocSetContextCreateExtended", which'd take a flag argument specifying > whether the context name needs to be copied. > > However, unless we want to run around and touch all the ~ 150 calls > with constant arguments, we'd have to set things up so that the default > behavior for AllocSetContextCreate is to not copy. This risks breaking > callers in extensions. Not entirely sure if it's worth that --- any > thoughts? > I don't think silently breaking extensions is particularly attractive option, so I guess we'll have to run around and tweak the ~150 calls. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Rethinking MemoryContext creation
Peter Geogheganwrites: > On Sat, Dec 9, 2017 at 5:53 PM, Tom Lane wrote: >> Overall I'm seeing about a 5% improvement in a "pgbench -S" scenario, >> although that number is a bit shaky since the run-to-run variation >> is a few percent anyway. > Is that with "-M prepared", too? No, I didn't use that. regards, tom lane
Rethinking MemoryContext creation
While fooling around with a different problem, I got annoyed at how slow MemoryContext creation and deletion is. A big chunk of that seems to be the palloc/pfree cycle needed to allocate the context header block in TopMemoryContext. After thinking about it for a bit, I decided that it'd be smarter to include the context header in the context's first malloc() allocation, thereby eliminating the palloc/pfree cycle completely. This is not completely cost-free, because it means that we must allocate a context's first block ("keeper" block) immediately, even if no pallocs ever get done in it. However, optimizing for the unused-context case seems like the wrong thing. To do this, we need to move the responsibility for allocating and releasing the context header into the individual context-type-specific modules, which leads to a certain amount of code duplication, but not a lot. A big advantage is that we can simplify the overly-complicated creation API that had the context module calling into mcxt.c which then recursed back to the context module. Further down the pike there might be some additional win, since this gives the context module more control over where to put the header (shared memory maybe?). Another point worth making is that getting the context headers out of TopMemoryContext reduces the cost of transiently having a lot of contexts. Since TopMemoryContext is never reset, and aset.c doesn't give memory back to libc short of a reset, creating a lot of contexts permanently bloats TopMemoryContext even if they all get deleted later. I didn't spend a lot of effort on slab.c or generation.c, just having them do a separate malloc and free for the context header. If anyone's really excited about that, they could be improved later. To cut to the chase: I experimented with simply creating and deleting an aset.c memory context in a tight loop, optionally with one palloc in the context before deleting it. On my machine, the time per loop looked like HEADwith patch no palloc at all 9.2 us 10.3 us palloc 100 bytes18.5 us 11.6 us palloc 1 bytes 17.2 us 18.5 us So the unused-context case does get a bit slower, as we trade off a malloc/free cycle to save a palloc/pfree cycle. The case where we've made some use of the context gets considerably quicker though. Also, the last line shows a situation where the lone palloc request does not fit in the context's initial allocation so that we're forced to make two malloc requests not one. This is a bit slower too, but not by much, and I think it's really an unusual case. (As soon as you've made any requests that do fit in the initial allocation, you'd be back to the patch winning.) Overall I'm seeing about a 5% improvement in a "pgbench -S" scenario, although that number is a bit shaky since the run-to-run variation is a few percent anyway. Thoughts, objections? Anyone want to do some other benchmarking? regards, tom lane diff --git a/src/backend/utils/mmgr/README b/src/backend/utils/mmgr/README index 296fa19..7f08f5d 100644 *** a/src/backend/utils/mmgr/README --- b/src/backend/utils/mmgr/README *** every other context is a direct or indir *** 177,184 here is essentially the same as "malloc", because this context will never be reset or deleted. This is for stuff that should live forever, or for stuff that the controlling module will take care of deleting at the ! appropriate time. An example is fd.c's tables of open files, as well as ! the context management nodes for memory contexts themselves. Avoid allocating stuff here unless really necessary, and especially avoid running with CurrentMemoryContext pointing here. --- 177,183 here is essentially the same as "malloc", because this context will never be reset or deleted. This is for stuff that should live forever, or for stuff that the controlling module will take care of deleting at the ! appropriate time. An example is fd.c's tables of open files. Avoid allocating stuff here unless really necessary, and especially avoid running with CurrentMemoryContext pointing here. *** a maximum block size. Selecting smaller *** 420,430 space in contexts that aren't expected to hold very much (an example is the relcache's per-relation contexts). ! Also, it is possible to specify a minimum context size. If this ! value is greater than zero then a block of that size will be grabbed ! immediately upon context creation, and cleared but not released during ! context resets. This feature is needed for ErrorContext (see above), ! but will most likely not be used for other contexts. We expect that per-tuple contexts will be reset frequently and typically will not allocate very much space per tuple cycle. To make this usage --- 419,428 space in contexts that aren't expected to hold very much (an example is the