Re: crash with sql language partition support function
Hi, On 2018-04-13 15:08:30 -0300, Alvaro Herrera wrote: > I think this is a good improvement. On top of that, I propose a new > file partitioning/partdefs.h with the following approximate contents. > This reduces cross-inclusion of headers to the minimum. I'm dealing > with the fallout from this now, will post a complete patch shortly. It'd be good to adjust the thread topic - this surely isn't about the crash anymore. And it's good, especially given we're past the feature freeze etc, if the subject conveyed what's happening? - Andres
Re: crash with sql language partition support function
I think this is a good improvement. On top of that, I propose a new file partitioning/partdefs.h with the following approximate contents. This reduces cross-inclusion of headers to the minimum. I'm dealing with the fallout from this now, will post a complete patch shortly. /*- * * partdefs.h * Base definitions for partitioned table handling * * Copyright (c) 2007-2018, PostgreSQL Global Development Group * * src/include/partitioning/partdefs.h * *- */ #ifndef PARTDEFS_H #define PARTDEFS_H typedef enum PartitionRangeDatumKind { PARTITION_RANGE_DATUM_MINVALUE = -1,/* less than any other value */ PARTITION_RANGE_DATUM_VALUE = 0,/* a specific (bounded) value */ PARTITION_RANGE_DATUM_MAXVALUE = 1 /* greater than any other value */ } PartitionRangeDatumKind; typedef struct PartitionBoundInfoData *PartitionBoundInfo; typedef struct PartitionKeyData *PartitionKey; typedef struct PartitionBoundSpec PartitionBoundSpec; typedef struct PartitionDescData *PartitionDesc; #endif /* PARTDEFS_H */ -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: crash with sql language partition support function
Amit Langote wrote: > On 2018/04/13 6:58, Alvaro Herrera wrote: > > After going over your patch, I think you went slightly overboard here. > > Or maybe not, but this patch is so large that it's hard to form an > > opinion about it. > > It's mostly code movement, but there are some other changes as well as > described below. Hmm can you please split out the code changes into a separate patch? I can commit the code movement quickly, but the other stuff requres more creful review. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: crash with sql language partition support function
On 2018/04/13 3:10, Alvaro Herrera wrote: > Alvaro Herrera wrote: > >> I'm dealing with this now -- will push shortly. The sane thing to do is >> backpatch my previous memcxt fixes, since your patch introduces a >> problem that we discussed with that other patch, namely that you would >> leak the whole memory context if there is a problem while running the >> function. I also noticed here that copy_partition_key is doing memcpy() >> on the fmgr_info, which is bogus -- it should be using fmgr_info_copy. >> Rather than patching that one piece it seems better to replace it >> wholesale, since I bet this won't be the last time we'll hear about this >> routine, and I would prefer not to deal with differently broken code in >> the older branch. > > Pushed. Thanks for fixing that. Regards, Amit
Re: crash with sql language partition support function
I wonder what prompted people to #include "catalog/partition.h" in executor.h. Amit Langote wrote: > Anyway, after reading your replies, I thought of taking a stab at unifying > the partitioning information that's cached by relcache.c. After going over your patch, I think you went slightly overboard here. Or maybe not, but this patch is so large that it's hard to form an opinion about it. Some of these cleanups we should probably adopt per discussion upthread, but I think we're at a point where we have to work in smaller steps to avoid destabilizing the code. I'm not sure about the new PartitionInfo that you propose. I see you had to add partitioning/partbounds.h to rel.h -- not a fan of that. I was thinking of a simpler fix there, just remove one of the two memcxts in RelationData and put both structs in the remaining one. Maybe I'm not ambitious enough. Here's what I would propose: create partitioning/partbounds.c to deal with partition bounds (i.e. mostly PartitionBoundInfoData and PartitionBoundSpec), and have utils/cache/partcache.c (with corresponding utils/partcache.h) for RelationGetPartitionDesc and RelationGetPartitionKey (not much else, it seems). Maybe also move get_partition_for_tuple to execPartition.c? Preliminarly, I would say that the following functions would be in partbounds.c (in roughly this order): Exported: get_qual_from_partbound partition_bounds_equal partition_bounds_copy check_new_partition_bound check_default_allows_bound get_hash_partition_greatest_modulus make_one_range_bound qsort_partition_list_value_cmp partition_rbound_cmp partition_rbound_datum_cmp qsort_partition_hbound_cmp partition_hbound_cmp partition_list_bsearch partition_range_bsearch partition_range_datum_bsearch partition_hash_bsearch static: get_partition_bound_num_indexes make_partition_op_expr get_partition_operator get_qual_for_hash get_qual_for_list get_qual_for_range get_range_key_properties get_range_nulltest Unsure yet about compute_hash_value and satisfies_hash_partition. The rest would remain in catalog/partition.c, which should hopefully not be a lot. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: crash with sql language partition support function
Robert Haas wrote: > On Thu, Apr 12, 2018 at 8:55 AM, Alvaro Herrera > wrote: > > Amit Langote wrote: > >> Anyway, after reading your replies, I thought of taking a stab at unifying > >> the partitioning information that's cached by relcache.c. > > > > Wow. Now that's one large patch. I'm going to run with this for HEAD, > > but I think we should do a minimal fix for PG10. > > Is there really a compelling reason to do more than minimal fixes in > HEAD? IMO there is ample reason for restructuring the lot of code that currently lives in catalog/partition.c. We're going to have to support this code for a minimum of six years -- and that's only if we get around to reorganizing it during pg12. I don't have a lot of faith that I'll be motivated to reorganize it in pg12, but I do know that I am motivated to reorganize it now. If we do happen to reorganize it in pg12, then any bugs we find afterwards will cost double in terms of backpatching the fixes than if we reorganize it now. I don't want my future self to curse my current self for not doing it when it was appropriate -- i.e. when it was fresh in my mind and freshly written. > We are (or should be) trying to stabilize this branch so we can > ship it and start the next one, Yes, that is what I am trying to do -- I want us to have a sane base upon which to do our work for years to come. Do we need more than minimal fixes in the memory context, FmgrInfo, and miscellaneous other fixes that Amit is proposing in this patch? That I don't know. I hope to have an answer for this later, and I think this is the reason for your final comment: > and the chances that heavy hacking on this will delay that process > seem better than average. Maybe if there are no bugs and it's just ugly coding, it is best left alone. But as I said, I don't know the answer yet. I will make sure to propose any functional code changes separately from code movement. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: crash with sql language partition support function
Alvaro Herrera wrote: > I'm dealing with this now -- will push shortly. The sane thing to do is > backpatch my previous memcxt fixes, since your patch introduces a > problem that we discussed with that other patch, namely that you would > leak the whole memory context if there is a problem while running the > function. I also noticed here that copy_partition_key is doing memcpy() > on the fmgr_info, which is bogus -- it should be using fmgr_info_copy. > Rather than patching that one piece it seems better to replace it > wholesale, since I bet this won't be the last time we'll hear about this > routine, and I would prefer not to deal with differently broken code in > the older branch. Pushed. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: crash with sql language partition support function
On Thu, Apr 12, 2018 at 8:55 AM, Alvaro Herrera wrote: > Amit Langote wrote: >> Anyway, after reading your replies, I thought of taking a stab at unifying >> the partitioning information that's cached by relcache.c. > > Wow. Now that's one large patch. I'm going to run with this for HEAD, > but I think we should do a minimal fix for PG10. Is there really a compelling reason to do more than minimal fixes in HEAD? We are (or should be) trying to stabilize this branch so we can ship it and start the next one, and the chances that heavy hacking on this will delay that process seem better than average. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: crash with sql language partition support function
Amit Langote wrote: > Since this bug also exists in the released PG 10 branch, I also created a > patch for that. It's slightly different than the one for PG 11dev, > because there were some changes recently to how the memory context is > manipulated in RelationBuildPartitionKey. That's > v1-PG10-0001-Fix-a-memory-context-bug-in-RelationBuildPartitio.patch. Here's a repro for pg10, which doesn't have hash partitioning: create function my_int4_sort(int4,int4) returns int language sql as $$ select case when $1 = $2 then 0 when $1 > $2 then 1 else -1 end; $$; create operator class test_int4_ops for type int4 using btree as operator 1 < (int4,int4), operator 2 <= (int4,int4), operator 3 = (int4,int4), operator 4 >= (int4,int4), operator 5 > (int4,int4), function 1 my_int4_sort(int4,int4); create table t (a int4) partition by range (a test_int4_ops); create table t1 partition of t for values from (0) to (1000); insert into t values (100); insert into t values (200); -- *boom* I'm dealing with this now -- will push shortly. The sane thing to do is backpatch my previous memcxt fixes, since your patch introduces a problem that we discussed with that other patch, namely that you would leak the whole memory context if there is a problem while running the function. I also noticed here that copy_partition_key is doing memcpy() on the fmgr_info, which is bogus -- it should be using fmgr_info_copy. Rather than patching that one piece it seems better to replace it wholesale, since I bet this won't be the last time we'll hear about this routine, and I would prefer not to deal with differently broken code in the older branch. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: crash with sql language partition support function
Amit Langote wrote: > Anyway, after reading your replies, I thought of taking a stab at unifying > the partitioning information that's cached by relcache.c. Wow. Now that's one large patch. I'm going to run with this for HEAD, but I think we should do a minimal fix for PG10. Did you detect any further bugs, while doing all this rework, apart from the one that started this thread? If not, I would prefer to do commit the minimal fix at start of thread to both branches, then apply the larger restructuring patch to HEAD only. For the record, I don't like the amount of code that this is putting in relcache.c. I am thinking that most of that code will go to src/backend/partitioning/partbounds.c instead. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: crash with sql language partition support function
Alvaro Herrera writes: > Ashutosh Bapat wrote: >> On Tue, Apr 10, 2018 at 1:44 PM, Amit Langote >> wrote: >>> Attached fixes it. It teaches RelationBuildPartitionKey() to use >>> fmgr_info_cxt and pass rd_partkeycxt to it. >> The patch is using partkeycxt and not rd_partkeycxt. Probably a typo >> in the mail. No, it's correct as written, because rd_partkeycxt hasn't been set yet. > Good point. Yeah, it looks like it should cause problems. I think it > would be better to have RelationGetPartitionKey() return a copy in the > caller's context of the data of the relcache data, rather than the > relcache data itself, no? Seems like this would not fail if there never > is a CCI between the RelationGetPartitionKey call and the last access of > the returned key struct ... but if this is the explanation, then it's a > pretty brittle setup and we should stop relying on that. Yeah, all of the callers of RelationGetPartitionKey seem to assume that the pointer they get is guaranteed valid and will stay so forever. This is pretty dangerous, independently of the bug Amit mentions. However, I'm not sure that copy-on-read is a good solution here, because of exactly the point at stake that the FmgrInfos may have infrastructure. We don't have a way to copy that, and even if we did, copying on every reference would be really expensive. We might try to make this work like the relcache infrastructure for indexes, which also contains FmgrInfos. However, in the index case we may safely assume that that stuff never changes for the life of the index. I'm afraid that's not true for the partitioning data is it? How much does it actually buy us to store FmgrInfos here rather than, say, function OIDs? If we backed off to that, then the data structure might be simple enough that copy-on-read would work. Otherwise we might need some kind of refcount mechanism. We built one for domain constraints in the typcache, and it's not horrid, but it is a fair amount of code. > Finally: I don't quite understand why we need two memory contexts there, > one for the partkey and another for the partdesc. Surely one context > for both is sufficient. It'd only matter if there were a reason to delete/rebuild one but not the other within a particular relcache entry, which probably there isn't. So using one context for both would likely be a bit simpler and more efficient. BTW, another thing in the same general area that I was planning to get on the warpath about sooner or later is that the code managing rd_partcheck is really cruddy. It spews a lot of random data structure into the CacheMemoryContext with no way to release it at relcache inval, resulting in a session-lifespan memory leak. (pfree'ing just the List header, as RelationDestroyRelation does, is laughably inadequate.) Perhaps that could be fixed by likewise storing it in a sub-context used for partition information. regards, tom lane
Re: crash with sql language partition support function
Ashutosh Bapat wrote: > On Tue, Apr 10, 2018 at 1:44 PM, Amit Langote > wrote: > > > > Attached fixes it. It teaches RelationBuildPartitionKey() to use > > fmgr_info_cxt and pass rd_partkeycxt to it. > > The patch is using partkeycxt and not rd_partkeycxt. Probably a typo > in the mail. But a wider question, why that context? I guess that > cache context will vanish when that cache entry is thrown away. That's > the reason we have to copy partition key information in > find_partition_scheme() instead of just pointing to it and also use > fmgr_info_copy() there. But if that's the case, buildfarm animal run > with -DCLOBBER_CACHE_ALWAYS should show failure. I am missing > something here. Good point. Yeah, it looks like it should cause problems. I think it would be better to have RelationGetPartitionKey() return a copy in the caller's context of the data of the relcache data, rather than the relcache data itself, no? Seems like this would not fail if there never is a CCI between the RelationGetPartitionKey call and the last access of the returned key struct ... but if this is the explanation, then it's a pretty brittle setup and we should stop relying on that. I wonder why do we RelationBuildPartitionDesc and RelationBuildPartitionKey directly in RelationBuildDesc. Wouldn't it be better to delay constructing those until the first access to them? Finally: I don't quite understand why we need two memory contexts there, one for the partkey and another for the partdesc. Surely one context for both is sufficient. (And while at it, why not have the whole RelationData inside the same memory context, so that it can be blown away without so much retail pfree'ing?) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: crash with sql language partition support function
On Tue, Apr 10, 2018 at 1:44 PM, Amit Langote wrote: > > Attached fixes it. It teaches RelationBuildPartitionKey() to use > fmgr_info_cxt and pass rd_partkeycxt to it. The patch is using partkeycxt and not rd_partkeycxt. Probably a typo in the mail. But a wider question, why that context? I guess that cache context will vanish when that cache entry is thrown away. That's the reason we have to copy partition key information in find_partition_scheme() instead of just pointing to it and also use fmgr_info_copy() there. But if that's the case, buildfarm animal run with -DCLOBBER_CACHE_ALWAYS should show failure. I am missing something here. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: crash with sql language partition support function
I have added this in the Older Bugs section of open items page. https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items#Older_Bugs Thanks, Amit