Re: [HACKERS] Extending opfamilies for GIN indexes
Tom Lane writes: > Actually the other way around. An opclass is the subset of an opfamily > that is tightly bound to an index. The "build" methods have to be > associatable with an index, so they're part of the index's opclass. > The "query" methods could be loose in the opfamily. I had understood your proposal to change that for GIN. Thinking again now with keeping opfamily and opclass as they are now: an opclass is the code we run to build and scan the index, an opfamily is a way to use the same index data and code in more contexts than strictly covered by an opclass. > The planner's not the problem here --- what's missing is the rule for > the index AM to look up the right support functions to call at runtime. > > The trick is to associate the proper query support methods with any > given query operator (which'd also be loose in the family, probably). > The existing schema for pg_amop and pg_amproc is built on the assumption > that the amoplefttype/amoprighttype are sufficient for making this > association; but that seems to fall down if we would like to allow > contrib modules to add new query operators that coincidentally take the > same input types as an existing opfamily member. Well the opfamily machinery allows to give query support to any index whose opclass is in the family. That is, the same set of operators are covered by more than one opclass. What we want to add is more than one set of operators can find data support in more than one "index kind". But you still want to run specific search code here. So it seems to me we shouldn't attack the problem at the operator left and right type level, but rather model that we need another level of flexibility, separating somewhat the index data building and maintaining from the code that's used to access it. The example that we're working from seem to be covered if we are able to instruct PostgreSQL than a set of opclass'es are "binary coercible", I think that's the term here. Then the idea would be to have PostgreSQL able to figure out that a given index can be used with any binary coercible opclass, rather than only the one used to maintain it. What do you think? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending opfamilies for GIN indexes
Dimitri Fontaine writes: > For the GIN indexes, we have 2 methods for building the index and 3 > others to search it to solve the query. You're proposing that the 2 > former methods would be in the opfamily and the 3 later in the opclass. Actually the other way around. An opclass is the subset of an opfamily that is tightly bound to an index. The "build" methods have to be associatable with an index, so they're part of the index's opclass. The "query" methods could be loose in the opfamily. > So we would want the planner to know that in the GIN case an index built > with any opclass of a given opfamily can help answer a query that would > need any opclass of the opfamily. Right? The planner's not the problem here --- what's missing is the rule for the index AM to look up the right support functions to call at runtime. The trick is to associate the proper query support methods with any given query operator (which'd also be loose in the family, probably). The existing schema for pg_amop and pg_amproc is built on the assumption that the amoplefttype/amoprighttype are sufficient for making this association; but that seems to fall down if we would like to allow contrib modules to add new query operators that coincidentally take the same input types as an existing opfamily member. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending opfamilies for GIN indexes
Tom Lane writes: > I think you missed the point: right now, to use both the core and > intarray operators on an integer[] column, you have to create *two* > GIN indexes, which will have exactly identical contents. I'm looking > for a way to let intarray extend the core opfamily definition so that > one index can serve. That I think I understood, but then I mixed opfamily and opclasses badly. Let's try again. For the GIN indexes, we have 2 methods for building the index and 3 others to search it to solve the query. You're proposing that the 2 former methods would be in the opfamily and the 3 later in the opclass. We'd like to be able to use the same index (which building depends on the opfamily) for solving different kind of queries, for which we can use different traversal and search algorithms, that's the opclass. So we would want the planner to know that in the GIN case an index built with any opclass of a given opfamily can help answer a query that would need any opclass of the opfamily. Right? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending opfamilies for GIN indexes
Robert Haas writes: > On Wed, Jan 19, 2011 at 1:33 PM, Tom Lane wrote: >> AFAICS that means integrating contrib/intarray into core. Independently >> of whether that's a good idea or not, PG is supposed to be an extensible >> system, so it would be nice to have a solution that supported add-on >> extensions. > Yeah, I'm just wondering if it's worth the effort, especially in view > of a rather large patch queue we seem to have outstanding at the > moment. Oh, maybe we're not on the same page here: I wasn't really proposing to do this right now, it's more of a TODO item. Offhand the only reason to do it now would be if we settled on something that required a layout change in pg_amop/pg_amproc. Since we already have one such change in 9.1, getting the additional change done in the same release would be valuable to reduce the number of distinct cases for pg_dump and other clients to support. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending opfamilies for GIN indexes
On Wed, Jan 19, 2011 at 1:33 PM, Tom Lane wrote: > Robert Haas writes: >> On Wed, Jan 19, 2011 at 12:29 PM, Tom Lane wrote: >>> I think you missed the point: right now, to use both the core and >>> intarray operators on an integer[] column, you have to create *two* >>> GIN indexes, which will have exactly identical contents. I'm looking >>> for a way to let intarray extend the core opfamily definition so that >>> one index can serve. > >> Maybe this is a dumb question, but why not just put whatever stuff >> intarray[] adds directly into the core opfamily? > > AFAICS that means integrating contrib/intarray into core. Independently > of whether that's a good idea or not, PG is supposed to be an extensible > system, so it would be nice to have a solution that supported add-on > extensions. Yeah, I'm just wondering if it's worth the effort, especially in view of a rather large patch queue we seem to have outstanding at the moment. > The subtext here is that GIN, unlike the other index AMs, uses a > representation that seems pretty amenable to supporting a wide variety > of query types with a single index. contrib/intarray's "query_int" > operators are not at all like the subset-inclusion-testing operators > that the core opclass supports, and it's not very hard to think of > additional cases that could be of interest to somebody (example: find > all arrays that contain some/all entries within a given integer range). > I think we're going to come up against similar situations over and over > until we find a solution. Interesting. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending opfamilies for GIN indexes
Robert Haas writes: > On Wed, Jan 19, 2011 at 12:29 PM, Tom Lane wrote: >> I think you missed the point: right now, to use both the core and >> intarray operators on an integer[] column, you have to create *two* >> GIN indexes, which will have exactly identical contents. I'm looking >> for a way to let intarray extend the core opfamily definition so that >> one index can serve. > Maybe this is a dumb question, but why not just put whatever stuff > intarray[] adds directly into the core opfamily? AFAICS that means integrating contrib/intarray into core. Independently of whether that's a good idea or not, PG is supposed to be an extensible system, so it would be nice to have a solution that supported add-on extensions. The subtext here is that GIN, unlike the other index AMs, uses a representation that seems pretty amenable to supporting a wide variety of query types with a single index. contrib/intarray's "query_int" operators are not at all like the subset-inclusion-testing operators that the core opclass supports, and it's not very hard to think of additional cases that could be of interest to somebody (example: find all arrays that contain some/all entries within a given integer range). I think we're going to come up against similar situations over and over until we find a solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending opfamilies for GIN indexes
On Wed, Jan 19, 2011 at 12:29 PM, Tom Lane wrote: > Dimitri Fontaine writes: >> Tom Lane writes: >>> Oh, wait a minute: there's a bad restriction there, namely that a >>> contrib module could only add "loose" operators that had different >>> declared input types from the ones known to the core opclass. > >> I would have though that such contrib would then need to offer their own >> opfamily and opclasses, and users would have to use the specific opclass >> manually like they do e.g. for text_pattern_ops. Can't it work that way? > > I think you missed the point: right now, to use both the core and > intarray operators on an integer[] column, you have to create *two* > GIN indexes, which will have exactly identical contents. I'm looking > for a way to let intarray extend the core opfamily definition so that > one index can serve. Maybe this is a dumb question, but why not just put whatever stuff intarray[] adds directly into the core opfamily? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending opfamilies for GIN indexes
Dimitri Fontaine writes: > Tom Lane writes: >> Oh, wait a minute: there's a bad restriction there, namely that a >> contrib module could only add "loose" operators that had different >> declared input types from the ones known to the core opclass. > I would have though that such contrib would then need to offer their own > opfamily and opclasses, and users would have to use the specific opclass > manually like they do e.g. for text_pattern_ops. Can't it work that way? I think you missed the point: right now, to use both the core and intarray operators on an integer[] column, you have to create *two* GIN indexes, which will have exactly identical contents. I'm looking for a way to let intarray extend the core opfamily definition so that one index can serve. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending opfamilies for GIN indexes
Tom Lane writes: > Oh, wait a minute: there's a bad restriction there, namely that a > contrib module could only add "loose" operators that had different > declared input types from the ones known to the core opclass. Otherwise > there'd be a conflict with the contrib module and core needing to insert > similarly-keyed support functions. This would actually be enough for > contrib/intarray (because the core operator entries are for "anyarray" > not for "integer[]") but it is easy to foresee cases where that wouldn't > be good enough. Seems like we'd need an additional key column in > pg_amproc to really make this cover all cases. I would have though that such contrib would then need to offer their own opfamily and opclasses, and users would have to use the specific opclass manually like they do e.g. for text_pattern_ops. Can't it work that way? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending opfamilies for GIN indexes
I wrote: > So here's what I'm thinking: we could redefine a GIN opclass, per se, as > needing only compare() and extractValue() procs to be bound into it. > The other three procs, as well as the query operators, could be "loose" > in the containing opfamily. The index AM would choose which set of the > other support procedures to use for a specific query by matching their > amproclefttype/amprocrighttype to the declared input types of the query > operator, much as btree does. > Having done that, contrib/intarray could work by adding "loose" > operators and support procs to the core opfamily for integer[]. Oh, wait a minute: there's a bad restriction there, namely that a contrib module could only add "loose" operators that had different declared input types from the ones known to the core opclass. Otherwise there'd be a conflict with the contrib module and core needing to insert similarly-keyed support functions. This would actually be enough for contrib/intarray (because the core operator entries are for "anyarray" not for "integer[]") but it is easy to foresee cases where that wouldn't be good enough. Seems like we'd need an additional key column in pg_amproc to really make this cover all cases. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Extending opfamilies for GIN indexes
I just got annoyed by the fact that contrib/intarray has support for queries on GIN indexes on integer[] columns, but they only work if you use the intarray-provided opclass, not the core-provided GIN opclass for integer[] columns. In general, of course, two different GIN opclasses aren't compatible, but here there is precious little reason why not: the contents of the index are the same both ways, ie, all the individual integer keys in the arrays. It would be a real usability improvement, and would eliminate a foot-gun, if contrib/intarray could somehow be an extension to the core opclass instead of an independent thing. It seems to me that this should be possible within the opfamily/opclass data structure. Right now, there isn't any real application for opfamilies for GIN (or GiST) indexes, because both of those AMs pay attention only to the "default" support procs that are bound into the opclass for an index. But that could change. In particular, only two of the five support procs used by GIN are actually associated with "the index", in the sense of having some impact on what's stored in the index: the compare() and extractValue() procs. The other three are more associated with queries, though they do depend on having knowledge about the behavior of the compare and extractValue procs. So here's what I'm thinking: we could redefine a GIN opclass, per se, as needing only compare() and extractValue() procs to be bound into it. The other three procs, as well as the query operators, could be "loose" in the containing opfamily. The index AM would choose which set of the other support procedures to use for a specific query by matching their amproclefttype/amprocrighttype to the declared input types of the query operator, much as btree does. Having done that, contrib/intarray could work by adding "loose" operators and support procs to the core opfamily for integer[]. It's possible that this scheme would also make it really useful to have multiple opclasses within one GIN opfamily; though offhand I'm not sure of an application for that. (Right now, the only reason to do that is if you want to give opclasses for different types the same name, as we do with the core "array_ops".) Perhaps the same could be done with GiST, although I'm less sure about the possible usefulness there. Comments? BTW, this idea means that amproc entries would no longer be tightly associated with specific GIN opclasses, so the contentious patch for getObjectDescription should indeed get applied. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers