Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, 2011-03-01 at 22:56 +, Simon Riggs wrote: > On Tue, 2011-03-01 at 12:42 -0800, Josh Berkus wrote: > > > That response is just dodging the hard question, so whatever. Tom's > > > cleanup is not going to break things, or at least it's going to fix > > > more than it breaks on net. Sync rep, on the other hand, is going to > > > do the opposite, probably by a large margin. > > > > Well, we need to at least give Simon and Fujii a chance to respond > > during their respective daytime hours. > > Friday is a reasonable deadline. I'm not going to be committing Sync Rep today. It's technically in very good shape, but some of the conceptual issues aren't clear enough on a Friday night to make it sensible for me to commit to PostgreSQL, given we must support it for 5 years if I do. I believe those things are resolvable for 9.1, probably fairly soon, but I'm not going to rush a commit just to hit a deadline. That isn't the Way of Quality that we have followed for so long. I'll let y'all discuss what to do next. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Thanks, Tom ! Oleg On Wed, 2 Mar 2011, Tom Lane wrote: Teodor Sigaev writes: [ builtin_knngist_contrib_btree_gist-0.12 patch ] Applied with some corrections --- mostly, that the upgrade script was all wet. I added some documentation too. regards, tom lane Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Teodor Sigaev writes: > [ builtin_knngist_contrib_btree_gist-0.12 patch ] Applied with some corrections --- mostly, that the upgrade script was all wet. I added some documentation too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, 2011-03-01 at 12:42 -0800, Josh Berkus wrote: > > That response is just dodging the hard question, so whatever. Tom's > > cleanup is not going to break things, or at least it's going to fix > > more than it breaks on net. Sync rep, on the other hand, is going to > > do the opposite, probably by a large margin. > > Well, we need to at least give Simon and Fujii a chance to respond > during their respective daytime hours. Friday is a reasonable deadline. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 4:58 PM, Robert Haas wrote: > > So where is the new patch, and the test reports from all the people > who found problems in the last version? > Simon do this in his public repo here: git://github.com/simon2ndQuadrant/postgres.git It has been just one day from this so i expect he will post a patch soon, meanwhile i'm following his repo and testing and seems i'm not the only one because Yeb Havinga has added two columns in pg_stat_replication to know which standby is the synch standby and which ones are potential synch standbys -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 4:55 PM, Jaime Casanova wrote: > On Tue, Mar 1, 2011 at 3:40 PM, Robert Haas wrote: >> Sync rep, on the other hand, is going to >> do the opposite, probably by a large margin. >> > > Are you sure of that? the code is very centralized and the bugs we > found last week weren't that difficult to address, the prove for that > is that Simon fixed them in one day... So where is the new patch, and the test reports from all the people who found problems in the last version? > And for the open items, most of them are definitions about the behaviour... Like what? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 3:40 PM, Robert Haas wrote: > Sync rep, on the other hand, is going to > do the opposite, probably by a large margin. > Are you sure of that? the code is very centralized and the bugs we found last week weren't that difficult to address, the prove for that is that Simon fixed them in one day... And for the open items, most of them are definitions about the behaviour... -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
> That response is just dodging the hard question, so whatever. Tom's > cleanup is not going to break things, or at least it's going to fix > more than it breaks on net. Sync rep, on the other hand, is going to > do the opposite, probably by a large margin. Well, we need to at least give Simon and Fujii a chance to respond during their respective daytime hours. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
Robert Haas writes: > Bruce has been going through the open items for the past several weeks > (at least) and tells me that he hasn't found very much. I'm not sure > what your thought is on what's required to get us from here to beta, > but I am thinking it could be done in a few weeks. With a concerted > effort and some sustained focus, I don't see why we could get this > release out the door in, say, three months. Taking in a feature > that's going to take another month to sort out is going to push that > out, and I am really not excited about another round of > spend-all-summer-waiting-for-people-to-get-back-from-vacation-and-release-in-September. Yeah, it would be really nice to get the release out in June rather than September. If we wait any longer for Sync Rep I'm pretty sure it's going to be the latter not the former. See my nearby message for a start at a list of what we "must" do to get to alpha4. Any features we want to cram in at this stage go on top of that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 3:36 PM, Josh Berkus wrote: > >> That's still missing the point, which is that the code is unlikely to >> be up to our usual standards at that point. > > I was talking about "when do we close the CF and start building an > alpha", not synch rep particularly. Tom said he wants until Friday > anyway just for some cleanup. That response is just dodging the hard question, so whatever. Tom's cleanup is not going to break things, or at least it's going to fix more than it breaks on net. Sync rep, on the other hand, is going to do the opposite, probably by a large margin. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
> That's still missing the point, which is that the code is unlikely to > be up to our usual standards at that point. I was talking about "when do we close the CF and start building an alpha", not synch rep particularly. Tom said he wants until Friday anyway just for some cleanup. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 3:01 PM, Josh Berkus wrote: > On 3/1/11 11:58 AM, Tom Lane wrote: >> If we do hold up the release, I'll probably go back and reopen the >> postgresql_fdw patch as well as btree_gist. So I won't run out of >> things to do. But I'm not terribly satisfied with the decision-making >> process here. > > OK, well, we gave it an extra 15 days, which was all we promised. > > I'm ok with closing things as of the end of the 15 days, say Thursday or > Friday. That's still missing the point, which is that the code is unlikely to be up to our usual standards at that point. So far I don't hear anyone arguing strongly that we should accept sync rep in 9.1, and several people arguing the reverse. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, 2011-03-01 at 14:26 -0500, Robert Haas wrote: > On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus wrote: > > > >>> Since we appear to be still holding the commitfest open for Sync Rep, > >>> I guess this ought to get reviewed. > >> > >> Or else we should close the CommitFest and cut alpha4. Anyone have an > >> opinion on which way to go? > > > > I think we can give Sync Rep until the 15th, given the pace of work on > > it. It is a major feature, and a complicated one. > > Sure, but there are other features, major and minor, that we have > postponed to 9.2. In the normal course of events, sync rep would have > been marked Returned with Feedback a month ago. I like the feature, > but I have to say I'm not very pleased that we seem to have fallen > into a pattern of believing that some major features are somehow > exempted from the scheduling deadline and others are not. Neither am I, I mean, we were actively fixing and bringing Fk-Locks up to date and we got pushed but sync rep gets in? We may have had fk-locks baked by now. I want syncrep as much as the next person, but this isn't really fair to any of the other submitters. JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 2:58 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Mar 1, 2011 at 2:45 PM, Tom Lane wrote: >>> I'd say that if there's a plausible chance that Sync Rep will be >>> committable by the end of *this* week (and I mean Friday not Sunday), >>> I'm willing to wait that long for it. Otherwise, it's 9.2 material. > >> I am quite sure that Simon will be able to get something committed >> ahead of whatever deadline we choose to set. Whether that commit will >> be up to our usual standards is another question altogether. The last >> version posted to the list was trivial to break, and that was several >> weeks ago. > > Yeah, there's that. It's difficult to believe that anything committed > in the very short term wouldn't be rushed to completion rather than > really ready. That holds whether the deadline is this week or two > weeks out. Yep. > The other issue is that, as Robert says, we have already cut Sync Rep > more slack than any other patch in the commitfest. It does not seem > fair to hold up the release process another week or two for it, even > assuming that we get a high-quality feature at the end of that. Yep. > If we do hold up the release, I'll probably go back and reopen the > postgresql_fdw patch as well as btree_gist. So I won't run out of > things to do. But I'm not terribly satisfied with the decision-making > process here. Well, we haven't actually made a decision here yet. We're just talking about what decision we ought to make. Frankly, I avoided trying to mark Sync Rep Returned with Feedback mostly for the reason that I knew Simon would object, and his commit bit gives him a certain degree of latitude to ignore the CF process anyway. But I am really not that keen on having Sync Rep go in and then spending another month fixing all the bugs, and that's what I think will happen. Bruce has been going through the open items for the past several weeks (at least) and tells me that he hasn't found very much. I'm not sure what your thought is on what's required to get us from here to beta, but I am thinking it could be done in a few weeks. With a concerted effort and some sustained focus, I don't see why we could get this release out the door in, say, three months. Taking in a feature that's going to take another month to sort out is going to push that out, and I am really not excited about another round of spend-all-summer-waiting-for-people-to-get-back-from-vacation-and-release-in-September. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 3:08 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Mar 1, 2011 at 2:52 PM, Josh Berkus wrote: >>> Ideally, we want to have some binaries/packages for the "final alpha". >>> Those broaden testing considerably. > >> When we have a version that needs that treatment, we can simply call >> it beta1. If it's too half-baked for that, then I don't see the point >> in going to a lot of trouble to build packages. > > We (or more precisely EDB) made Windows installers for alpha1: > http://www.enterprisedb.com/products-services-training/pgdevdownload > And IIRC they did installers for alphas in the 9.0 cycle too. And > certainly Devrim and others have been building binary packages for > alphas. If this alpha is so much less baked than the previous ones > that that's not worthwhile, there's something very wrong with the > process. The last alpha ought to be in testable condition. Oh, really? OK. I wasn't aware that alphas got installers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
Robert Haas writes: > On Tue, Mar 1, 2011 at 2:52 PM, Josh Berkus wrote: >> Ideally, we want to have some binaries/packages for the "final alpha". >> Those broaden testing considerably. > When we have a version that needs that treatment, we can simply call > it beta1. If it's too half-baked for that, then I don't see the point > in going to a lot of trouble to build packages. We (or more precisely EDB) made Windows installers for alpha1: http://www.enterprisedb.com/products-services-training/pgdevdownload And IIRC they did installers for alphas in the 9.0 cycle too. And certainly Devrim and others have been building binary packages for alphas. If this alpha is so much less baked than the previous ones that that's not worthwhile, there's something very wrong with the process. The last alpha ought to be in testable condition. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On 3/1/11 11:58 AM, Tom Lane wrote: > If we do hold up the release, I'll probably go back and reopen the > postgresql_fdw patch as well as btree_gist. So I won't run out of > things to do. But I'm not terribly satisfied with the decision-making > process here. OK, well, we gave it an extra 15 days, which was all we promised. I'm ok with closing things as of the end of the 15 days, say Thursday or Friday. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
Robert Haas writes: > On Tue, Mar 1, 2011 at 2:45 PM, Tom Lane wrote: >> I'd say that if there's a plausible chance that Sync Rep will be >> committable by the end of *this* week (and I mean Friday not Sunday), >> I'm willing to wait that long for it. Otherwise, it's 9.2 material. > I am quite sure that Simon will be able to get something committed > ahead of whatever deadline we choose to set. Whether that commit will > be up to our usual standards is another question altogether. The last > version posted to the list was trivial to break, and that was several > weeks ago. Yeah, there's that. It's difficult to believe that anything committed in the very short term wouldn't be rushed to completion rather than really ready. That holds whether the deadline is this week or two weeks out. The other issue is that, as Robert says, we have already cut Sync Rep more slack than any other patch in the commitfest. It does not seem fair to hold up the release process another week or two for it, even assuming that we get a high-quality feature at the end of that. If we do hold up the release, I'll probably go back and reopen the postgresql_fdw patch as well as btree_gist. So I won't run out of things to do. But I'm not terribly satisfied with the decision-making process here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 2:52 PM, Josh Berkus wrote: > Ideally, we want to have some binaries/packages for the "final alpha". > Those broaden testing considerably. When we have a version that needs that treatment, we can simply call it beta1. If it's too half-baked for that, then I don't see the point in going to a lot of trouble to build packages. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
> As I understand it, it requires only the steps described here: > > http://wiki.postgresql.org/wiki/Alpha_release_process > > That all looks pretty straightforward, assuming you can log into > developer.postgresql.org, which I can't. I think I remember having an > ftp account at some point, but it's not accepting connections on port > 22, so there is doubtless some secret sauce I am missing here. Ideally, we want to have some binaries/packages for the "final alpha". Those broaden testing considerably. We don't need them for all platforms, of course. Really, the critical ones for testing are Windows and OSX. Linux/BSD/Solaris users are pretty good at make/make install. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 2:38 PM, Magnus Hagander wrote: >> Frankly, I think we should be aiming to get a beta out in April, not >> another alpha. > > That would be good, but in order to get there, +1 for cutting a new > alpha even if syncrep isn't ready for it yet. That way we can at least > get some more testing on all the non-syncrep code. > > That is assuming that cutting an alpha release isn't all that much > work, but IIRC it's not. (Hey, I know it's not much work for me, but > someone who actually does the work should comment on the total > amount...) As I understand it, it requires only the steps described here: http://wiki.postgresql.org/wiki/Alpha_release_process That all looks pretty straightforward, assuming you can log into developer.postgresql.org, which I can't. I think I remember having an ftp account at some point, but it's not accepting connections on port 22, so there is doubtless some secret sauce I am missing here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 2:45 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus wrote: >>> I think we can give Sync Rep until the 15th, given the pace of work on >>> it. It is a major feature, and a complicated one. > >> Sure, but there are other features, major and minor, that we have >> postponed to 9.2. In the normal course of events, sync rep would have >> been marked Returned with Feedback a month ago. I like the feature, >> but I have to say I'm not very pleased that we seem to have fallen >> into a pattern of believing that some major features are somehow >> exempted from the scheduling deadline and others are not. > > Yes. What are the rest of us supposed to do for the next two weeks, > twiddle our thumbs? > > Personally I've got a couple of days' worth of cleanup tasks before I'd > want to see us cut an alpha anyway, especially if we're going to try > to accept the btree_gist KNNgist patch. Two weeks is too much though. > > I'd say that if there's a plausible chance that Sync Rep will be > committable by the end of *this* week (and I mean Friday not Sunday), > I'm willing to wait that long for it. Otherwise, it's 9.2 material. I am quite sure that Simon will be able to get something committed ahead of whatever deadline we choose to set. Whether that commit will be up to our usual standards is another question altogether. The last version posted to the list was trivial to break, and that was several weeks ago. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
Robert Haas writes: > On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus wrote: >> I think we can give Sync Rep until the 15th, given the pace of work on >> it. It is a major feature, and a complicated one. > Sure, but there are other features, major and minor, that we have > postponed to 9.2. In the normal course of events, sync rep would have > been marked Returned with Feedback a month ago. I like the feature, > but I have to say I'm not very pleased that we seem to have fallen > into a pattern of believing that some major features are somehow > exempted from the scheduling deadline and others are not. Yes. What are the rest of us supposed to do for the next two weeks, twiddle our thumbs? Personally I've got a couple of days' worth of cleanup tasks before I'd want to see us cut an alpha anyway, especially if we're going to try to accept the btree_gist KNNgist patch. Two weeks is too much though. I'd say that if there's a plausible chance that Sync Rep will be committable by the end of *this* week (and I mean Friday not Sunday), I'm willing to wait that long for it. Otherwise, it's 9.2 material. > Frankly, I think we should be aiming to get a beta out in April, not > another alpha. Quite. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 20:26, Robert Haas wrote: > On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus wrote: >> Since we appear to be still holding the commitfest open for Sync Rep, I guess this ought to get reviewed. >>> >>> Or else we should close the CommitFest and cut alpha4. Anyone have an >>> opinion on which way to go? >> >> I think we can give Sync Rep until the 15th, given the pace of work on >> it. It is a major feature, and a complicated one. > > Sure, but there are other features, major and minor, that we have > postponed to 9.2. In the normal course of events, sync rep would have > been marked Returned with Feedback a month ago. I like the feature, > but I have to say I'm not very pleased that we seem to have fallen > into a pattern of believing that some major features are somehow > exempted from the scheduling deadline and others are not. I am sure > there are plenty of other people who would have liked a six week > extension of the usual CommitFest deadlines too, but they didn't get > it (and for the most part, were pretty gracious about that). > >> We could even cut a pre-synch-rep Alpha4 *now*, and follow that with a >> post-synch-rep Alpha5 sometime around April 1. >> >> That'll put us in a good position for beta, and also to see what >> specific issue SynR adds. > > Frankly, I think we should be aiming to get a beta out in April, not > another alpha. That would be good, but in order to get there, +1 for cutting a new alpha even if syncrep isn't ready for it yet. That way we can at least get some more testing on all the non-syncrep code. That is assuming that cutting an alpha release isn't all that much work, but IIRC it's not. (Hey, I know it's not much work for me, but someone who actually does the work should comment on the total amount...) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus wrote: > >>> Since we appear to be still holding the commitfest open for Sync Rep, >>> I guess this ought to get reviewed. >> >> Or else we should close the CommitFest and cut alpha4. Anyone have an >> opinion on which way to go? > > I think we can give Sync Rep until the 15th, given the pace of work on > it. It is a major feature, and a complicated one. Sure, but there are other features, major and minor, that we have postponed to 9.2. In the normal course of events, sync rep would have been marked Returned with Feedback a month ago. I like the feature, but I have to say I'm not very pleased that we seem to have fallen into a pattern of believing that some major features are somehow exempted from the scheduling deadline and others are not. I am sure there are plenty of other people who would have liked a six week extension of the usual CommitFest deadlines too, but they didn't get it (and for the most part, were pretty gracious about that). > We could even cut a pre-synch-rep Alpha4 *now*, and follow that with a > post-synch-rep Alpha5 sometime around April 1. > > That'll put us in a good position for beta, and also to see what > specific issue SynR adds. Frankly, I think we should be aiming to get a beta out in April, not another alpha. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
>> Since we appear to be still holding the commitfest open for Sync Rep, >> I guess this ought to get reviewed. > > Or else we should close the CommitFest and cut alpha4. Anyone have an > opinion on which way to go? I think we can give Sync Rep until the 15th, given the pace of work on it. It is a major feature, and a complicated one. We could even cut a pre-synch-rep Alpha4 *now*, and follow that with a post-synch-rep Alpha5 sometime around April 1. That'll put us in a good position for beta, and also to see what specific issue SynR adds. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)
On Tue, Mar 1, 2011 at 1:35 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Feb 28, 2011 at 2:25 PM, Tom Lane wrote: >>> Given that it is a contrib module, I personally wouldn't object to it >>> getting patched later, like during alpha or beta. But somebody's got >>> to do the work, and I've got a dozen higher-priority problems right now. > >> Well, we can argue about whether it's too late for 9.1 if and when a >> patch shows up. Right now we don't have that problem. > > We do now ... > http://archives.postgresql.org/pgsql-hackers/2011-03/msg00038.php > > Since we appear to be still holding the commitfest open for Sync Rep, > I guess this ought to get reviewed. Or else we should close the CommitFest and cut alpha4. Anyone have an opinion on which way to go? I think it's fair to say that Simon is working pretty actively on Sync Rep and that the bug count is probably dropping rapidly. It seems a shame to push sync rep out to 9.2 in that context. On the other hand, the patch wasn't done at the beginning of the CommitFest, it wasn't done at the scheduled end of the CommitFest, and it's still not done now two weeks after the scheduled end of the CommitFest. If it gets committed O(now), it's probably going to still have bugs and design problems that will take at least a few more weeks to shake out, which will directly add to the length of time that it takes to actually get the release out the door. I could go either way on this one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > On Mon, Feb 28, 2011 at 2:25 PM, Tom Lane wrote: >> Given that it is a contrib module, I personally wouldn't object to it >> getting patched later, like during alpha or beta. But somebody's got >> to do the work, and I've got a dozen higher-priority problems right now. > Well, we can argue about whether it's too late for 9.1 if and when a > patch shows up. Right now we don't have that problem. We do now ... http://archives.postgresql.org/pgsql-hackers/2011-03/msg00038.php Since we appear to be still holding the commitfest open for Sync Rep, I guess this ought to get reviewed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
I did a quick look at this patch. The major problem with it is of course that it needs to be fixed for the recent extension-related changes. I transposed the .sql.in changes into additions to btree_gist--1.0.sql (attached), but haven't really sanity-checked them beyond checking that the regression tests pass. The same mods would need to be made in btree_gist--unpackaged--1.0.sql. Fixed 1. oid_dist() returns oid ... really? Oid is unsigned. I'd be inclined to argue though that distance between Oids is a meaningless concept, so Hmm, oid is often used as unsigned int. you should remove this not just mess with the result type. Anybody who actually wants to form a distance between Oids should have to cast them to an arithmetic type first. Let the user figure out how wraparound cases should be handled. Distance between unsigned 32-bit integers could not be more than 2^32. 2. Beyond that, none of the distance routines have given any thought to avoiding overflow. For instance, dist_int2 had better return something wider than int2, and so on up. It looks to me like the internal gist Just like other operations: # select 32000::smallint + 32000::smallint; ERROR: smallint out of range distance functions also suffer overflow risks, in that they tend to form the difference first (in the source datatype) and only afterwards cast to float8. fixed 3. I was surprised that there wasn't a distance implementation for numeric. I suppose that this might be difficult to do without risking overflow in conversion to float8, though. Exactly 4. I didn't much care for changing the result type of gbt_num_consistent from bool to float8; that's just messy, and I don't see any compensating advantage. I suggest you leave gbt_num_consistent and its callers alone, and add a separate gbt_num_distance routine that only handles the KNNDistance case. Done -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ builtin_knngist_contrib_btree_gist-0.12.gz Description: Unix tar archive -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Feb 28, 2011 at 2:25 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Feb 28, 2011 at 1:53 PM, Josh Berkus wrote: Since no one has stepped up to fix these issues, I have marked this patch Returned with Feedback. > >>> This is just contrib/btree_GIST, yes? > >> Yes, core KNN was committed by Tom during the November CommitFest. > > Right. However, it's disappointing that this isn't in, because the > number of use cases for KNN-gist in core isn't very large. We really > need support for KNN in btree_gist to make it useful. > > Given that it is a contrib module, I personally wouldn't object to it > getting patched later, like during alpha or beta. But somebody's got > to do the work, and I've got a dozen higher-priority problems right now. Well, we can argue about whether it's too late for 9.1 if and when a patch shows up. Right now we don't have that problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > On Mon, Feb 28, 2011 at 1:53 PM, Josh Berkus wrote: >>> Since no one has stepped up to fix these issues, I have marked this >>> patch Returned with Feedback. >> This is just contrib/btree_GIST, yes? > Yes, core KNN was committed by Tom during the November CommitFest. Right. However, it's disappointing that this isn't in, because the number of use cases for KNN-gist in core isn't very large. We really need support for KNN in btree_gist to make it useful. Given that it is a contrib module, I personally wouldn't object to it getting patched later, like during alpha or beta. But somebody's got to do the work, and I've got a dozen higher-priority problems right now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Feb 28, 2011 at 1:53 PM, Josh Berkus wrote: > >> Since no one has stepped up to fix these issues, I have marked this >> patch Returned with Feedback. > > This is just contrib/btree_GIST, yes? Yes, core KNN was committed by Tom during the November CommitFest. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
> Since no one has stepped up to fix these issues, I have marked this > patch Returned with Feedback. This is just contrib/btree_GIST, yes? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Fri, Feb 18, 2011 at 1:07 AM, Tom Lane wrote: > There might be more issues, I haven't read the patch in detail. > But anyway, I'm going to set it to Waiting on Author. I think it > needs at least a day or so's work, and I can't put in that kind of > time on it now. Since no one has stepped up to fix these issues, I have marked this patch Returned with Feedback. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Teodor Sigaev writes: >> I've applied all of this, and written documentation for all of it, > Thank you a lot >> except for the contrib/btree_gist additions which still need to be >> redone for the revised API (and then documented!). My patience ran out > Done, btree_gist is reworked for a new API. I did a quick look at this patch. The major problem with it is of course that it needs to be fixed for the recent extension-related changes. I transposed the .sql.in changes into additions to btree_gist--1.0.sql (attached), but haven't really sanity-checked them beyond checking that the regression tests pass. The same mods would need to be made in btree_gist--unpackaged--1.0.sql. However, I feel that this is not ready to apply even given those fixes. Problems yet to solve: 1. oid_dist() returns oid ... really? Oid is unsigned. I'd be inclined to argue though that distance between Oids is a meaningless concept, so you should remove this not just mess with the result type. Anybody who actually wants to form a distance between Oids should have to cast them to an arithmetic type first. Let the user figure out how wraparound cases should be handled. 2. Beyond that, none of the distance routines have given any thought to avoiding overflow. For instance, dist_int2 had better return something wider than int2, and so on up. It looks to me like the internal gist distance functions also suffer overflow risks, in that they tend to form the difference first (in the source datatype) and only afterwards cast to float8. 3. I was surprised that there wasn't a distance implementation for numeric. I suppose that this might be difficult to do without risking overflow in conversion to float8, though. 4. I didn't much care for changing the result type of gbt_num_consistent from bool to float8; that's just messy, and I don't see any compensating advantage. I suggest you leave gbt_num_consistent and its callers alone, and add a separate gbt_num_distance routine that only handles the KNNDistance case. There might be more issues, I haven't read the patch in detail. But anyway, I'm going to set it to Waiting on Author. I think it needs at least a day or so's work, and I can't put in that kind of time on it now. regards, tom lane diff --git a/contrib/btree_gist/btree_gist--1.0.sql b/contrib/btree_gist/btree_gist--1.0.sql index 1ea5fa2db418d06f991f362a956fec07ccfba9e9..dd428995c185a09519ed65433081752b83b71bb7 100644 *** a/contrib/btree_gist/btree_gist--1.0.sql --- b/contrib/btree_gist/btree_gist--1.0.sql *** CREATE TYPE gbtreekey_var ( *** 81,86 --- 81,231 STORAGE = EXTENDED ); + --distance operators + + CREATE FUNCTION cash_dist(money, money) + RETURNS money + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = money, + RIGHTARG = money, + PROCEDURE = cash_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION date_dist(date, date) + RETURNS int4 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = date, + RIGHTARG = date, + PROCEDURE = date_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION float4_dist(float4, float4) + RETURNS float4 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = float4, + RIGHTARG = float4, + PROCEDURE = float4_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION float8_dist(float8, float8) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = float8, + RIGHTARG = float8, + PROCEDURE = float8_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION int2_dist(int2, int2) + RETURNS int2 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = int2, + RIGHTARG = int2, + PROCEDURE = int2_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION int4_dist(int4, int4) + RETURNS int4 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = int4, + RIGHTARG = int4, + PROCEDURE = int4_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION int8_dist(int8, int8) + RETURNS int8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = int8, + RIGHTARG = int8, + PROCEDURE = int8_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION interval_dist(interval, interval) + RETURNS interval + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = interval, + RIGHTARG = interval, + PROCEDURE = interval_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION oid_dist(oid, oid) + RETURNS oid + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = oid, + RIGHTARG = oid, + PROCEDURE = oid_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION time_dist(time, time) + RETURNS interval + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = time, + RIGHTARG
Re: [HACKERS] knngist - 0.8
I've applied all of this, and written documentation for all of it, Thank you a lot except for the contrib/btree_gist additions which still need to be redone for the revised API (and then documented!). My patience ran out Done, btree_gist is reworked for a new API. I'm very sorry, but I'm rather busy now and will be accessible only after January, 10. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ builtin_knngist_contrib_btree_gist-0.9.gz Description: Unix tar archive -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Martijn van Oosterhout writes: > On Sun, Dec 26, 2010 at 08:13:40PM -0500, Tom Lane wrote: >> [ thinks for a bit... ] One reason for having a different structure >> would be if we needed to represent abstract semantics for some operators >> that couldn't be associated with a btree opclass. > One thing that comes to mind is the operators used for hash indexes, > namely the hash() function. The hash opclasses handle that fine. I cannot conceive of any reason for shoehorning hash functions into btree opclasses. > With respect to the collation of strings I have thought it useful to be > able to define a sortkey() function, which would map the input space to > a 8 byte integer and satisfies the rule: > sortkey(a) < sortkey(b) implies a < b I'm pretty dubious about the workability of that one, but again, there isn't any obvious reason why we'd need a new catalog structure to support it. If we did want it, it could be an optional support function in btree opclasses. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Sun, Dec 26, 2010 at 08:13:40PM -0500, Tom Lane wrote: > [ thinks for a bit... ] One reason for having a different structure > would be if we needed to represent abstract semantics for some operators > that couldn't be associated with a btree opclass. This is clearly not > an issue for what RANGE needs, since anything you can order by will > surely have a btree opclass for that, and in fact we probably need to > tie those operators to specific orderings if a datatype has more than > one sort ordering. But maybe there could be some other situation where > we'd need to describe operator behavior for a datatype independently of > any sort ordering. Can anyone come up with a plausible example? One thing that comes to mind is the operators used for hash indexes, namely the hash() function. It is closely related to the collation but isn't actually used for sorting. For every btree class you can make a hash class with the same equality operator and an appropriate hash function. I've had the idea of defining a parent object and deriving the btree and hash operator classes from that, but it gets messy once you get into cross-type operators (i.e. operator families). With respect to the collation of strings I have thought it useful to be able to define a sortkey() function, which would map the input space to a 8 byte integer and satisfies the rule: sortkey(a) < sortkey(b) implies a < b The idea being that you can use this as an efficient first step to speed up sorting strings, since adding it to the sort list implicitly before the actual column doesn't change the result. In the case of strings the sortkey() could be generated with strxfrm(). A similar idea could be used with other expensive comparison operations, but I can't think of any at the moment. Actually, perhaps you could use it with ints/floats/etc as well, since you could skip the function call overhead. You'd be trading (n log n int compares + n sortkeys) with (n log n comparisions). Just some thoughts, Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] knngist - 0.8
Hitoshi Harada writes: > 2010/12/27 Robert Haas : >> As far as window functions go, we clearly need some kind of type >> interface feature, but I am unclear whether we should sandwhich it >> into the btree opclass machinery or whether it might be better to >> create a whole separate concept just for this purpose. > I believe we should use btree opclass machinery to represent > add/subtract interfaces because in the RANGE case you eventually need > add constant to value and compare it with some boundary by using > btree. Or, btree operator set would be shared in the type interface > machinery. We will want to avoid duplication between them. The thing is, these operators have no arguable application in the context of actual btree index searches. We could certainly stick them into pg_amop anyway, using a third amoppurpose category to distinguish them from searchable or orderable operators. But I share Robert's discomfort with this --- it's unarguably an abuse of the original intention of operator classes. OTOH, operator classes are what we've got to represent the abstract semantics of operators, and it's not real clear to me what we'd gain by inventing an independent catalog structure to do more or less the same sort of thing. One good reason *not* to do that is it'll represent even more stuff that has to be cached during backend startup. [ thinks for a bit... ] One reason for having a different structure would be if we needed to represent abstract semantics for some operators that couldn't be associated with a btree opclass. This is clearly not an issue for what RANGE needs, since anything you can order by will surely have a btree opclass for that, and in fact we probably need to tie those operators to specific orderings if a datatype has more than one sort ordering. But maybe there could be some other situation where we'd need to describe operator behavior for a datatype independently of any sort ordering. Can anyone come up with a plausible example? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
2010/12/27 Robert Haas : > On Sun, Dec 26, 2010 at 2:41 PM, Tom Lane wrote: >> Hitoshi Harada writes: >>> Catching up tonight, I wonder I could propose to add ordering >>> operators in btree, not in gist, for basic types. >> >> I thought about that for a bit while working on the knngist patch, but >> couldn't really see any useful application. In particular, I don't see >> a way to shoehorn + and - in there as ordering operators. They don't >> match the structure. The RANGE problem wants to add operators that are >> somehow related to a btree's operators, but they're not related in the >> way that knngist uses. > > As far as window functions go, we clearly need some kind of type > interface feature, but I am unclear whether we should sandwhich it > into the btree opclass machinery or whether it might be better to > create a whole separate concept just for this purpose. Range types > might also like to have some of the same information (addition and > subtraction operators for a type, and perhaps also the identity and > unit if those exist). I believe we should use btree opclass machinery to represent add/subtract interfaces because in the RANGE case you eventually need add constant to value and compare it with some boundary by using btree. Or, btree operator set would be shared in the type interface machinery. We will want to avoid duplication between them. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Sun, Dec 26, 2010 at 2:41 PM, Tom Lane wrote: > Hitoshi Harada writes: >> Catching up tonight, I wonder I could propose to add ordering >> operators in btree, not in gist, for basic types. > > I thought about that for a bit while working on the knngist patch, but > couldn't really see any useful application. In particular, I don't see > a way to shoehorn + and - in there as ordering operators. They don't > match the structure. The RANGE problem wants to add operators that are > somehow related to a btree's operators, but they're not related in the > way that knngist uses. It's superficially syntactically similar (value op constant) but as you say it's really a different problem. The KNNGIST case could occur in combination with this case, too: ORDER BY (point-column + zero-vector-constant) distance-from point-constant. What's really going on here is that there's a class of operations which can be applied to an ORDER-BY column without actually changing the ordering - adding or subtracting a constant, multiplying by a positive value, concatenating with the empty string, etc. In theory, we could try to recognize such cases and avoid a sort, but it seems like a lot of work in proportion to the likely benefit. A far more valuable case to optimize would be the one where you have ORDER BY a, b and can obtain input pre-sorted by a but not by a, b. It'd be extremely useful to be able to take the data sorted by just a and sort each group on b. As far as window functions go, we clearly need some kind of type interface feature, but I am unclear whether we should sandwhich it into the btree opclass machinery or whether it might be better to create a whole separate concept just for this purpose. Range types might also like to have some of the same information (addition and subtraction operators for a type, and perhaps also the identity and unit if those exist). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Hitoshi Harada writes: > Catching up tonight, I wonder I could propose to add ordering > operators in btree, not in gist, for basic types. I thought about that for a bit while working on the knngist patch, but couldn't really see any useful application. In particular, I don't see a way to shoehorn + and - in there as ordering operators. They don't match the structure. The RANGE problem wants to add operators that are somehow related to a btree's operators, but they're not related in the way that knngist uses. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
2010/12/4 Tom Lane : > What we have at this point (pending contrib/btree_gist fixes) is > nearest-neighbor searching capability for point columns. And > trigram-based nearest-neighbor for text strings, if you install > contrib/pg_trgm. That doesn't seem like a lot of return for the > amount of work that went into it. Are there plans to add KNN support > for any other standard types? Catching up tonight, I wonder I could propose to add ordering operators in btree, not in gist, for basic types. So far, we couldn't optimize simple examples like: regression=# create index ti on t using btree(i); CREATE INDEX regression=# explain select * from t order by i - 10 limit 10; QUERY PLAN Limit (cost=405.10..405.12 rows=10 width=24) -> Sort (cost=405.10..430.10 rows=1 width=24) Sort Key: ((i - 10)) -> Seq Scan on t (cost=0.00..189.00 rows=1 width=24) (4 rows) While this looks too stupid at a glance, adding 2 strategies into btree, "addition" and "subtraction", will help RANGE concept on data types; we were stacked around how to implement RANGE in both of window functions' frame and PARTITION because of lack of "addition" and "subtraction" idea. If we have 2 more strategies in btree, they can be solved IMHO. I know addition and subtraction is never relevant to btree indexing but avoiding unnecessary seq scan and supporting RANGE concept may buy somehow. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Wed, Dec 22, 2010 at 8:04 PM, Bruce Momjian wrote: > Tom Lane wrote: >> Robert Haas writes: >> > 2010/9/13 Teodor Sigaev : >> >> [updated patch] >> >> > I realize I'm repeating myself, but... this patch needs >> > documentation. It's not optional. >> >> I've applied all of this, and written documentation for all of it, >> except for the contrib/btree_gist additions which still need to be >> redone for the revised API (and then documented!). My patience ran out >> somewhere around there, so I'm marking that part as returned with >> feedback. >> >> What we have at this point (pending contrib/btree_gist fixes) is >> nearest-neighbor searching capability for point columns. And >> trigram-based nearest-neighbor for text strings, if you install >> contrib/pg_trgm. That doesn't seem like a lot of return for the >> amount of work that went into it. Are there plans to add KNN support >> for any other standard types? > > I was thinking /contrib/fuzzystrmatch could use it to find the words > that mostly closely match a string. Seems unlikely to be feasible. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Tom Lane wrote: > Robert Haas writes: > > 2010/9/13 Teodor Sigaev : > >> [updated patch] > > > I realize I'm repeating myself, but... this patch needs > > documentation. It's not optional. > > I've applied all of this, and written documentation for all of it, > except for the contrib/btree_gist additions which still need to be > redone for the revised API (and then documented!). My patience ran out > somewhere around there, so I'm marking that part as returned with > feedback. > > What we have at this point (pending contrib/btree_gist fixes) is > nearest-neighbor searching capability for point columns. And > trigram-based nearest-neighbor for text strings, if you install > contrib/pg_trgm. That doesn't seem like a lot of return for the > amount of work that went into it. Are there plans to add KNN support > for any other standard types? I was thinking /contrib/fuzzystrmatch could use it to find the words that mostly closely match a string. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Sat, 4 Dec 2010, Greg Stark wrote: On Sat, Dec 4, 2010 at 6:07 PM, Oleg Bartunov wrote: We'll sync contrib/btree_gist with current API. Also, we're thinking about distance for ltree, boxes, circles. I'm not sure about polygons, though. So, we'll have rather complete set of knn-fied data types. I kind of assumed the natural client for KNN-gist was the tsearch full text search indexes handling sorting by relevance. For example if I search for "Postgres DBA" I should find documents where those words appear adjacent first and documents where the two words appear far apart in the document sorted further down. Is that not on the list of operators supported or planned to be supported? We'll start thinking about this once we know how to store coordinate information in index :) Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Dimitri Fontaine writes: > Greg Stark writes: >> I kind of assumed the natural client for KNN-gist was the tsearch full >> text search indexes handling sorting by relevance. For example if I >> search for "Postgres DBA" I should find documents where those words >> appear adjacent first and documents where the two words appear far >> apart in the document sorted further down. Is that not on the list of >> operators supported or planned to be supported? > From the presentation I've seen, the typical use case is more searching > "PostgreSQL DBA" at 100 km around a known location. Or more typical yet, > Pizza restaurants around a known place :) Right offhand I don't see how KNNGIST could usefully be applied to the problem Greg is thinking about. A KNNGIST search is only going to be fast if the target items can be found in a reasonably small part of the index. Nearest-neighbor in a geometrically organized index qualifies, but I don't see how Greg's problem matches the structure of a tsearch index. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Greg Stark writes: > I kind of assumed the natural client for KNN-gist was the tsearch full > text search indexes handling sorting by relevance. For example if I > search for "Postgres DBA" I should find documents where those words > appear adjacent first and documents where the two words appear far > apart in the document sorted further down. Is that not on the list of > operators supported or planned to be supported? From the presentation I've seen, the typical use case is more searching "PostgreSQL DBA" at 100 km around a known location. Or more typical yet, Pizza restaurants around a known place :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Sat, Dec 4, 2010 at 6:07 PM, Oleg Bartunov wrote: > We'll sync contrib/btree_gist with current API. Also, we're thinking > about distance for ltree, boxes, circles. I'm not sure about polygons, > though. > So, we'll have rather complete set of knn-fied data types. I kind of assumed the natural client for KNN-gist was the tsearch full text search indexes handling sorting by relevance. For example if I search for "Postgres DBA" I should find documents where those words appear adjacent first and documents where the two words appear far apart in the document sorted further down. Is that not on the list of operators supported or planned to be supported? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Thanks to all for patience ! We'll sync contrib/btree_gist with current API. Also, we're thinking about distance for ltree, boxes, circles. I'm not sure about polygons, though. So, we'll have rather complete set of knn-fied data types. Oleg On Sat, 4 Dec 2010, Tom Lane wrote: Robert Haas writes: 2010/9/13 Teodor Sigaev : [updated patch] I realize I'm repeating myself, but... this patch needs documentation. It's not optional. I've applied all of this, and written documentation for all of it, except for the contrib/btree_gist additions which still need to be redone for the revised API (and then documented!). My patience ran out somewhere around there, so I'm marking that part as returned with feedback. What we have at this point (pending contrib/btree_gist fixes) is nearest-neighbor searching capability for point columns. And trigram-based nearest-neighbor for text strings, if you install contrib/pg_trgm. That doesn't seem like a lot of return for the amount of work that went into it. Are there plans to add KNN support for any other standard types? regards, tom lane Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > 2010/9/13 Teodor Sigaev : >> [updated patch] > I realize I'm repeating myself, but... this patch needs > documentation. It's not optional. I've applied all of this, and written documentation for all of it, except for the contrib/btree_gist additions which still need to be redone for the revised API (and then documented!). My patience ran out somewhere around there, so I'm marking that part as returned with feedback. What we have at this point (pending contrib/btree_gist fixes) is nearest-neighbor searching capability for point columns. And trigram-based nearest-neighbor for text strings, if you install contrib/pg_trgm. That doesn't seem like a lot of return for the amount of work that went into it. Are there plans to add KNN support for any other standard types? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Tue, Nov 23, 2010 at 11:37 AM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Nov 23, 2010 at 11:12 AM, Tom Lane wrote: >>> It would be the first, because simply assigning another strategy number >>> only satisfies one of the unique constraints on pg_amop. To allow >>> arbitrary flexibility here, we would have to include all components of >>> the ordering specification in the unique constraint that's presently >>> just (amopopr, amopfamily) and is proposed to become >>> (amopopr, amopfamily, amoppurpose). I think that's an undue amount of >>> complexity to support something that's most likely physically impossible >>> from the index's standpoint anyway. > >> Or, you'd need to pass these details separately to the AM, which seems >> like a more more flexible way of doing it. > > A more flexible way of doing what? The first requirement here is that > the catalog entries provide sufficient information to determine the > semantics. You can't just say "this opclass supports ordering", you > have to specify what that ordering is. Punting to the index AM helps > not at all, unless your proposal is to hard-wire this in GIST rather > than in the core planner. Eh, let's just do it the way you want to do it. It's probably going to have to be redone the next time somebody wants to make an enhancement in this area, but I guess it's going to be easy to do that then than to figure where to go with it now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
I wrote: > We will probably *also* want to pass these details explicitly to the > index AM, but that doesn't solve the problem that some catalog somewhere > has to say what it is that the opclass can do. ... although having said that, the obvious question is why that catalog has to be pg_amop. Maybe we should leave pg_amop alone (so that it represents only search operators) and invent a new catalog pg_amorderop. I envision it having the same columns as pg_amop, plus an ordering opclass OID (and maybe we might as well stick in asc/desc and nulls_first). The reason to do this instead of just adding those columns to pg_amop is that then we can have a different set of unique indexes. I'm thinking about (amopfamily, amoplefttype, amoprighttype, amopstrategy), which would be the same as in pg_amop, plus (amopopr, amopfamily, amopstrategy). This would allow a single operator to be registered under multiple strategy numbers, which presumably would carry different payload sort-order-specification columns. This still seems like overkill to me, because I don't actually believe that it's practical for an index to support multiple sort orders. But if anyone would like to make an argument why that's not pie in the sky, this might be the way to represent it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > On Tue, Nov 23, 2010 at 11:12 AM, Tom Lane wrote: >> It would be the first, because simply assigning another strategy number >> only satisfies one of the unique constraints on pg_amop. To allow >> arbitrary flexibility here, we would have to include all components of >> the ordering specification in the unique constraint that's presently >> just (amopopr, amopfamily) and is proposed to become >> (amopopr, amopfamily, amoppurpose). I think that's an undue amount of >> complexity to support something that's most likely physically impossible >> from the index's standpoint anyway. > Or, you'd need to pass these details separately to the AM, which seems > like a more more flexible way of doing it. A more flexible way of doing what? The first requirement here is that the catalog entries provide sufficient information to determine the semantics. You can't just say "this opclass supports ordering", you have to specify what that ordering is. Punting to the index AM helps not at all, unless your proposal is to hard-wire this in GIST rather than in the core planner. We will probably *also* want to pass these details explicitly to the index AM, but that doesn't solve the problem that some catalog somewhere has to say what it is that the opclass can do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Tue, Nov 23, 2010 at 11:12 AM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Nov 22, 2010 at 11:05 PM, Tom Lane wrote: >>> I'm satisfied to say that only one sort order can be associated with a >>> particular operator in a particular opclass, which is what would be >>> implied by using AMOP_SEARCH/AMOP_ORDER as the unique key component. > >> Does that imply that KNNGIST would only be able to support one >> ordering per AMOP_ORDER-operator, or does it imply that each such >> ordering would require a separate strategy number? The second might >> be OK, but the first sounds bad. > > It would be the first, because simply assigning another strategy number > only satisfies one of the unique constraints on pg_amop. To allow > arbitrary flexibility here, we would have to include all components of > the ordering specification in the unique constraint that's presently > just (amopopr, amopfamily) and is proposed to become > (amopopr, amopfamily, amoppurpose). I think that's an undue amount of > complexity to support something that's most likely physically impossible > from the index's standpoint anyway. Or, you'd need to pass these details separately to the AM, which seems like a more more flexible way of doing it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > On Mon, Nov 22, 2010 at 11:05 PM, Tom Lane wrote: >> I'm satisfied to say that only one sort order can be associated with a >> particular operator in a particular opclass, which is what would be >> implied by using AMOP_SEARCH/AMOP_ORDER as the unique key component. > Does that imply that KNNGIST would only be able to support one > ordering per AMOP_ORDER-operator, or does it imply that each such > ordering would require a separate strategy number? The second might > be OK, but the first sounds bad. It would be the first, because simply assigning another strategy number only satisfies one of the unique constraints on pg_amop. To allow arbitrary flexibility here, we would have to include all components of the ordering specification in the unique constraint that's presently just (amopopr, amopfamily) and is proposed to become (amopopr, amopfamily, amoppurpose). I think that's an undue amount of complexity to support something that's most likely physically impossible from the index's standpoint anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Nov 22, 2010 at 11:05 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Nov 22, 2010 at 8:32 PM, Tom Lane wrote: >>> On balance I'm inclined to leave the unique key as per previous proposal >>> (with a "purpose" column) and add the which-sort-order-is-that >>> information as payload columns that aren't part of the key. > >> This is probably OK too, although I confess I'm a lot less happy about >> it now that you've pointed out the need for those payload columns. > > The reason I said "columns" is that I can foresee eventually wanting to > specify a pathkey in its entirety --- opfamily, asc/desc, nulls_first, > and whatever we come up with for collation. We don't currently need to > store more than the opfamily, since the others can never need to have > non-default values given the current implementation of KNNGIST. But the > idea that they might all be there eventually makes me feel that we don't > want to try to incorporate this data in pg_amop's unique key. I'm > satisfied to say that only one sort order can be associated with a > particular operator in a particular opclass, which is what would be > implied by using AMOP_SEARCH/AMOP_ORDER as the unique key component. Does that imply that KNNGIST would only be able to support one ordering per AMOP_ORDER-operator, or does it imply that each such ordering would require a separate strategy number? The second might be OK, but the first sounds bad. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > On Mon, Nov 22, 2010 at 8:32 PM, Tom Lane wrote: >> On balance I'm inclined to leave the unique key as per previous proposal >> (with a "purpose" column) and add the which-sort-order-is-that >> information as payload columns that aren't part of the key. > This is probably OK too, although I confess I'm a lot less happy about > it now that you've pointed out the need for those payload columns. The reason I said "columns" is that I can foresee eventually wanting to specify a pathkey in its entirety --- opfamily, asc/desc, nulls_first, and whatever we come up with for collation. We don't currently need to store more than the opfamily, since the others can never need to have non-default values given the current implementation of KNNGIST. But the idea that they might all be there eventually makes me feel that we don't want to try to incorporate this data in pg_amop's unique key. I'm satisfied to say that only one sort order can be associated with a particular operator in a particular opclass, which is what would be implied by using AMOP_SEARCH/AMOP_ORDER as the unique key component. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Nov 22, 2010 at 8:32 PM, Tom Lane wrote: > The reason I bring this up now is that it affects the decision as to > what the unique key for pg_amop ought to be. Instead of having an > enum "purpose" column, maybe we should consider that the unique key > is (operator oid, opfamily oid, order-by-oid), where order-by-oid > is zero for a search operator and the OID of the btree opclass or sort > operator for an ordering operator. This would be of value if we > imagined that a single opclass could support ordering by more than one > distance ordering operator; which seems a bit far-fetched but perhaps > not impossible. On the other side of the coin it'd mean we aren't > leaving room for other sorts of operator "purposes". Since the need for additional purposes is mostly hypothetical, this wouldn't bother me any. > On balance I'm inclined to leave the unique key as per previous proposal > (with a "purpose" column) and add the which-sort-order-is-that > information as payload columns that aren't part of the key. This is probably OK too, although I confess I'm a lot less happy about it now that you've pointed out the need for those payload columns. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
I wrote: > As far as the syntax of CREATE/ALTER OPERATOR CLASS/FAMILY is concerned, > I like Robert's proposal (OPERATOR ... FOR ORDER or FOR SEARCH) better > than Teodor's, though we don't need the multiple-purposes-per-entry > aspect of it. This is mainly because it looks more easily extensible > if we ever do get around to having more purposes. Although having said that ... I was poking around a bit more and noticed how entirely bogus the current patch's matching of pathkeys is. It pays no attention at all to which pk_opfamily is specified, which means it will fail given a query like SELECT ... ORDER BY indexable_col <-> constant USING <<< where <<< represents some sort order that has nothing to do with the order that GIST is expecting. The query will be considered to match the index and then it will proceed to produce results in the wrong order. We could perhaps kluge around that by insisting that the pathkey opclass be the default btree opclass for the relevant datatype; but that's fairly expensive to check for in match_pathkey_to_index, and it's getting even further away from having a general-purpose solution. I'm inclined to think that instead of just specifying "FOR ORDER", the opclass definition ought to specify "FOR ORDER BY something", where "something" could perhaps be a pre-existing btree opclass name. Or maybe it could be a suitable sort operator, though then we'd have to do a bit of reverse-engineering in match_pathkey_to_index, so that's still no fun from a lookup-speed point of view. In any case what we'd be specifying is a sort order for the datatype that the opclass member operator yields, not the indexed datatype (so in practice it'd usually be float8_ops or float8 <, no doubt). The reason I bring this up now is that it affects the decision as to what the unique key for pg_amop ought to be. Instead of having an enum "purpose" column, maybe we should consider that the unique key is (operator oid, opfamily oid, order-by-oid), where order-by-oid is zero for a search operator and the OID of the btree opclass or sort operator for an ordering operator. This would be of value if we imagined that a single opclass could support ordering by more than one distance ordering operator; which seems a bit far-fetched but perhaps not impossible. On the other side of the coin it'd mean we aren't leaving room for other sorts of operator "purposes". On balance I'm inclined to leave the unique key as per previous proposal (with a "purpose" column) and add the which-sort-order-is-that information as payload columns that aren't part of the key. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
I've done some *very* preliminary review of this patch now. I think that the way to move forward is to first commit the work surrounding changes to pg_amop, including suitable changes to CREATE/ALTER OPERATOR CLASS/FAMILY so that there's a way to put the new info into the table. The system won't initially *do* anything useful with ordering-operator entries, but this is necessary infrastructure before we can start to do anything interesting. After reviewing both Teodor's and Robert's patches for the issue, I believe that Teodor has a better basic approach: if an operator is useful for both searching and ordering, the way to handle that is to make two pg_amop entries for it, with different strategy numbers. This means in particular that we can continue to require strategy numbers to be unique within an opclass, so right offhand I see no need to widen pg_amop_fam_strat_index to five columns. (Which means we don't need that patch to allow five-key syscaches.) What we need instead is to add the "purpose" column to the existing unique index on (amopopr, amopfamily). In English that means that instead of allowing an operator to have only one entry per opfamily, it can now have one entry per opfamily per purpose. I do however concur with Robert that it's a bit silly to go to all this work and not leave room in the design for additional operator "purposes" later. So rather than using a boolean amoporder column, I think we want an enumerated-type column "amoppurpose". Per our usual system catalog conventions for "poor man's enums", I suggest declaring the column as "char" with the two allowed values AMOP_SEARCH 's' AMOP_ORDER 'o' Aside from leaving room for possible extension, I think that using AMOP_SEARCH/AMOP_ORDER rather than true/false in the code will be more readable and less error-prone. As far as the syntax of CREATE/ALTER OPERATOR CLASS/FAMILY is concerned, I like Robert's proposal (OPERATOR ... FOR ORDER or FOR SEARCH) better than Teodor's, though we don't need the multiple-purposes-per-entry aspect of it. This is mainly because it looks more easily extensible if we ever do get around to having more purposes. There's a lot more to be done after this, but if there are no objections to this much, I'll go merge the relevant parts of the two patches and commit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > That doesn't seem very hard on its face. The trick is what to do with > that information once you've got it. As far as I can tell, you need > to drill some kind of hole that lets you pass "additional details > about the desired sort order" to the index AM. We clearly need to add additional information to IndexScan plan nodes to tell the index AM which sort order is required. Up to now, an indexscan has only had one possible resultant sort order (two if you count backwards scan, but as I said I don't think generalizing that particular feature is the way to approach this). I would imagine that the best way to handle that is to add a PathKey list or something equivalent to it, and add that to the arguments passed to ambeginscan. The other issue is how the planner can figure out what the possible orderings are when it's considering an index. You seem to be contemplating adding a new index AM function that the planner would call at the right point; but I'm not sure that that adds much of anything, because the index AM can't have hard-wired behavior either. We really have to have enough information in the system catalog entries about an opclass to allow the possible orderings to be determined. Given that, I think it makes more sense for the core planner to know what to do than to put possibly duplicative code into multiple AMs. I guess a third alternative would be to create per-opclass hook functions for the planner to call, but I'm not thrilled with that idea; it would still be largely duplicative code, and in a lot more places. I think it would also bind our hands with respect to making internal planner changes in future, because the data structures representing pathkeys would be pretty well locked down by such a choice. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Sun, Nov 21, 2010 at 5:24 PM, Tom Lane wrote: > I haven't spent any time on this patch yet (hope to start looking at it > next week). As for your specific question above, I don't have a big > problem with reusing ScanKey this way, but I do agree that using > RestrictInfo for this would be a crock. ISTM what we ought to have is > just the ability to match PathKeys with expressions of the form > "indexedcol op constant" to an index. That doesn't seem very hard on its face. The trick is what to do with that information once you've got it. As far as I can tell, you need to drill some kind of hole that lets you pass "additional details about the desired sort order" to the index AM. What I'd sort of like to be able to do is throw the PathKeys at the index AM and say "you want these?". Short of that, we're probably going to have to resign ourselves to the core code basically knowing exactly what the capabilities of KNNGIST are, making the index API pretty porous - not that it already isn't. There's really nothing special about the subset of the problem space KNNGIST happens to attack except that it makes the GIS guys drool; the next problem someone wants to attack in this area is as likely as not to look completely different. > This example leaves me totally cold, not least because it assumes a > specific storage implementation for nulls in an index. It is also, > I think, misunderstanding what ScanDirection is for. That's only > intended to allow executor plans to be run forward and then backed up > in the same fashion that fetching backwards from a cursor would do; > which is not a btree-specific concept, indeed not even index-specific. Ah, OK. > If there is sufficient interest in doing what you suggest, what we'd > want to do is pass the PathKey representation to the index and let the > index AM figure out what it has to do to produce that sort order. But > that is way way down my priority list. Yeah, this is basically what I'm wondering whether we can reasonably do for KNNGIST; with hopes of later reuse. But it may be unworkable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > 2010/11/12 Teodor Sigaev : >> My variants informs GiST by SK_ORDER flags and consistentFn looks at >> strategy number (strategy numbers are different for different purposes). > Yeah. At ten thousand feet, I think the open design question here is > to what extent it's OK to rely on the fact that the ORDER BY clauses > we wish to optimize happen to look a lot like the WHERE clauses we > already know how to optimize: namely, they're both binary opclauses of > the form . Your patch manages to > reuse a LOT of existing machinery by shoving ordering expressions > through the same code paths that quals take. Code reuse is generally > a good thing, but here's we're forming RestrictInfo and ScanKey > objects out of things that are neither restrictions nor keys, which > might lead to maintainability problems down the road. I'd like to get > some input from Tom on how he feels about that, and any alternatives > he sees. I haven't spent any time on this patch yet (hope to start looking at it next week). As for your specific question above, I don't have a big problem with reusing ScanKey this way, but I do agree that using RestrictInfo for this would be a crock. ISTM what we ought to have is just the ability to match PathKeys with expressions of the form "indexedcol op constant" to an index. I'm undecided about the big-picture question of how much extra generality ought to be put into the system along with this patch. The argument not to is that with no candidate uses of additional generality on the horizon, it's a waste of time to design something more general, because we'll probably get it wrong anyway. I'm not a fan of designing APIs without use-cases in mind. On the other hand, there's an argument *for* doing something more general, which is basically Polya's paradox: the more general problem may be easier to solve. To support that argument, we'd need a design that is clearly cleaner than bolting KNNGIST on according to the current patch. AIUI we don't have that at the moment, but I still think it's worth spending a bit more time looking for one. > It seems to me that our concept of ScanDirection is really woefully > under-expressive. For example, given: > CREATE TABLE foo ( > id integer NOT NULL, > name character varying NOT NULL, > PRIMARY KEY (id) > ); > We use the index for the first of these but not the second: > select * from foo order by id nulls last; > select * from foo order by id nulls first; > In an ideal world, we'd like to handle the second one by finding the > first non-NULL entry in the index, scanning away from the NULLs, and > then, when we run off the end, looping back around to spit out the > NULL entries. This example leaves me totally cold, not least because it assumes a specific storage implementation for nulls in an index. It is also, I think, misunderstanding what ScanDirection is for. That's only intended to allow executor plans to be run forward and then backed up in the same fashion that fetching backwards from a cursor would do; which is not a btree-specific concept, indeed not even index-specific. If there is sufficient interest in doing what you suggest, what we'd want to do is pass the PathKey representation to the index and let the index AM figure out what it has to do to produce that sort order. But that is way way down my priority list. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
2010/11/12 Teodor Sigaev : >> Slightly-more-fleshed out proof of concept patch attached, with actual >> syntax, documentation, and pg_dump support added. This might be >> thought of as a subset of the builtin_knngist_core patch, without the >> parts that make it actually do something useful (which is mostly >> match_pathkey_to_index - which I'm still rather hoping to abstract in >> some way via the access method interface, though I'm currently unsure >> what the best way to do that is). > > I don't see in your patch how to propagate knowledge of kind of operation > (AMOP_SEARCH or AMOP_ORDER) to GiST and consistent method. For both of them > they aren't distinguishable. That's not acceptably for both, because GiST > needs to choose right traversal algorithm, consistentFn needs role to decide > return infinity or false (-1) value. My version of the patch is suffering from a bad case of not being done. It sort of started out as a proof-of-concept to show what the syntax might look like and seems to be gradually growing into something more real. I'm not sure what we'll end up with. > My variants informs GiST by SK_ORDER flags and consistentFn looks at > strategy number (strategy numbers are different for different purposes). Yeah. At ten thousand feet, I think the open design question here is to what extent it's OK to rely on the fact that the ORDER BY clauses we wish to optimize happen to look a lot like the WHERE clauses we already know how to optimize: namely, they're both binary opclauses of the form . Your patch manages to reuse a LOT of existing machinery by shoving ordering expressions through the same code paths that quals take. Code reuse is generally a good thing, but here's we're forming RestrictInfo and ScanKey objects out of things that are neither restrictions nor keys, which might lead to maintainability problems down the road. I'd like to get some input from Tom on how he feels about that, and any alternatives he sees. It seems to me that our concept of ScanDirection is really woefully under-expressive. For example, given: CREATE TABLE foo ( id integer NOT NULL, name character varying NOT NULL, PRIMARY KEY (id) ); We use the index for the first of these but not the second: select * from foo order by id nulls last; select * from foo order by id nulls first; In an ideal world, we'd like to handle the second one by finding the first non-NULL entry in the index, scanning away from the NULLs, and then, when we run off the end, looping back around to spit out the NULL entries. Or, similarly, if we have an index on (id, name), we can handle the first two of the following with the index, but not the second two: select * from foo order by id, name; select * from foo order by id desc, name desc; select * from foo order by id, name desc; select * from foo order by id, name nulls last; (can't handle this last one even if name is marked NOT NULL!) It seems like it might even be possible to handle these (albeit maybe not real efficiently) via a sufficiently complicated scanning order, but even if we had the code to do the scan, we have no way of telling the index what scan direction we want: forward, backward, and no movement don't really cut it. The idea that forward and backward mean anything at all is really pretty btree-centric. The trick, of course, is to come up with something better. I have a vague notion of using something like an array or list of objects of the form . That could easily be extended with collation or whatever as needed. The trick is - you need to create this object and then ask the index - hey, is this something you can support? And the index needs to then respond by saying whether it can do that (and maybe doing some sort of translation into what instructions should be provided at execution time). Trouble is, that sounds like a lot of work on code I'm not altogether comfortable with, and I'd really like to get KNNGIST in shape for 9.1 if possible. I'm not real sure where to draw the line between, on the one hand, not wanting to commit something that is excessively crufty and, on the other hand, wanting to finish in finite time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Slightly-more-fleshed out proof of concept patch attached, with actual syntax, documentation, and pg_dump support added. This might be thought of as a subset of the builtin_knngist_core patch, without the parts that make it actually do something useful (which is mostly match_pathkey_to_index - which I'm still rather hoping to abstract in some way via the access method interface, though I'm currently unsure what the best way to do that is). I don't see in your patch how to propagate knowledge of kind of operation (AMOP_SEARCH or AMOP_ORDER) to GiST and consistent method. For both of them they aren't distinguishable. That's not acceptably for both, because GiST needs to choose right traversal algorithm, consistentFn needs role to decide return infinity or false (-1) value. My variants informs GiST by SK_ORDER flags and consistentFn looks at strategy number (strategy numbers are different for different purposes). I notice that builtin_knngist_core checks whether the return type of an ordering operator has a built-in btree opclass. I'm not sure Actually, GiST doesn't use that knowledge, check is done only to be sure that operation returns orderable data type. whether we should bother checking that, because even if it's true I don't think there's anything preventing it from becoming false later. I think it's probably sufficient to just check this condition at plan time and silently skip trying to build knn-type index paths if it's not met. It's already do it: you can not ORDER BY over non-orderable data type. That check just make diagnostic earlier. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert, it is great you are taking this on. This is really a well-known area of the code for you, but not so much for Teodor and Oleg, so I am sure they appreciate your assistance. --- Robert Haas wrote: > On Sat, Oct 16, 2010 at 9:54 PM, Robert Haas wrote: > >> Thinking about it that way, perhaps we could add an integer column > >> amop_whats_it_good_for that gets used as a bit field. ?That wouldn't > >> require changing the index structure, although it might break some > >> other things. > > > > I gave this a shot (though I called it amoppurpose rather than > > amop_whats_it_good_for) and I think it's a reasonable way to proceed. > > Proof-of-concept patch attached. ?This just adds the column (using the > > existing padding space), defines AMOP_SEARCH and AMOP_ORDER, and makes > > just about everything ignore anything not marked AMOP_SEARCH, > > attached. ?This would obviously need some more hacking to pay > > attention to AMOP_ORDER where relevant, etc. and to create some actual > > syntax around it. ?Currently CREATE OPERATOR CLASS / ALTER OPERATOR > > FAMILY have this bit: > > > > OPERATOR strategy_number ( op_type [ , op_type ] ) > > > > knngist-0.9 implements this: > > > > [ORDER] OPERATOR strategy_number ( op_type [, op_type ] ) > > > > ...but with the design proposed above that's not quite what we'd want, > > because amoppurpose is a bit field, so you could have one or both of > > the two possible purposes. ?Perhaps: > > > > OPERATOR strategy_number ( op_type [ , op_type ] ) [ FOR { SEARCH | > > ORDER } [, ...] ] > > > > With the default being FOR SEARCH. > > Slightly-more-fleshed out proof of concept patch attached, with actual > syntax, documentation, and pg_dump support added. This might be > thought of as a subset of the builtin_knngist_core patch, without the > parts that make it actually do something useful (which is mostly > match_pathkey_to_index - which I'm still rather hoping to abstract in > some way via the access method interface, though I'm currently unsure > what the best way to do that is). > > I notice that builtin_knngist_core checks whether the return type of > an ordering operator has a built-in btree opclass. I'm not sure > whether we should bother checking that, because even if it's true I > don't think there's anything preventing it from becoming false later. > I think it's probably sufficient to just check this condition at plan > time and silently skip trying to build knn-type index paths if it's > not met. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company [ Attachment, skipping... ] > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
2010/10/22 Teodor Sigaev : >> You can define the additional argument as providing all of the extra >> info about how the operator is being used, and, if it's being used for >> ordering, the details of the requested order. What is your thinking >> on the matter? > > Maby be useful, but it seems to me to be a bit overengineering for now. GiST > will not support knn-search with different than ASC NULLS LAST order in > foresee future, because it stores NULLs in unmaintainable manner for > different ordering: NULLs could be everywhere in tree. So, implementation of > ASC NULLS FIRST requires to read whole tree to find all NULLs value first. I > can imagine, how to reorganize tree to store NULLs together for > single-column index, but not for multi-column index. Orders DESC NULLS > LAST/FIRST requires to introduce two more "infinite" values: one to notion > of distance more than +INF and another, non-negative, but less than zero > distance. I don't think it's overengineering, just because making core changes to the planner is such a pain in the neck that we don't want to have to come back and do it again any too soon, or at least we'd like them to be as minor as possible. It's better to have one API break and be done with it than maybe have to come back and do a second round of code changes. And from what I can see it's not going to be that ugly. I'm trying to drum up some time to hack on it... > Again, it could be useful for modification of GiST known as ordered GiST. > But ordered GiST uses completely different tree traversal algorithm for > search and insert, and it hasn't any benefits comparing with B-Tree and > GiST. It's looks like an autogiro which successfully combines disadvantages > of airplanes and helicopters :) Heh. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
You can define the additional argument as providing all of the extra info about how the operator is being used, and, if it's being used for ordering, the details of the requested order. What is your thinking on the matter? Maby be useful, but it seems to me to be a bit overengineering for now. GiST will not support knn-search with different than ASC NULLS LAST order in foresee future, because it stores NULLs in unmaintainable manner for different ordering: NULLs could be everywhere in tree. So, implementation of ASC NULLS FIRST requires to read whole tree to find all NULLs value first. I can imagine, how to reorganize tree to store NULLs together for single-column index, but not for multi-column index. Orders DESC NULLS LAST/FIRST requires to introduce two more "infinite" values: one to notion of distance more than +INF and another, non-negative, but less than zero distance. Again, it could be useful for modification of GiST known as ordered GiST. But ordered GiST uses completely different tree traversal algorithm for search and insert, and it hasn't any benefits comparing with B-Tree and GiST. It's looks like an autogiro which successfully combines disadvantages of airplanes and helicopters :) -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
2010/10/19 Teodor Sigaev : >>> Thinking about it that way, perhaps we could add an integer column >>> amop_whats_it_good_for that gets used as a bit field. That wouldn't >>> require changing the index structure, although it might break some >>> other things. > >> OPERATOR strategy_number ( op_type [ , op_type ] ) [ FOR { SEARCH | >> ORDER } [, ...] ] > > It's very desirable thing to be able to distinguish roles in consistent > method of GiST: computation of distance could be very expensive and. The > single way to provide it in current GiST interface is a strategy number. Of > course, we could add 6-th argument to consistent to point role, but I don't > think that's good decision. To me, adding an additional argument (or maybe providing a whole separate support function, separate from consistent) seems quite natural, because now you have a way to pass all the other little bits that might matter... ASC/DESC, NULLS FIRST/LAST, collation OID, etc. You can define the additional argument as providing all of the extra info about how the operator is being used, and, if it's being used for ordering, the details of the requested order. What is your thinking on the matter? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
combination of parameters is one that it can handle. I'm thinking perhaps in lieu of a boolean, we can add another indexam method which, if not InvalidOid, gets called when we're wondering about whether a given clause is something that the index can order by. Although knngist focuses on a the ORDER BY col OP constant case, which Hmm, interesting idea. To be more general, amcanorder (without byop suffix) could be eliminated too. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Thinking about it that way, perhaps we could add an integer column amop_whats_it_good_for that gets used as a bit field. That wouldn't require changing the index structure, although it might break some other things. OPERATOR strategy_number ( op_type [ , op_type ] ) [ FOR { SEARCH | ORDER } [, ...] ] It's very desirable thing to be able to distinguish roles in consistent method of GiST: computation of distance could be very expensive and. The single way to provide it in current GiST interface is a strategy number. Of course, we could add 6-th argument to consistent to point role, but I don't think that's good decision. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
3) 3-rd boolean column (amopopr, amopfamily, amoporder) - could be two records per operator + operator could be used in both roles + strategy number could be different for different roles How can #3 work at all? It's ignoring a couple of critical index columns. In particular, I believe the sticking point here is this unique index: "pg_amop_fam_strat_index" UNIQUE, btree (amopfamily, amoplefttype, amoprighttype, amopstrategy) I believe, that columns (amoplefttype, amoprighttype) are fixed for operation and could not be changed. So, two operations in one opfamily should be differ in strategy number for different roles. It also gives a direct way to transfer knowledge abot role to consistent method of GiST. I'm not terribly thrilled with using a boolean here in any case. Now that we have two "roles" an operator might play in an opclass, who's to say there might not be more roles someday? We should use a column type that will support more than two roles without basic rejiggering. It's easy to change to char type, for now only 's'search and 'o'order characters will be allowed. BTW, have we discussed the idea of embedding the role in the strategy number? For example, require regular operators to have strategy In this schema, one operator could not be used in more than one role. Right now it's not strict limitation, because the we still don't have an example of boolean distance. Anyhow, it needs to distinguish roles while IndexPath is built. So, op_in_opfamily/get_op_opfamily_strategy/get_op_opfamily_properties and friends will accept extra argument pointing to interested role. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Oct 18, 2010 at 04:13:04PM +0100, Mark Cave-Ayland wrote: > David Fetter wrote: > > >>For my vote, I'd prefer either the Oid of a custom type or an array > >>of Oid, Datum pairs - i.e. something we can extend in the future if > >>required. > > > >This sounds a lot like a foreign key to another table. Are you not > >proposing doing that because of performance considerations? > > > >Cheers, > >David. > > Well, in PostGIS a typmod contains 3 pieces of information: > > 1) the SRID > 2) the dimension > 3) the geometry type > > The SRID is technically already a foreign key into another table, > with dimension and SRID as other information. At the moment, we > bit-pack the dimension and geometry type into the SRID and use that > as the typmod but this only leaves 21 bits IIRC for the SRID. The > additional complication is that SRIDs at the higher end of the range > are allowed for anyone to use, and so people may have their own > customised spheroids defined in this region of the table. Sounds like you need space for all of these separately, and once you have that, you'll probably start thinking about other possible pieces of information you could store, especially as you start to store new data types and have new operators on them. It sounds to me as thought the typemod, or something like it, needs to be able to store, in general, completely arbitrary information, although not likely much over a page or two of memory. Cf. Robert Haas's recent comment about Klingon. While we could make this a completely opaque structure from the SQL level, I'm not sure it's a good idea. > If we had a foreign key into another table, we'd need to ensure that > no one could tamper with it as otherwise all chaos would break lose, > e.g. breaking the geometry type constraint on a column. Our system catalog tables have the nice property that no one can accidentally modify them by hand, and "all chaos," as you put it, can reasonably be expected to occur should they choose to do so. > Heck, we even have people deleting the geometry_columns table > sometimes because they are not aware of what it does. By storing > this information in the PG catalog then this can't happen, plus the > information is available easily in Form_pg_attribute without having > to implement our own cache, with its own related problems such as > how/when to invalidate etc. :) > There is also a chance that we'd want to include additional > information in the future related to geometry validity, for example, > which would mean further reducing the range allowed within the > spatial_ref_sys table in its existing form. Right. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > On Mon, Oct 18, 2010 at 3:40 PM, Peter Eisentraut wrote: >> Frankly, I have no idea what knngist has to do with any of this, as I >> haven't followed that topic at all. > Me neither, except for the fact that the PostGIS guys are interested > in both things. I think this thread has been hijacked OT. We got there by discussing how ordering information could be passed around for knngist operators. Robert said something about side channels, and I mentioned that maybe collation information should pass through a side channel as well instead of trying to make it work like typmods, and then it drifted to rethinking typmods as long as we were thinking about having to touch everyplace that deals in typmods. I agree the thread is far OT --- most of the recent messages belong in discussion of the collation patch more than they do with knngist. But there is some overlap if you think about whether these problems could be solved together. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Oct 18, 2010 at 3:40 PM, Peter Eisentraut wrote: > On mån, 2010-10-18 at 15:36 -0400, Robert Haas wrote: >> On Mon, Oct 18, 2010 at 3:33 PM, Peter Eisentraut wrote: >> > On mån, 2010-10-18 at 11:41 +0100, Mark Cave-Ayland wrote: >> >> Paul Ramsey wrote: >> >> >> >> >> So what kind of data structure would you like for a typmod? >> >> > >> >> > I'm a primitive enough beast that just having 64-bits would make me >> >> > happy. As a general matter though, a bytea? >> >> > >> >> > P >> >> >> >> For my vote, I'd prefer either the Oid of a custom type or an array of >> >> Oid, Datum pairs - i.e. something we can extend in the future if required. >> > >> > I think if we really wanted to design this generally, we'd give a type >> > function arguments. So, numeric would get (int default = 0, int default >> > = 0). That can easily get very complicated, of course. >> > >> > In any case, for the shorter term, it's clear that refactoring the >> > passing around of type + typmod would help this endeavor, so I'm going >> > to give it a try. >> >> By "this endeavor" do you mean KNNGIST, or per-column collation? > > No, I was referring to the talk about a different/better/more flexible > typmod. Per-column collation could also benefit, as you have observed. > Frankly, I have no idea what knngist has to do with any of this, as I > haven't followed that topic at all. Me neither, except for the fact that the PostGIS guys are interested in both things. I think this thread has been hijacked OT. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On mån, 2010-10-18 at 15:36 -0400, Robert Haas wrote: > On Mon, Oct 18, 2010 at 3:33 PM, Peter Eisentraut wrote: > > On mån, 2010-10-18 at 11:41 +0100, Mark Cave-Ayland wrote: > >> Paul Ramsey wrote: > >> > >> >> So what kind of data structure would you like for a typmod? > >> > > >> > I'm a primitive enough beast that just having 64-bits would make me > >> > happy. As a general matter though, a bytea? > >> > > >> > P > >> > >> For my vote, I'd prefer either the Oid of a custom type or an array of > >> Oid, Datum pairs - i.e. something we can extend in the future if required. > > > > I think if we really wanted to design this generally, we'd give a type > > function arguments. So, numeric would get (int default = 0, int default > > = 0). That can easily get very complicated, of course. > > > > In any case, for the shorter term, it's clear that refactoring the > > passing around of type + typmod would help this endeavor, so I'm going > > to give it a try. > > By "this endeavor" do you mean KNNGIST, or per-column collation? No, I was referring to the talk about a different/better/more flexible typmod. Per-column collation could also benefit, as you have observed. Frankly, I have no idea what knngist has to do with any of this, as I haven't followed that topic at all. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Oct 18, 2010 at 3:33 PM, Peter Eisentraut wrote: > On mån, 2010-10-18 at 11:41 +0100, Mark Cave-Ayland wrote: >> Paul Ramsey wrote: >> >> >> So what kind of data structure would you like for a typmod? >> > >> > I'm a primitive enough beast that just having 64-bits would make me >> > happy. As a general matter though, a bytea? >> > >> > P >> >> For my vote, I'd prefer either the Oid of a custom type or an array of >> Oid, Datum pairs - i.e. something we can extend in the future if required. > > I think if we really wanted to design this generally, we'd give a type > function arguments. So, numeric would get (int default = 0, int default > = 0). That can easily get very complicated, of course. > > In any case, for the shorter term, it's clear that refactoring the > passing around of type + typmod would help this endeavor, so I'm going > to give it a try. By "this endeavor" do you mean KNNGIST, or per-column collation? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On mån, 2010-10-18 at 11:41 +0100, Mark Cave-Ayland wrote: > Paul Ramsey wrote: > > >> So what kind of data structure would you like for a typmod? > > > > I'm a primitive enough beast that just having 64-bits would make me > > happy. As a general matter though, a bytea? > > > > P > > For my vote, I'd prefer either the Oid of a custom type or an array of > Oid, Datum pairs - i.e. something we can extend in the future if required. I think if we really wanted to design this generally, we'd give a type function arguments. So, numeric would get (int default = 0, int default = 0). That can easily get very complicated, of course. In any case, for the shorter term, it's clear that refactoring the passing around of type + typmod would help this endeavor, so I'm going to give it a try. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
David Fetter wrote: For my vote, I'd prefer either the Oid of a custom type or an array of Oid, Datum pairs - i.e. something we can extend in the future if required. This sounds a lot like a foreign key to another table. Are you not proposing doing that because of performance considerations? Cheers, David. Well, in PostGIS a typmod contains 3 pieces of information: 1) the SRID 2) the dimension 3) the geometry type The SRID is technically already a foreign key into another table, with dimension and SRID as other information. At the moment, we bit-pack the dimension and geometry type into the SRID and use that as the typmod but this only leaves 21 bits IIRC for the SRID. The additional complication is that SRIDs at the higher end of the range are allowed for anyone to use, and so people may have their own customised spheroids defined in this region of the table. If we had a foreign key into another table, we'd need to ensure that no one could tamper with it as otherwise all chaos would break lose, e.g. breaking the geometry type constraint on a column. Heck, we even have people deleting the geometry_columns table sometimes because they are not aware of what it does. By storing this information in the PG catalog then this can't happen, plus the information is available easily in Form_pg_attribute without having to implement our own cache, with its own related problems such as how/when to invalidate etc. There is also a chance that we'd want to include additional information in the future related to geometry validity, for example, which would mean further reducing the range allowed within the spatial_ref_sys table in its existing form. ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Oct 18, 2010 at 11:41:06AM +0100, Mark Cave-Ayland wrote: > Paul Ramsey wrote: > > >>So what kind of data structure would you like for a typmod? > > > >I'm a primitive enough beast that just having 64-bits would make me > >happy. As a general matter though, a bytea? > > > >P > > For my vote, I'd prefer either the Oid of a custom type or an array > of Oid, Datum pairs - i.e. something we can extend in the future if > required. This sounds a lot like a foreign key to another table. Are you not proposing doing that because of performance considerations? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Paul Ramsey wrote: So what kind of data structure would you like for a typmod? I'm a primitive enough beast that just having 64-bits would make me happy. As a general matter though, a bytea? P For my vote, I'd prefer either the Oid of a custom type or an array of Oid, Datum pairs - i.e. something we can extend in the future if required. ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Fri, Oct 15, 2010 at 7:10 PM, Tom Lane wrote: > Robert Haas writes: >> I still feel vaguely uneasy about the fact that the proposed patch >> can't handle ASC/DESC or NULLS FIRST/LAST, and that unease grew a bit >> more last night when I read Peter's patch to add collation support. > > Good point. > >> We could possibly cram ASC/DESC and NULLS FIRST/LAST in by defining >> four new categories of operator strategies rather than one, but >> there's no way that's going to work for collations. Is there some >> other way to approach this problem? Can we leave pg_amop as it is, >> and pass the context (sort vs. qual, ASC/DESC, NULLS FIRST/LAST, >> collation, whatever...) to the index via some sort of side channel? > > Well, we cannot avoid changing pg_amop, or at least changing its > interpretation, because the current scheme simply can't represent > indexable operators that are used for anything except simple boolean > WHERE tests. I agree though that we do *not* want pg_amop involved > in the details of exactly what sort ordering is referenced by a sortable > operator. Somehow that needs to be passed in a side channel. I spent some time tonight looking at how this is handled in builtin_knngist_core-0.9. It looks like the requested sort, if there is one, just gets tossed into the indexquals list and eventually transformed into a scankey with a new flag SK_ORDER set. I think what we should do instead is add an additional IndexPath field which can point to a node indicating the desired ordering properties. This node might look similar to a scan key but having it be a separate node type will allow us to add in whatever, ahem, miscellaneous crap we want to pass: currently ASC/DESC and NULLS FIRST/LAST, and eventually collation and so forth. That can then get propagated into the IndexScan and passed to index_beginscan, where it can be passed to the AM's ambeginscan function (i.e. we'll add an additional argument to the signatures of those functions). There's a second problem, too. If we assume that we want to treat this problem with some degree of generality - that is, we really do care about things like ASC/DESC and NULLS FIRST/LAST and eventually collation - then the proposed amcanorderbyop flag isn't really sufficient. The AM will need to be able to indicate whether a given combination of parameters is one that it can handle. I'm thinking perhaps in lieu of a boolean, we can add another indexam method which, if not InvalidOid, gets called when we're wondering about whether a given clause is something that the index can order by. Although knngist focuses on a the ORDER BY col OP constant case, which certainly seems like the most useful one, there's no theoretical reason why an AM couldn't allow ordering by some more complex clause. I don't know whether anyone will ever feel like writing code to do something like that, but if we set the API up this way then it should be at least theoretically possible. Comments? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Fri, Oct 15, 2010 at 9:39 PM, Robert Haas wrote: > On Fri, Oct 15, 2010 at 8:45 PM, Robert Haas wrote: >> On Fri, Oct 15, 2010 at 8:10 PM, Tom Lane wrote: Perhaps we should think of pg_amop not so much as a way to tell the AM what to do, but just a way to tell it what operator is logically involved without relying on the name or OID. >>> >>> I already think of it that way ... >> >> OK. > > Thinking about it that way, perhaps we could add an integer column > amop_whats_it_good_for that gets used as a bit field. That wouldn't > require changing the index structure, although it might break some > other things. I gave this a shot (though I called it amoppurpose rather than amop_whats_it_good_for) and I think it's a reasonable way to proceed. Proof-of-concept patch attached. This just adds the column (using the existing padding space), defines AMOP_SEARCH and AMOP_ORDER, and makes just about everything ignore anything not marked AMOP_SEARCH, attached. This would obviously need some more hacking to pay attention to AMOP_ORDER where relevant, etc. and to create some actual syntax around it. Currently CREATE OPERATOR CLASS / ALTER OPERATOR FAMILY have this bit: OPERATOR strategy_number ( op_type [ , op_type ] ) knngist-0.9 implements this: [ORDER] OPERATOR strategy_number ( op_type [, op_type ] ) ...but with the design proposed above that's not quite what we'd want, because amoppurpose is a bit field, so you could have one or both of the two possible purposes. Perhaps: OPERATOR strategy_number ( op_type [ , op_type ] ) [ FOR { SEARCH | ORDER } [, ...] ] With the default being FOR SEARCH. Comments? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Sat, Oct 16, 2010 at 6:13 PM, Paul Ramsey wrote: > On Sat, Oct 16, 2010 at 10:17 AM, Peter Eisentraut wrote: >> On lör, 2010-10-16 at 09:23 -0700, Paul Ramsey wrote: >>> >> (And, if we are going to break everything >>> > in sight, now would be a good time to think about changing typmod to >>> > something more flexible than one int32.) >>> >>> As someone who is jamming geometry type, spatial reference number and >>> dimensionality into said 32bit typmod, let me say emphatically ... >>> Amen! >> >> So what kind of data structure would you like for a typmod? > > I'm a primitive enough beast that just having 64-bits would make me > happy. As a general matter though, a bytea? Yeah. It strikes me that there are three main kinds of things people might want to represent: 1. An integer. e.g. for a numeric, precision or scale; for a varchar, length; for an array, number of dimensions. 2. An OID. e.g. for varchar or text, a collation OID. 3. Recursive structure. So you might have an array (which is one-dimensional) containing strings (which are limited to 80 characters and collated in Klingon). You want to hold onto all of those details somehow. There might be use cases for even crazier things - like packing all the field names and types for a record object in there... but maybe that's too crazy to be workable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Sat, Oct 16, 2010 at 10:17 AM, Peter Eisentraut wrote: > On lör, 2010-10-16 at 09:23 -0700, Paul Ramsey wrote: >> >> (And, if we are going to break everything >> > in sight, now would be a good time to think about changing typmod to >> > something more flexible than one int32.) >> >> As someone who is jamming geometry type, spatial reference number and >> dimensionality into said 32bit typmod, let me say emphatically ... >> Amen! > > So what kind of data structure would you like for a typmod? I'm a primitive enough beast that just having 64-bits would make me happy. As a general matter though, a bytea? P -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On lör, 2010-10-16 at 09:23 -0700, Paul Ramsey wrote: > >> (And, if we are going to break everything > > in sight, now would be a good time to think about changing typmod to > > something more flexible than one int32.) > > As someone who is jamming geometry type, spatial reference number and > dimensionality into said 32bit typmod, let me say emphatically ... > Amen! So what kind of data structure would you like for a typmod? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
>> (And, if we are going to break everything > in sight, now would be a good time to think about changing typmod to > something more flexible than one int32.) As someone who is jamming geometry type, spatial reference number and dimensionality into said 32bit typmod, let me say emphatically ... Amen! P
Re: [HACKERS] knngist - 0.8
On Fri, Oct 15, 2010 at 08:45:26PM -0400, Robert Haas wrote: > But I'm also not sure how far this gets us with KNNGIST, where the > issue is not the typmods but the auxilliary information about the > context of the sort and/or whether this is a sort or qual. ISTM there are two issues here. With Btree you have the issue where the index is of a particular collation and at planning time you're given a collation and either it's compatable or it's not. With Gist you have the situation where index isn't of a specific collation but it can be used in different ways to get different collection as output. (I'm here thinking of that Gist can in some circumstance be asked to return tuples in a certain order.) Actually, NULLS FIRST/LAST, ASC/DESC are just variations on collations and my original patch many years back for each locale made *four* collation IDs, one for each of the combinations of the above. Not terribly scalable, but I never did get the planning code to work. :) What you're going for here I think is a sort of cross-collation operators, or something, that tells the planner how to get the particular collation out of the given index. So you have something like: Input: en_US, NULLS FIRST, ASC Output: en_US, NULLS_LAST, DESC -> Reverse index scan Input: en_US, NULLS FIRST, ASC Output: en_US, NULLS LAST, ASC -> Index scan, add x IS NOT NULL test -> Append output of x IS NULL. Input: Gist Output: KnnGist -> Index scan with scan type 1 Input: Gist Output: Gist -> Index scan with scan type 0 Obviously this doesn't solve the problem of being able to represent the required collation in the first place, and if this is at all compatable with what the SQL standard calls a collation. I just don't think you can make hard and fast rules about how to make this work and that perhaps we should be looking for a way to push that to the index implementation code, with the default rule being: same collection yes, different no. Just some thoughts, Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] knngist - 0.8
On Fri, Oct 15, 2010 at 8:45 PM, Robert Haas wrote: > On Fri, Oct 15, 2010 at 8:10 PM, Tom Lane wrote: >>> Perhaps we should think of pg_amop not so much >>> as a way to tell the AM what to do, but just a way to tell it what >>> operator is logically involved without relying on the name or OID. >> >> I already think of it that way ... > > OK. Thinking about it that way, perhaps we could add an integer column amop_whats_it_good_for that gets used as a bit field. That wouldn't require changing the index structure, although it might break some other things. The core KNNGIST patch is actually quite small, actually. Excluding a lot of not-very-interesting changes to pg_amop.h, it's just over 300 lines of new code, about half of which is in indxpath.c. If we could get this issue of how to structure the catalogs resolved, it seems to me that we might be able to see our way to committing that part of this work fairly quickly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Fri, Oct 15, 2010 at 8:10 PM, Tom Lane wrote: >> Perhaps we should think of pg_amop not so much >> as a way to tell the AM what to do, but just a way to tell it what >> operator is logically involved without relying on the name or OID. > > I already think of it that way ... OK. >>> Maybe we should think in terms of a side channel for Peter's patch >>> as well. I share your feeling that trying to propagate collation >>> the same way we now propagate typmod is a recipe for serious pain. > >> I'm not sure what you're thinking of here. > > I'm not either. I'm dissatisfied with the direction he's heading > because of the amount of code it's going to break, but I don't have a > better idea. It may well be impossible to have this feature without > breaking everything in sight. (And, if we are going to break everything > in sight, now would be a good time to think about changing typmod to > something more flexible than one int32.) I assume I don't need to tell you my vote on that. But I'm also not sure how far this gets us with KNNGIST, where the issue is not the typmods but the auxilliary information about the context of the sort and/or whether this is a sort or qual. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > On Fri, Oct 15, 2010 at 7:10 PM, Tom Lane wrote: >> Well, we cannot avoid changing pg_amop, or at least changing its >> interpretation, because the current scheme simply can't represent >> indexable operators that are used for anything except simple boolean >> WHERE tests. > What exactly do you mean by that? > It has always seemed to me that the operator class mechanism is a > complicated abstraction mechanism that actually adds only a very small > amount of abstraction. Instead of referring to operators by name or > OID we refer to them by a number that maps onto a name or OID. Well, the amount of abstraction might be minimal, but the point is to be able to understand which operators are related to an index and exactly what the relationship is. There might be a better way to represent "this operator acts as <= for btree int4 indexes" than "this operator has strategy number 2 for btree int4 indexes", but it doesn't seem to me that there's a lot of win available there. C code certainly finds it more convenient to work with numbers than names, so I'm not excited about replacing the strategy numbers with strategy names, if that's what you're thinking. > Perhaps we should think of pg_amop not so much > as a way to tell the AM what to do, but just a way to tell it what > operator is logically involved without relying on the name or OID. I already think of it that way ... >> Maybe we should think in terms of a side channel for Peter's patch >> as well. I share your feeling that trying to propagate collation >> the same way we now propagate typmod is a recipe for serious pain. > I'm not sure what you're thinking of here. I'm not either. I'm dissatisfied with the direction he's heading because of the amount of code it's going to break, but I don't have a better idea. It may well be impossible to have this feature without breaking everything in sight. (And, if we are going to break everything in sight, now would be a good time to think about changing typmod to something more flexible than one int32.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Fri, Oct 15, 2010 at 7:10 PM, Tom Lane wrote: > Robert Haas writes: >> I still feel vaguely uneasy about the fact that the proposed patch >> can't handle ASC/DESC or NULLS FIRST/LAST, and that unease grew a bit >> more last night when I read Peter's patch to add collation support. > > Good point. > >> We could possibly cram ASC/DESC and NULLS FIRST/LAST in by defining >> four new categories of operator strategies rather than one, but >> there's no way that's going to work for collations. Is there some >> other way to approach this problem? Can we leave pg_amop as it is, >> and pass the context (sort vs. qual, ASC/DESC, NULLS FIRST/LAST, >> collation, whatever...) to the index via some sort of side channel? > > Well, we cannot avoid changing pg_amop, or at least changing its > interpretation, because the current scheme simply can't represent > indexable operators that are used for anything except simple boolean > WHERE tests. What exactly do you mean by that? It has always seemed to me that the operator class mechanism is a complicated abstraction mechanism that actually adds only a very small amount of abstraction. Instead of referring to operators by name or OID we refer to them by a number that maps onto a name or OID. That allows the user to change the name or OID without breaking anything, but that's about it. Perhaps we should think of pg_amop not so much as a way to tell the AM what to do, but just a way to tell it what operator is logically involved without relying on the name or OID. > I agree though that we do *not* want pg_amop involved > in the details of exactly what sort ordering is referenced by a sortable > operator. Somehow that needs to be passed in a side channel. Yeah. > Maybe we should think in terms of a side channel for Peter's patch > as well. I share your feeling that trying to propagate collation > the same way we now propagate typmod is a recipe for serious pain. I'm not sure what you're thinking of here. I think we can have the idea of a FullySpecifiedType = , but that's not so much a side channel as an abstraction layer. It absolutely won't work to stuff the collations in a global variable or something like that, if that's what you're imagining. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas writes: > I still feel vaguely uneasy about the fact that the proposed patch > can't handle ASC/DESC or NULLS FIRST/LAST, and that unease grew a bit > more last night when I read Peter's patch to add collation support. Good point. > We could possibly cram ASC/DESC and NULLS FIRST/LAST in by defining > four new categories of operator strategies rather than one, but > there's no way that's going to work for collations. Is there some > other way to approach this problem? Can we leave pg_amop as it is, > and pass the context (sort vs. qual, ASC/DESC, NULLS FIRST/LAST, > collation, whatever...) to the index via some sort of side channel? Well, we cannot avoid changing pg_amop, or at least changing its interpretation, because the current scheme simply can't represent indexable operators that are used for anything except simple boolean WHERE tests. I agree though that we do *not* want pg_amop involved in the details of exactly what sort ordering is referenced by a sortable operator. Somehow that needs to be passed in a side channel. Maybe we should think in terms of a side channel for Peter's patch as well. I share your feeling that trying to propagate collation the same way we now propagate typmod is a recipe for serious pain. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Fri, Oct 15, 2010 at 5:35 PM, Robert Haas wrote: > On Fri, Oct 15, 2010 at 12:02 PM, Tom Lane wrote: >> BTW, have we discussed the idea of embedding the role in the strategy >> number? For example, require regular operators to have strategy >> numbers 1-, while ordering operators have numbers 1-1, >> leaving room for a couple more roles before we have to rethink the >> assignment or widen amopstrategy to an int. That's a bit ugly but >> might be better than adding a separate role column. > > Yeah, we talked about it. > > http://archives.postgresql.org/pgsql-hackers/2010-02/msg01075.php > http://archives.postgresql.org/pgsql-hackers/2010-02/msg01079.php Having said that, I'm not wild on having 5-key syscaches, even though the patch is ready to go (modulo whatever rebasing is needed, which probably isn't much). So if you can figure out a way to avoid it, let's do that. I still feel vaguely uneasy about the fact that the proposed patch can't handle ASC/DESC or NULLS FIRST/LAST, and that unease grew a bit more last night when I read Peter's patch to add collation support. We could possibly cram ASC/DESC and NULLS FIRST/LAST in by defining four new categories of operator strategies rather than one, but there's no way that's going to work for collations. Is there some other way to approach this problem? Can we leave pg_amop as it is, and pass the context (sort vs. qual, ASC/DESC, NULLS FIRST/LAST, collation, whatever...) to the index via some sort of side channel? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Fri, Oct 15, 2010 at 12:02 PM, Tom Lane wrote: > BTW, have we discussed the idea of embedding the role in the strategy > number? For example, require regular operators to have strategy > numbers 1-, while ordering operators have numbers 1-1, > leaving room for a couple more roles before we have to rethink the > assignment or widen amopstrategy to an int. That's a bit ugly but > might be better than adding a separate role column. Yeah, we talked about it. http://archives.postgresql.org/pgsql-hackers/2010-02/msg01075.php http://archives.postgresql.org/pgsql-hackers/2010-02/msg01079.php -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Teodor Sigaev writes: > As I remember, there were several suggested designs: > 1) 5-th boolean column (amopfamily, amoplefttype, amoprighttype, > amopstrategy, > amoporder) to point kind of operator (search or order) >+ saves one record for operator in pg_amop >- operator could not be used in both roles >- increase number of arguments for syscache machinery > 2) 5-th combined column, which contains some kind of flag for each role >+ saves one record for operator in pg_amop >+ operator could be used in both roles >- strategy number for operator is the same for both roles, it's > unacceptable > because GiST's consistentFn will not have information about role. GiST > itself could distinguish them by invented SK_ORDER flag. So, this > requires to introduce one more argument for consistentFn, while it > already has 5 arguments. >- increase number of arguments for syscache machinery > 3) 3-rd boolean column (amopopr, amopfamily, amoporder) >- could be two records per operator >+ operator could be used in both roles >+ strategy number could be different for different roles How can #3 work at all? It's ignoring a couple of critical index columns. In particular, I believe the sticking point here is this unique index: "pg_amop_fam_strat_index" UNIQUE, btree (amopfamily, amoplefttype, amoprighttype, amopstrategy) #3 doesn't explain what to do about this index. If we extend it to five columns, as we'd logically have to do to preserve uniqueness, then the associated syscache has to have five key columns as well, and now we're at solution #1. I'm not terribly thrilled with using a boolean here in any case. Now that we have two "roles" an operator might play in an opclass, who's to say there might not be more roles someday? We should use a column type that will support more than two roles without basic rejiggering. BTW, have we discussed the idea of embedding the role in the strategy number? For example, require regular operators to have strategy numbers 1-, while ordering operators have numbers 1-1, leaving room for a couple more roles before we have to rethink the assignment or widen amopstrategy to an int. That's a bit ugly but might be better than adding a separate role column. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
forward. Tom and I laid out a technical design back in January and I still think it's a good one, even though I know you think it's silly. We may just have to agree to disagree on this point. As I remember, there were several suggested designs: 1) 5-th boolean column (amopfamily, amoplefttype, amoprighttype, amopstrategy, amoporder) to point kind of operator (search or order) + saves one record for operator in pg_amop - operator could not be used in both roles - increase number of arguments for syscache machinery 2) 5-th combined column, which contains some kind of flag for each role + saves one record for operator in pg_amop + operator could be used in both roles - strategy number for operator is the same for both roles, it's unacceptable because GiST's consistentFn will not have information about role. GiST itself could distinguish them by invented SK_ORDER flag. So, this requires to introduce one more argument for consistentFn, while it already has 5 arguments. - increase number of arguments for syscache machinery 3) 3-rd boolean column (amopopr, amopfamily, amoporder) - could be two records per operator + operator could be used in both roles + strategy number could be different for different roles All three options require to add flag of role op_in_opfamily/get_op_opfamily_strategy/get_op_opfamily_properties to check applicability of operation in current code path. First two options could do not change of interface of op_in_opfamily/get_op_opfamily_strategy but it will be needed to check actual role of operator later. Basing on this comparison, I think, that 2) is worse and 3) is better. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Tue, Oct 5, 2010 at 5:05 PM, Oleg Bartunov wrote: > Are you sure postgres doesn't have limitation at all ? There are a lot ot > them. Of course, there are limitation and limitation and we should avoid > limitations, which will require incompatible changes in future in user's > code, or which prevent future improvements (getting rid limitation). We > suggested implementation of k-nn search, which has limitations, but in my > opinion, they are rather small and doesn't prevent in future to > write a descent patch, based on your five-key syscaches patches, > which will touch a lot of places in the pg source. > > Who need boolean distance ? Ok, you insisted, we now support it. Teodor > wrote arguments > (http://archives.postgresql.org/message-id/4c8e2590.6040...@sigaev.ru) > why two fields solution doesn't really helped. > > You want "the points most distant from the bright center of the universe?", > sorry, for now. I think, this is a limitation we can live with for now. It's > k-nearest neighbourhood search, and not k-furthest outliers search. > > You want documentation for review, I don't believe a reviewer can't review > without users documentation like CREATE/ALTER OPERATOR CLASS, etc. Andrew > Gierth was true, > that GiST documentation needed to be rewritten and he suggested to do that > if I understand him. I'd love to help, but I don't have any message from > him. > If he changed his mind, I'll describe these changes. > > We're not full-time pg-employers and it's not easy for us to follow new > cleaner-way. I think there is a risk, that there are will be lesser big > features introduced by non-pg employers in the future and eventually, pg > will be open-source database developed by pg-employers with some amount > cosmetic changes and fixes. > > I suggest to find a sensible consensus on implementation, taking into > account Pareto Rule, and leave place for future improvement. I am sorry my review pissed you off, as it seems to have done. Looking back on it, I realize now that it was phrased more harshly than it needed to be. That having been said, it seems to me that we really are at an impasse and I am not sure what to propose as a way forward. Tom and I laid out a technical design back in January and I still think it's a good one, even though I know you think it's silly. We may just have to agree to disagree on this point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert, Are you sure postgres doesn't have limitation at all ? There are a lot ot them. Of course, there are limitation and limitation and we should avoid limitations, which will require incompatible changes in future in user's code, or which prevent future improvements (getting rid limitation). We suggested implementation of k-nn search, which has limitations, but in my opinion, they are rather small and doesn't prevent in future to write a descent patch, based on your five-key syscaches patches, which will touch a lot of places in the pg source. Who need boolean distance ? Ok, you insisted, we now support it. Teodor wrote arguments (http://archives.postgresql.org/message-id/4c8e2590.6040...@sigaev.ru) why two fields solution doesn't really helped. You want "the points most distant from the bright center of the universe?", sorry, for now. I think, this is a limitation we can live with for now. It's k-nearest neighbourhood search, and not k-furthest outliers search. You want documentation for review, I don't believe a reviewer can't review without users documentation like CREATE/ALTER OPERATOR CLASS, etc. Andrew Gierth was true, that GiST documentation needed to be rewritten and he suggested to do that if I understand him. I'd love to help, but I don't have any message from him. If he changed his mind, I'll describe these changes. We're not full-time pg-employers and it's not easy for us to follow new cleaner-way. I think there is a risk, that there are will be lesser big features introduced by non-pg employers in the future and eventually, pg will be open-source database developed by pg-employers with some amount cosmetic changes and fixes. I suggest to find a sensible consensus on implementation, taking into account Pareto Rule, and leave place for future improvement. Oleg On Tue, 14 Sep 2010, Robert Haas wrote: 2010/9/13 Teodor Sigaev : [updated patch] I realize I'm repeating myself, but... this patch needs documentation. It's not optional. It seems to me that you need to do something about AMOPSTRATEGY. Hence the five-key syscaches patches I wrote that is sitting in queue. And also LookupOpClassInfo(). Am I confused? Is there a reason we're using a boolean 'amoporder' distinguish ordering operators from regular old operators? Tom and I were talking about some kind of an integer field, in case we need more categories later. You know, like 'amopcategory' or something like that. It could be just one byte, perhaps, but one bit seems unduly narrow. You could define constants OPCAT_QUAL and OPCAT_ORDER, or something like that. This error message seems like it ought to be complaining about the access method, not the index: + if (!pg_am->amcanorderbyop) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), +errmsg("index doesn't support ordering operations"))); I sort of understand the reasons behind the following restriction, but is this truly the best we can do? + /* +* Currently, only descending and nulls last ordering is supported +*/ + if (!(pathkey->pk_strategy == BTLessStrategyNumber && pathkey->pk_nulls_first == false)) + return; What happens if we have an index strategy that can efficiently return the points most distant from the bright center of the universe? By the way: gistproc.c: In function ?gist_point_consistent?: gistproc.c:1023: error: ?distance? may be used uninitialized in this function Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers