subject:"\[HACKERS\] knngist \- 0.8"

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-04 Thread Simon Riggs

On Tue, 2011-03-01 at 22:56 +, Simon Riggs wrote:
> On Tue, 2011-03-01 at 12:42 -0800, Josh Berkus wrote:
> > > That response is just dodging the hard question, so whatever.  Tom's
> > > cleanup is not going to break things, or at least it's going to fix
> > > more than it breaks on net.  Sync rep, on the other hand, is going to
> > > do the opposite, probably by a large margin.
> > 
> > Well, we need to at least give Simon and Fujii a chance to respond
> > during their respective daytime hours.
> 
> Friday is a reasonable deadline.

I'm not going to be committing Sync Rep today.

It's technically in very good shape, but some of the conceptual issues
aren't clear enough on a Friday night to make it sensible for me to
commit to PostgreSQL, given we must support it for 5 years if I do.

I believe those things are resolvable for 9.1, probably fairly soon, but
I'm not going to rush a commit just to hit a deadline. That isn't the
Way of Quality that we have followed for so long.

I'll let y'all discuss what to do next.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-03-03 Thread Oleg Bartunov


Thanks, Tom !


Oleg
On Wed, 2 Mar 2011, Tom Lane wrote:


Teodor Sigaev  writes:

[ builtin_knngist_contrib_btree_gist-0.12 patch ]


Applied with some corrections --- mostly, that the upgrade script was
all wet.  I added some documentation too.

regards, tom lane




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-03-02 Thread Tom Lane

Teodor Sigaev  writes:
> [ builtin_knngist_contrib_btree_gist-0.12 patch ]

Applied with some corrections --- mostly, that the upgrade script was
all wet.  I added some documentation too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Simon Riggs

On Tue, 2011-03-01 at 12:42 -0800, Josh Berkus wrote:
> > That response is just dodging the hard question, so whatever.  Tom's
> > cleanup is not going to break things, or at least it's going to fix
> > more than it breaks on net.  Sync rep, on the other hand, is going to
> > do the opposite, probably by a large margin.
> 
> Well, we need to at least give Simon and Fujii a chance to respond
> during their respective daytime hours.

Friday is a reasonable deadline.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Jaime Casanova

On Tue, Mar 1, 2011 at 4:58 PM, Robert Haas  wrote:
>
> So where is the new patch, and the test reports from all the people
> who found problems in the last version?
>

Simon do this in his public repo here:
git://github.com/simon2ndQuadrant/postgres.git

It has been just one day from this so i expect he will post a patch
soon, meanwhile i'm following his repo and testing and seems i'm not
the only one because Yeb Havinga has added two columns in
pg_stat_replication to know which standby is the synch standby and
which ones are potential synch standbys

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 4:55 PM, Jaime Casanova  wrote:
> On Tue, Mar 1, 2011 at 3:40 PM, Robert Haas  wrote:
>> Sync rep, on the other hand, is going to
>> do the opposite, probably by a large margin.
>>
>
> Are you sure of that? the code is very centralized and the bugs we
> found last week weren't that difficult to address, the prove for that
> is that Simon fixed them in one day...

So where is the new patch, and the test reports from all the people
who found problems in the last version?

> And for the open items, most of them are definitions about the behaviour...

Like what?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Jaime Casanova

On Tue, Mar 1, 2011 at 3:40 PM, Robert Haas  wrote:
> Sync rep, on the other hand, is going to
> do the opposite, probably by a large margin.
>

Are you sure of that? the code is very centralized and the bugs we
found last week weren't that difficult to address, the prove for that
is that Simon fixed them in one day...

And for the open items, most of them are definitions about the behaviour...

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Josh Berkus


> That response is just dodging the hard question, so whatever.  Tom's
> cleanup is not going to break things, or at least it's going to fix
> more than it breaks on net.  Sync rep, on the other hand, is going to
> do the opposite, probably by a large margin.

Well, we need to at least give Simon and Fujii a chance to respond
during their respective daytime hours.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Tom Lane

Robert Haas  writes:
> Bruce has been going through the open items for the past several weeks
> (at least) and tells me that he hasn't found very much.  I'm not sure
> what your thought is on what's required to get us from here to beta,
> but I am thinking it could be done in a few weeks.  With a concerted
> effort and some sustained focus, I don't see why we could get this
> release out the door in, say, three months.  Taking in a feature
> that's going to take another month to sort out is going to push that
> out, and I am really not excited about another round of
> spend-all-summer-waiting-for-people-to-get-back-from-vacation-and-release-in-September.

Yeah, it would be really nice to get the release out in June rather than
September.  If we wait any longer for Sync Rep I'm pretty sure it's
going to be the latter not the former.

See my nearby message for a start at a list of what we "must" do to
get to alpha4.  Any features we want to cram in at this stage go on
top of that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 3:36 PM, Josh Berkus  wrote:
>
>> That's still missing the point, which is that the code is unlikely to
>> be up to our usual standards at that point.
>
> I was talking about "when do we close the CF and start building an
> alpha", not synch rep particularly.  Tom said he wants until Friday
> anyway just for some cleanup.

That response is just dodging the hard question, so whatever.  Tom's
cleanup is not going to break things, or at least it's going to fix
more than it breaks on net.  Sync rep, on the other hand, is going to
do the opposite, probably by a large margin.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Josh Berkus


> That's still missing the point, which is that the code is unlikely to
> be up to our usual standards at that point.

I was talking about "when do we close the CF and start building an
alpha", not synch rep particularly.  Tom said he wants until Friday
anyway just for some cleanup.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 3:01 PM, Josh Berkus  wrote:
> On 3/1/11 11:58 AM, Tom Lane wrote:
>> If we do hold up the release, I'll probably go back and reopen the
>> postgresql_fdw patch as well as btree_gist.  So I won't run out of
>> things to do.  But I'm not terribly satisfied with the decision-making
>> process here.
>
> OK, well, we gave it an extra 15 days, which was all we promised.
>
> I'm ok with closing things as of the end of the 15 days, say Thursday or
> Friday.

That's still missing the point, which is that the code is unlikely to
be up to our usual standards at that point.

So far I don't hear anyone arguing strongly that we should accept sync
rep in 9.1, and several people arguing the reverse.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Joshua D. Drake

On Tue, 2011-03-01 at 14:26 -0500, Robert Haas wrote:
> On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus  wrote:
> >
> >>> Since we appear to be still holding the commitfest open for Sync Rep,
> >>> I guess this ought to get reviewed.
> >>
> >> Or else we should close the CommitFest and cut alpha4.  Anyone have an
> >> opinion on which way to go?
> >
> > I think we can give Sync Rep until the 15th, given the pace of work on
> > it.  It is a major feature, and a complicated one.
> 
> Sure, but there are other features, major and minor, that we have
> postponed to 9.2.  In the normal course of events, sync rep would have
> been marked Returned with Feedback a month ago.  I like the feature,
> but I have to say I'm not very pleased that we seem to have fallen
> into a pattern of believing that some major features are somehow
> exempted from the scheduling deadline and others are not. 

Neither am I, I mean, we were actively fixing and bringing Fk-Locks up
to date and we got pushed but sync rep gets in? We may have had fk-locks
baked by now. I want syncrep as much as the next person, but this isn't
really fair to any of the other submitters.

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 2:58 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Mar 1, 2011 at 2:45 PM, Tom Lane  wrote:
>>> I'd say that if there's a plausible chance that Sync Rep will be
>>> committable by the end of *this* week (and I mean Friday not Sunday),
>>> I'm willing to wait that long for it.  Otherwise, it's 9.2 material.
>
>> I am quite sure that Simon will be able to get something committed
>> ahead of whatever deadline we choose to set.  Whether that commit will
>> be up to our usual standards is another question altogether.  The last
>> version posted to the list was trivial to break, and that was several
>> weeks ago.
>
> Yeah, there's that.  It's difficult to believe that anything committed
> in the very short term wouldn't be rushed to completion rather than
> really ready.  That holds whether the deadline is this week or two
> weeks out.

Yep.

> The other issue is that, as Robert says, we have already cut Sync Rep
> more slack than any other patch in the commitfest.  It does not seem
> fair to hold up the release process another week or two for it, even
> assuming that we get a high-quality feature at the end of that.

Yep.

> If we do hold up the release, I'll probably go back and reopen the
> postgresql_fdw patch as well as btree_gist.  So I won't run out of
> things to do.  But I'm not terribly satisfied with the decision-making
> process here.

Well, we haven't actually made a decision here yet.  We're just
talking about what decision we ought to make.  Frankly, I avoided
trying to mark Sync Rep Returned with Feedback mostly for the reason
that I knew Simon would object, and his commit bit gives him a certain
degree of latitude to ignore the CF process anyway.  But I am really
not that keen on having Sync Rep go in and then spending another month
fixing all the bugs, and that's what I think will happen.

Bruce has been going through the open items for the past several weeks
(at least) and tells me that he hasn't found very much.  I'm not sure
what your thought is on what's required to get us from here to beta,
but I am thinking it could be done in a few weeks.  With a concerted
effort and some sustained focus, I don't see why we could get this
release out the door in, say, three months.  Taking in a feature
that's going to take another month to sort out is going to push that
out, and I am really not excited about another round of
spend-all-summer-waiting-for-people-to-get-back-from-vacation-and-release-in-September.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 3:08 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Mar 1, 2011 at 2:52 PM, Josh Berkus  wrote:
>>> Ideally, we want to have some binaries/packages for the "final alpha".
>>> Those broaden testing considerably.
>
>> When we have a version that needs that treatment, we can simply call
>> it beta1.  If it's too half-baked for that, then I don't see the point
>> in going to a lot of trouble to build packages.
>
> We (or more precisely EDB) made Windows installers for alpha1:
> http://www.enterprisedb.com/products-services-training/pgdevdownload
> And IIRC they did installers for alphas in the 9.0 cycle too.  And
> certainly Devrim and others have been building binary packages for
> alphas.  If this alpha is so much less baked than the previous ones
> that that's not worthwhile, there's something very wrong with the
> process.  The last alpha ought to be in testable condition.

Oh, really?  OK.  I wasn't aware that alphas got installers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Tom Lane

Robert Haas  writes:
> On Tue, Mar 1, 2011 at 2:52 PM, Josh Berkus  wrote:
>> Ideally, we want to have some binaries/packages for the "final alpha".
>> Those broaden testing considerably.

> When we have a version that needs that treatment, we can simply call
> it beta1.  If it's too half-baked for that, then I don't see the point
> in going to a lot of trouble to build packages.

We (or more precisely EDB) made Windows installers for alpha1:
http://www.enterprisedb.com/products-services-training/pgdevdownload
And IIRC they did installers for alphas in the 9.0 cycle too.  And
certainly Devrim and others have been building binary packages for
alphas.  If this alpha is so much less baked than the previous ones
that that's not worthwhile, there's something very wrong with the
process.  The last alpha ought to be in testable condition.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Josh Berkus

On 3/1/11 11:58 AM, Tom Lane wrote:
> If we do hold up the release, I'll probably go back and reopen the
> postgresql_fdw patch as well as btree_gist.  So I won't run out of
> things to do.  But I'm not terribly satisfied with the decision-making
> process here.

OK, well, we gave it an extra 15 days, which was all we promised.

I'm ok with closing things as of the end of the 15 days, say Thursday or
Friday.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Tom Lane

Robert Haas  writes:
> On Tue, Mar 1, 2011 at 2:45 PM, Tom Lane  wrote:
>> I'd say that if there's a plausible chance that Sync Rep will be
>> committable by the end of *this* week (and I mean Friday not Sunday),
>> I'm willing to wait that long for it.  Otherwise, it's 9.2 material.

> I am quite sure that Simon will be able to get something committed
> ahead of whatever deadline we choose to set.  Whether that commit will
> be up to our usual standards is another question altogether.  The last
> version posted to the list was trivial to break, and that was several
> weeks ago.

Yeah, there's that.  It's difficult to believe that anything committed
in the very short term wouldn't be rushed to completion rather than
really ready.  That holds whether the deadline is this week or two
weeks out.

The other issue is that, as Robert says, we have already cut Sync Rep
more slack than any other patch in the commitfest.  It does not seem
fair to hold up the release process another week or two for it, even
assuming that we get a high-quality feature at the end of that.

If we do hold up the release, I'll probably go back and reopen the
postgresql_fdw patch as well as btree_gist.  So I won't run out of
things to do.  But I'm not terribly satisfied with the decision-making
process here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 2:52 PM, Josh Berkus  wrote:
> Ideally, we want to have some binaries/packages for the "final alpha".
> Those broaden testing considerably.

When we have a version that needs that treatment, we can simply call
it beta1.  If it's too half-baked for that, then I don't see the point
in going to a lot of trouble to build packages.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Josh Berkus


> As I understand it, it requires only the steps described here:
> 
> http://wiki.postgresql.org/wiki/Alpha_release_process
> 
> That all looks pretty straightforward, assuming you can log into
> developer.postgresql.org, which I can't.  I think I remember having an
> ftp account at some point, but it's not accepting connections on port
> 22, so there is doubtless some secret sauce I am missing here.

Ideally, we want to have some binaries/packages for the "final alpha".
Those broaden testing considerably.

We don't need them for all platforms, of course.  Really, the critical
ones for testing are Windows and OSX.  Linux/BSD/Solaris users are
pretty good at make/make install.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 2:38 PM, Magnus Hagander  wrote:
>> Frankly, I think we should be aiming to get a beta out in April, not
>> another alpha.
>
> That would be good, but in order to get there, +1 for cutting a new
> alpha even if syncrep isn't ready for it yet. That way we can at least
> get some more testing on all the non-syncrep code.
>
> That is assuming that cutting an alpha release isn't all that much
> work, but IIRC it's not. (Hey, I know it's not much work for me, but
> someone who actually does the work should comment on the total
> amount...)

As I understand it, it requires only the steps described here:

http://wiki.postgresql.org/wiki/Alpha_release_process

That all looks pretty straightforward, assuming you can log into
developer.postgresql.org, which I can't.  I think I remember having an
ftp account at some point, but it's not accepting connections on port
22, so there is doubtless some secret sauce I am missing here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 2:45 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus  wrote:
>>> I think we can give Sync Rep until the 15th, given the pace of work on
>>> it.  It is a major feature, and a complicated one.
>
>> Sure, but there are other features, major and minor, that we have
>> postponed to 9.2.  In the normal course of events, sync rep would have
>> been marked Returned with Feedback a month ago.  I like the feature,
>> but I have to say I'm not very pleased that we seem to have fallen
>> into a pattern of believing that some major features are somehow
>> exempted from the scheduling deadline and others are not.
>
> Yes.  What are the rest of us supposed to do for the next two weeks,
> twiddle our thumbs?
>
> Personally I've got a couple of days' worth of cleanup tasks before I'd
> want to see us cut an alpha anyway, especially if we're going to try
> to accept the btree_gist KNNgist patch.  Two weeks is too much though.
>
> I'd say that if there's a plausible chance that Sync Rep will be
> committable by the end of *this* week (and I mean Friday not Sunday),
> I'm willing to wait that long for it.  Otherwise, it's 9.2 material.

I am quite sure that Simon will be able to get something committed
ahead of whatever deadline we choose to set.  Whether that commit will
be up to our usual standards is another question altogether.  The last
version posted to the list was trivial to break, and that was several
weeks ago.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Tom Lane

Robert Haas  writes:
> On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus  wrote:
>> I think we can give Sync Rep until the 15th, given the pace of work on
>> it.  It is a major feature, and a complicated one.

> Sure, but there are other features, major and minor, that we have
> postponed to 9.2.  In the normal course of events, sync rep would have
> been marked Returned with Feedback a month ago.  I like the feature,
> but I have to say I'm not very pleased that we seem to have fallen
> into a pattern of believing that some major features are somehow
> exempted from the scheduling deadline and others are not.

Yes.  What are the rest of us supposed to do for the next two weeks,
twiddle our thumbs?

Personally I've got a couple of days' worth of cleanup tasks before I'd
want to see us cut an alpha anyway, especially if we're going to try
to accept the btree_gist KNNgist patch.  Two weeks is too much though.

I'd say that if there's a plausible chance that Sync Rep will be
committable by the end of *this* week (and I mean Friday not Sunday),
I'm willing to wait that long for it.  Otherwise, it's 9.2 material.

> Frankly, I think we should be aiming to get a beta out in April, not
> another alpha.

Quite.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Magnus Hagander

On Tue, Mar 1, 2011 at 20:26, Robert Haas  wrote:
> On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus  wrote:
>>
 Since we appear to be still holding the commitfest open for Sync Rep,
 I guess this ought to get reviewed.
>>>
>>> Or else we should close the CommitFest and cut alpha4.  Anyone have an
>>> opinion on which way to go?
>>
>> I think we can give Sync Rep until the 15th, given the pace of work on
>> it.  It is a major feature, and a complicated one.
>
> Sure, but there are other features, major and minor, that we have
> postponed to 9.2.  In the normal course of events, sync rep would have
> been marked Returned with Feedback a month ago.  I like the feature,
> but I have to say I'm not very pleased that we seem to have fallen
> into a pattern of believing that some major features are somehow
> exempted from the scheduling deadline and others are not.  I am sure
> there are plenty of other people who would have liked a six week
> extension of the usual CommitFest deadlines too, but they didn't get
> it (and for the most part, were pretty gracious about that).
>
>> We could even cut a pre-synch-rep Alpha4 *now*, and follow that with a
>> post-synch-rep Alpha5 sometime around April 1.
>>
>> That'll put us in a good position for beta, and also to see what
>> specific issue SynR adds.
>
> Frankly, I think we should be aiming to get a beta out in April, not
> another alpha.

That would be good, but in order to get there, +1 for cutting a new
alpha even if syncrep isn't ready for it yet. That way we can at least
get some more testing on all the non-syncrep code.

That is assuming that cutting an alpha release isn't all that much
work, but IIRC it's not. (Hey, I know it's not much work for me, but
someone who actually does the work should comment on the total
amount...)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 2:12 PM, Josh Berkus  wrote:
>
>>> Since we appear to be still holding the commitfest open for Sync Rep,
>>> I guess this ought to get reviewed.
>>
>> Or else we should close the CommitFest and cut alpha4.  Anyone have an
>> opinion on which way to go?
>
> I think we can give Sync Rep until the 15th, given the pace of work on
> it.  It is a major feature, and a complicated one.

Sure, but there are other features, major and minor, that we have
postponed to 9.2.  In the normal course of events, sync rep would have
been marked Returned with Feedback a month ago.  I like the feature,
but I have to say I'm not very pleased that we seem to have fallen
into a pattern of believing that some major features are somehow
exempted from the scheduling deadline and others are not.  I am sure
there are plenty of other people who would have liked a six week
extension of the usual CommitFest deadlines too, but they didn't get
it (and for the most part, were pretty gracious about that).

> We could even cut a pre-synch-rep Alpha4 *now*, and follow that with a
> post-synch-rep Alpha5 sometime around April 1.
>
> That'll put us in a good position for beta, and also to see what
> specific issue SynR adds.

Frankly, I think we should be aiming to get a beta out in April, not
another alpha.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Josh Berkus


>> Since we appear to be still holding the commitfest open for Sync Rep,
>> I guess this ought to get reviewed.
> 
> Or else we should close the CommitFest and cut alpha4.  Anyone have an
> opinion on which way to go?

I think we can give Sync Rep until the 15th, given the pace of work on
it.  It is a major feature, and a complicated one.

We could even cut a pre-synch-rep Alpha4 *now*, and follow that with a
post-synch-rep Alpha5 sometime around April 1.

That'll put us in a good position for beta, and also to see what
specific issue SynR adds.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

wrapping up this CommitFest (was Re: [HACKERS] knngist - 0.8)

2011-03-01 Thread Robert Haas

On Tue, Mar 1, 2011 at 1:35 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Feb 28, 2011 at 2:25 PM, Tom Lane  wrote:
>>> Given that it is a contrib module, I personally wouldn't object to it
>>> getting patched later, like during alpha or beta.  But somebody's got
>>> to do the work, and I've got a dozen higher-priority problems right now.
>
>> Well, we can argue about whether it's too late for 9.1 if and when a
>> patch shows up.  Right now we don't have that problem.
>
> We do now ...
> http://archives.postgresql.org/pgsql-hackers/2011-03/msg00038.php
>
> Since we appear to be still holding the commitfest open for Sync Rep,
> I guess this ought to get reviewed.

Or else we should close the CommitFest and cut alpha4.  Anyone have an
opinion on which way to go?

I think it's fair to say that Simon is working pretty actively on Sync
Rep and that the bug count is probably dropping rapidly.  It seems a
shame to push sync rep out to 9.2 in that context.  On the other hand,
the patch wasn't done at the beginning of the CommitFest, it wasn't
done at the scheduled end of the CommitFest, and it's still not done
now two weeks after the scheduled end of the CommitFest.  If it gets
committed O(now), it's probably going to still have bugs and design
problems that will take at least a few more weeks to shake out, which
will directly add to the length of time that it takes to actually get
the release out the door.

I could go either way on this one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-03-01 Thread Tom Lane

Robert Haas  writes:
> On Mon, Feb 28, 2011 at 2:25 PM, Tom Lane  wrote:
>> Given that it is a contrib module, I personally wouldn't object to it
>> getting patched later, like during alpha or beta.  But somebody's got
>> to do the work, and I've got a dozen higher-priority problems right now.

> Well, we can argue about whether it's too late for 9.1 if and when a
> patch shows up.  Right now we don't have that problem.

We do now ...
http://archives.postgresql.org/pgsql-hackers/2011-03/msg00038.php

Since we appear to be still holding the commitfest open for Sync Rep,
I guess this ought to get reviewed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-03-01 Thread Teodor Sigaev


I did a quick look at this patch.  The major problem with it is of
course that it needs to be fixed for the recent extension-related
changes.  I transposed the .sql.in changes into additions to
btree_gist--1.0.sql (attached), but haven't really sanity-checked
them beyond checking that the regression tests pass.  The same mods
would need to be made in btree_gist--unpackaged--1.0.sql.


Fixed


1. oid_dist() returns oid ... really?  Oid is unsigned.  I'd be inclined
to argue though that distance between Oids is a meaningless concept, so

Hmm, oid is often used as unsigned int.


you should remove this not just mess with the result type.  Anybody who
actually wants to form a distance between Oids should have to cast them
to an arithmetic type first.  Let the user figure out how wraparound
cases should be handled.


Distance between unsigned 32-bit integers could not be more than 2^32.



2. Beyond that, none of the distance routines have given any thought to
avoiding overflow.  For instance, dist_int2 had better return something
wider than int2, and so on up.  It looks to me like the internal gist

Just like other operations:
# select 32000::smallint + 32000::smallint;
ERROR:  smallint out of range


distance functions also suffer overflow risks, in that they tend to form
the difference first (in the source datatype) and only afterwards cast
to float8.

fixed


3. I was surprised that there wasn't a distance implementation for
numeric.  I suppose that this might be difficult to do without risking
overflow in conversion to float8, though.

Exactly


4. I didn't much care for changing the result type of gbt_num_consistent
from bool to float8; that's just messy, and I don't see any compensating
advantage.  I suggest you leave gbt_num_consistent and its callers
alone, and add a separate gbt_num_distance routine that only handles the
KNNDistance case.

Done

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


builtin_knngist_contrib_btree_gist-0.12.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-02-28 Thread Robert Haas

On Mon, Feb 28, 2011 at 2:25 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Feb 28, 2011 at 1:53 PM, Josh Berkus  wrote:
 Since no one has stepped up to fix these issues, I have marked this
 patch Returned with Feedback.
>
>>> This is just contrib/btree_GIST, yes?
>
>> Yes, core KNN was committed by Tom during the November CommitFest.
>
> Right.  However, it's disappointing that this isn't in, because the
> number of use cases for KNN-gist in core isn't very large.  We really
> need support for KNN in btree_gist to make it useful.
>
> Given that it is a contrib module, I personally wouldn't object to it
> getting patched later, like during alpha or beta.  But somebody's got
> to do the work, and I've got a dozen higher-priority problems right now.

Well, we can argue about whether it's too late for 9.1 if and when a
patch shows up.  Right now we don't have that problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-02-28 Thread Tom Lane

Robert Haas  writes:
> On Mon, Feb 28, 2011 at 1:53 PM, Josh Berkus  wrote:
>>> Since no one has stepped up to fix these issues, I have marked this
>>> patch Returned with Feedback.

>> This is just contrib/btree_GIST, yes?

> Yes, core KNN was committed by Tom during the November CommitFest.

Right.  However, it's disappointing that this isn't in, because the
number of use cases for KNN-gist in core isn't very large.  We really
need support for KNN in btree_gist to make it useful.

Given that it is a contrib module, I personally wouldn't object to it
getting patched later, like during alpha or beta.  But somebody's got
to do the work, and I've got a dozen higher-priority problems right now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-02-28 Thread Robert Haas

On Mon, Feb 28, 2011 at 1:53 PM, Josh Berkus  wrote:
>
>> Since no one has stepped up to fix these issues, I have marked this
>> patch Returned with Feedback.
>
> This is just contrib/btree_GIST, yes?

Yes, core KNN was committed by Tom during the November CommitFest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-02-28 Thread Josh Berkus


> Since no one has stepped up to fix these issues, I have marked this
> patch Returned with Feedback.

This is just contrib/btree_GIST, yes?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-02-28 Thread Robert Haas

On Fri, Feb 18, 2011 at 1:07 AM, Tom Lane  wrote:
> There might be more issues, I haven't read the patch in detail.
> But anyway, I'm going to set it to Waiting on Author.  I think it
> needs at least a day or so's work, and I can't put in that kind of
> time on it now.

Since no one has stepped up to fix these issues, I have marked this
patch Returned with Feedback.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2011-02-17 Thread Tom Lane

Teodor Sigaev  writes:
>> I've applied all of this, and written documentation for all of it,
> Thank you a lot
>> except for the contrib/btree_gist additions which still need to be
>> redone for the revised API (and then documented!).  My patience ran out

> Done, btree_gist is reworked for a new API.

I did a quick look at this patch.  The major problem with it is of
course that it needs to be fixed for the recent extension-related
changes.  I transposed the .sql.in changes into additions to 
btree_gist--1.0.sql (attached), but haven't really sanity-checked
them beyond checking that the regression tests pass.  The same mods
would need to be made in btree_gist--unpackaged--1.0.sql.

However, I feel that this is not ready to apply even given those fixes.
Problems yet to solve:

1. oid_dist() returns oid ... really?  Oid is unsigned.  I'd be inclined
to argue though that distance between Oids is a meaningless concept, so
you should remove this not just mess with the result type.  Anybody who
actually wants to form a distance between Oids should have to cast them
to an arithmetic type first.  Let the user figure out how wraparound
cases should be handled.

2. Beyond that, none of the distance routines have given any thought to
avoiding overflow.  For instance, dist_int2 had better return something
wider than int2, and so on up.  It looks to me like the internal gist
distance functions also suffer overflow risks, in that they tend to form
the difference first (in the source datatype) and only afterwards cast
to float8.

3. I was surprised that there wasn't a distance implementation for
numeric.  I suppose that this might be difficult to do without risking
overflow in conversion to float8, though.

4. I didn't much care for changing the result type of gbt_num_consistent
from bool to float8; that's just messy, and I don't see any compensating
advantage.  I suggest you leave gbt_num_consistent and its callers
alone, and add a separate gbt_num_distance routine that only handles the
KNNDistance case.

There might be more issues, I haven't read the patch in detail.
But anyway, I'm going to set it to Waiting on Author.  I think it
needs at least a day or so's work, and I can't put in that kind of
time on it now.

regards, tom lane

diff --git a/contrib/btree_gist/btree_gist--1.0.sql b/contrib/btree_gist/btree_gist--1.0.sql
index 1ea5fa2db418d06f991f362a956fec07ccfba9e9..dd428995c185a09519ed65433081752b83b71bb7 100644
*** a/contrib/btree_gist/btree_gist--1.0.sql
--- b/contrib/btree_gist/btree_gist--1.0.sql
*** CREATE TYPE gbtreekey_var (
*** 81,86 
--- 81,231 
  	STORAGE = EXTENDED
  );

+ --distance operators
+ 
+ CREATE FUNCTION cash_dist(money, money)
+ RETURNS money
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = money,
+ 	RIGHTARG = money,
+ 	PROCEDURE = cash_dist,
+ 	COMMUTATOR = '<->'
+ );
+ 
+ CREATE FUNCTION date_dist(date, date)
+ RETURNS int4
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = date,
+ 	RIGHTARG = date,
+ 	PROCEDURE = date_dist,
+ 	COMMUTATOR = '<->'
+ );
+ 
+ CREATE FUNCTION float4_dist(float4, float4)
+ RETURNS float4
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = float4,
+ 	RIGHTARG = float4,
+ 	PROCEDURE = float4_dist,
+ 	COMMUTATOR = '<->'
+ );
+ 
+ CREATE FUNCTION float8_dist(float8, float8)
+ RETURNS float8
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = float8,
+ 	RIGHTARG = float8,
+ 	PROCEDURE = float8_dist,
+ 	COMMUTATOR = '<->'
+ );
+ 
+ CREATE FUNCTION int2_dist(int2, int2)
+ RETURNS int2
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = int2,
+ 	RIGHTARG = int2,
+ 	PROCEDURE = int2_dist,
+ 	COMMUTATOR = '<->'
+ );
+ 
+ CREATE FUNCTION int4_dist(int4, int4)
+ RETURNS int4
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = int4,
+ 	RIGHTARG = int4,
+ 	PROCEDURE = int4_dist,
+ 	COMMUTATOR = '<->'
+ );
+ 
+ CREATE FUNCTION int8_dist(int8, int8)
+ RETURNS int8
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = int8,
+ 	RIGHTARG = int8,
+ 	PROCEDURE = int8_dist,
+ 	COMMUTATOR = '<->'
+ );
+ 
+ CREATE FUNCTION interval_dist(interval, interval)
+ RETURNS interval
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = interval,
+ 	RIGHTARG = interval,
+ 	PROCEDURE = interval_dist,
+ 	COMMUTATOR = '<->'
+ );
+ 
+ CREATE FUNCTION oid_dist(oid, oid)
+ RETURNS oid
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = oid,
+ 	RIGHTARG = oid,
+ 	PROCEDURE = oid_dist,
+ 	COMMUTATOR = '<->'
+ );
+ 
+ CREATE FUNCTION time_dist(time, time)
+ RETURNS interval
+ AS 'MODULE_PATHNAME'
+ LANGUAGE C IMMUTABLE STRICT;
+ 
+ CREATE OPERATOR <-> (
+ 	LEFTARG = time,
+ 	RIGHTARG

Re: [HACKERS] knngist - 0.8

2010-12-28 Thread Teodor Sigaev


I've applied all of this, and written documentation for all of it,


Thank you a lot

except for the contrib/btree_gist additions which still need to be
redone for the revised API (and then documented!).  My patience ran out


Done, btree_gist is reworked for a new API.

I'm very sorry, but I'm rather busy now and will be accessible only after 
January, 10.



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


builtin_knngist_contrib_btree_gist-0.9.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-28 Thread Tom Lane

Martijn van Oosterhout  writes:
> On Sun, Dec 26, 2010 at 08:13:40PM -0500, Tom Lane wrote:
>> [ thinks for a bit... ]  One reason for having a different structure
>> would be if we needed to represent abstract semantics for some operators
>> that couldn't be associated with a btree opclass.

> One thing that comes to mind is the operators used for hash indexes,
> namely the hash() function.

The hash opclasses handle that fine.  I cannot conceive of any reason
for shoehorning hash functions into btree opclasses.

> With respect to the collation of strings I have thought it useful to be
> able to define a sortkey() function, which would map the input space to
> a 8 byte integer and satisfies the rule:
> sortkey(a) < sortkey(b)  implies a < b

I'm pretty dubious about the workability of that one, but again, there
isn't any obvious reason why we'd need a new catalog structure to
support it.  If we did want it, it could be an optional support function
in btree opclasses.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-28 Thread Martijn van Oosterhout

On Sun, Dec 26, 2010 at 08:13:40PM -0500, Tom Lane wrote:
> [ thinks for a bit... ]  One reason for having a different structure
> would be if we needed to represent abstract semantics for some operators
> that couldn't be associated with a btree opclass.  This is clearly not
> an issue for what RANGE needs, since anything you can order by will
> surely have a btree opclass for that, and in fact we probably need to
> tie those operators to specific orderings if a datatype has more than
> one sort ordering.  But maybe there could be some other situation where
> we'd need to describe operator behavior for a datatype independently of
> any sort ordering.  Can anyone come up with a plausible example?

One thing that comes to mind is the operators used for hash indexes,
namely the hash() function. It is closely related to the collation but
isn't actually used for sorting. For every btree class you can make a
hash class with the same equality operator and an appropriate hash
function.

I've had the idea of defining a parent object and deriving the btree
and hash operator classes from that, but it gets messy once you get
into cross-type operators (i.e. operator families).

With respect to the collation of strings I have thought it useful to be
able to define a sortkey() function, which would map the input space to
a 8 byte integer and satisfies the rule:

sortkey(a) < sortkey(b)  implies a < b

The idea being that you can use this as an efficient first step to
speed up sorting strings, since adding it to the sort list implicitly
before the actual column doesn't change the result. In the case of
strings the sortkey() could be generated with strxfrm(). 

A similar idea could be used with other expensive comparison
operations, but I can't think of any at the moment. Actually, perhaps
you could use it with ints/floats/etc as well, since you could skip the
function call overhead. You'd be trading (n log n int compares + n
sortkeys) with (n log n comparisions).

Just some thoughts, 

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first. 
>   - Charles de Gaulle

signature.asc
Description: Digital signature

Re: [HACKERS] knngist - 0.8

2010-12-26 Thread Tom Lane

Hitoshi Harada  writes:
> 2010/12/27 Robert Haas :
>> As far as window functions go, we clearly need some kind of type
>> interface feature, but I am unclear whether we should sandwhich it
>> into the btree opclass machinery or whether it might be better to
>> create a whole separate concept just for this purpose.

> I believe we should use btree opclass machinery to represent
> add/subtract interfaces because in the RANGE case you eventually need
> add constant to value and compare it with some boundary by using
> btree. Or, btree operator set would be shared in the type interface
> machinery. We will want to avoid duplication between them.

The thing is, these operators have no arguable application in the
context of actual btree index searches.  We could certainly stick them
into pg_amop anyway, using a third amoppurpose category to distinguish
them from searchable or orderable operators.  But I share Robert's
discomfort with this --- it's unarguably an abuse of the original
intention of operator classes.

OTOH, operator classes are what we've got to represent the abstract
semantics of operators, and it's not real clear to me what we'd gain by
inventing an independent catalog structure to do more or less the same
sort of thing.  One good reason *not* to do that is it'll represent even
more stuff that has to be cached during backend startup.

[ thinks for a bit... ]  One reason for having a different structure
would be if we needed to represent abstract semantics for some operators
that couldn't be associated with a btree opclass.  This is clearly not
an issue for what RANGE needs, since anything you can order by will
surely have a btree opclass for that, and in fact we probably need to
tie those operators to specific orderings if a datatype has more than
one sort ordering.  But maybe there could be some other situation where
we'd need to describe operator behavior for a datatype independently of
any sort ordering.  Can anyone come up with a plausible example?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-26 Thread Hitoshi Harada

2010/12/27 Robert Haas :
> On Sun, Dec 26, 2010 at 2:41 PM, Tom Lane  wrote:
>> Hitoshi Harada  writes:
>>> Catching up tonight, I wonder I could propose to add ordering
>>> operators in btree, not in gist, for basic types.
>>
>> I thought about that for a bit while working on the knngist patch, but
>> couldn't really see any useful application.  In particular, I don't see
>> a way to shoehorn + and - in there as ordering operators.  They don't
>> match the structure.  The RANGE problem wants to add operators that are
>> somehow related to a btree's operators, but they're not related in the
>> way that knngist uses.
>
> As far as window functions go, we clearly need some kind of type
> interface feature, but I am unclear whether we should sandwhich it
> into the btree opclass machinery or whether it might be better to
> create a whole separate concept just for this purpose.  Range types
> might also like to have some of the same information (addition and
> subtraction operators for a type, and perhaps also the identity and
> unit if those exist).

I believe we should use btree opclass machinery to represent
add/subtract interfaces because in the RANGE case you eventually need
add constant to value and compare it with some boundary by using
btree. Or, btree operator set would be shared in the type interface
machinery. We will want to avoid duplication between them.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-26 Thread Robert Haas

On Sun, Dec 26, 2010 at 2:41 PM, Tom Lane  wrote:
> Hitoshi Harada  writes:
>> Catching up tonight, I wonder I could propose to add ordering
>> operators in btree, not in gist, for basic types.
>
> I thought about that for a bit while working on the knngist patch, but
> couldn't really see any useful application.  In particular, I don't see
> a way to shoehorn + and - in there as ordering operators.  They don't
> match the structure.  The RANGE problem wants to add operators that are
> somehow related to a btree's operators, but they're not related in the
> way that knngist uses.

It's superficially syntactically similar (value op constant) but as
you say it's really a different problem.  The KNNGIST case could occur
in combination with this case, too: ORDER BY (point-column +
zero-vector-constant) distance-from point-constant.  What's really
going on here is that there's a class of operations which can be
applied to an ORDER-BY column without actually changing the ordering -
adding or subtracting a constant, multiplying by a positive value,
concatenating with the empty string, etc.  In theory, we could try to
recognize such cases and avoid a sort, but it seems like a lot of work
in proportion to the likely benefit.

A far more valuable case to optimize would be the one where you have
ORDER BY a, b and can obtain input pre-sorted by a but not by a, b.
It'd be extremely useful to be able to take the data sorted by just a
and sort each group on b.

As far as window functions go, we clearly need some kind of type
interface feature, but I am unclear whether we should sandwhich it
into the btree opclass machinery or whether it might be better to
create a whole separate concept just for this purpose.  Range types
might also like to have some of the same information (addition and
subtraction operators for a type, and perhaps also the identity and
unit if those exist).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-26 Thread Tom Lane

Hitoshi Harada  writes:
> Catching up tonight, I wonder I could propose to add ordering
> operators in btree, not in gist, for basic types.

I thought about that for a bit while working on the knngist patch, but
couldn't really see any useful application.  In particular, I don't see
a way to shoehorn + and - in there as ordering operators.  They don't
match the structure.  The RANGE problem wants to add operators that are
somehow related to a btree's operators, but they're not related in the
way that knngist uses.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-26 Thread Hitoshi Harada

2010/12/4 Tom Lane :
> What we have at this point (pending contrib/btree_gist fixes) is
> nearest-neighbor searching capability for point columns.  And
> trigram-based nearest-neighbor for text strings, if you install
> contrib/pg_trgm.  That doesn't seem like a lot of return for the
> amount of work that went into it.  Are there plans to add KNN support
> for any other standard types?

Catching up tonight, I wonder I could propose to add ordering
operators in btree, not in gist, for basic types. So far, we couldn't
optimize simple examples like:

regression=# create index ti on t using btree(i);
CREATE INDEX
regression=# explain select * from t order by i - 10 limit 10;
 QUERY PLAN

 Limit  (cost=405.10..405.12 rows=10 width=24)
   ->  Sort  (cost=405.10..430.10 rows=1 width=24)
 Sort Key: ((i - 10))
 ->  Seq Scan on t  (cost=0.00..189.00 rows=1 width=24)
(4 rows)

While this looks too stupid at a glance, adding 2 strategies into
btree, "addition" and "subtraction", will help RANGE concept on data
types; we were stacked around how to implement RANGE in both of window
functions' frame and PARTITION because of lack of "addition" and
"subtraction" idea. If we have 2 more strategies in btree, they can be
solved IMHO. I know addition and subtraction is never relevant to
btree indexing but avoiding unnecessary seq scan and supporting RANGE
concept may buy somehow.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-22 Thread Robert Haas

On Wed, Dec 22, 2010 at 8:04 PM, Bruce Momjian  wrote:
> Tom Lane wrote:
>> Robert Haas  writes:
>> > 2010/9/13 Teodor Sigaev :
>> >> [updated patch]
>>
>> > I realize I'm repeating myself, but...  this patch needs
>> > documentation.  It's not optional.
>>
>> I've applied all of this, and written documentation for all of it,
>> except for the contrib/btree_gist additions which still need to be
>> redone for the revised API (and then documented!).  My patience ran out
>> somewhere around there, so I'm marking that part as returned with
>> feedback.
>>
>> What we have at this point (pending contrib/btree_gist fixes) is
>> nearest-neighbor searching capability for point columns.  And
>> trigram-based nearest-neighbor for text strings, if you install
>> contrib/pg_trgm.  That doesn't seem like a lot of return for the
>> amount of work that went into it.  Are there plans to add KNN support
>> for any other standard types?
>
> I was thinking /contrib/fuzzystrmatch could use it to find the words
> that mostly closely match a string.

Seems unlikely to be feasible.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-22 Thread Bruce Momjian

Tom Lane wrote:
> Robert Haas  writes:
> > 2010/9/13 Teodor Sigaev :
> >> [updated patch]
> 
> > I realize I'm repeating myself, but...  this patch needs
> > documentation.  It's not optional.
> 
> I've applied all of this, and written documentation for all of it,
> except for the contrib/btree_gist additions which still need to be
> redone for the revised API (and then documented!).  My patience ran out
> somewhere around there, so I'm marking that part as returned with
> feedback.
> 
> What we have at this point (pending contrib/btree_gist fixes) is
> nearest-neighbor searching capability for point columns.  And
> trigram-based nearest-neighbor for text strings, if you install
> contrib/pg_trgm.  That doesn't seem like a lot of return for the
> amount of work that went into it.  Are there plans to add KNN support
> for any other standard types?

I was thinking /contrib/fuzzystrmatch could use it to find the words
that mostly closely match a string.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-06 Thread Oleg Bartunov


On Sat, 4 Dec 2010, Greg Stark wrote:


On Sat, Dec 4, 2010 at 6:07 PM, Oleg Bartunov  wrote:

We'll sync contrib/btree_gist with current API. Also, we're thinking
about distance for ltree, boxes, circles. I'm not sure about polygons,
though.
So, we'll have rather complete set of knn-fied data types.


I kind of assumed the natural client for KNN-gist was the tsearch full
text search indexes handling sorting by relevance. For example if I
search for "Postgres DBA" I should find documents where those words
appear adjacent first and documents where the two words appear far
apart in the document sorted further down. Is that not on the list of
operators supported or planned to be supported?


We'll start thinking about this once we know how to store coordinate 
information in index :)


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-04 Thread Tom Lane

Dimitri Fontaine  writes:
> Greg Stark  writes:
>> I kind of assumed the natural client for KNN-gist was the tsearch full
>> text search indexes handling sorting by relevance. For example if I
>> search for "Postgres DBA" I should find documents where those words
>> appear adjacent first and documents where the two words appear far
>> apart in the document sorted further down. Is that not on the list of
>> operators supported or planned to be supported?

> From the presentation I've seen, the typical use case is more searching
> "PostgreSQL DBA" at 100 km around a known location. Or more typical yet,
> Pizza restaurants around a known placeÂ :)

Right offhand I don't see how KNNGIST could usefully be applied to the
problem Greg is thinking about.  A KNNGIST search is only going to be
fast if the target items can be found in a reasonably small part of the
index.  Nearest-neighbor in a geometrically organized index qualifies,
but I don't see how Greg's problem matches the structure of a tsearch
index.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-04 Thread Dimitri Fontaine

Greg Stark  writes:
> I kind of assumed the natural client for KNN-gist was the tsearch full
> text search indexes handling sorting by relevance. For example if I
> search for "Postgres DBA" I should find documents where those words
> appear adjacent first and documents where the two words appear far
> apart in the document sorted further down. Is that not on the list of
> operators supported or planned to be supported?

From the presentation I've seen, the typical use case is more searching
"PostgreSQL DBA" at 100 km around a known location. Or more typical yet,
Pizza restaurants around a known place :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-04 Thread Greg Stark

On Sat, Dec 4, 2010 at 6:07 PM, Oleg Bartunov  wrote:
> We'll sync contrib/btree_gist with current API. Also, we're thinking
> about distance for ltree, boxes, circles. I'm not sure about polygons,
> though.
> So, we'll have rather complete set of knn-fied data types.

I kind of assumed the natural client for KNN-gist was the tsearch full
text search indexes handling sorting by relevance. For example if I
search for "Postgres DBA" I should find documents where those words
appear adjacent first and documents where the two words appear far
apart in the document sorted further down. Is that not on the list of
operators supported or planned to be supported?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-04 Thread Oleg Bartunov


Thanks to all for patience !

We'll sync contrib/btree_gist with current API. Also, we're thinking
about distance for ltree, boxes, circles. I'm not sure about polygons, though.
So, we'll have rather complete set of knn-fied data types.

Oleg
On Sat, 4 Dec 2010, Tom Lane wrote:


Robert Haas  writes:

2010/9/13 Teodor Sigaev :

[updated patch]



I realize I'm repeating myself, but...  this patch needs
documentation.  It's not optional.


I've applied all of this, and written documentation for all of it,
except for the contrib/btree_gist additions which still need to be
redone for the revised API (and then documented!).  My patience ran out
somewhere around there, so I'm marking that part as returned with
feedback.

What we have at this point (pending contrib/btree_gist fixes) is
nearest-neighbor searching capability for point columns.  And
trigram-based nearest-neighbor for text strings, if you install
contrib/pg_trgm.  That doesn't seem like a lot of return for the
amount of work that went into it.  Are there plans to add KNN support
for any other standard types?

regards, tom lane



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-12-03 Thread Tom Lane

Robert Haas  writes:
> 2010/9/13 Teodor Sigaev :
>> [updated patch]

> I realize I'm repeating myself, but...  this patch needs
> documentation.  It's not optional.

I've applied all of this, and written documentation for all of it,
except for the contrib/btree_gist additions which still need to be
redone for the revised API (and then documented!).  My patience ran out
somewhere around there, so I'm marking that part as returned with
feedback.

What we have at this point (pending contrib/btree_gist fixes) is
nearest-neighbor searching capability for point columns.  And
trigram-based nearest-neighbor for text strings, if you install
contrib/pg_trgm.  That doesn't seem like a lot of return for the
amount of work that went into it.  Are there plans to add KNN support
for any other standard types?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-23 Thread Robert Haas

On Tue, Nov 23, 2010 at 11:37 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Nov 23, 2010 at 11:12 AM, Tom Lane  wrote:
>>> It would be the first, because simply assigning another strategy number
>>> only satisfies one of the unique constraints on pg_amop.  To allow
>>> arbitrary flexibility here, we would have to include all components of
>>> the ordering specification in the unique constraint that's presently
>>> just (amopopr, amopfamily) and is proposed to become
>>> (amopopr, amopfamily, amoppurpose).  I think that's an undue amount of
>>> complexity to support something that's most likely physically impossible
>>> from the index's standpoint anyway.
>
>> Or, you'd need to pass these details separately to the AM, which seems
>> like a more more flexible way of doing it.
>
> A more flexible way of doing what?  The first requirement here is that
> the catalog entries provide sufficient information to determine the
> semantics.  You can't just say "this opclass supports ordering", you
> have to specify what that ordering is.  Punting to the index AM helps
> not at all, unless your proposal is to hard-wire this in GIST rather
> than in the core planner.

Eh, let's just do it the way you want to do it.  It's probably going
to have to be redone the next time somebody wants to make an
enhancement in this area, but I guess it's going to be easy to do that
then than to figure where to go with it now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-23 Thread Tom Lane

I wrote:
> We will probably *also* want to pass these details explicitly to the
> index AM, but that doesn't solve the problem that some catalog somewhere
> has to say what it is that the opclass can do.

... although having said that, the obvious question is why that catalog
has to be pg_amop.  Maybe we should leave pg_amop alone (so that it
represents only search operators) and invent a new catalog pg_amorderop.
I envision it having the same columns as pg_amop, plus an ordering
opclass OID (and maybe we might as well stick in asc/desc and nulls_first).
The reason to do this instead of just adding those columns to pg_amop
is that then we can have a different set of unique indexes.  I'm
thinking about (amopfamily, amoplefttype, amoprighttype, amopstrategy),
which would be the same as in pg_amop, plus
(amopopr, amopfamily, amopstrategy).  This would allow a single operator
to be registered under multiple strategy numbers, which presumably would
carry different payload sort-order-specification columns.

This still seems like overkill to me, because I don't actually believe
that it's practical for an index to support multiple sort orders.
But if anyone would like to make an argument why that's not pie in the
sky, this might be the way to represent it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-23 Thread Tom Lane

Robert Haas  writes:
> On Tue, Nov 23, 2010 at 11:12 AM, Tom Lane  wrote:
>> It would be the first, because simply assigning another strategy number
>> only satisfies one of the unique constraints on pg_amop.  To allow
>> arbitrary flexibility here, we would have to include all components of
>> the ordering specification in the unique constraint that's presently
>> just (amopopr, amopfamily) and is proposed to become
>> (amopopr, amopfamily, amoppurpose).  I think that's an undue amount of
>> complexity to support something that's most likely physically impossible
>> from the index's standpoint anyway.

> Or, you'd need to pass these details separately to the AM, which seems
> like a more more flexible way of doing it.

A more flexible way of doing what?  The first requirement here is that
the catalog entries provide sufficient information to determine the
semantics.  You can't just say "this opclass supports ordering", you
have to specify what that ordering is.  Punting to the index AM helps
not at all, unless your proposal is to hard-wire this in GIST rather
than in the core planner.

We will probably *also* want to pass these details explicitly to the
index AM, but that doesn't solve the problem that some catalog somewhere
has to say what it is that the opclass can do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-23 Thread Robert Haas

On Tue, Nov 23, 2010 at 11:12 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Nov 22, 2010 at 11:05 PM, Tom Lane  wrote:
>>> I'm satisfied to say that only one sort order can be associated with a
>>> particular operator in a particular opclass, which is what would be
>>> implied by using AMOP_SEARCH/AMOP_ORDER as the unique key component.
>
>> Does that imply that KNNGIST would only be able to support one
>> ordering per AMOP_ORDER-operator, or does it imply that each such
>> ordering would require a separate strategy number?  The second might
>> be OK, but the first sounds bad.
>
> It would be the first, because simply assigning another strategy number
> only satisfies one of the unique constraints on pg_amop.  To allow
> arbitrary flexibility here, we would have to include all components of
> the ordering specification in the unique constraint that's presently
> just (amopopr, amopfamily) and is proposed to become
> (amopopr, amopfamily, amoppurpose).  I think that's an undue amount of
> complexity to support something that's most likely physically impossible
> from the index's standpoint anyway.

Or, you'd need to pass these details separately to the AM, which seems
like a more more flexible way of doing it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-23 Thread Tom Lane

Robert Haas  writes:
> On Mon, Nov 22, 2010 at 11:05 PM, Tom Lane  wrote:
>> I'm satisfied to say that only one sort order can be associated with a
>> particular operator in a particular opclass, which is what would be
>> implied by using AMOP_SEARCH/AMOP_ORDER as the unique key component.

> Does that imply that KNNGIST would only be able to support one
> ordering per AMOP_ORDER-operator, or does it imply that each such
> ordering would require a separate strategy number?  The second might
> be OK, but the first sounds bad.

It would be the first, because simply assigning another strategy number
only satisfies one of the unique constraints on pg_amop.  To allow
arbitrary flexibility here, we would have to include all components of
the ordering specification in the unique constraint that's presently
just (amopopr, amopfamily) and is proposed to become
(amopopr, amopfamily, amoppurpose).  I think that's an undue amount of
complexity to support something that's most likely physically impossible
from the index's standpoint anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-23 Thread Robert Haas

On Mon, Nov 22, 2010 at 11:05 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Nov 22, 2010 at 8:32 PM, Tom Lane  wrote:
>>> On balance I'm inclined to leave the unique key as per previous proposal
>>> (with a "purpose" column) and add the which-sort-order-is-that
>>> information as payload columns that aren't part of the key.
>
>> This is probably OK too, although I confess I'm a lot less happy about
>> it now that you've pointed out the need for those payload columns.
>
> The reason I said "columns" is that I can foresee eventually wanting to
> specify a pathkey in its entirety --- opfamily, asc/desc, nulls_first,
> and whatever we come up with for collation.  We don't currently need to
> store more than the opfamily, since the others can never need to have
> non-default values given the current implementation of KNNGIST.  But the
> idea that they might all be there eventually makes me feel that we don't
> want to try to incorporate this data in pg_amop's unique key.  I'm
> satisfied to say that only one sort order can be associated with a
> particular operator in a particular opclass, which is what would be
> implied by using AMOP_SEARCH/AMOP_ORDER as the unique key component.

Does that imply that KNNGIST would only be able to support one
ordering per AMOP_ORDER-operator, or does it imply that each such
ordering would require a separate strategy number?  The second might
be OK, but the first sounds bad.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-22 Thread Tom Lane

Robert Haas  writes:
> On Mon, Nov 22, 2010 at 8:32 PM, Tom Lane  wrote:
>> On balance I'm inclined to leave the unique key as per previous proposal
>> (with a "purpose" column) and add the which-sort-order-is-that
>> information as payload columns that aren't part of the key.

> This is probably OK too, although I confess I'm a lot less happy about
> it now that you've pointed out the need for those payload columns.

The reason I said "columns" is that I can foresee eventually wanting to
specify a pathkey in its entirety --- opfamily, asc/desc, nulls_first,
and whatever we come up with for collation.  We don't currently need to
store more than the opfamily, since the others can never need to have
non-default values given the current implementation of KNNGIST.  But the
idea that they might all be there eventually makes me feel that we don't
want to try to incorporate this data in pg_amop's unique key.  I'm
satisfied to say that only one sort order can be associated with a
particular operator in a particular opclass, which is what would be
implied by using AMOP_SEARCH/AMOP_ORDER as the unique key component.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-22 Thread Robert Haas

On Mon, Nov 22, 2010 at 8:32 PM, Tom Lane  wrote:
> The reason I bring this up now is that it affects the decision as to
> what the unique key for pg_amop ought to be.  Instead of having an
> enum "purpose" column, maybe we should consider that the unique key
> is (operator oid, opfamily oid, order-by-oid), where order-by-oid
> is zero for a search operator and the OID of the btree opclass or sort
> operator for an ordering operator.  This would be of value if we
> imagined that a single opclass could support ordering by more than one
> distance ordering operator; which seems a bit far-fetched but perhaps
> not impossible.  On the other side of the coin it'd mean we aren't
> leaving room for other sorts of operator "purposes".

Since the need for additional purposes is mostly hypothetical, this
wouldn't bother me any.

> On balance I'm inclined to leave the unique key as per previous proposal
> (with a "purpose" column) and add the which-sort-order-is-that
> information as payload columns that aren't part of the key.

This is probably OK too, although I confess I'm a lot less happy about
it now that you've pointed out the need for those payload columns.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-22 Thread Tom Lane

I wrote:
> As far as the syntax of CREATE/ALTER OPERATOR CLASS/FAMILY is concerned,
> I like Robert's proposal (OPERATOR ... FOR ORDER or FOR SEARCH) better
> than Teodor's, though we don't need the multiple-purposes-per-entry
> aspect of it.  This is mainly because it looks more easily extensible
> if we ever do get around to having more purposes.

Although having said that ... I was poking around a bit more and noticed
how entirely bogus the current patch's matching of pathkeys is.  It pays
no attention at all to which pk_opfamily is specified, which means it
will fail given a query like
SELECT ... ORDER BY indexable_col <-> constant USING <<<
where <<< represents some sort order that has nothing to do with the
order that GIST is expecting.  The query will be considered to match the
index and then it will proceed to produce results in the wrong order.

We could perhaps kluge around that by insisting that the pathkey opclass
be the default btree opclass for the relevant datatype; but that's
fairly expensive to check for in match_pathkey_to_index, and it's
getting even further away from having a general-purpose solution.

I'm inclined to think that instead of just specifying "FOR ORDER",
the opclass definition ought to specify "FOR ORDER BY something",
where "something" could perhaps be a pre-existing btree opclass name.
Or maybe it could be a suitable sort operator, though then we'd have
to do a bit of reverse-engineering in match_pathkey_to_index, so that's
still no fun from a lookup-speed point of view.  In any case what we'd
be specifying is a sort order for the datatype that the opclass member
operator yields, not the indexed datatype (so in practice it'd usually
be float8_ops or float8 <, no doubt).

The reason I bring this up now is that it affects the decision as to
what the unique key for pg_amop ought to be.  Instead of having an
enum "purpose" column, maybe we should consider that the unique key
is (operator oid, opfamily oid, order-by-oid), where order-by-oid
is zero for a search operator and the OID of the btree opclass or sort
operator for an ordering operator.  This would be of value if we
imagined that a single opclass could support ordering by more than one
distance ordering operator; which seems a bit far-fetched but perhaps
not impossible.  On the other side of the coin it'd mean we aren't
leaving room for other sorts of operator "purposes".

On balance I'm inclined to leave the unique key as per previous proposal
(with a "purpose" column) and add the which-sort-order-is-that
information as payload columns that aren't part of the key.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-22 Thread Tom Lane

I've done some *very* preliminary review of this patch now.  I think
that the way to move forward is to first commit the work surrounding
changes to pg_amop, including suitable changes to CREATE/ALTER OPERATOR
CLASS/FAMILY so that there's a way to put the new info into the table.
The system won't initially *do* anything useful with ordering-operator
entries, but this is necessary infrastructure before we can start to
do anything interesting.

After reviewing both Teodor's and Robert's patches for the issue,
I believe that Teodor has a better basic approach: if an operator
is useful for both searching and ordering, the way to handle that
is to make two pg_amop entries for it, with different strategy
numbers.  This means in particular that we can continue to require
strategy numbers to be unique within an opclass, so right offhand
I see no need to widen pg_amop_fam_strat_index to five columns.
(Which means we don't need that patch to allow five-key syscaches.)
What we need instead is to add the "purpose" column to the existing
unique index on (amopopr, amopfamily).  In English that means that
instead of allowing an operator to have only one entry per opfamily,
it can now have one entry per opfamily per purpose.

I do however concur with Robert that it's a bit silly to go to all this
work and not leave room in the design for additional operator "purposes"
later.  So rather than using a boolean amoporder column, I think we want
an enumerated-type column "amoppurpose".  Per our usual system catalog
conventions for "poor man's enums", I suggest declaring the column as
"char" with the two allowed values
AMOP_SEARCH 's'
AMOP_ORDER  'o'
Aside from leaving room for possible extension, I think that using
AMOP_SEARCH/AMOP_ORDER rather than true/false in the code will be more
readable and less error-prone.

As far as the syntax of CREATE/ALTER OPERATOR CLASS/FAMILY is concerned,
I like Robert's proposal (OPERATOR ... FOR ORDER or FOR SEARCH) better
than Teodor's, though we don't need the multiple-purposes-per-entry
aspect of it.  This is mainly because it looks more easily extensible
if we ever do get around to having more purposes.

There's a lot more to be done after this, but if there are no objections
to this much, I'll go merge the relevant parts of the two patches and
commit.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-21 Thread Tom Lane

Robert Haas  writes:
> That doesn't seem very hard on its face.  The trick is what to do with
> that information once you've got it.  As far as I can tell, you need
> to drill some kind of hole that lets you pass "additional details
> about the desired sort order" to the index AM.

We clearly need to add additional information to IndexScan plan nodes
to tell the index AM which sort order is required.  Up to now, an
indexscan has only had one possible resultant sort order (two if you
count backwards scan, but as I said I don't think generalizing that
particular feature is the way to approach this).  I would imagine
that the best way to handle that is to add a PathKey list or something
equivalent to it, and add that to the arguments passed to ambeginscan.

The other issue is how the planner can figure out what the possible
orderings are when it's considering an index.  You seem to be
contemplating adding a new index AM function that the planner would call
at the right point; but I'm not sure that that adds much of anything,
because the index AM can't have hard-wired behavior either.  We really
have to have enough information in the system catalog entries about an
opclass to allow the possible orderings to be determined.  Given that,
I think it makes more sense for the core planner to know what to do than
to put possibly duplicative code into multiple AMs.

I guess a third alternative would be to create per-opclass hook
functions for the planner to call, but I'm not thrilled with that
idea; it would still be largely duplicative code, and in a lot more
places.  I think it would also bind our hands with respect to making
internal planner changes in future, because the data structures
representing pathkeys would be pretty well locked down by such a choice.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-21 Thread Robert Haas

On Sun, Nov 21, 2010 at 5:24 PM, Tom Lane  wrote:
> I haven't spent any time on this patch yet (hope to start looking at it
> next week).  As for your specific question above, I don't have a big
> problem with reusing ScanKey this way, but I do agree that using
> RestrictInfo for this would be a crock.  ISTM what we ought to have is
> just the ability to match PathKeys with expressions of the form
> "indexedcol op constant" to an index.

That doesn't seem very hard on its face.  The trick is what to do with
that information once you've got it.  As far as I can tell, you need
to drill some kind of hole that lets you pass "additional details
about the desired sort order" to the index AM.  What I'd sort of like
to be able to do is throw the PathKeys at the index AM and say "you
want these?".  Short of that, we're probably going to have to resign
ourselves to the core code basically knowing exactly what the
capabilities of KNNGIST are, making the index API pretty porous - not
that it already isn't.  There's really nothing special about the
subset of the problem space KNNGIST happens to attack except that it
makes the GIS guys drool; the next problem someone wants to attack in
this area is as likely as not to look completely different.

> This example leaves me totally cold, not least because it assumes a
> specific storage implementation for nulls in an index.  It is also,
> I think, misunderstanding what ScanDirection is for.  That's only
> intended to allow executor plans to be run forward and then backed up
> in the same fashion that fetching backwards from a cursor would do;
> which is not a btree-specific concept, indeed not even index-specific.

Ah, OK.

> If there is sufficient interest in doing what you suggest, what we'd
> want to do is pass the PathKey representation to the index and let the
> index AM figure out what it has to do to produce that sort order.  But
> that is way way down my priority list.

Yeah, this is basically what I'm wondering whether we can reasonably
do for KNNGIST; with hopes of later reuse.  But it may be unworkable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-21 Thread Tom Lane

Robert Haas  writes:
> 2010/11/12 Teodor Sigaev :
>> My variants informs GiST by SK_ORDER flags and consistentFn looks at
>> strategy number (strategy numbers are different for different purposes).

> Yeah.  At ten thousand feet, I think the open design question here is
> to what extent it's OK to rely on the fact that the ORDER BY clauses
> we wish to optimize happen to look a lot like the WHERE clauses we
> already know how to optimize: namely, they're both binary opclauses of
> the form   .  Your patch manages to
> reuse a LOT of existing machinery by shoving ordering expressions
> through the same code paths that quals take.  Code reuse is generally
> a good thing, but here's we're forming RestrictInfo and ScanKey
> objects out of things that are neither restrictions nor keys, which
> might lead to maintainability problems down the road.  I'd like to get
> some input from Tom on how he feels about that, and any alternatives
> he sees.

I haven't spent any time on this patch yet (hope to start looking at it
next week).  As for your specific question above, I don't have a big
problem with reusing ScanKey this way, but I do agree that using
RestrictInfo for this would be a crock.  ISTM what we ought to have is
just the ability to match PathKeys with expressions of the form
"indexedcol op constant" to an index.

I'm undecided about the big-picture question of how much extra
generality ought to be put into the system along with this patch.
The argument not to is that with no candidate uses of additional
generality on the horizon, it's a waste of time to design something
more general, because we'll probably get it wrong anyway.  I'm not
a fan of designing APIs without use-cases in mind.  On the other hand,
there's an argument *for* doing something more general, which is
basically Polya's paradox: the more general problem may be easier to
solve.  To support that argument, we'd need a design that is clearly
cleaner than bolting KNNGIST on according to the current patch.
AIUI we don't have that at the moment, but I still think it's worth
spending a bit more time looking for one.

> It seems to me that our concept of ScanDirection is really woefully
> under-expressive.  For example, given:

> CREATE TABLE foo (
> id integer NOT NULL,
> name character varying NOT NULL,
> PRIMARY KEY (id)
> );

> We use the index for the first of these but not the second:

> select * from foo order by id nulls last;
> select * from foo order by id nulls first;

> In an ideal world, we'd like to handle the second one by finding the
> first non-NULL entry in the index, scanning away from the NULLs, and
> then, when we run off the end, looping back around to spit out the
> NULL entries.

This example leaves me totally cold, not least because it assumes a
specific storage implementation for nulls in an index.  It is also,
I think, misunderstanding what ScanDirection is for.  That's only
intended to allow executor plans to be run forward and then backed up
in the same fashion that fetching backwards from a cursor would do;
which is not a btree-specific concept, indeed not even index-specific.
If there is sufficient interest in doing what you suggest, what we'd
want to do is pass the PathKey representation to the index and let the
index AM figure out what it has to do to produce that sort order.  But
that is way way down my priority list.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-20 Thread Robert Haas

2010/11/12 Teodor Sigaev :
>> Slightly-more-fleshed out proof of concept patch attached, with actual
>> syntax, documentation, and pg_dump support added.  This might be
>> thought of as a subset of the builtin_knngist_core patch, without the
>> parts that make it actually do something useful (which is mostly
>> match_pathkey_to_index - which I'm still rather hoping to abstract in
>> some way via the access method interface, though I'm currently unsure
>> what the best way to do that is).
>
> I don't see in your patch how to propagate knowledge of kind of operation
> (AMOP_SEARCH or AMOP_ORDER) to GiST and consistent method. For both of them
> they aren't distinguishable. That's not acceptably for both, because GiST
> needs to choose right traversal algorithm, consistentFn needs role to decide
> return infinity or false (-1) value.

My version of the patch is suffering from a bad case of not being
done.  It sort of started out as a proof-of-concept to show what the
syntax might look like and seems to be gradually growing into
something more real.  I'm not sure what we'll end up with.

> My variants informs GiST by SK_ORDER flags and consistentFn looks at
> strategy number (strategy numbers are different for different purposes).

Yeah.  At ten thousand feet, I think the open design question here is
to what extent it's OK to rely on the fact that the ORDER BY clauses
we wish to optimize happen to look a lot like the WHERE clauses we
already know how to optimize: namely, they're both binary opclauses of
the form   .  Your patch manages to
reuse a LOT of existing machinery by shoving ordering expressions
through the same code paths that quals take.  Code reuse is generally
a good thing, but here's we're forming RestrictInfo and ScanKey
objects out of things that are neither restrictions nor keys, which
might lead to maintainability problems down the road.  I'd like to get
some input from Tom on how he feels about that, and any alternatives
he sees.

It seems to me that our concept of ScanDirection is really woefully
under-expressive.  For example, given:

CREATE TABLE foo (
id integer NOT NULL,
name character varying NOT NULL,
PRIMARY KEY (id)
);

We use the index for the first of these but not the second:

select * from foo order by id nulls last;
select * from foo order by id nulls first;

In an ideal world, we'd like to handle the second one by finding the
first non-NULL entry in the index, scanning away from the NULLs, and
then, when we run off the end, looping back around to spit out the
NULL entries.  Or, similarly, if we have an index on (id, name), we
can handle the first two of the following with the index, but not the
second two:

select * from foo order by id, name;
select * from foo order by id desc, name desc;
select * from foo order by id, name desc;
select * from foo order by id, name nulls last;
(can't handle this last one even if name is marked NOT NULL!)

It seems like it might even be possible to handle these (albeit maybe
not real efficiently) via a sufficiently complicated scanning order,
but even if we had the code to do the scan, we have no way of telling
the index what scan direction we want: forward, backward, and no
movement don't really cut it.  The idea that forward and backward mean
anything at all is really pretty btree-centric.

The trick, of course, is to come up with something better.  I have a
vague notion of using something like an array or list of objects of
the form .  That could easily be extended with collation or
whatever as needed.  The trick is - you need to create this object and
then ask the index - hey, is this something you can support?  And the
index needs to then respond by saying whether it can do that (and
maybe doing some sort of translation into what instructions should be
provided at execution time).

Trouble is, that sounds like a lot of work on code I'm not altogether
comfortable with, and I'd really like to get KNNGIST in shape for 9.1
if possible.  I'm not real sure where to draw the line between, on the
one hand, not wanting to commit something that is excessively crufty
and, on the other hand, wanting to finish in finite time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-12 Thread Teodor Sigaev


Slightly-more-fleshed out proof of concept patch attached, with actual
syntax, documentation, and pg_dump support added.  This might be
thought of as a subset of the builtin_knngist_core patch, without the
parts that make it actually do something useful (which is mostly
match_pathkey_to_index - which I'm still rather hoping to abstract in
some way via the access method interface, though I'm currently unsure
what the best way to do that is).


I don't see in your patch how to propagate knowledge of kind of operation 
(AMOP_SEARCH or AMOP_ORDER) to GiST and consistent method. For both of them they 
aren't distinguishable. That's not acceptably for both, because GiST needs to 
choose right traversal algorithm, consistentFn needs role to decide return 
infinity or false (-1) value.


My variants informs GiST by SK_ORDER flags and consistentFn looks at strategy 
number (strategy numbers are different for different purposes).




I notice that builtin_knngist_core checks whether the return type of
an ordering operator has a built-in btree opclass.  I'm not sure
Actually, GiST doesn't use that knowledge, check is done only to be sure that 
operation returns orderable data type.



whether we should bother checking that, because even if it's true I
don't think there's anything preventing it from becoming false later.
I think it's probably sufficient to just check this condition at plan
time and silently skip trying to build knn-type index paths if it's
not met.


It's already do it: you can not ORDER BY over non-orderable data type. That 
check just make diagnostic earlier.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-11-12 Thread Bruce Momjian


Robert, it is great you are taking this on.  This is really a well-known
area of the code for you, but not so much for Teodor and Oleg, so I am
sure they appreciate your assistance.

---

Robert Haas wrote:
> On Sat, Oct 16, 2010 at 9:54 PM, Robert Haas  wrote:
> >> Thinking about it that way, perhaps we could add an integer column
> >> amop_whats_it_good_for that gets used as a bit field. ?That wouldn't
> >> require changing the index structure, although it might break some
> >> other things.
> >
> > I gave this a shot (though I called it amoppurpose rather than
> > amop_whats_it_good_for) and I think it's a reasonable way to proceed.
> > Proof-of-concept patch attached. ?This just adds the column (using the
> > existing padding space), defines AMOP_SEARCH and AMOP_ORDER, and makes
> > just about everything ignore anything not marked AMOP_SEARCH,
> > attached. ?This would obviously need some more hacking to pay
> > attention to AMOP_ORDER where relevant, etc. and to create some actual
> > syntax around it. ?Currently CREATE OPERATOR CLASS / ALTER OPERATOR
> > FAMILY have this bit:
> >
> > OPERATOR strategy_number ( op_type [ , op_type ] )
> >
> > knngist-0.9 implements this:
> >
> > [ORDER] OPERATOR strategy_number ( op_type [, op_type ] )
> >
> > ...but with the design proposed above that's not quite what we'd want,
> > because amoppurpose is a bit field, so you could have one or both of
> > the two possible purposes. ?Perhaps:
> >
> > OPERATOR strategy_number ( op_type [ , op_type ] ) [ FOR { SEARCH |
> > ORDER } [, ...] ]
> >
> > With the default being FOR SEARCH.
> 
> Slightly-more-fleshed out proof of concept patch attached, with actual
> syntax, documentation, and pg_dump support added.  This might be
> thought of as a subset of the builtin_knngist_core patch, without the
> parts that make it actually do something useful (which is mostly
> match_pathkey_to_index - which I'm still rather hoping to abstract in
> some way via the access method interface, though I'm currently unsure
> what the best way to do that is).
> 
> I notice that builtin_knngist_core checks whether the return type of
> an ordering operator has a built-in btree opclass.  I'm not sure
> whether we should bother checking that, because even if it's true I
> don't think there's anything preventing it from becoming false later.
> I think it's probably sufficient to just check this condition at plan
> time and silently skip trying to build knn-type index paths if it's
> not met.
> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

[ Attachment, skipping... ]

> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-24 Thread Robert Haas

2010/10/22 Teodor Sigaev :
>> You can define the additional argument as providing all of the extra
>> info about how the operator is being used, and, if it's being used for
>> ordering, the details of the requested order.  What is your thinking
>> on the matter?
>
> Maby be useful, but it seems to me to be a bit overengineering for now. GiST
> will not support knn-search with different than ASC NULLS LAST order in
> foresee future, because it stores NULLs in unmaintainable manner for
> different ordering: NULLs could be everywhere in tree. So, implementation of
> ASC NULLS FIRST requires to read whole tree to find all NULLs value first. I
> can imagine, how to reorganize tree to store NULLs together for
> single-column index, but not for multi-column index. Orders DESC NULLS
> LAST/FIRST requires to introduce two more "infinite" values: one to notion
> of distance more than +INF and another, non-negative, but less than zero
> distance.

I don't think it's overengineering, just because making core changes
to the planner is such a pain in the neck that we don't want to have
to come back and do it again any too soon, or at least we'd like them
to be as minor as possible.  It's better to have one API break and be
done with it than maybe have to come back and do a second round of
code changes.  And from what I can see it's not going to be that ugly.
 I'm trying to drum up some time to hack on it...

> Again, it could be useful for modification of GiST known as ordered GiST.
> But ordered GiST uses completely different tree traversal algorithm for
> search and insert, and it hasn't any benefits comparing with B-Tree and
> GiST. It's looks like an autogiro which successfully combines disadvantages
> of airplanes and helicopters :)

Heh.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-22 Thread Teodor Sigaev


You can define the additional argument as providing all of the extra
info about how the operator is being used, and, if it's being used for
ordering, the details of the requested order.  What is your thinking
on the matter?


Maby be useful, but it seems to me to be a bit overengineering for now. GiST 
will not support knn-search with different than ASC NULLS LAST order in foresee 
future, because it stores NULLs in unmaintainable manner for different ordering: 
NULLs could be everywhere in tree. So, implementation of ASC NULLS FIRST 
requires to read whole tree to find all NULLs value first. I can imagine, how to 
reorganize tree to store NULLs together for single-column index, but not for 
multi-column index. Orders DESC NULLS LAST/FIRST requires to introduce two more 
"infinite" values: one to notion of distance more than +INF and another, 
non-negative, but less than zero distance.


Again, it could be useful for modification of GiST known as ordered GiST. But 
ordered GiST uses completely different tree traversal algorithm for search and 
insert, and it hasn't any benefits comparing with B-Tree and GiST. It's looks 
like an autogiro which successfully combines disadvantages of airplanes and 
helicopters :)

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-19 Thread Robert Haas

2010/10/19 Teodor Sigaev :
>>> Thinking about it that way, perhaps we could add an integer column
>>> amop_whats_it_good_for that gets used as a bit field.  That wouldn't
>>> require changing the index structure, although it might break some
>>> other things.
>
>> OPERATOR strategy_number ( op_type [ , op_type ] ) [ FOR { SEARCH |
>> ORDER } [, ...] ]
>
> It's very desirable thing to be able to distinguish roles in consistent
> method of GiST: computation of distance could be very expensive and. The
> single way to provide it in current GiST interface is a strategy number. Of
> course, we could add 6-th argument to consistent to point role, but I don't
> think that's good decision.

To me, adding an additional argument (or maybe providing a whole
separate support function, separate from consistent) seems quite
natural, because now you have a way to pass all the other little bits
that might matter... ASC/DESC, NULLS FIRST/LAST, collation OID, etc.
You can define the additional argument as providing all of the extra
info about how the operator is being used, and, if it's being used for
ordering, the details of the requested order.  What is your thinking
on the matter?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-19 Thread Teodor Sigaev


combination of parameters is one that it can handle.  I'm thinking
perhaps in lieu of a boolean, we can add another indexam method which,
if not InvalidOid, gets called when we're wondering about whether a
given clause is something that the index can order by.  Although
knngist focuses on a the ORDER BY col OP constant case, which


Hmm, interesting idea. To be more general, amcanorder (without byop suffix) 
could be eliminated too.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-19 Thread Teodor Sigaev


Thinking about it that way, perhaps we could add an integer column
amop_whats_it_good_for that gets used as a bit field.  That wouldn't
require changing the index structure, although it might break some
other things.



OPERATOR strategy_number ( op_type [ , op_type ] ) [ FOR { SEARCH |
ORDER } [, ...] ]


It's very desirable thing to be able to distinguish roles in consistent method 
of GiST: computation of distance could be very expensive and. The single way to 
provide it in current GiST interface is a strategy number. Of course, we could 
add 6-th argument to consistent to point role, but I don't think that's good 
decision.



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-19 Thread Teodor Sigaev


3) 3-rd boolean column (amopopr, amopfamily, amoporder)
- could be two records per operator
+ operator could be used in both roles
+ strategy number could be different for different roles



How can #3 work at all?  It's ignoring a couple of critical index
columns.  In particular, I believe the sticking point here is this
unique index:

 "pg_amop_fam_strat_index" UNIQUE, btree (amopfamily, amoplefttype, 
amoprighttype, amopstrategy)

I believe, that columns (amoplefttype, amoprighttype) are fixed for operation 
and could not be changed. So, two operations in one opfamily should be differ in 
strategy number for different roles. It also gives a direct way to transfer 
knowledge abot role to consistent method of GiST.



I'm not terribly thrilled with using a boolean here in any case.
Now that we have two "roles" an operator might play in an opclass,
who's to say there might not be more roles someday?  We should use
a column type that will support more than two roles without basic
rejiggering.


It's easy to change to char type, for now only 's'search and 'o'order characters 
will be allowed.



BTW, have we discussed the idea of embedding the role in the strategy
number?  For example, require regular operators to have strategy


In this schema, one operator could not be used in more than one role. Right now 
it's not strict limitation, because the we still don't have an example of 
boolean distance.


Anyhow, it needs to distinguish roles while IndexPath is built. So,
op_in_opfamily/get_op_opfamily_strategy/get_op_opfamily_properties and friends 
will accept extra argument pointing to interested role.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-19 Thread David Fetter

On Mon, Oct 18, 2010 at 04:13:04PM +0100, Mark Cave-Ayland wrote:
> David Fetter wrote:
> 
> >>For my vote, I'd prefer either the Oid of a custom type or an array
> >>of Oid, Datum pairs - i.e. something we can extend in the future if
> >>required.
> >
> >This sounds a lot like a foreign key to another table.  Are you not
> >proposing doing that because of performance considerations?
> >
> >Cheers,
> >David.
> 
> Well, in PostGIS a typmod contains 3 pieces of information:
> 
> 1) the SRID
> 2) the dimension
> 3) the geometry type
> 
> The SRID is technically already a foreign key into another table,
> with dimension and SRID as other information. At the moment, we
> bit-pack the dimension and geometry type into the SRID and use that
> as the typmod but this only leaves 21 bits IIRC for the SRID. The
> additional complication is that SRIDs at the higher end of the range
> are allowed for anyone to use, and so people may have their own
> customised spheroids defined in this region of the table.

Sounds like you need space for all of these separately, and once you
have that, you'll probably start thinking about other possible pieces
of information you could store, especially as you start to store new
data types and have new operators on them.

It sounds to me as thought the typemod, or something like it, needs to
be able to store, in general, completely arbitrary information,
although not likely much over a page or two of memory.  Cf.  Robert
Haas's recent comment about Klingon.  While we could make this a
completely opaque structure from the SQL level, I'm not sure it's a
good idea.

> If we had a foreign key into another table, we'd need to ensure that
> no one could tamper with it as otherwise all chaos would break lose,
> e.g. breaking the geometry type constraint on a column.

Our system catalog tables have the nice property that no one can
accidentally modify them by hand, and "all chaos," as you put it, can
reasonably be expected to occur should they choose to do so.

> Heck, we even have people deleting the geometry_columns table
> sometimes because they are not aware of what it does.  By storing
> this information in the PG catalog then this can't happen, plus the
> information is available easily in Form_pg_attribute without having
> to implement our own cache, with its own related problems such as
> how/when to invalidate etc.

:)

> There is also a chance that we'd want to include additional
> information in the future related to geometry validity, for example,
> which would mean further reducing the range allowed within the
> spatial_ref_sys table in its existing form.

Right.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-18 Thread Tom Lane

Robert Haas  writes:
> On Mon, Oct 18, 2010 at 3:40 PM, Peter Eisentraut  wrote:
>> Frankly, I have no idea what knngist has to do with any of this, as I
>> haven't followed that topic at all.

> Me neither, except for the fact that the PostGIS guys are interested
> in both things.  I think this thread has been hijacked OT.

We got there by discussing how ordering information could be passed
around for knngist operators.  Robert said something about side
channels, and I mentioned that maybe collation information should
pass through a side channel as well instead of trying to make it work
like typmods, and then it drifted to rethinking typmods as long as we
were thinking about having to touch everyplace that deals in typmods.
I agree the thread is far OT --- most of the recent messages belong
in discussion of the collation patch more than they do with knngist.
But there is some overlap if you think about whether these problems
could be solved together.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-18 Thread Robert Haas

On Mon, Oct 18, 2010 at 3:40 PM, Peter Eisentraut  wrote:
> On mån, 2010-10-18 at 15:36 -0400, Robert Haas wrote:
>> On Mon, Oct 18, 2010 at 3:33 PM, Peter Eisentraut  wrote:
>> > On mån, 2010-10-18 at 11:41 +0100, Mark Cave-Ayland wrote:
>> >> Paul Ramsey wrote:
>> >>
>> >> >> So what kind of data structure would you like for a typmod?
>> >> >
>> >> > I'm a primitive enough beast that just having 64-bits would make me
>> >> > happy. As a general matter though, a bytea?
>> >> >
>> >> > P
>> >>
>> >> For my vote, I'd prefer either the Oid of a custom type or an array of
>> >> Oid, Datum pairs - i.e. something we can extend in the future if required.
>> >
>> > I think if we really wanted to design this generally, we'd give a type
>> > function arguments.  So, numeric would get (int default = 0, int default
>> > = 0).  That can easily get very complicated, of course.
>> >
>> > In any case, for the shorter term, it's clear that refactoring the
>> > passing around of type + typmod would help this endeavor, so I'm going
>> > to give it a try.
>>
>> By "this endeavor" do you mean KNNGIST, or per-column collation?
>
> No, I was referring to the talk about a different/better/more flexible
> typmod.  Per-column collation could also benefit, as you have observed.
> Frankly, I have no idea what knngist has to do with any of this, as I
> haven't followed that topic at all.

Me neither, except for the fact that the PostGIS guys are interested
in both things.  I think this thread has been hijacked OT.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-18 Thread Peter Eisentraut

On mån, 2010-10-18 at 15:36 -0400, Robert Haas wrote:
> On Mon, Oct 18, 2010 at 3:33 PM, Peter Eisentraut  wrote:
> > On mån, 2010-10-18 at 11:41 +0100, Mark Cave-Ayland wrote:
> >> Paul Ramsey wrote:
> >>
> >> >> So what kind of data structure would you like for a typmod?
> >> >
> >> > I'm a primitive enough beast that just having 64-bits would make me
> >> > happy. As a general matter though, a bytea?
> >> >
> >> > P
> >>
> >> For my vote, I'd prefer either the Oid of a custom type or an array of
> >> Oid, Datum pairs - i.e. something we can extend in the future if required.
> >
> > I think if we really wanted to design this generally, we'd give a type
> > function arguments.  So, numeric would get (int default = 0, int default
> > = 0).  That can easily get very complicated, of course.
> >
> > In any case, for the shorter term, it's clear that refactoring the
> > passing around of type + typmod would help this endeavor, so I'm going
> > to give it a try.
> 
> By "this endeavor" do you mean KNNGIST, or per-column collation?

No, I was referring to the talk about a different/better/more flexible
typmod.  Per-column collation could also benefit, as you have observed.
Frankly, I have no idea what knngist has to do with any of this, as I
haven't followed that topic at all.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-18 Thread Robert Haas

On Mon, Oct 18, 2010 at 3:33 PM, Peter Eisentraut  wrote:
> On mån, 2010-10-18 at 11:41 +0100, Mark Cave-Ayland wrote:
>> Paul Ramsey wrote:
>>
>> >> So what kind of data structure would you like for a typmod?
>> >
>> > I'm a primitive enough beast that just having 64-bits would make me
>> > happy. As a general matter though, a bytea?
>> >
>> > P
>>
>> For my vote, I'd prefer either the Oid of a custom type or an array of
>> Oid, Datum pairs - i.e. something we can extend in the future if required.
>
> I think if we really wanted to design this generally, we'd give a type
> function arguments.  So, numeric would get (int default = 0, int default
> = 0).  That can easily get very complicated, of course.
>
> In any case, for the shorter term, it's clear that refactoring the
> passing around of type + typmod would help this endeavor, so I'm going
> to give it a try.

By "this endeavor" do you mean KNNGIST, or per-column collation?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-18 Thread Peter Eisentraut

On mån, 2010-10-18 at 11:41 +0100, Mark Cave-Ayland wrote:
> Paul Ramsey wrote:
> 
> >> So what kind of data structure would you like for a typmod?
> > 
> > I'm a primitive enough beast that just having 64-bits would make me
> > happy. As a general matter though, a bytea?
> > 
> > P
> 
> For my vote, I'd prefer either the Oid of a custom type or an array of 
> Oid, Datum pairs - i.e. something we can extend in the future if required.

I think if we really wanted to design this generally, we'd give a type
function arguments.  So, numeric would get (int default = 0, int default
= 0).  That can easily get very complicated, of course.

In any case, for the shorter term, it's clear that refactoring the
passing around of type + typmod would help this endeavor, so I'm going
to give it a try. 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-18 Thread Mark Cave-Ayland


David Fetter wrote:


For my vote, I'd prefer either the Oid of a custom type or an array
of Oid, Datum pairs - i.e. something we can extend in the future if
required.


This sounds a lot like a foreign key to another table.  Are you not
proposing doing that because of performance considerations?

Cheers,
David.


Well, in PostGIS a typmod contains 3 pieces of information:

1) the SRID
2) the dimension
3) the geometry type

The SRID is technically already a foreign key into another table, with 
dimension and SRID as other information. At the moment, we bit-pack the 
dimension and geometry type into the SRID and use that as the typmod but 
this only leaves 21 bits IIRC for the SRID. The additional complication 
is that SRIDs at the higher end of the range are allowed for anyone to 
use, and so people may have their own customised spheroids defined in 
this region of the table.


If we had a foreign key into another table, we'd need to ensure that no 
one could tamper with it as otherwise all chaos would break lose, e.g. 
breaking the geometry type constraint on a column. Heck, we even have 
people deleting the geometry_columns table sometimes because they are 
not aware of what it does. By storing this information in the PG catalog 
then this can't happen, plus the information is available easily in 
Form_pg_attribute without having to implement our own cache, with its 
own related problems such as how/when to invalidate etc.


There is also a chance that we'd want to include additional information 
in the future related to geometry validity, for example, which would 
mean further reducing the range allowed within the spatial_ref_sys table 
in its existing form.



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-18 Thread David Fetter

On Mon, Oct 18, 2010 at 11:41:06AM +0100, Mark Cave-Ayland wrote:
> Paul Ramsey wrote:
> 
> >>So what kind of data structure would you like for a typmod?
> >
> >I'm a primitive enough beast that just having 64-bits would make me
> >happy. As a general matter though, a bytea?
> >
> >P
> 
> For my vote, I'd prefer either the Oid of a custom type or an array
> of Oid, Datum pairs - i.e. something we can extend in the future if
> required.

This sounds a lot like a foreign key to another table.  Are you not
proposing doing that because of performance considerations?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-18 Thread Mark Cave-Ayland


Paul Ramsey wrote:


So what kind of data structure would you like for a typmod?


I'm a primitive enough beast that just having 64-bits would make me
happy. As a general matter though, a bytea?

P


For my vote, I'd prefer either the Oid of a custom type or an array of 
Oid, Datum pairs - i.e. something we can extend in the future if required.


ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-16 Thread Robert Haas

On Fri, Oct 15, 2010 at 7:10 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I still feel vaguely uneasy about the fact that the proposed patch
>> can't handle ASC/DESC or NULLS FIRST/LAST, and that unease grew a bit
>> more last night when I read Peter's patch to add collation support.
>
> Good point.
>
>> We could possibly cram ASC/DESC and NULLS FIRST/LAST in by defining
>> four new categories of operator strategies rather than one, but
>> there's no way that's going to work for collations.  Is there some
>> other way to approach this problem?  Can we leave pg_amop as it is,
>> and pass the context (sort vs. qual, ASC/DESC, NULLS FIRST/LAST,
>> collation, whatever...) to the index via some sort of side channel?
>
> Well, we cannot avoid changing pg_amop, or at least changing its
> interpretation, because the current scheme simply can't represent
> indexable operators that are used for anything except simple boolean
> WHERE tests.  I agree though that we do *not* want pg_amop involved
> in the details of exactly what sort ordering is referenced by a sortable
> operator.  Somehow that needs to be passed in a side channel.

I spent some time tonight looking at how this is handled in
builtin_knngist_core-0.9.  It looks like the requested sort, if there
is one, just gets tossed into the indexquals list and eventually
transformed into a scankey with a new flag SK_ORDER set.  I think what
we should do instead is add an additional IndexPath field which can
point to a node indicating the desired ordering properties.  This node
might look similar to a scan key but having it be a separate node type
will allow us to add in whatever, ahem, miscellaneous crap we want to
pass: currently ASC/DESC and NULLS FIRST/LAST, and eventually
collation and so forth.  That can then get propagated into the
IndexScan and passed to index_beginscan, where it can be passed to the
AM's ambeginscan function (i.e. we'll add an additional argument to
the signatures of those functions).

There's a second problem, too.  If we assume that we want to treat
this problem with some degree of generality - that is, we really do
care about things like ASC/DESC and NULLS FIRST/LAST and eventually
collation - then the proposed amcanorderbyop flag isn't really
sufficient.  The AM will need to be able to indicate whether a given
combination of parameters is one that it can handle.  I'm thinking
perhaps in lieu of a boolean, we can add another indexam method which,
if not InvalidOid, gets called when we're wondering about whether a
given clause is something that the index can order by.  Although
knngist focuses on a the ORDER BY col OP constant case, which
certainly seems like the most useful one, there's no theoretical
reason why an AM couldn't allow ordering by some more complex clause.
I don't know whether anyone will ever feel like writing code to do
something like that, but if we set the API up this way then it should
be at least theoretically possible.

Comments?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-16 Thread Robert Haas

On Fri, Oct 15, 2010 at 9:39 PM, Robert Haas  wrote:
> On Fri, Oct 15, 2010 at 8:45 PM, Robert Haas  wrote:
>> On Fri, Oct 15, 2010 at 8:10 PM, Tom Lane  wrote:
 Perhaps we should think of pg_amop not so much
 as a way to tell the AM what to do, but just a way to tell it what
 operator is logically involved without relying on the name or OID.
>>>
>>> I already think of it that way ...
>>
>> OK.
>
> Thinking about it that way, perhaps we could add an integer column
> amop_whats_it_good_for that gets used as a bit field.  That wouldn't
> require changing the index structure, although it might break some
> other things.

I gave this a shot (though I called it amoppurpose rather than
amop_whats_it_good_for) and I think it's a reasonable way to proceed.
Proof-of-concept patch attached.  This just adds the column (using the
existing padding space), defines AMOP_SEARCH and AMOP_ORDER, and makes
just about everything ignore anything not marked AMOP_SEARCH,
attached.  This would obviously need some more hacking to pay
attention to AMOP_ORDER where relevant, etc. and to create some actual
syntax around it.  Currently CREATE OPERATOR CLASS / ALTER OPERATOR
FAMILY have this bit:

OPERATOR strategy_number ( op_type [ , op_type ] )

knngist-0.9 implements this:

[ORDER] OPERATOR strategy_number ( op_type [, op_type ] )

...but with the design proposed above that's not quite what we'd want,
because amoppurpose is a bit field, so you could have one or both of
the two possible purposes.  Perhaps:

OPERATOR strategy_number ( op_type [ , op_type ] ) [ FOR { SEARCH |
ORDER } [, ...] ]

With the default being FOR SEARCH.

Comments?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-16 Thread Robert Haas

On Sat, Oct 16, 2010 at 6:13 PM, Paul Ramsey  wrote:
> On Sat, Oct 16, 2010 at 10:17 AM, Peter Eisentraut  wrote:
>> On lör, 2010-10-16 at 09:23 -0700, Paul Ramsey wrote:
>>> >>   (And, if we are going to break everything
>>> > in sight, now would be a good time to think about changing typmod to
>>> > something more flexible than one int32.)
>>>
>>> As someone who is jamming geometry type, spatial reference number and
>>> dimensionality into said 32bit typmod, let me say emphatically ...
>>> Amen!
>>
>> So what kind of data structure would you like for a typmod?
>
> I'm a primitive enough beast that just having 64-bits would make me
> happy. As a general matter though, a bytea?

Yeah.  It strikes me that there are three main kinds of things people
might want to represent:

1. An integer.  e.g. for a numeric, precision or scale; for a varchar,
length; for an array, number of dimensions.
2. An OID.  e.g. for varchar or text, a collation OID.
3. Recursive structure.  So you might have an array (which is
one-dimensional) containing strings (which are limited to 80
characters and collated in Klingon).  You want to hold onto all of
those details somehow.

There might be use cases for even crazier things - like packing all
the field names and types for a record object in there...  but maybe
that's too crazy to be workable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-16 Thread Paul Ramsey

On Sat, Oct 16, 2010 at 10:17 AM, Peter Eisentraut  wrote:
> On lör, 2010-10-16 at 09:23 -0700, Paul Ramsey wrote:
>> >>   (And, if we are going to break everything
>> > in sight, now would be a good time to think about changing typmod to
>> > something more flexible than one int32.)
>>
>> As someone who is jamming geometry type, spatial reference number and
>> dimensionality into said 32bit typmod, let me say emphatically ...
>> Amen!
>
> So what kind of data structure would you like for a typmod?

I'm a primitive enough beast that just having 64-bits would make me
happy. As a general matter though, a bytea?

P

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-16 Thread Peter Eisentraut

On lör, 2010-10-16 at 09:23 -0700, Paul Ramsey wrote:
> >>   (And, if we are going to break everything
> > in sight, now would be a good time to think about changing typmod to
> > something more flexible than one int32.)
> 
> As someone who is jamming geometry type, spatial reference number and
> dimensionality into said 32bit typmod, let me say emphatically ...
> Amen!

So what kind of data structure would you like for a typmod?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-16 Thread Paul Ramsey


>>   (And, if we are going to break everything
> in sight, now would be a good time to think about changing typmod to
> something more flexible than one int32.)

As someone who is jamming geometry type, spatial reference number and 
dimensionality into said 32bit typmod, let me say emphatically ... Amen!

P

Re: [HACKERS] knngist - 0.8

2010-10-16 Thread Martijn van Oosterhout

On Fri, Oct 15, 2010 at 08:45:26PM -0400, Robert Haas wrote:
> But I'm also not sure how far this gets us with KNNGIST, where the
> issue is not the typmods but the auxilliary information about the
> context of the sort and/or whether this is a sort or qual.

ISTM there are two issues here. With Btree you have the issue where the
index is of a particular collation and at planning time you're given a
collation and either it's compatable or it's not.

With Gist you have the situation where index isn't of a specific
collation but it can be used in different ways to get different
collection as output. (I'm here thinking of that Gist can in some
circumstance be asked to return tuples in a certain order.)

Actually, NULLS FIRST/LAST, ASC/DESC are just variations on collations
and my original patch many years back for each locale made *four*
collation IDs, one for each of the combinations of the above. Not
terribly scalable, but I never did get the planning code to work. :)

What you're going for here I think is a sort of cross-collation
operators, or something, that tells the planner how to get the
particular collation out of the given index. So you have something
like:

Input: en_US, NULLS FIRST, ASC
Output: en_US, NULLS_LAST, DESC
-> Reverse index scan

Input: en_US, NULLS FIRST, ASC
Output: en_US, NULLS LAST, ASC
-> Index scan, add x IS NOT NULL test
-> Append output of x IS NULL.

Input: Gist
Output: KnnGist
-> Index scan with scan type 1

Input: Gist
Output: Gist
-> Index scan with scan type 0

Obviously this doesn't solve the problem of being able to represent the
required collation in the first place, and if this is at all compatable
with what the SQL standard calls a collation. I just don't think you
can make hard and fast rules about how to make this work and that
perhaps we should be looking for a way to push that to the index
implementation code, with the default rule being: same collection yes,
different no.

Just some thoughts,

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first. 
>   - Charles de Gaulle

signature.asc
Description: Digital signature

Re: [HACKERS] knngist - 0.8

2010-10-15 Thread Robert Haas

On Fri, Oct 15, 2010 at 8:45 PM, Robert Haas  wrote:
> On Fri, Oct 15, 2010 at 8:10 PM, Tom Lane  wrote:
>>> Perhaps we should think of pg_amop not so much
>>> as a way to tell the AM what to do, but just a way to tell it what
>>> operator is logically involved without relying on the name or OID.
>>
>> I already think of it that way ...
>
> OK.

Thinking about it that way, perhaps we could add an integer column
amop_whats_it_good_for that gets used as a bit field.  That wouldn't
require changing the index structure, although it might break some
other things.

The core KNNGIST patch is actually quite small, actually.  Excluding a
lot of not-very-interesting changes to pg_amop.h, it's just over 300
lines of new code, about half of which is in indxpath.c.  If we could
get this issue of how to structure the catalogs resolved, it seems to
me that we might be able to see our way to committing that part of
this work fairly quickly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-15 Thread Robert Haas

On Fri, Oct 15, 2010 at 8:10 PM, Tom Lane  wrote:
>> Perhaps we should think of pg_amop not so much
>> as a way to tell the AM what to do, but just a way to tell it what
>> operator is logically involved without relying on the name or OID.
>
> I already think of it that way ...

OK.

>>> Maybe we should think in terms of a side channel for Peter's patch
>>> as well.  I share your feeling that trying to propagate collation
>>> the same way we now propagate typmod is a recipe for serious pain.
>
>> I'm not sure what you're thinking of here.
>
> I'm not either.  I'm dissatisfied with the direction he's heading
> because of the amount of code it's going to break, but I don't have a
> better idea.  It may well be impossible to have this feature without
> breaking everything in sight.  (And, if we are going to break everything
> in sight, now would be a good time to think about changing typmod to
> something more flexible than one int32.)

I assume I don't need to tell you my vote on that.

But I'm also not sure how far this gets us with KNNGIST, where the
issue is not the typmods but the auxilliary information about the
context of the sort and/or whether this is a sort or qual.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-15 Thread Tom Lane

Robert Haas  writes:
> On Fri, Oct 15, 2010 at 7:10 PM, Tom Lane  wrote:
>> Well, we cannot avoid changing pg_amop, or at least changing its
>> interpretation, because the current scheme simply can't represent
>> indexable operators that are used for anything except simple boolean
>> WHERE tests.

> What exactly do you mean by that?

> It has always seemed to me that the operator class mechanism is a
> complicated abstraction mechanism that actually adds only a very small
> amount of abstraction.  Instead of referring to operators by name or
> OID we refer to them by a number that maps onto a name or OID.

Well, the amount of abstraction might be minimal, but the point is to be
able to understand which operators are related to an index and exactly
what the relationship is.  There might be a better way to represent
"this operator acts as <= for btree int4 indexes" than "this operator
has strategy number 2 for btree int4 indexes", but it doesn't seem to me
that there's a lot of win available there.  C code certainly finds it
more convenient to work with numbers than names, so I'm not excited
about replacing the strategy numbers with strategy names, if that's what
you're thinking.

> Perhaps we should think of pg_amop not so much
> as a way to tell the AM what to do, but just a way to tell it what
> operator is logically involved without relying on the name or OID.

I already think of it that way ...

>> Maybe we should think in terms of a side channel for Peter's patch
>> as well.  I share your feeling that trying to propagate collation
>> the same way we now propagate typmod is a recipe for serious pain.

> I'm not sure what you're thinking of here.

I'm not either.  I'm dissatisfied with the direction he's heading
because of the amount of code it's going to break, but I don't have a
better idea.  It may well be impossible to have this feature without
breaking everything in sight.  (And, if we are going to break everything
in sight, now would be a good time to think about changing typmod to
something more flexible than one int32.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-15 Thread Robert Haas

On Fri, Oct 15, 2010 at 7:10 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I still feel vaguely uneasy about the fact that the proposed patch
>> can't handle ASC/DESC or NULLS FIRST/LAST, and that unease grew a bit
>> more last night when I read Peter's patch to add collation support.
>
> Good point.
>
>> We could possibly cram ASC/DESC and NULLS FIRST/LAST in by defining
>> four new categories of operator strategies rather than one, but
>> there's no way that's going to work for collations.  Is there some
>> other way to approach this problem?  Can we leave pg_amop as it is,
>> and pass the context (sort vs. qual, ASC/DESC, NULLS FIRST/LAST,
>> collation, whatever...) to the index via some sort of side channel?
>
> Well, we cannot avoid changing pg_amop, or at least changing its
> interpretation, because the current scheme simply can't represent
> indexable operators that are used for anything except simple boolean
> WHERE tests.

What exactly do you mean by that?

It has always seemed to me that the operator class mechanism is a
complicated abstraction mechanism that actually adds only a very small
amount of abstraction.  Instead of referring to operators by name or
OID we refer to them by a number that maps onto a name or OID.  That
allows the user to change the name or OID without breaking anything,
but that's about it.  Perhaps we should think of pg_amop not so much
as a way to tell the AM what to do, but just a way to tell it what
operator is logically involved without relying on the name or OID.

> I agree though that we do *not* want pg_amop involved
> in the details of exactly what sort ordering is referenced by a sortable
> operator.  Somehow that needs to be passed in a side channel.

Yeah.

> Maybe we should think in terms of a side channel for Peter's patch
> as well.  I share your feeling that trying to propagate collation
> the same way we now propagate typmod is a recipe for serious pain.

I'm not sure what you're thinking of here.  I think we can have the
idea of a FullySpecifiedType = , but
that's not so much a side channel as an abstraction layer.  It
absolutely won't work to stuff the collations in a global variable or
something like that, if that's what you're imagining.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-15 Thread Tom Lane

Robert Haas  writes:
> I still feel vaguely uneasy about the fact that the proposed patch
> can't handle ASC/DESC or NULLS FIRST/LAST, and that unease grew a bit
> more last night when I read Peter's patch to add collation support.

Good point.

> We could possibly cram ASC/DESC and NULLS FIRST/LAST in by defining
> four new categories of operator strategies rather than one, but
> there's no way that's going to work for collations.  Is there some
> other way to approach this problem?  Can we leave pg_amop as it is,
> and pass the context (sort vs. qual, ASC/DESC, NULLS FIRST/LAST,
> collation, whatever...) to the index via some sort of side channel?

Well, we cannot avoid changing pg_amop, or at least changing its
interpretation, because the current scheme simply can't represent
indexable operators that are used for anything except simple boolean
WHERE tests.  I agree though that we do *not* want pg_amop involved
in the details of exactly what sort ordering is referenced by a sortable
operator.  Somehow that needs to be passed in a side channel.

Maybe we should think in terms of a side channel for Peter's patch
as well.  I share your feeling that trying to propagate collation
the same way we now propagate typmod is a recipe for serious pain.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-15 Thread Robert Haas

On Fri, Oct 15, 2010 at 5:35 PM, Robert Haas  wrote:
> On Fri, Oct 15, 2010 at 12:02 PM, Tom Lane  wrote:
>> BTW, have we discussed the idea of embedding the role in the strategy
>> number?  For example, require regular operators to have strategy
>> numbers 1-, while ordering operators have numbers 1-1,
>> leaving room for a couple more roles before we have to rethink the
>> assignment or widen amopstrategy to an int.  That's a bit ugly but
>> might be better than adding a separate role column.
>
> Yeah, we talked about it.
>
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg01075.php
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg01079.php

Having said that, I'm not wild on having 5-key syscaches, even though
the patch is ready to go (modulo whatever rebasing is needed, which
probably isn't much).  So if you can figure out a way to avoid it,
let's do that.

I still feel vaguely uneasy about the fact that the proposed patch
can't handle ASC/DESC or NULLS FIRST/LAST, and that unease grew a bit
more last night when I read Peter's patch to add collation support.
We could possibly cram ASC/DESC and NULLS FIRST/LAST in by defining
four new categories of operator strategies rather than one, but
there's no way that's going to work for collations.  Is there some
other way to approach this problem?  Can we leave pg_amop as it is,
and pass the context (sort vs. qual, ASC/DESC, NULLS FIRST/LAST,
collation, whatever...) to the index via some sort of side channel?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-15 Thread Robert Haas

On Fri, Oct 15, 2010 at 12:02 PM, Tom Lane  wrote:
> BTW, have we discussed the idea of embedding the role in the strategy
> number?  For example, require regular operators to have strategy
> numbers 1-, while ordering operators have numbers 1-1,
> leaving room for a couple more roles before we have to rethink the
> assignment or widen amopstrategy to an int.  That's a bit ugly but
> might be better than adding a separate role column.

Yeah, we talked about it.

http://archives.postgresql.org/pgsql-hackers/2010-02/msg01075.php
http://archives.postgresql.org/pgsql-hackers/2010-02/msg01079.php

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-15 Thread Tom Lane

Teodor Sigaev  writes:
> As I remember, there were several suggested designs:
> 1) 5-th boolean column (amopfamily, amoplefttype, amoprighttype, 
> amopstrategy, 
> amoporder) to point kind of operator (search or order)
>+ saves one record for operator in pg_amop
>- operator could not be used in both roles
>- increase number of arguments for syscache machinery
> 2) 5-th combined column, which contains some kind of flag for each role
>+ saves one record for operator in pg_amop
>+ operator could be used in both roles
>- strategy number for operator is the same for both roles, it's 
> unacceptable
>  because GiST's consistentFn will not have information about role. GiST
>  itself could distinguish them by invented SK_ORDER flag. So, this
>  requires to introduce one more argument for consistentFn, while it
>  already has 5 arguments.
>- increase number of arguments for syscache machinery
> 3) 3-rd boolean column (amopopr, amopfamily, amoporder)
>- could be two records per operator
>+ operator could be used in both roles
>+ strategy number could be different for different roles

How can #3 work at all?  It's ignoring a couple of critical index
columns.  In particular, I believe the sticking point here is this
unique index:

"pg_amop_fam_strat_index" UNIQUE, btree (amopfamily, amoplefttype, 
amoprighttype, amopstrategy)

#3 doesn't explain what to do about this index.  If we extend it to
five columns, as we'd logically have to do to preserve uniqueness,
then the associated syscache has to have five key columns as well,
and now we're at solution #1.

I'm not terribly thrilled with using a boolean here in any case.
Now that we have two "roles" an operator might play in an opclass,
who's to say there might not be more roles someday?  We should use
a column type that will support more than two roles without basic
rejiggering.

BTW, have we discussed the idea of embedding the role in the strategy
number?  For example, require regular operators to have strategy
numbers 1-, while ordering operators have numbers 1-1,
leaving room for a couple more roles before we have to rethink the
assignment or widen amopstrategy to an int.  That's a bit ugly but
might be better than adding a separate role column.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-15 Thread Teodor Sigaev




forward.  Tom and I laid out a technical design back in January and I
still think it's a good one, even though I know you think it's silly.
We may just have to agree to disagree on this point.

As I remember, there were several suggested designs:
1) 5-th boolean column (amopfamily, amoplefttype, amoprighttype, amopstrategy, 
amoporder) to point kind of operator (search or order)

  + saves one record for operator in pg_amop
  - operator could not be used in both roles
  - increase number of arguments for syscache machinery
2) 5-th combined column, which contains some kind of flag for each role
  + saves one record for operator in pg_amop
  + operator could be used in both roles
  - strategy number for operator is the same for both roles, it's unacceptable
because GiST's consistentFn will not have information about role. GiST
itself could distinguish them by invented SK_ORDER flag. So, this
requires to introduce one more argument for consistentFn, while it
already has 5 arguments.
  - increase number of arguments for syscache machinery
3) 3-rd boolean column (amopopr, amopfamily, amoporder)
  - could be two records per operator
  + operator could be used in both roles
  + strategy number could be different for different roles

All three options require to add flag of role 
op_in_opfamily/get_op_opfamily_strategy/get_op_opfamily_properties to check 
applicability of operation in current code path. First two options could do not 
change of interface of op_in_opfamily/get_op_opfamily_strategy but it will be 
needed to check actual role of operator later.


Basing on this comparison, I think, that 2) is worse and 3) is better.
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-10-05 Thread Robert Haas

On Tue, Oct 5, 2010 at 5:05 PM, Oleg Bartunov  wrote:
> Are you sure postgres doesn't have limitation at all ? There are a lot ot
> them. Of course, there are limitation and limitation and we should avoid
> limitations, which will require incompatible changes in future in user's
> code, or which prevent future improvements (getting rid limitation). We
> suggested implementation of k-nn search, which has limitations, but in my
> opinion, they are rather small and doesn't prevent in future to
> write a descent patch, based on your five-key syscaches patches,
> which will touch a lot of places in the pg source.
>
> Who need boolean distance ? Ok, you insisted,  we now support it. Teodor
> wrote arguments
> (http://archives.postgresql.org/message-id/4c8e2590.6040...@sigaev.ru)
> why two fields solution doesn't really helped.
>
> You want "the points most distant from the bright center of the universe?",
> sorry, for now. I think, this is a limitation we can live with for now. It's
> k-nearest neighbourhood search, and not k-furthest outliers search.
>
> You want documentation for review, I don't believe a reviewer can't review
> without users documentation like CREATE/ALTER OPERATOR CLASS, etc. Andrew
> Gierth was true,
> that GiST documentation needed to be rewritten and he suggested to do that
> if I understand him. I'd love to help, but I don't have any message from
> him.
> If he changed his mind, I'll describe these changes.
>
> We're not full-time pg-employers and it's not easy for us to follow new
> cleaner-way. I think there is a risk, that  there are will be lesser big
> features introduced by non-pg employers in the future and eventually, pg
> will be open-source database developed by pg-employers with some amount
> cosmetic changes and fixes.
>
> I suggest to find a sensible consensus on implementation, taking into
> account Pareto Rule, and leave place for future improvement.

I am sorry my review pissed you off, as it seems to have done.
Looking back on it, I realize now that it was phrased more harshly
than it needed to be.  That having been said, it seems to me that we
really are at an impasse and I am not sure what to propose as a way
forward.  Tom and I laid out a technical design back in January and I
still think it's a good one, even though I know you think it's silly.
We may just have to agree to disagree on this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Are you sure postgres doesn't have limitation at all ? There are a lot ot them.
Of course, there are limitation and limitation and we should avoid limitations,
which will require incompatible changes in future in user's code, or which
prevent future improvements (getting rid limitation). We suggested
implementation of k-nn search, which has limitations, but in my opinion,
they are rather small and doesn't prevent in future to

write a descent patch, based on your five-key syscaches patches,
which will touch a lot of places in the pg source.

Who need boolean distance ? Ok, you insisted, we now support it.
Teodor wrote arguments (http://archives.postgresql.org/message-id/4c8e2590.6040...@sigaev.ru)

why two fields solution doesn't really helped.

You want "the points most distant from the bright center of the universe?",
sorry, for now. I think, this is a limitation we can live with for now.
It's k-nearest neighbourhood search, and not k-furthest outliers search.

You want documentation for review, I don't believe a reviewer can't review
without users documentation like CREATE/ALTER OPERATOR CLASS, etc.
Andrew Gierth was true,
that GiST documentation needed to be rewritten and he suggested to do that if
I understand him. I'd love to help, but I don't have any message from him.

If he changed his mind, I'll describe these changes.

We're not full-time pg-employers and it's not easy for us to follow
new cleaner-way. I think there is a risk, that there are will be lesser big
features introduced by non-pg employers in the future and eventually,
pg will be open-source database developed by pg-employers with some
amount cosmetic changes and fixes.

I suggest to find a sensible consensus on implementation, taking into
account Pareto Rule, and leave place for future improvement.

Oleg

On Tue, 14 Sep 2010, Robert Haas wrote:

2010/9/13 Teodor Sigaev :

[updated patch]

I realize I'm repeating myself, but... this patch needs
documentation. It's not optional.

It seems to me that you need to do something about AMOPSTRATEGY.
Hence the five-key syscaches patches I wrote that is sitting in queue.
And also LookupOpClassInfo(). Am I confused?

Is there a reason we're using a boolean 'amoporder' distinguish
ordering operators from regular old operators? Tom and I were talking
about some kind of an integer field, in case we need more categories
later. You know, like 'amopcategory' or something like that. It
could be just one byte, perhaps, but one bit seems unduly narrow. You
could define constants OPCAT_QUAL and OPCAT_ORDER, or something like
that.

This error message seems like it ought to be complaining about the
access method, not the index:

+ if (!pg_am->amcanorderbyop)
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+errmsg("index doesn't support
ordering operations")));

I sort of understand the reasons behind the following restriction, but
is this truly the best we can do?

+ /*
+* Currently, only descending and nulls last ordering
is supported
+*/
+ if (!(pathkey->pk_strategy == BTLessStrategyNumber &&
pathkey->pk_nulls_first == false))
+ return;

What happens if we have an index strategy that can efficiently return
the points most distant from the bright center of the universe?

By the way:

gistproc.c: In function ?gist_point_consistent?:
gistproc.c:1023: error: ?distance? may be used uninitialized in this function

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 113 matches

Mail list logo