Re: [HACKERS] GiST support for inet datatypes

2014-04-09 Thread Emre Hasegeli
Tom Lane t...@sss.pgh.pa.us:
 Committed with some additional documentation work.  Thanks for
 submitting this!

Thank you for committing. I had not thought of using different structure
for the index. It works faster with my test case, too.

I am sending rebased version of the consistent operator names patch
in case you would like to include it to the same release. This is what
I wrote about this change before:

 That is why I prepared it as a separate patch on top of the others. It is
 not only consistency with the range types: @ and @ symbols used for
 containment everywhere except the inet data types, particularly on
 the geometric types, arrays; cube, hstore, intaray, ltree extensions.
 The patch does not just change the operator names, it leaves the old ones,
 adds new operators with GiST support and changes the documentation, like
 your commit ba920e1c9182eac55d5f1327ab0d29b721154277 back in 2006. I could
 not find why did you leave the inet operators unchanged on that commit,
 in the mailing list archives [1]. GiST support will be a promotion for
 users to switch to the new operators, if we make this change with it.

 This change will also indirectly deprecate the undocumented non-transparent
 btree index support that works sometimes for some of the subnet inclusion
 operators [2].

 [1] http://www.postgresql.org/message-id/14277.1157304...@sss.pgh.pa.us

 [2] http://www.postgresql.org/message-id/389.1042992...@sss.pgh.pa.us


inet-operator-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-04-08 Thread Tom Lane
I wrote:
 [ inet-gist-v6.patch ]

Committed with some additional documentation work.  Thanks for
submitting this!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-04-08 Thread Robert Haas
On Tue, Apr 8, 2014 at 3:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 [ inet-gist-v6.patch ]

 Committed with some additional documentation work.  Thanks for
 submitting this!

NICE.  I'd like to tell you how excited I am about this part:

# It also handles a new network
# operator inet  inet (overlaps, a/k/a is supernet or subnet of),
# which is expected to be useful in exclusion constraints.

...but I can't, because my mouth is too full of drool.  Wouldn't you
really want that more for cidr than for inet, though?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-04-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 NICE.  I'd like to tell you how excited I am about this part:

 # It also handles a new network
 # operator inet  inet (overlaps, a/k/a is supernet or subnet of),
 # which is expected to be useful in exclusion constraints.

 ...but I can't, because my mouth is too full of drool.  Wouldn't you
 really want that more for cidr than for inet, though?

Probably, but it works with either since they're binary-compatible.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-04-08 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Tue, Apr 8, 2014 at 3:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I wrote:
  [ inet-gist-v6.patch ]
 
  Committed with some additional documentation work.  Thanks for
  submitting this!
 
 NICE.  I'd like to tell you how excited I am about this part:
 
 # It also handles a new network
 # operator inet  inet (overlaps, a/k/a is supernet or subnet of),
 # which is expected to be useful in exclusion constraints.
 
 ...but I can't, because my mouth is too full of drool.  Wouldn't you
 really want that more for cidr than for inet, though?

You realize ip4r has had all this and more for, like, forever, right?
It's also more efficient wrt storage and generally more 'sane' regarding
operators, etc..

Just thought I'd share..  If you have a use-case, ip4r is what
everyone's been using for quite some time.  Also, yes, it supports both
ipv4 and ipv6, despite the name.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] GiST support for inet datatypes

2014-04-04 Thread Andres Freund
Hi,

On 2014-03-08 23:40:31 +0200, Emre Hasegeli wrote:
 Fourth version of the patch attached. It is rebased to the HEAD (8879fa0).
 Operator name formatting patch rebased on top of it. I will put
 the selectivity estimation patch to the next commit fest.
 
 This version of the patch does not touch to the btree_gist extension,
 does not set the operator class as the default. It adds support to
 the not equals operator to make the operator class compatible with
 the btree_gist extension.

This patch looks like it can be applied much more realistically, but it
looks too late for 9.4. I suggest moving it to the next CF?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-04-04 Thread Andreas Karlsson

On 04/04/2014 04:01 PM, Andres Freund wrote:

This patch looks like it can be applied much more realistically, but it
looks too late for 9.4. I suggest moving it to the next CF?


If it does not change the default operator class I do not see anything 
preventing it from being applied to 9.4, as long as the committers have 
the time to look at this. My review is done and I think the first patch 
is ok and useful by itself.


--
Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-04-04 Thread Andres Freund
On 2014-04-04 16:50:36 +0200, Andreas Karlsson wrote:
 On 04/04/2014 04:01 PM, Andres Freund wrote:
 This patch looks like it can be applied much more realistically, but it
 looks too late for 9.4. I suggest moving it to the next CF?
 
 If it does not change the default operator class I do not see anything
 preventing it from being applied to 9.4, as long as the committers have the
 time to look at this. My review is done and I think the first patch is ok
 and useful by itself.

Well, the patch was marked as needs review, not ready for committer. So,
if that's your position, re-check the latest version and mark it as
ready.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-04-04 Thread Andreas Karlsson

On 03/08/2014 10:40 PM, Emre Hasegeli wrote:

Fourth version of the patch attached. It is rebased to the HEAD (8879fa0).
Operator name formatting patch rebased on top of it. I will put
the selectivity estimation patch to the next commit fest.

This version of the patch does not touch to the btree_gist extension,
does not set the operator class as the default. It adds support to
the not equals operator to make the operator class compatible with
the btree_gist extension.


The patch looks good but it does not apply anymore against master. If 
you could fix that and the duplicate OIDs I think this is ready for a 
committer.


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-03-03 Thread Robert Haas
On Thu, Feb 27, 2014 at 12:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 On Feb27, 2014, at 17:56 , Tom Lane t...@sss.pgh.pa.us wrote:
 That's not a bug, it's a feature, for much the same reasons that pg_dump
 tries to minimize explicit schema-qualification.

 I fail to see the value in this for opclasses. It's certainly nice for
 schema qualifications, because dumping one schema and restoring into a
 different schema is probably quite common.

 The value in it is roughly the same as the reason we don't include a
 version number when dumping CREATE EXTENSION.  If you had a default
 opclass in the source database, you probably want a default opclass
 in the destination, even if that's not bitwise the same as what you
 had before.  The implication is that you want best practice for the
 current PG version.

I don't think that argument holds a lot of water in this instance.
The whole reason for having multiple opclasses that each one can
implement different comparison behavior.  It's unlikely that you want
an upgrade to change the comparison behavior you had before; you'd be
sad if, for example, the dump/restore process failed to preserve your
existing collation settings.

But even if that were desirable in general, suppressing it for binary
upgrade dumps certainly seems more than sane.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-03-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Feb 27, 2014 at 12:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The value in it is roughly the same as the reason we don't include a
 version number when dumping CREATE EXTENSION.  If you had a default
 opclass in the source database, you probably want a default opclass
 in the destination, even if that's not bitwise the same as what you
 had before.  The implication is that you want best practice for the
 current PG version.

 I don't think that argument holds a lot of water in this instance.
 The whole reason for having multiple opclasses that each one can
 implement different comparison behavior.

Well, I doubt we'd accept a proposal to change the default opclass
of a datatype to something that had incompatible behavior --- but
we might well accept one to change it to something that had improved
behavior, such as more operators.

The first couple dozen lines in GetIndexOpClass() make for interesting
reading in this context.  That approach to cross-version compatibility
probably doesn't work in the pg_upgrade universe, of course; but what I'd
like to point out here is that those kluges wouldn't have been necessary
in the first place if pg_dump had had the suppress-default-opclasses
behavior at the time.  (No, it didn't always do that; cf commits
e5bbf1965 and 1929a90b6.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-28 Thread Greg Stark
On Thu, Feb 27, 2014 at 5:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Indeed.  The root of the problem here is that we've never really thought
 about changing a type's default opclass before.  I don't think that a
 two-line change in pg_dump fixes all the issues this will bring up.

I think we did actually do this once. When you replaced rtree with
gist as the default opclass for the rtree method. IIRC you did it by
making rtree a synonym for gist and since the opclass wasn't
specified the default gist opclass kicked in automatically.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-28 Thread Emre Hasegeli
2014-02-27 18:15, Florian Pflug f...@phlo.org:
 It can be possible to update the new operator class in the new cluster
 as not default, before restore. After the restore, pg_upgrade can upgrade
 the btree_gist extension and reset the operator class as the default.
 pg_upgrade suggests to re-initdb the new cluster if it fails, anyway. Do
 you think it is a better solution? I think it will be more complicated
 to do in pg_dump when in binary upgrade mode.

 Maybe I'm missing something, but isn't the gist of the problem here that
 pg_dump won't explicitly state the operator class if it's the default?

No, the problem is pg_upgrade. We can even benefit from this behavior of
pg_dump, if we would like to remove the operator classes on btree_gist.
Because of that behavior; users, who would upgrade by dump and restore,
will upgrade to the better default operator class without noticing. I am
not sure not notifying is the best think to do, though.

The problem is that pg_dump --binary-upgrade dumps objects in the extension
on the old cluster, not just the CREATE EXTENSION statement. pg_upgrade
fails to restore them, if the new operator class already exists on the new
cluster as the default. It effects all users who use the extension, even
if they are not using the inet and cidr operator classes in it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-28 Thread Florian Pflug
On Feb28, 2014, at 15:45 , Emre Hasegeli e...@hasegeli.com wrote:
 The problem is that pg_dump --binary-upgrade dumps objects in the extension
 on the old cluster, not just the CREATE EXTENSION statement. pg_upgrade
 fails to restore them, if the new operator class already exists on the new
 cluster as the default. It effects all users who use the extension, even
 if they are not using the inet and cidr operator classes in it.

Hm, what if we put the new opclass into an extension of its own, say inet_gist,
instead of into core? 

Couldn't we then remove the inet support from the latest version of btree_gist
(the one we'd ship with 9.4)? People who don't use the old inet opclass could
then simply upgrade the extension after running pg_upgrade to get rid of the
old, broken version. People who *do* use the old inet opclass would need to
drop their indices before doing that, then install the extension inet_gist,
and finally re-create their indices. 

People who do nothing would continue to use the old inet opclass.

inet_gist's SQL script could check whether btree_gist has been upgrade, and
if not fail with an error like btree_gist must be upgraded to at least version
x.y before inet_gist can be installed. That would avoid failing with a rather
cryptic error later.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-28 Thread Emre Hasegeli
2014-02-28 17:30, Florian Pflug f...@phlo.org:
 Hm, what if we put the new opclass into an extension of its own, say 
 inet_gist,
 instead of into core?

It will work but I do not think it is better than adding it in core as
not default.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-27 Thread Emre Hasegeli
2014-02-24 17:55, Bruce Momjian br...@momjian.us:

 pg_upgrade has _never_ modified the old cluster, and I am hesitant to do
 that now.  Can we make the changes in the new cluster, or in pg_dump
 when in binary upgrade mode?

It can be possible to update the new operator class in the new cluster
as not default, before restore. After the restore, pg_upgrade can upgrade
the btree_gist extension and reset the operator class as the default.
pg_upgrade suggests to re-initdb the new cluster if it fails, anyway. Do
you think it is a better solution? I think it will be more complicated
to do in pg_dump when in binary upgrade mode.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-27 Thread Florian Pflug
On Feb27, 2014, at 11:39 , Emre Hasegeli e...@hasegeli.com wrote:
 2014-02-24 17:55, Bruce Momjian br...@momjian.us:
 
 pg_upgrade has _never_ modified the old cluster, and I am hesitant to do
 that now.  Can we make the changes in the new cluster, or in pg_dump
 when in binary upgrade mode?
 
 It can be possible to update the new operator class in the new cluster
 as not default, before restore. After the restore, pg_upgrade can upgrade
 the btree_gist extension and reset the operator class as the default.
 pg_upgrade suggests to re-initdb the new cluster if it fails, anyway. Do
 you think it is a better solution? I think it will be more complicated
 to do in pg_dump when in binary upgrade mode.

Maybe I'm missing something, but isn't the gist of the problem here that
pg_dump won't explicitly state the operator class if it's the default?

If so, can't we just make pg_dump always spell out the operator class
explicitly? Then changing the default will never change the meaning of
database dumps, so upgraded clusters will simply continue to use the
old (now non-default) opclass, no?

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-27 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 Maybe I'm missing something, but isn't the gist of the problem here that
 pg_dump won't explicitly state the operator class if it's the default?

That's not a bug, it's a feature, for much the same reasons that pg_dump
tries to minimize explicit schema-qualification.

At least, it's a feature for ordinary dumps.  I'm not sure whether it
might be a good idea for binary_upgrade dumps.  That would depend on
our policy about whether a new opclass has to have a new (and necessarily
weird) name.  If we try to make the new opclass still have the nicest
name then it won't help at all for pg_dump to dump the old name.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-27 Thread Florian Pflug
On Feb27, 2014, at 17:56 , Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 Maybe I'm missing something, but isn't the gist of the problem here that
 pg_dump won't explicitly state the operator class if it's the default?
 
 That's not a bug, it's a feature, for much the same reasons that pg_dump
 tries to minimize explicit schema-qualification.

I fail to see the value in this for opclasses. It's certainly nice for
schema qualifications, because dumping one schema and restoring into a
different schema is probably quite common. But how often does anyone dump
a database and restore it into a database with a different default opclass
for some type?

Or is the idea just to keep the dump as readable as possible?

 At least, it's a feature for ordinary dumps.  I'm not sure whether it
 might be a good idea for binary_upgrade dumps.  That would depend on
 our policy about whether a new opclass has to have a new (and necessarily
 weird) name.  If we try to make the new opclass still have the nicest
 name then it won't help at all for pg_dump to dump the old name.

Well, given the choice between not ever being able to change the default
opclass of a type, and not being able to re-use an old opclass' name,
I'd pick the latter. Especially because for default opclasses, users don't
usually have to know the name anyway.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-27 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 On Feb27, 2014, at 17:56 , Tom Lane t...@sss.pgh.pa.us wrote:
 That's not a bug, it's a feature, for much the same reasons that pg_dump
 tries to minimize explicit schema-qualification.

 I fail to see the value in this for opclasses. It's certainly nice for
 schema qualifications, because dumping one schema and restoring into a
 different schema is probably quite common.

The value in it is roughly the same as the reason we don't include a
version number when dumping CREATE EXTENSION.  If you had a default
opclass in the source database, you probably want a default opclass
in the destination, even if that's not bitwise the same as what you
had before.  The implication is that you want best practice for the
current PG version.

Another reason for not doing it is that unique-constraint syntax doesn't
even support it.  Constraints *have to* use the default opclass.

 But how often does anyone dump
 a database and restore it into a database with a different default opclass
 for some type?

Indeed.  The root of the problem here is that we've never really thought
about changing a type's default opclass before.  I don't think that a
two-line change in pg_dump fixes all the issues this will bring up.

I remind you also that even if you had a 100% bulletproof argument for
changing the behavior now, it won't affect what's in existing dump files.
The only real wiggle room we have is to change the --binary-upgrade
behavior, since we can plausibly insist that people use a current pg_dump
while doing an upgrade.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-26 Thread Emre Hasegeli
2014-02-24 02:44, Andreas Karlsson andr...@proxel.se:

 Note: The patches do not apply anymore due to changes to
 src/backend/utils/adt/Makefile.

I will fix it on the next version.

 I see, thanks for the explanation. But I am still not very fond of how that
 code is written since I find it hard to verify the correctness of it, but
 have no better suggestions.

We can split the selectivity estimation patch from the GiST support, add
it to the next commit fest for more review. In the mean time, I can
improve it with join selectivity estimation. Testing with different datasets
would help the most to reveal how true the algorithm is.

 Do you think the new index is useful even if you use the basic geo_selfuncs?
 Or should we wait with committing the patches until all selfuncs are
 implemented?

My motivation was to be able to use the cidr datatype with exclusion
constraints. I think the index would also be useful without proper
selectivity estimation functions. Range types also use geo_selfuncs for
join selectivity estimation.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-24 Thread Bruce Momjian
On Wed, Feb 19, 2014 at 06:37:42PM -0500, Tom Lane wrote:
 Emre Hasegeli e...@hasegeli.com writes:
  [ cites bug #5705 ]
 
 Hm.  I had forgotten how thoroughly broken btree_gist's inet and cidr
 opclasses are.  There was discussion at the time of just ripping them
 out despite the compatibility hit.  We didn't do it, but if we had
 then life would be simpler now.
 
 Perhaps it would be acceptable to drop the btree_gist implementation
 and teach pg_upgrade to refuse to upgrade if the old database contains
 any such indexes.  I'm not sure that solves the problem, though, because
 I think pg_upgrade will still fail if the opclass is in the old database
 even though unused (because you'll get the complaint about multiple
 default opclasses anyway).  The only simple way to avoid that is to not
 have btree_gist loaded at all in the old DB, and I doubt that's an
 acceptable upgrade restriction.  It would require dropping indexes of
 the other types supported by btree_gist, which would be a pretty hard
 sell since those aren't broken.
 
 Really what we'd need here is for btree_gist to be upgraded to a version
 that doesn't define gist_inet_ops (or at least doesn't mark it as default)
 *before* pg_upgrading to a server version containing the proposed new
 implementation.  Not sure how workable that is.  Could we get away with
 having pg_upgrade unset the default flag in the old database?
 (Ick, but there are no very good alternatives here ...)

pg_upgrade has _never_ modified the old cluster, and I am hesitant to do
that now.  Can we make the changes in the new cluster, or in pg_dump
when in binary upgrade mode?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-23 Thread Andreas Karlsson

On 02/06/2014 06:14 PM, Emre Hasegeli wrote:

Third versions of the patches attached. They are rebased to the HEAD. In
this versions, the bitncommon function is changed. sys/socket.h included
to network_gist.c to be able to compile it on FreeBSD. Geometric mean
calculation for partial bucket match on the function
inet_hist_inclusion_selectivity
reverted back. It was something I changed without enough testing on
the second revision of the patch. This version uses the maximum divider
calculated from the boundaries of the bucket, like the first version. It is
simpler and more reliable.


Thanks for the updated patch.

About the discussions about upgrading PostgreSQL, extensions and 
defaults I do not have any strong opinion. I think that this patch is 
useful even if it does not end up the default, but it would be a pity 
since the BTree GiST index is broken.


Note: The patches do not apply anymore due to changes to 
src/backend/utils/adt/Makefile.



I am not convinced of your approach to calculating the selectivity from the
histogram. The thing I have the problem with is the clever trickery involved
with how you handle different operator types. I prefer the clearer code of
the range types with how calc_hist_selectivity_scalar is used. Is there any
reason for why that approach would not work here or result in worse code?


Currently we do not have histogram of the lower and upper bounds as
the range types. Current histogram can be used nicely as the lower bound,
but not the upper bound because the comparison is first on the common bits
of the network part, then on the length of the network part. For example,
10.0/16 is defined as greater than 10/8.

Using the histogram as the lower bounds of the networks is not enough to
calculate selectivity for any of these operators. Using it also as the upper
bounds is still not enough for the inclusion operators. The lengths of
the network parts should taken into consideration in a way and it is
what this patch does. Using separate histograms for the lower bounds,
the upper bounds and the lengths of the network parts can solve all of these
problems, but it is a lot of work.


I see, thanks for the explanation. But I am still not very fond of how 
that code is written since I find it hard to verify the correctness of 
it, but have no better suggestions.



I see from the tests that you still are missing selectivity functions for
operators, what is your plan for this?


This was because the join selectivity estimation functions. I set
the geo_selfuncs for the missing ones. All tests pass with them. I want
to develop the join selectivity function too, but not for this commit fest.


All tests pass now. Excellent!

Do you think the new index is useful even if you use the basic 
geo_selfuncs? Or should we wait with committing the patches until all 
selfuncs are implemented?


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-20 Thread Emre Hasegeli
2014-02-20 01:37, Tom Lane t...@sss.pgh.pa.us:

 Perhaps it would be acceptable to drop the btree_gist implementation
 and teach pg_upgrade to refuse to upgrade if the old database contains
 any such indexes.  I'm not sure that solves the problem, though, because
 I think pg_upgrade will still fail if the opclass is in the old database
 even though unused (because you'll get the complaint about multiple
 default opclasses anyway).  The only simple way to avoid that is to not
 have btree_gist loaded at all in the old DB, and I doubt that's an
 acceptable upgrade restriction.  It would require dropping indexes of
 the other types supported by btree_gist, which would be a pretty hard
 sell since those aren't broken.

 Really what we'd need here is for btree_gist to be upgraded to a version
 that doesn't define gist_inet_ops (or at least doesn't mark it as default)
 *before* pg_upgrading to a server version containing the proposed new
 implementation.  Not sure how workable that is.  Could we get away with
 having pg_upgrade unset the default flag in the old database?
 (Ick, but there are no very good alternatives here ...)

Upgrading btree_gist on the old installation would be almost impossible
for the majority of the users who use package managers, in my opinion.
I cannot think of a better solution than your suggestion. I can try to
prepare a patch to execute the following query on pg_upgrade before
dumping the old database, if that is the final decision.

UPDATE pg_opclass SET opcdefault = false
WHERE opcname IN ('gist_inet_ops', 'gist_cidr_ops');

 BTW, I'd not been paying any attention to this thread, but now that
 I scan through it, I think the proposal to change the longstanding
 names of the inet operators has zero chance of being accepted.
 Consistency with the names chosen for range operators is a mighty
 weak argument compared to backwards compatibility.

That is why I prepared it as a separate patch on top of the others. It is
not only consistency with the range types: @ and @ symbols used for
containment everywhere except the inet data types, particularly on
the geometric types, arrays; cube, hstore, intaray, ltree extensions.
The patch does not just change the operator names, it leaves the old ones,
adds new operators with GiST support and changes the documentation, like
your commit ba920e1c9182eac55d5f1327ab0d29b721154277 back in 2006. I could
not find why did you leave the inet operators unchanged on that commit,
in the mailing list archives [1]. GiST support will be a promotion for
users to switch to the new operators, if we make this change with it.

This change will also indirectly deprecate the undocumented non-transparent
btree index support that works sometimes for some of the subnet inclusion
operators [2].

[1] http://www.postgresql.org/message-id/14277.1157304...@sss.pgh.pa.us

[2] http://www.postgresql.org/message-id/389.1042992...@sss.pgh.pa.us


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Robert Haas
On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Emre Hasegeli e...@hasegeli.com writes:
 How about only removing the inet and the cidr operator classes
 from btree_gist. btree-gist-drop-inet-v2.patch does that.

 I'm not sure which part of no you didn't understand, but to be
 clear: you don't get to break existing installations.

 Assuming that this opclass is sufficiently better than the existing one,
 it would sure be nice if it could become the default; but I've not seen
 any proposal in this thread that would allow that without serious upgrade
 problems.  I think the realistic alternatives so far are (1) new opclass
 is not the default, or (2) this patch gets rejected.

 We should probably expend some thought on a general approach to
 replacing the default opclass for a datatype, because I'm sure this
 will come up again.  Right now I don't see a feasible way.

It seems odd for default to be a property of the opclass rather than
a separate knob.  What comes to mind is something like:

ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops;

Not sure how much that helps.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 We should probably expend some thought on a general approach to
 replacing the default opclass for a datatype, because I'm sure this
 will come up again.  Right now I don't see a feasible way.

 It seems odd for default to be a property of the opclass rather than
 a separate knob.  What comes to mind is something like:
 ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops;
 Not sure how much that helps.

Not at all, AFAICS.  If it were okay to decide that some formerly-default
opclass is no longer default, then having such a command would be better
than manually manipulating pg_opclass.opcdefault --- but extension upgrade
scripts could certainly do the latter if they had to.  The problem here
is whether it's upgrade-safe to make such a change at all; having
syntactic sugar doesn't make it safer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Emre Hasegeli
2014-02-19 16:52, Tom Lane t...@sss.pgh.pa.us:
 Not at all, AFAICS.  If it were okay to decide that some formerly-default
 opclass is no longer default, then having such a command would be better
 than manually manipulating pg_opclass.opcdefault --- but extension upgrade
 scripts could certainly do the latter if they had to.  The problem here
 is whether it's upgrade-safe to make such a change at all; having
 syntactic sugar doesn't make it safer.

It is not hard to support != operator on the new operator class. That would
make it functionally compatible with the ones on btree_gist. That way,
changing the default would not break pg_dump dumps, it would only upgrade
the index to the new one.

pg_upgrade dumps current objects in the extension. It fails to restore them,
if the new operator class exists as the default. I do not really understand
how pg_upgrade handle extension upgrades. I do not have a solution to this.

It would be sad if we cannot make the new operator class default because of
the btree_gist implementation which is not useful for inet data types. You
wrote on 2010-10-11 about it [1]:

 Well, actually the btree_gist implementation for inet is a completely
 broken piece of junk: it thinks that convert_network_to_scalar is 100%
 trustworthy and can be used as a substitute for the real comparison
 functions, which isn't even approximately true.  I'm not sure why
 Teodor implemented it like that instead of using the type's real
 comparison functions, but it's pretty much useless if you want the
 same sort order that the type itself defines.

[1] http://www.postgresql.org/message-id/8973.1286841...@sss.pgh.pa.us


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Tom Lane
Emre Hasegeli e...@hasegeli.com writes:
 [ cites bug #5705 ]

Hm.  I had forgotten how thoroughly broken btree_gist's inet and cidr
opclasses are.  There was discussion at the time of just ripping them
out despite the compatibility hit.  We didn't do it, but if we had
then life would be simpler now.

Perhaps it would be acceptable to drop the btree_gist implementation
and teach pg_upgrade to refuse to upgrade if the old database contains
any such indexes.  I'm not sure that solves the problem, though, because
I think pg_upgrade will still fail if the opclass is in the old database
even though unused (because you'll get the complaint about multiple
default opclasses anyway).  The only simple way to avoid that is to not
have btree_gist loaded at all in the old DB, and I doubt that's an
acceptable upgrade restriction.  It would require dropping indexes of
the other types supported by btree_gist, which would be a pretty hard
sell since those aren't broken.

Really what we'd need here is for btree_gist to be upgraded to a version
that doesn't define gist_inet_ops (or at least doesn't mark it as default)
*before* pg_upgrading to a server version containing the proposed new
implementation.  Not sure how workable that is.  Could we get away with
having pg_upgrade unset the default flag in the old database?
(Ick, but there are no very good alternatives here ...)

BTW, I'd not been paying any attention to this thread, but now that
I scan through it, I think the proposal to change the longstanding
names of the inet operators has zero chance of being accepted.
Consistency with the names chosen for range operators is a mighty
weak argument compared to backwards compatibility.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-17 Thread Andres Freund
On 2014-02-17 14:40:07 +0200, Emre Hasegeli wrote:
 2014-02-07 22:41, Robert Haas robertmh...@gmail.com:
 
  Generally, modifying already-release .sql files for extensions is a no-no...
 
 I prepared separate patches for btree_gist extension with more options.
 First one (btree-gist-drop-default-inet-v1.patch) removes DEFAULT keyword
 only from the inet and the cidr operator classes. Second one
 (btree-gist-drop-default-all-v1.patch) removes DEFAULT keyword for all
 operator classes. I think it is more consistent to remove it from all.
 Third one (btree-gist-drop-inet-v1.patch) removes the inet and the cidr
 operator classes altogether. It is suggested by Tom Lane [1] on bug #5705.
 The new GiST operator class includes basic comparison operators except !=
 so it may be the right time to remove support from btree_gist. Fourth one
 (btree-gist-drop-inet-and-default-v1.patch) is the second one and the third
 one together.
 
 [1] http://www.postgresql.org/message-id/10183.1287526...@sss.pgh.pa.us

 diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile
 index ba4af14..d5b1fd7 100644
 --- a/contrib/btree_gist/Makefile
 +++ b/contrib/btree_gist/Makefile
 @@ -9,7 +9,7 @@ OBJS =  btree_gist.o btree_utils_num.o btree_utils_var.o 
 btree_int2.o \
  btree_numeric.o
  
  EXTENSION = btree_gist
 -DATA = btree_gist--1.0.sql btree_gist--unpackaged--1.0.sql
 +DATA = btree_gist--1.1.sql btree_gist--unpackaged--1.0.sql

You need to add a file for going from 1.0 to 1.1.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-17 Thread Tom Lane
Emre Hasegeli e...@hasegeli.com writes:
 2014-02-17 14:54, Andres Freund and...@2ndquadrant.com:
 You need to add a file for going from 1.0 to 1.1.

 Thank you for the notice. I added them to the patches which touch only two
 of the operator classes. It drops and re-creates operator classes as there
 is not ALTER OPERATOR CLASS DROP DEFAULT command.

Dropping an operator class is quite unacceptable, as it will cause indexes
based on that class to go away (or more likely, cause the upgrade to fail,
if you didn't use CASCADE).  What we've done in the past for changes that
are nominally unsupported is to make the upgrade scripts tweak the system
catalogs directly.

More generally, it doesn't look to me like these upgrade scripts are
complete; shouldn't they be creating some new objects, not just replacing
old ones?

We need to have a discussion as to whether it's actually sane for an
upgrade to remove the DEFAULT marking on a pre-existing opclass.  It
strikes me that this would for instance break pg_dump dumps, in the sense
that the reloaded index would probably now have a different opclass
than before (since pg_dump would see no need to have put an explicit
opclass name into CREATE INDEX if it was the default in the old database).
Even if the new improved opclass is in all ways better, that would be
catastrophic for pg_upgrade I suspect.  Unless the new opclass is
on-disk-compatible with the old; in which case we shouldn't be creating
a new opclass at all, but just modifying the definition of the old one.

In short we probably need to think a bit harder about what this patch is
proposing to do.  It seems fairly likely to me that some other approach
would be a better idea.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-17 Thread Emre Hasegeli
2014-02-17 22:16, Tom Lane t...@sss.pgh.pa.us:

 More generally, it doesn't look to me like these upgrade scripts are
 complete; shouldn't they be creating some new objects, not just replacing
 old ones?

The actual patches are on the previous mail [1]. I was just trying
to solve the problem that btree_gist cannot be loaded because of
the new operator class.

 In short we probably need to think a bit harder about what this patch is
 proposing to do.  It seems fairly likely to me that some other approach
 would be a better idea.

How about only removing the inet and the cidr operator classes
from btree_gist. btree-gist-drop-inet-v2.patch does that.

[1] 
http://www.postgresql.org/message-id/cae2gyzxc0fxewe59sfduznq24c+frbdmgxwwvbyvmeanate...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-17 Thread Tom Lane
Emre Hasegeli e...@hasegeli.com writes:
 How about only removing the inet and the cidr operator classes
 from btree_gist. btree-gist-drop-inet-v2.patch does that.

I'm not sure which part of no you didn't understand, but to be
clear: you don't get to break existing installations.

Assuming that this opclass is sufficiently better than the existing one,
it would sure be nice if it could become the default; but I've not seen
any proposal in this thread that would allow that without serious upgrade
problems.  I think the realistic alternatives so far are (1) new opclass
is not the default, or (2) this patch gets rejected.

We should probably expend some thought on a general approach to
replacing the default opclass for a datatype, because I'm sure this
will come up again.  Right now I don't see a feasible way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-07 Thread Robert Haas
On Thu, Feb 6, 2014 at 12:14 PM, Emre Hasegeli e...@hasegeli.com wrote:
 I have misread the name, rename was not necessary. I removed the DEFAULT
 keywords for the inet and the cidr operator classes from btree_gist--1.0.sql
 on the inet-gist patch.

Generally, modifying already-release .sql files for extensions is a no-no...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-06 Thread Emre Hasegeli
2014-01-19 12:10, Emre Hasegeli e...@hasegeli.com:
 2014-01-19 Andreas Karlsson andr...@proxel.se:

 I am a bit suspicious about your memcmp based optimization in bitncommon,
 but it could be good. Have you benchmarked it compared to doing the same
 thing with a loop?

 I did, when I was writing that part. I will be happy to do it again. I will
 post the results.

I was testing it by creating GiST indexes. I realized that these test are
inconsistent when BUFFERING = AUTO. I repeated them with BUFFERING = ON.
The function without memcmp was faster in this case. I will change
the function in the next version of the patch.

The test case:

Create table Network as
select (a || '.' || b || '.' || c || '/24')::cidr
from generate_series(0, 255) as a,
generate_series(0, 255) as b,
generate_series(0, 255) as c;

Drop index if exists N;
Create index N on Network using gist(cidr) with (buffering = on);

Create table Network6 as
select ('::' || to_hex(a) || ':' || to_hex(b))::inet
from generate_series(0, 255) as a,
generate_series(0, 65535) as b;

Drop index if exists N6;
Create index N6 on Network6 using gist(inet) with (buffering = on);

What I could not understand is the tests with IP version 6 was much faster.
The row count is same, the index size is bigger.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-01-30 Thread Andreas Karlsson

On 01/23/2014 11:22 PM, Emre Hasegeli wrote:
 inet-gist
 -
 I realized that create extension btree_gist fails with the patch.


ERROR:  could not make operator class gist_inet_ops be default for type inet
DETAIL:  Operator class inet_ops already is the default.

I think making the new operator class default makes more sense. Name conflict
can be a bigger problem. Should I rename the new operator class?


I agree that the new operator class should be the default one, it is 
more useful and also not buggy. No idea about the naming.



The only thing is that I think your index would do poorly on the case where
all values share a prefix before the netmask but have different values after
the netmask, since gist_union and gist_penalty does not care about the bits
after the netmask. Am I correct? Have you thought anything about this?


Yes, I have tried to construct the index with ip_maxbits() instead of ip_bits()
but I could not make it work with the cidr type. It can be done by adding
operator as a parameter for union, penalty and picksplit functions on the
GiST framework. I am not sure it worths the trouble. It would only help
the basic comparison operators and it would slow down other operators because
network length comparison before network address would not be possible for
the intermediate nodes. I mentioned this behavior of the operator class on
the paragraph added to the documentation.


I think this is fine since it does not seem like a very useful case to 
support to me. Would be worth doing only if it is easy to do.


A separate concern: we still have not come to a decision about operators 
for inet here. I do not like the fact that the operators differ between 
ranges and inet. But I feel this might be out of scope for this patch.


Does any third party want to chime in with an opinion?

Current inet/cidr
is contained within
=   is contained within or equals
contains
=   contains or equals

Range types
@   contains range
@   range is contained by
  overlap (have points in common)
strictly left of
strictly right of


inet-selfuncs
-

Here I have to honestly admit I am not sure if I can tell if your
selectivity function is correct. Your explanation sounds convincing and the
general idea of looking at the histogram is correct. I am just have no idea
if the details are right.


I tried to improve it in this version. I hope it is more readable now. I used
the word inclusion instead of overlap, made some changes on the algorithm
to make it more suitable to the other operators.

Using the histogram for inclusion which is originally for basic comparison,
is definitely not correct. It is an empirical approach. I have tried several
versions of it, tested them with different data sets and thought it is better
than nothing.


Thanks for the updates. The code in this version is cleaner and easier 
to follow.


I am not convinced of your approach to calculating the selectivity from 
the histogram. The thing I have the problem with is the clever trickery 
involved with how you handle different operator types. I prefer the 
clearer code of the range types with how calc_hist_selectivity_scalar is 
used. Is there any reason for why that approach would not work here or 
result in worse code?


I have attached a patch where I improved the language of your comment 
describing the workings of the selectivity estimation. Could you please 
check it so I did not change the meaning of any of it?


I see from the tests that you still are missing selectivity functions 
for operators, what is your plan for this?


--
Andreas Karlsson
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
new file mode 100644
index 87a7390..3b99afe
*** a/src/backend/utils/adt/network_selfuncs.c
--- b/src/backend/utils/adt/network_selfuncs.c
*** inet_opr_order(Oid operator)
*** 172,181 
   *
   * Calculates histogram selectivity for the subnet inclusion operators of
   * the inet type. In the normal case, the return value is between 0 and 1.
!  * It should be corrected with MVC selectivity and null fraction. If
   * the constant is less than the first element or greater than the last
   * element of the histogram the return value will be 0. If the histogram
!  * does not available, the return value will be -1.
   *
   * The histogram is originally for the basic comparison operators. Only
   * the common bits of the network part and the lenght of the network part
--- 172,181 
   *
   * Calculates histogram selectivity for the subnet inclusion operators of
   * the inet type. In the normal case, the return value is between 0 and 1.
!  * It should be corrected with the MVC selectivity and null fraction. If
   * the constant is less than the first element or greater than the last
   * element of the histogram the return value will be 0. If the histogram
!  * is not available, the return value will be -1.
   *
   * The 

Re: [HACKERS] GiST support for inet datatypes

2014-01-19 Thread Emre Hasegeli
2014-01-19 Andreas Karlsson andr...@proxel.se:
 Hi,

 I will review your two patches (gist support + selectivity). This is part 1
 of my review. I will look more into the actual GiST implementation in a
 couple of days, but thought I could provide you with my initial input right
 away.

Thank you for looking at it.


 inet-gist
 -

 General:

 I like the idea of the patch and think the  operator is useful for
 exclusion constraints, and that indexing the contains operator is useful for
 IP-address lookups. There is an extension, ip4r, which adds a GiST indexed
 type for IP ranges but since we have the inet type in core I think it should
 have GiST indexes.

 I am not convinced an adjacent operator is useful for the inet type, but if
 it is included it should be indexed just like -|- of ranges. We should try
 to keep these lists of indexed operators the same.

I added it just not to leave negotor field empty. It can also be useful with
exclusion constraints but not with GiST support. I did not add GiST support
for it and the not equals operator because they do not fit the index
structure. I can just remove the operator for now.


 Compilation:

 Compiled without errors.

 Regression tests:

 One of the broken regression tests seems unrelated to the selectivity
 functions.

 -- Make a list of all the distinct operator names being used in particular
 -- strategy slots.

 I think it would be fine just to add the newly indexed operators here, but
 the more indexed operators we get in the core the less useful this test
 becomes.

I did not add the new operators to the list because I do not feel right
about supporting , =, , = symbols for these operators.
They should be @, @=, @, @= to be consistent with other types.


 I am a bit suspicious about your memcmp based optimization in bitncommon,
 but it could be good. Have you benchmarked it compared to doing the same
 thing with a loop?

I did, when I was writing that part. I will be happy to do it again. I will
post the results.


 Documentation:

 Please use examples in the operator table which will evaluate to true, and
 for the  case an example where not both sides are the same.

I will change in the next version.


 I have not found a place either in the documentation where it is documented
 which operators are documented. I would suggest just adding a short note
 after the operators table.

I will add in the next version.


 inet-selfuncs
 -

 Compilation:

 There were some new warnings caused by this patch (with gcc 4.8.2). The
 warnings were all about the use of uninitialized variables (right,
 right_min_bits, right_order) in inet_hist_overlap_selectivity. Looking at
 the code I see that they are harmless but you should still rewrite the loop
 to silence the warnings.

I will fix in the next version.


 Regression tests:

 There are two broken

 -- Insist that all built-in pg_proc entries have descriptions

 Here you should just add descriptions for the new functions in pg_proc.h.

I will add in the next version.


 -- Check that all opclass search operators have selectivity estimators.

 Fails due to missing selectivity functions for the operators. I think you
 should at least for now do like the range types and use areajoinsel,
 contjoinsel, etc for the join selectivity estimation.

I did not support them in the first version because I could not decide
the way. I have better options than using the the geo_selfuncs for these
operators. The options are from simple to complex:

0) just use network_overlap_selectivity
1) fix network_overlap_selectivity with a constant between 0 and 1
2) calculate this constant by using masklens of all buckets of the histogram
3) calculate this constant by using masklens of matching buckets of
   the histogram
4) store STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM for this
   types, calculate this constant with it
5) create another kind of histogram for masklens, calculate this constant
   with it

I do not know how to do 4 or 5, so I will try 3 for the next version. Do you
think it is a good idea?


 Source code:

 The selectivity estimation functions, network_overlap_selectivity and
 network_adjacent_selectivity, should follow the naming conventions of the
 other selectivity estimation functions. They are named things like
 tsmatchsel, tsmatchjoinsel, and rangesel.

I will rename it to inetoverlapsel, then.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-01-19 Thread Andreas Karlsson

On 01/19/2014 11:10 AM, Emre Hasegeli wrote:

I am not convinced an adjacent operator is useful for the inet type, but if
it is included it should be indexed just like -|- of ranges. We should try
to keep these lists of indexed operators the same.


I added it just not to leave negotor field empty. It can also be useful with
exclusion constraints but not with GiST support. I did not add GiST support
for it and the not equals operator because they do not fit the index
structure. I can just remove the operator for now.


Sounds good. None of the other  implementations have a negator.


I think it would be fine just to add the newly indexed operators here, but
the more indexed operators we get in the core the less useful this test
becomes.


I did not add the new operators to the list because I do not feel right
about supporting , =, , = symbols for these operators.
They should be @, @=, @, @= to be consistent with other types.


I agree, but I am not sure if it is ok to break backward compatibility here.


-- Check that all opclass search operators have selectivity estimators.

Fails due to missing selectivity functions for the operators. I think you
should at least for now do like the range types and use areajoinsel,
contjoinsel, etc for the join selectivity estimation.


I did not support them in the first version because I could not decide
the way. I have better options than using the the geo_selfuncs for these
operators. The options are from simple to complex:

0) just use network_overlap_selectivity
1) fix network_overlap_selectivity with a constant between 0 and 1
2) calculate this constant by using masklens of all buckets of the histogram
3) calculate this constant by using masklens of matching buckets of
the histogram
4) store STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM for this
types, calculate this constant with it
5) create another kind of histogram for masklens, calculate this constant
with it

I do not know how to do 4 or 5, so I will try 3 for the next version. Do you
think it is a good idea?


Solutions 0 and 1 does not really sound better than using geo_selfuncs.

--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-01-19 Thread Andreas Karlsson

Here comes part 2 of 2 of the review.

inet-gist
-

In general the code looks good and I think your approach makes sense. 
Not an expert on GiST though so I would like a second opinion on this. 
Maybe there is some clever trick which would make the index more 
efficient, but the design I see is simple and logical. Like this much 
more than the incorrect GiST index for inet in btree_gist.


The only thing is that I think your index would do poorly on the case 
where all values share a prefix before the netmask but have different 
values after the netmask, since gist_union and gist_penalty does not 
care about the bits after the netmask. Am I correct? Have you thought 
anything about this?


To create such data:

CREATE TABLE a (ip inet);
INSERT INTO a SELECT '127.0.0.0/16'::inet + s FROM generate_series(1, 
pow(2, 16)::int) s;

CREATE INDEX aidx ON a USING gist (ip);

Saw also some minor things too.

Typo in comment, weird sentence:

 * The main question to calculate the union is that they have how many
 * bits in common. [...]

I do not like how you called the result key i inet_union_gist() tmp. 
Something like result or result_ip would be better.


Typo in comment, reverse should be inverse:

 * Penalty is reverse of the common IP bits of the two addresses.

Typo in comment, missing be:

 * Values bigger than 1 are used when the common IP bits cannot
 * calculated.

Language can be improved in this comment. Both ways to split are by the 
union of the keys, it is more of a split by address prefix.


 * The second and the important way is to split by the union of the keys.
 * Union of the keys calculated same way with the inet_gist_union function.
 * The first and the last biggest subnets created from the calculated
 * union. In this case addresses contained by the first subnet will be put
 * to the left bucket, address contained by the last subnet will be put to
 * the right bucket.

Could you not just use memcmp in inet_gist_same() instead of bitncmp() 
since it only cares about equality?


There is an extra empty line at the end of network_gist.c

inet-selfuncs
-

Here I have to honestly admit I am not sure if I can tell if your 
selectivity function is correct. Your explanation sounds convincing and 
the general idea of looking at the histogram is correct. I am just have 
no idea if the details are right.


DEFAULT_NETWORK_OVERLAP_SELECTIVITY should be renamed 
DEFAULT_NETWORK_OVERLAP_SEL and moved to the .c file.


Typo in comment, constrant - constant.

There is an extra empty line at the end of network_selfuncs.c

--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-01-18 Thread Andreas Karlsson

Hi,

I will review your two patches (gist support + selectivity). This is 
part 1 of my review. I will look more into the actual GiST 
implementation in a couple of days, but thought I could provide you with 
my initial input right away.


inet-gist
-

General:

I like the idea of the patch and think the  operator is useful for 
exclusion constraints, and that indexing the contains operator is useful 
for IP-address lookups. There is an extension, ip4r, which adds a GiST 
indexed type for IP ranges but since we have the inet type in core I 
think it should have GiST indexes.


I am not convinced an adjacent operator is useful for the inet type, but 
if it is included it should be indexed just like -|- of ranges. We 
should try to keep these lists of indexed operators the same.


Compilation:

Compiled without errors.

Regression tests:

One of the broken regression tests seems unrelated to the selectivity 
functions.


-- Make a list of all the distinct operator names being used in particular
-- strategy slots.

I think it would be fine just to add the newly indexed operators here, 
but the more indexed operators we get in the core the less useful this 
test becomes.


Source code:

The is adjacent to operator, -|-, does not seem to be doing the right 
thing. In the code it is implemented as the inverse of  which is not 
true. The below comparison should return false.


SELECT '10.0.0.1'::inet -|- '127.0.0.1'::inet;
 ?column?
--
 t
(1 row)

I am a bit suspicious about your memcmp based optimization in 
bitncommon, but it could be good. Have you benchmarked it compared to 
doing the same thing with a loop?


Documentation:

Please use examples in the operator table which will evaluate to true, 
and for the  case an example where not both sides are the same.


I have not found a place either in the documentation where it is 
documented which operators are documented. I would suggest just adding a 
short note after the operators table.


inet-selfuncs
-

Compilation:

There were some new warnings caused by this patch (with gcc 4.8.2). The 
warnings were all about the use of uninitialized variables (right, 
right_min_bits, right_order) in inet_hist_overlap_selectivity. Looking 
at the code I see that they are harmless but you should still rewrite 
the loop to silence the warnings.


Regression tests:

There are two broken

-- Insist that all built-in pg_proc entries have descriptions

Here you should just add descriptions for the new functions in pg_proc.h.

-- Check that all opclass search operators have selectivity estimators.

Fails due to missing selectivity functions for the operators. I think 
you should at least for now do like the range types and use areajoinsel, 
contjoinsel, etc for the join selectivity estimation.


Source code:

The selectivity estimation functions, network_overlap_selectivity and 
network_adjacent_selectivity, should follow the naming conventions of 
the other selectivity estimation functions. They are named things like 
tsmatchsel, tsmatchjoinsel, and rangesel.


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-01-05 Thread Emre Hasegeli
2013-12-17 Emre Hasegeli e...@hasegeli.com:
 Query planner never chooses to use the index for the operators which
 the index is particularly useful because selectivity estimation functions
 are missing. I am planning to work on them.

Attached patch adds selectivity estimation functions for the overlap and
adjacent operators. Other operators need a bit more work. I want to send it
before to get some feedback.


inet-selfuncs-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2013-12-22 Thread David Fetter
On Tue, Dec 17, 2013 at 08:58:13PM +0200, Emre Hasegeli wrote:
 Hi,
 
 Attached patch adds GiST support to the inet datatypes with two new
 operators. Overlaps operator can be used with exclusion constraints.
 Is adjacent to operator is just the negator of it. Index uses only
 the network bits of the addresses. Except for the new operators and
 is contained within, contains; basic comparison operators are also
 supported.

Please add this patch to the upcoming Commitfest at
https://commitfest.postgresql.org/action/commitfest_view?id=21

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers