Re: [HACKERS] GiST support for inet datatypes

Andreas Karlsson Sun, 19 Jan 2014 15:18:31 -0800

Here comes part 2 of 2 of the review.

inet-gist
---------

In general the code looks good and I think your approach makes sense.Not an expert on GiST though so I would like a second opinion on this.Maybe there is some clever trick which would make the index moreefficient, but the design I see is simple and logical. Like this muchmore than the incorrect GiST index for inet in btree_gist.

The only thing is that I think your index would do poorly on the casewhere all values share a prefix before the netmask but have differentvalues after the netmask, since gist_union and gist_penalty does notcare about the bits after the netmask. Am I correct? Have you thoughtanything about this?


To create such data:

CREATE TABLE a (ip inet);

INSERT INTO a SELECT '127.0.0.0/16'::inet + s FROM generate_series(1,pow(2, 16)::int) s;

CREATE INDEX aidx ON a USING gist (ip);

Saw also some minor things too.

Typo in comment, weird sentence:

 * The main question to calculate the union is that they have how many
 * bits in common. [...]

I do not like how you called the result key i inet_union_gist() "tmp".Something like "result" or "result_ip" would be better.


Typo in comment, "reverse" should be "inverse":

 * Penalty is reverse of the common IP bits of the two addresses.

Typo in comment, missing "be":

 * Values bigger than 1 are used when the common IP bits cannot
 * calculated.

Language can be improved in this comment. Both ways to split are by theunion of the keys, it is more of a split by address prefix.


 * The second and the important way is to split by the union of the keys.
 * Union of the keys calculated same way with the inet_gist_union function.
 * The first and the last biggest subnets created from the calculated
 * union. In this case addresses contained by the first subnet will be put
 * to the left bucket, address contained by the last subnet will be put to
 * the right bucket.

Could you not just use memcmp in inet_gist_same() instead of bitncmp()since it only cares about equality?


There is an extra empty line at the end of network_gist.c

inet-selfuncs
-------------

Here I have to honestly admit I am not sure if I can tell if yourselectivity function is correct. Your explanation sounds convincing andthe general idea of looking at the histogram is correct. I am just haveno idea if the details are right.

DEFAULT_NETWORK_OVERLAP_SELECTIVITY should be renamedDEFAULT_NETWORK_OVERLAP_SEL and moved to the .c file.


Typo in comment, "constrant" -> "constant".

There is an extra empty line at the end of network_selfuncs.c

--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST support for inet datatypes

Reply via email to