2014-01-19 Andreas Karlsson <andr...@proxel.se>: > Hi, > > I will review your two patches (gist support + selectivity). This is part 1 > of my review. I will look more into the actual GiST implementation in a > couple of days, but thought I could provide you with my initial input right > away.
Thank you for looking at it. > > inet-gist > --------- > > General: > > I like the idea of the patch and think the && operator is useful for > exclusion constraints, and that indexing the contains operator is useful for > IP-address lookups. There is an extension, ip4r, which adds a GiST indexed > type for IP ranges but since we have the inet type in core I think it should > have GiST indexes. > > I am not convinced an adjacent operator is useful for the inet type, but if > it is included it should be indexed just like -|- of ranges. We should try > to keep these lists of indexed operators the same. I added it just not to leave negotor field empty. It can also be useful with exclusion constraints but not with GiST support. I did not add GiST support for it and the not equals operator because they do not fit the index structure. I can just remove the operator for now. > > Compilation: > > Compiled without errors. > > Regression tests: > > One of the broken regression tests seems unrelated to the selectivity > functions. > > -- Make a list of all the distinct operator names being used in particular > -- strategy slots. > > I think it would be fine just to add the newly indexed operators here, but > the more indexed operators we get in the core the less useful this test > becomes. I did not add the new operators to the list because I do not feel right about supporting <<, <<=, >>, >>= symbols for these operators. They should be <@, <@=, @>, @>= to be consistent with other types. > > I am a bit suspicious about your memcmp based optimization in bitncommon, > but it could be good. Have you benchmarked it compared to doing the same > thing with a loop? I did, when I was writing that part. I will be happy to do it again. I will post the results. > > Documentation: > > Please use examples in the operator table which will evaluate to true, and > for the && case an example where not both sides are the same. I will change in the next version. > > I have not found a place either in the documentation where it is documented > which operators are documented. I would suggest just adding a short note > after the operators table. I will add in the next version. > > inet-selfuncs > ------------- > > Compilation: > > There were some new warnings caused by this patch (with gcc 4.8.2). The > warnings were all about the use of uninitialized variables (right, > right_min_bits, right_order) in inet_hist_overlap_selectivity. Looking at > the code I see that they are harmless but you should still rewrite the loop > to silence the warnings. I will fix in the next version. > > Regression tests: > > There are two broken > > -- Insist that all built-in pg_proc entries have descriptions > > Here you should just add descriptions for the new functions in pg_proc.h. I will add in the next version. > > -- Check that all opclass search operators have selectivity estimators. > > Fails due to missing selectivity functions for the operators. I think you > should at least for now do like the range types and use areajoinsel, > contjoinsel, etc for the join selectivity estimation. I did not support them in the first version because I could not decide the way. I have better options than using the the geo_selfuncs for these operators. The options are from simple to complex: 0) just use network_overlap_selectivity 1) fix network_overlap_selectivity with a constant between 0 and 1 2) calculate this constant by using masklens of all buckets of the histogram 3) calculate this constant by using masklens of matching buckets of the histogram 4) store STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM for this types, calculate this constant with it 5) create another kind of histogram for masklens, calculate this constant with it I do not know how to do 4 or 5, so I will try 3 for the next version. Do you think it is a good idea? > > Source code: > > The selectivity estimation functions, network_overlap_selectivity and > network_adjacent_selectivity, should follow the naming conventions of the > other selectivity estimation functions. They are named things like > tsmatchsel, tsmatchjoinsel, and rangesel. I will rename it to inetoverlapsel, then. -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers