I am trying to understand the rangetypes spgist code and its interaction
with empty ranges. (Slightly embarrassing, because I reviewed the code.)
I see an old email here:
http://www.postgresql.org/message-id/50145a9c.7080...@enterprisedb.com
But still don't have a clear picture.
What I don't understand is:
1. Under what conditions might a tuple have no prefix (centroid), yet
also *not* be allTheSame?
2. Why would any tuple have 2 nodes? If there are some non-empty ranges,
why not make a centroid and have 4 or 5 nodes?
I added a bunch of assertions that seem reasonable to me based on my
understanding of the structure, and I attached them as a patch. They all
pass regression, which has a non-trivial set of input ranges, so there
is a reasonable chance the assertions are generally true. But if they
are true, some refactoring is in order.
It seems like the structure should be something like:
* Empty and non-empty ranges are only mixed in the root (level == 0).
* If a tuple has any non-empty ranges, there's a prefix (centroid).
AllTheSame may or may not be set (ordinarily not).
* If a tuple has only empty ranges, there's no prefix, and allTheSame is
set.
* The root tuple may have 5 nodes if there are any non-empty ranges,
otherwise 1 node (and allTheSame is set).
* Inner tuples may have either:
- one node if it contains only empty ranges, in which case it is
allTheSame, and must be in the 5th node (quadrant) of the root node, and
be at level 1
- four nodes if there are any non-empty ranges, in which case it has
*no* empty ranges
Note: I am using allTheSame in the logical sense; perhaps the flag is
an internal optimization that we can't rely on.
Regards,
Jeff Davis
*** a/src/backend/utils/adt/rangetypes_spgist.c
--- b/src/backend/utils/adt/rangetypes_spgist.c
***
*** 104,109 getQuadrant(TypeCacheEntry *typcache, RangeType *centroid, RangeType *tst)
--- 104,112
range_deserialize(typcache, centroid, centroidLower, centroidUpper,
centroidEmpty);
+
+ Assert(!centroidEmpty);
+
range_deserialize(typcache, tst, lower, upper, empty);
if (empty)
***
*** 138,143 spg_range_quad_choose(PG_FUNCTION_ARGS)
--- 141,158
int16 quadrant;
TypeCacheEntry *typcache;
+ /* if there is no prefix, they must be all the same */
+ Assert(in-hasPrefix || in-allTheSame);
+ /*
+ * If there are two nodes, it must either be the root or at level 1, and
+ * there must be no prefix.
+ */
+ Assert(in-nNodes != 2 ||
+ (!in-hasPrefix (in-level == 0 || in-level == 1)));
+ /* if there are 4 or 5 nodes, there must be a prefix */
+ Assert(in-nNodes != 4 || in-hasPrefix);
+ Assert(in-nNodes != 5 || (in-hasPrefix in-level == 0));
+
if (in-allTheSame)
{
out-resultType = spgMatchNode;
***
*** 156,161 spg_range_quad_choose(PG_FUNCTION_ARGS)
--- 171,178
*/
if (!in-hasPrefix)
{
+ Assert(false);
+
out-resultType = spgMatchNode;
if (RangeIsEmpty(inRange))
out-result.matchNode.nodeN = 0;
***
*** 232,237 spg_range_quad_picksplit(PG_FUNCTION_ARGS)
--- 249,261
nonEmptyCount = j;
/*
+ * Only the root mixes empty and non-empty tuples.
+ */
+ Assert(nonEmptyCount == 0 ||
+ nonEmptyCount == in-nTuples ||
+ in-level == 0);
+
+ /*
* All the ranges are empty. The best we can do is to construct an inner
* node with no centroid, and put all ranges into node 0. If non-empty
* ranges are added later, they will be routed to node 1.
***
*** 315,320 spg_range_quad_inner_consistent(PG_FUNCTION_ARGS)
--- 339,356
*/
bool needPrevious = false;
+ /* if there is no prefix, they must be all the same */
+ Assert(in-hasPrefix || in-allTheSame);
+ /*
+ * If there are two nodes, it must either be the root or at level 1, and
+ * there must be no prefix.
+ */
+ Assert(in-nNodes != 2 ||
+ (!in-hasPrefix (in-level == 0 || in-level == 1)));
+ /* if there are 4 or 5 nodes, there must be a prefix */
+ Assert(in-nNodes != 4 || in-hasPrefix);
+ Assert(in-nNodes != 5 || (in-hasPrefix in-level == 0));
+
if (in-allTheSame)
{
/* Report that all nodes should be visited */
***
*** 327,332 spg_range_quad_inner_consistent(PG_FUNCTION_ARGS)
--- 363,370
if (!in-hasPrefix)
{
+ Assert(false);
+
/*
* No centroid on this inner node. Such a node has two child nodes,
* the first for empty ranges, and the second for non-empty ones.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers