I am trying to understand the rangetypes spgist code and its interaction
with empty ranges. (Slightly embarrassing, because I reviewed the code.)
I see an old email here:
http://www.postgresql.org/message-id/[email protected]
But still don't have a clear picture.
What I don't understand is:
1. Under what conditions might a tuple have no prefix (centroid), yet
also *not* be allTheSame?
2. Why would any tuple have 2 nodes? If there are some non-empty ranges,
why not make a centroid and have 4 or 5 nodes?
I added a bunch of assertions that seem reasonable to me based on my
understanding of the structure, and I attached them as a patch. They all
pass regression, which has a non-trivial set of input ranges, so there
is a reasonable chance the assertions are generally true. But if they
are true, some refactoring is in order.
It seems like the structure should be something like:
* Empty and non-empty ranges are only mixed in the root (level == 0).
* If a tuple has any non-empty ranges, there's a prefix (centroid).
AllTheSame may or may not be set (ordinarily not).
* If a tuple has only empty ranges, there's no prefix, and allTheSame is
set.
* The root tuple may have 5 nodes if there are any non-empty ranges,
otherwise 1 node (and allTheSame is set).
* Inner tuples may have either:
- one node if it contains only empty ranges, in which case it is
allTheSame, and must be in the 5th node (quadrant) of the root node, and
be at level 1
- four nodes if there are any non-empty ranges, in which case it has
*no* empty ranges
Note: I am using "allTheSame" in the logical sense; perhaps the flag is
an internal optimization that we can't rely on.
Regards,
Jeff Davis
*** a/src/backend/utils/adt/rangetypes_spgist.c
--- b/src/backend/utils/adt/rangetypes_spgist.c
***************
*** 104,109 **** getQuadrant(TypeCacheEntry *typcache, RangeType *centroid, RangeType *tst)
--- 104,112 ----
range_deserialize(typcache, centroid, ¢roidLower, ¢roidUpper,
¢roidEmpty);
+
+ Assert(!centroidEmpty);
+
range_deserialize(typcache, tst, &lower, &upper, &empty);
if (empty)
***************
*** 138,143 **** spg_range_quad_choose(PG_FUNCTION_ARGS)
--- 141,158 ----
int16 quadrant;
TypeCacheEntry *typcache;
+ /* if there is no prefix, they must be all the same */
+ Assert(in->hasPrefix || in->allTheSame);
+ /*
+ * If there are two nodes, it must either be the root or at level 1, and
+ * there must be no prefix.
+ */
+ Assert(in->nNodes != 2 ||
+ (!in->hasPrefix && (in->level == 0 || in->level == 1)));
+ /* if there are 4 or 5 nodes, there must be a prefix */
+ Assert(in->nNodes != 4 || in->hasPrefix);
+ Assert(in->nNodes != 5 || (in->hasPrefix && in->level == 0));
+
if (in->allTheSame)
{
out->resultType = spgMatchNode;
***************
*** 156,161 **** spg_range_quad_choose(PG_FUNCTION_ARGS)
--- 171,178 ----
*/
if (!in->hasPrefix)
{
+ Assert(false);
+
out->resultType = spgMatchNode;
if (RangeIsEmpty(inRange))
out->result.matchNode.nodeN = 0;
***************
*** 232,237 **** spg_range_quad_picksplit(PG_FUNCTION_ARGS)
--- 249,261 ----
nonEmptyCount = j;
/*
+ * Only the root mixes empty and non-empty tuples.
+ */
+ Assert(nonEmptyCount == 0 ||
+ nonEmptyCount == in->nTuples ||
+ in->level == 0);
+
+ /*
* All the ranges are empty. The best we can do is to construct an inner
* node with no centroid, and put all ranges into node 0. If non-empty
* ranges are added later, they will be routed to node 1.
***************
*** 315,320 **** spg_range_quad_inner_consistent(PG_FUNCTION_ARGS)
--- 339,356 ----
*/
bool needPrevious = false;
+ /* if there is no prefix, they must be all the same */
+ Assert(in->hasPrefix || in->allTheSame);
+ /*
+ * If there are two nodes, it must either be the root or at level 1, and
+ * there must be no prefix.
+ */
+ Assert(in->nNodes != 2 ||
+ (!in->hasPrefix && (in->level == 0 || in->level == 1)));
+ /* if there are 4 or 5 nodes, there must be a prefix */
+ Assert(in->nNodes != 4 || in->hasPrefix);
+ Assert(in->nNodes != 5 || (in->hasPrefix && in->level == 0));
+
if (in->allTheSame)
{
/* Report that all nodes should be visited */
***************
*** 327,332 **** spg_range_quad_inner_consistent(PG_FUNCTION_ARGS)
--- 363,370 ----
if (!in->hasPrefix)
{
+ Assert(false);
+
/*
* No centroid on this inner node. Such a node has two child nodes,
* the first for empty ranges, and the second for non-empty ones.
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers