On Tue, Mar 2, 2021, at 01:12, Mark Dilger wrote: > I like the idea so I did a bit of testing. I think the following should not > error, but does: > > +SELECT regexp_positions('foObARbEqUEbAz', $re$(?=beque)$re$, 'i'); > +ERROR: range lower bound must be less than or equal to range upper bound
Thanks for testing! The bug is due to using an (inclusive,inclusive) range, so for the example the code tried to construct a (7,6,'[]') range. When trying to fix, I think I've found a general problem with ranges: I'll use int4range() to demonstrate the problem: First the expected error for what the patch tries to do internally using make_range(). This is all good: # SELECT int4range(7,6,'[]'); ERROR: range lower bound must be less than or equal to range upper bound I tried to fix this like this: @ src/backend/utils/adt/regexp.c - lower.val = Int32GetDatum(so + 1); + lower.val = Int32GetDatum(so); lower.infinite = false; - lower.inclusive = true; + lower.inclusive = false; lower.lower = true; Which would give the same result as doing: SELECT int4range(6,6,'(]'); int4range ----------- empty (1 row) Hmm. This "empty" value what surprise to me. I would instead have assumed the canonical form "[7,7)". If I wanted to know if the range is empty or not, I would have guessed I should use the isempty() function, like this: SELECT isempty(int4range(6,6,'(]')); isempty --------- t (1 row) Since we have this isempty() function, I don't see the point of discarding the lower/upper vals, since they contain possibly interesting information on where the empty range exists. I find it strange two ranges of zero-length with different bounds are considered equal: SELECT '[7,7)'::int4range = '[8,8)'::int4range; ?column? ---------- t (1 row) This seems like a bug to me. What am I missing here? Unless fixed, then the way I see it, I don't think we can use int4range[] for regexp_positions(), if we want to allow returning the positions for zero-length matches, which would be nice. /Joel