> On Mar 5, 2021, at 11:46 AM, Joel Jacobson <j...@compiler.org> wrote:
> 
> 
> /Joel
> <range.sql><0003-regexp-positions.patch>

I did a bit more testing:

+SELECT regexp_positions('foobarbequebaz', 'b', 'g');
+ regexp_positions 
+------------------
+ {"[3,5)"}
+ {"[6,8)"}
+ {"[11,13)"}
+(3 rows)
+

I understand that these ranges are intended to be read as one character long 
matches starting at positions 3, 6, and 11, but they look like they match 
either two or three characters, depending on how you read them, and users will 
likely be confused by that.

+SELECT regexp_positions('foobarbequebaz', '(?=beque)', 'g');
+ regexp_positions 
+------------------
+ {"[6,7)"}
+(1 row)
+

This is a zero length match.  As above, it might be confusing that a zero 
length match reads this way.

+SELECT regexp_positions('foobarbequebaz', '(?<=z)', 'g');
+ regexp_positions 
+------------------
+ {"[14,15)"}
+(1 row)
+

Same here, except this time position 15 is referenced, which is beyond the end 
of the string.

I think a zero length match at the end of this string should read as 
{"[14,14)"}, and you have been forced to add one to avoid that collapsing down 
to "empty", but I'd rather you found a different datatype rather than abuse the 
definition of int4range.

It seems that you may have reached a similar conclusion down-thread?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Reply via email to