Hi Tom,

> Perhaps the docs are a bit unclear about that, but it's not
> restricted to ASCII alphanumerics.  AFAICS the code will accept
> whatever iswalpha() and iswdigit() will accept in the database's
> default locale.

Sorry but I don't think that is correct. Here is the single
definition check of what constitutes a valid character:
https://github.com/postgres/postgres/blob/c3315a7da57be720222b119385ed0f7ad7c15268/contrib/ltree/ltree.h#L129

As you can see, there are no `is_*` calls at all. Where in this contrib
package do you see `iswalpha`? Perhaps I missed it.

> That seems really pretty random.

Ok. I am trying to avoid a situation where other users may wish to use
other delimiters other than `-`, due to its commonplace presence in words
(eg., compound ones).

On Wed, Oct 5, 2022 at 2:59 PM Tom Lane <t...@sss.pgh.pa.us> wrote:

> Garen Torikian <gjtorik...@gmail.com> writes:
> > I am submitting a patch to expand the label requirements for ltree.
>
> > The current format is restricted to alphanumeric characters, plus _.
> > Unfortunately, for non-English labels, this set is insufficient.
>
> Hm?  Perhaps the docs are a bit unclear about that, but it's not
> restricted to ASCII alphanumerics.  AFAICS the code will accept
> whatever iswalpha() and iswdigit() will accept in the database's
> default locale.  There's certainly work that could/should be done
> to allow use of not-so-default locales, but that's not specific
> to ltree.  I'm not sure that doing an application-side encoding
> is attractive compared to just using that ability directly.
>
> If you do want to do application-side encoding, I'm unsure why
> punycode would be the choice anyway, as opposed to something
> that can fit in the existing restrictions.
>
> > On top of this, I added support for two more characters: # and ;, which
> are
> > used for HTML entities.
>
> That seems really pretty random.
>
>                         regards, tom lane
>

Reply via email to