Re: Pre-proposal: unicode normalized text

Peter Eisentraut Mon, 09 Oct 2023 23:47:51 -0700

On 06.10.23 19:22, Jeff Davis wrote:

On Fri, 2023-10-06 at 09:58 +0200, Peter Eisentraut wrote:

If you want to be rigid about it, you also need to consider whether
the
Unicode version used by the ICU library in use matches the one used
by
the in-core tables.

What problem are you concerned about here? I thought about it and I
didn't see an obvious issue.


If the ICU unicode version is ahead of the Postgres unicode version,
and no unassigned code points are used according to the Postgres
version, then there's no problem.

And in the other direction, there might be some code points that are
assigned according to the postgres unicode version but unassigned
according to the ICU version. But that would be tracked by the
collation version as you pointed out earlier, so upgrading ICU would be
like any other ICU upgrade (with the same risks). Right?

It might be alright in this particular combination of circumstances.But in general if we rely on these tables for correctness (e.g., checkthat a string is normalized before passing it to a function thatrequires it to be normalized), we would need to consider this. Thecorrect fix would then probably be to not use our own tables but usesome ICU function to achieve the desired task.

Re: Pre-proposal: unicode normalized text

Reply via email to