On Wed, 2023-03-08 at 07:21 +0100, Peter Eisentraut wrote: > On 04.03.23 19:29, Jeff Davis wrote: > > It looks like the way you've handled this is by inserting the > > collation > > with collprovider=icu even if built without ICU support. I think > > that's > > a new case, so we need to make sure it throws reasonable user- > > facing > > errors. > > It would look like this: > > => select * from t1 order by b collate unicode; > ERROR: 0A000: ICU is not supported in this build
Right, the error looks good. I'm just pointing out that before this patch, having provider='i' in a build without ICU was a configuration mistake; whereas afterward every database will have a collation with provider='i' whether it has ICU support or not. I think that's fine, I'm just double-checking. Why is "unicode" only provided for the UTF-8 encoding? For "ucs_basic" that makes some sense, because the implementation only works in UTF-8. But here we are using ICU, and the "und" locale should work for any ICU-supported encoding. I suggest that we use collencoding=-1 for "unicode", and the docs can just add a note next to "ucs_basic" that it only works for UTF-8, because that's the weird case. For the docs, I suggest that you clarify that "ucs_basic" has the same behavior as the C locale does *in the UTF-8 encoding*. Not all users might pick up on the subtlety that the C locale has different behaviors in different encodings. Other than that, it looks good. -- Jeff Davis PostgreSQL Contributor Team - AWS