Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-10-10 Thread Michael Paquier
On Tue, Oct 10, 2023 at 10:51:10AM -0400, Evan Jones wrote: > Here is a quick demonstration of this issue, showing that the quoting > behavior is different between these two. Mac OS X with the "default" locale > includes quotes because ą includes 0x85 in its UTF-8 encoding: Ugh. rowtypes.c has r

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-10-10 Thread Evan Jones
Thanks for bringing this up! I just looked at the uses if isspace() in that file. It looks like it is the usual thing: it is allowing leading or trailing whitespace when parsing values, or for this "needs quoting" logic on output. The fix would be the same: this *should* be using scanner_isspace. T

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-10-09 Thread Thomas Munro
FTR I ran into a benign case of the phenomenon in this thread when dealing with row types. In rowtypes.c, we double-quote stuff containing spaces, but we detect them by passing individual bytes of UTF-8 sequences to isspace(). Like macOS, Windows thinks that 0xa0 is a space when you do that, so f

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-20 Thread Michael Paquier
On Tue, Jun 20, 2023 at 11:39:31PM -0400, Tom Lane wrote: > I'd be okay with adding \v to the set of whitespace characters in > scan.l and scanner_isspace (and other affected places) for v17. > Don't want to back-patch it though. Okay. No idea where this will lead, but for now I have sent a patch

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-20 Thread Tom Lane
Michael Paquier writes: > As a whole, I'd like to think that this is an improvement even for > stable branches with these weird isspace() handlings, so I'm OK with > the current status in all the branches. Sounds like we're all content with that. > There's an argument about \v, > IMO, but I won'

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-20 Thread Michael Paquier
On Tue, Jun 20, 2023 at 09:04:26AM -0400, Evan Jones wrote: > My one last suggestion: We *could* revert the backpatching if we are > concerned about this change, but I'm not personally sure that is necessary. > As we discussed, this is an unusual corner case in an "extension" type that > many won't

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-20 Thread Evan Jones
Thanks for the detailed discussion. To confirm that I've understood everything: * Michael's proposed patch to add hstore_isspace() would be a potential fix: it resolves my original bug, and does not change the behavior of '\v'. * We believe the change to '\v' is not a problem, and may be an improv

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-19 Thread Michael Paquier
On Sun, Jun 18, 2023 at 09:10:59PM -0400, Tom Lane wrote: > What have you got in mind? We should already have validated encoding > correctness before the text ever gets to hstore_in, and I'm not clear > what additional checks would be useful. I was staring at the hstore parsing code and got the i

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-18 Thread Tom Lane
Michael Paquier writes: > Another thing that I was wondering, though.. Do you think that there > would be an argument in being stricter in the hstore code regarding > the handling of multi-byte characters with some checks based on > IS_HIGHBIT_SET() when parsing the keys and values? What have yo

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-18 Thread Michael Paquier
On Sun, Jun 18, 2023 at 12:38:12PM -0400, Tom Lane wrote: > FWIW, I think the status quo is fine. Having hstore do something that > is neither its historical behavior nor aligned with the core parser > doesn't seem like a great idea. Okay. Fine by me. > I don't buy this argument that > somebody

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-18 Thread Tom Lane
Michael Paquier writes: > At the end, no need to do that. I have been able to hack the > attached, that shows the difference of treatment for \v when running > in macOS. Evan, what do you think? FWIW, I think the status quo is fine. Having hstore do something that is neither its historical beh

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-18 Thread Michael Paquier
On Sun, Jun 18, 2023 at 10:50:16AM +0900, Michael Paquier wrote: > The difference between scanner_isspace() and array_isspace() is that > the former matches with what scan.l stores as rules for whitespace > characters, but the latter works on values. For hstore, we want the > latter, with somethin

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-17 Thread Michael Paquier
On Sat, Jun 17, 2023 at 10:57:05AM -0400, Evan Jones wrote: > However, if we think this change could be a problem, one fix would be to > switch scanner_isspace() to array_isspace(), which returns true for these > *six* ASCII characters. I am happy to submit a patch to do this. The difference betwe

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-17 Thread Evan Jones
Unfortunately I just noticed a possible "bug" with this change. The scanner_isspace() function only recognizes *five* ASCII space characters: ' ' \t \n \r \f. It *excludes* VTAB \v, which the C standard function isspace() includes. This means this patch changed the behavior of hstore parsing for so

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-11 Thread Michael Paquier
On Tue, Jun 06, 2023 at 10:16:09AM -0400, Evan Jones wrote: > I did a quick look at the places found with "git grep isspace" yesterday. I > agree with the comment from commit 9ae2661: "I've left alone isspace() > calls in places that aren't really expecting any non-ASCII input > characters, such as

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-06 Thread Evan Jones
On Tue, Jun 6, 2023 at 7:37 AM Michael Paquier wrote: > Indeed. It looks like 9ae2661 missed this spot. > I didn't think to look for a previous fix, thanks for finding this commit id! I did a quick look at the places found with "git grep isspace" yesterday. I agree with the comment from commit

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-06 Thread Michael Paquier
On Mon, Jun 05, 2023 at 11:26:56AM -0400, Evan Jones wrote: > This patch fixes a rare parsing bug with unicode characters on Mac OS X. > The problem is that isspace() on Mac OS X changes its behaviour with the > locale. Use scanner_isspace instead, which only returns true for ASCII > whitespace. It

[PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-05 Thread Evan Jones
This patch fixes a rare parsing bug with unicode characters on Mac OS X. The problem is that isspace() on Mac OS X changes its behaviour with the locale. Use scanner_isspace instead, which only returns true for ASCII whitespace. It appears other places in the Postgres code have already run into thi