On Tue, Oct 10, 2023 at 10:51:10AM -0400, Evan Jones wrote:
> Here is a quick demonstration of this issue, showing that the quoting
> behavior is different between these two. Mac OS X with the "default" locale
> includes quotes because ą includes 0x85 in its UTF-8 encoding:
Ugh. rowtypes.c has r
Thanks for bringing this up! I just looked at the uses if isspace() in that
file. It looks like it is the usual thing: it is allowing leading or
trailing whitespace when parsing values, or for this "needs quoting" logic
on output. The fix would be the same: this *should* be
using scanner_isspace. T
FTR I ran into a benign case of the phenomenon in this thread when
dealing with row types. In rowtypes.c, we double-quote stuff
containing spaces, but we detect them by passing individual bytes of
UTF-8 sequences to isspace(). Like macOS, Windows thinks that 0xa0 is
a space when you do that, so f
On Tue, Jun 20, 2023 at 11:39:31PM -0400, Tom Lane wrote:
> I'd be okay with adding \v to the set of whitespace characters in
> scan.l and scanner_isspace (and other affected places) for v17.
> Don't want to back-patch it though.
Okay. No idea where this will lead, but for now I have sent a patch
Michael Paquier writes:
> As a whole, I'd like to think that this is an improvement even for
> stable branches with these weird isspace() handlings, so I'm OK with
> the current status in all the branches.
Sounds like we're all content with that.
> There's an argument about \v,
> IMO, but I won'
On Tue, Jun 20, 2023 at 09:04:26AM -0400, Evan Jones wrote:
> My one last suggestion: We *could* revert the backpatching if we are
> concerned about this change, but I'm not personally sure that is necessary.
> As we discussed, this is an unusual corner case in an "extension" type that
> many won't
Thanks for the detailed discussion. To confirm that I've understood
everything:
* Michael's proposed patch to add hstore_isspace() would be a potential
fix: it resolves my original bug, and does not change the behavior of '\v'.
* We believe the change to '\v' is not a problem, and may be an improv
On Sun, Jun 18, 2023 at 09:10:59PM -0400, Tom Lane wrote:
> What have you got in mind? We should already have validated encoding
> correctness before the text ever gets to hstore_in, and I'm not clear
> what additional checks would be useful.
I was staring at the hstore parsing code and got the i
Michael Paquier writes:
> Another thing that I was wondering, though.. Do you think that there
> would be an argument in being stricter in the hstore code regarding
> the handling of multi-byte characters with some checks based on
> IS_HIGHBIT_SET() when parsing the keys and values?
What have yo
On Sun, Jun 18, 2023 at 12:38:12PM -0400, Tom Lane wrote:
> FWIW, I think the status quo is fine. Having hstore do something that
> is neither its historical behavior nor aligned with the core parser
> doesn't seem like a great idea.
Okay. Fine by me.
> I don't buy this argument that
> somebody
Michael Paquier writes:
> At the end, no need to do that. I have been able to hack the
> attached, that shows the difference of treatment for \v when running
> in macOS. Evan, what do you think?
FWIW, I think the status quo is fine. Having hstore do something that
is neither its historical beh
On Sun, Jun 18, 2023 at 10:50:16AM +0900, Michael Paquier wrote:
> The difference between scanner_isspace() and array_isspace() is that
> the former matches with what scan.l stores as rules for whitespace
> characters, but the latter works on values. For hstore, we want the
> latter, with somethin
On Sat, Jun 17, 2023 at 10:57:05AM -0400, Evan Jones wrote:
> However, if we think this change could be a problem, one fix would be to
> switch scanner_isspace() to array_isspace(), which returns true for these
> *six* ASCII characters. I am happy to submit a patch to do this.
The difference betwe
Unfortunately I just noticed a possible "bug" with this change. The
scanner_isspace() function only recognizes *five* ASCII space characters: '
' \t \n \r \f. It *excludes* VTAB \v, which the C standard function
isspace() includes. This means this patch changed the behavior of hstore
parsing for so
On Tue, Jun 06, 2023 at 10:16:09AM -0400, Evan Jones wrote:
> I did a quick look at the places found with "git grep isspace" yesterday. I
> agree with the comment from commit 9ae2661: "I've left alone isspace()
> calls in places that aren't really expecting any non-ASCII input
> characters, such as
On Tue, Jun 6, 2023 at 7:37 AM Michael Paquier wrote:
> Indeed. It looks like 9ae2661 missed this spot.
>
I didn't think to look for a previous fix, thanks for finding this commit
id!
I did a quick look at the places found with "git grep isspace" yesterday. I
agree with the comment from commit
On Mon, Jun 05, 2023 at 11:26:56AM -0400, Evan Jones wrote:
> This patch fixes a rare parsing bug with unicode characters on Mac OS X.
> The problem is that isspace() on Mac OS X changes its behaviour with the
> locale. Use scanner_isspace instead, which only returns true for ASCII
> whitespace. It
This patch fixes a rare parsing bug with unicode characters on Mac OS X.
The problem is that isspace() on Mac OS X changes its behaviour with the
locale. Use scanner_isspace instead, which only returns true for ASCII
whitespace. It appears other places in the Postgres code have already run
into thi
18 matches
Mail list logo