On Sun, Jan 16, 2022 at 03:41:24PM +0000, Gavin Smith wrote: > On Sun, Jan 16, 2022 at 04:13:22PM +0100, [email protected] wrote: > > Actually, there is a difference between the C parser and the perl > > parser, the perl parser consider those spaces as space too, which is not > > surprising given that \s is used a lot in regexp where the C parser > > probably check the characters... I will investigate. > > The change didn't touch the parser, only HTML.pm, so it's unexpected > that there should be a difference there.
There is a difference because the perl parser was always incorrect (at least since \s includes non ascii spaces). In the XS parser whitespace characters are listed explicitely, in the perl parser it is up to the definition of perl, so it is in general different without /a nowadays (if I understand well). So, the perl parser is incorrect since some time, but there was no example in which there was a difference since it is the first example with non ascii space. But non ascii spaces here and there would show the difference between the perl parser and the XS parser. Anyway I will shortly commit a fix. -- Pat
