Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-31 Thread Kyle Evans
On Fri, Jul 31, 2020 at 8:39 AM Li-Wen Hsu  wrote:
>
> On Fri, Jul 31, 2020 at 9:50 AM Kyle Evans  wrote:
> >
> > On Thu, Jul 30, 2020 at 8:47 PM Kyle Evans  wrote:
> > >
> > > On Wed, Jul 29, 2020 at 10:53 PM Li-Wen Hsu  wrote:
> > > >
> > > > On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans  wrote:
> > > > >
> > > > > Author: kevans
> > > > > Date: Wed Jul 29 23:21:56 2020
> > > > > New Revision: 363679
> > > > > URL: https://svnweb.freebsd.org/changeset/base/363679
> > > > >
> > > > > Log:
> > > > >   regex(3): Interpret many escaped ordinary characters as EESCAPE
> > > > >
> > > > >   In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar 
> > > > > allows for
> > > > >   any character to be escaped, but "ORD_CHAR preceded by an unescaped
> > > > >character [gives undefined results]".
> > > > >
> > > > >   Historically, we've interpreted an escaped ordinary character as the
> > > > >   ordinary character itself. This becomes problematic when some 
> > > > > extensions
> > > > >   give special meanings to an otherwise ordinary character
> > > > >   (e.g. GNU's \b, \s, \w), meaning we may have two different valid
> > > > >   interpretations of the same sequence.
> > > > >
> > > > >   To make this easier to deal with and given that the standard calls 
> > > > > this
> > > > >   undefined, we should throw an error (EESCAPE) if we run into this 
> > > > > scenario
> > > > >   to ease transition into a state where some escaped ordinaries are 
> > > > > blessed
> > > > >   with a special meaning -- it will either error out or have extended
> > > > >   behavior, rather than have two entirely different versions of 
> > > > > undefined
> > > > >   behavior that leave the consumer of regex(3) guessing as to what 
> > > > > behavior
> > > > >   will be used or leaving them with false impressions.
> > > > >
> > > > >   This change bumps the symbol version of regcomp to FBSD_1.6 and 
> > > > > provides the
> > > > >   old escape semantics for legacy applications, just in case one has 
> > > > > an older
> > > > >   application that would immediately turn into a pumpkin because of an
> > > > >   extraneous escape that's embedded or otherwise critical to its 
> > > > > operation.
> > > > >
> > > > >   This is the final piece needed before enhancing libregex with GNU 
> > > > > extensions
> > > > >   and flipping the switch on bsdgrep.
> > > > >
> > > > >   [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/
> > > > >
> > > > >   PR:   229925 (exp-run, courtesy of antoine)
> > > > >   Differential Revision:https://reviews.freebsd.org/D10510
> > > > >
> > > > > Modified:
> > > > >   head/contrib/netbsd-tests/lib/libc/regex/data/meta.in
> > > > >   head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in
> > > > >   head/lib/libc/regex/Symbol.map
> > > > >   head/lib/libc/regex/regcomp.c
> > > >
> > > > I think there are 3 test cases need to be modified after this change:
> > > >
> > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/
> > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/
> > > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/
> > > >
> > >
> > > CC'ing asomers@ and ngie@, because ISTR they have some googletest stock.
> > >
> > > Testing my libregex GNU extensions revealed that I'm really not ready
> > > to commit that just yet. We have two options here for googletest:
> > >
> > > 1. Disable it and create a PR to be fixed when my changes are done,
> > > hopefully by the end of the week, or
> > > 2. Fix the expressions in
> > > contrib/googletest/googletest/test/googletest-port-test.cc to be POSIX
> > > compliant and upstream that.
> > >
> > > #2 is generally a replacement of \w -> [[:alnum:]] and \W ->
> > > [^[:alnum:]] and maybe \s -> [[:space:]].
> > >
> >
> > Sorry, to be more precise: disable it meaning expect failure of that
> > specific test or something similar.
>
> I think there's no need to let a known issue generate lots of failure
> reports for more than 24 hours, I suggest let's go with 1) first. For
> 2), It's also good that both libregex and googletest can aware the
> difference between POSIX and GNU extensions, but I am not sure how
> upstream thinks about this. Still worth trying, though.
>

Sure- if you have time and no one objects, please proceed with #1 (no
time at the moment myself) and I'll get it fixed this weekend, even if
I have to hold back implementation of some of the GNU extensions to
nab the few googletest's tests care about.

Thanks,

Kyle Evans
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-31 Thread Li-Wen Hsu
On Fri, Jul 31, 2020 at 9:50 AM Kyle Evans  wrote:
>
> On Thu, Jul 30, 2020 at 8:47 PM Kyle Evans  wrote:
> >
> > On Wed, Jul 29, 2020 at 10:53 PM Li-Wen Hsu  wrote:
> > >
> > > On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans  wrote:
> > > >
> > > > Author: kevans
> > > > Date: Wed Jul 29 23:21:56 2020
> > > > New Revision: 363679
> > > > URL: https://svnweb.freebsd.org/changeset/base/363679
> > > >
> > > > Log:
> > > >   regex(3): Interpret many escaped ordinary characters as EESCAPE
> > > >
> > > >   In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows 
> > > > for
> > > >   any character to be escaped, but "ORD_CHAR preceded by an unescaped
> > > >character [gives undefined results]".
> > > >
> > > >   Historically, we've interpreted an escaped ordinary character as the
> > > >   ordinary character itself. This becomes problematic when some 
> > > > extensions
> > > >   give special meanings to an otherwise ordinary character
> > > >   (e.g. GNU's \b, \s, \w), meaning we may have two different valid
> > > >   interpretations of the same sequence.
> > > >
> > > >   To make this easier to deal with and given that the standard calls 
> > > > this
> > > >   undefined, we should throw an error (EESCAPE) if we run into this 
> > > > scenario
> > > >   to ease transition into a state where some escaped ordinaries are 
> > > > blessed
> > > >   with a special meaning -- it will either error out or have extended
> > > >   behavior, rather than have two entirely different versions of 
> > > > undefined
> > > >   behavior that leave the consumer of regex(3) guessing as to what 
> > > > behavior
> > > >   will be used or leaving them with false impressions.
> > > >
> > > >   This change bumps the symbol version of regcomp to FBSD_1.6 and 
> > > > provides the
> > > >   old escape semantics for legacy applications, just in case one has an 
> > > > older
> > > >   application that would immediately turn into a pumpkin because of an
> > > >   extraneous escape that's embedded or otherwise critical to its 
> > > > operation.
> > > >
> > > >   This is the final piece needed before enhancing libregex with GNU 
> > > > extensions
> > > >   and flipping the switch on bsdgrep.
> > > >
> > > >   [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/
> > > >
> > > >   PR:   229925 (exp-run, courtesy of antoine)
> > > >   Differential Revision:https://reviews.freebsd.org/D10510
> > > >
> > > > Modified:
> > > >   head/contrib/netbsd-tests/lib/libc/regex/data/meta.in
> > > >   head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in
> > > >   head/lib/libc/regex/Symbol.map
> > > >   head/lib/libc/regex/regcomp.c
> > >
> > > I think there are 3 test cases need to be modified after this change:
> > >
> > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/
> > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/
> > > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/
> > >
> >
> > CC'ing asomers@ and ngie@, because ISTR they have some googletest stock.
> >
> > Testing my libregex GNU extensions revealed that I'm really not ready
> > to commit that just yet. We have two options here for googletest:
> >
> > 1. Disable it and create a PR to be fixed when my changes are done,
> > hopefully by the end of the week, or
> > 2. Fix the expressions in
> > contrib/googletest/googletest/test/googletest-port-test.cc to be POSIX
> > compliant and upstream that.
> >
> > #2 is generally a replacement of \w -> [[:alnum:]] and \W ->
> > [^[:alnum:]] and maybe \s -> [[:space:]].
> >
>
> Sorry, to be more precise: disable it meaning expect failure of that
> specific test or something similar.

I think there's no need to let a known issue generate lots of failure
reports for more than 24 hours, I suggest let's go with 1) first. For
2), It's also good that both libregex and googletest can aware the
difference between POSIX and GNU extensions, but I am not sure how
upstream thinks about this. Still worth trying, though.

Best,
Li-Wen
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-30 Thread Kyle Evans
On Thu, Jul 30, 2020 at 8:47 PM Kyle Evans  wrote:
>
> On Wed, Jul 29, 2020 at 10:53 PM Li-Wen Hsu  wrote:
> >
> > On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans  wrote:
> > >
> > > Author: kevans
> > > Date: Wed Jul 29 23:21:56 2020
> > > New Revision: 363679
> > > URL: https://svnweb.freebsd.org/changeset/base/363679
> > >
> > > Log:
> > >   regex(3): Interpret many escaped ordinary characters as EESCAPE
> > >
> > >   In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows 
> > > for
> > >   any character to be escaped, but "ORD_CHAR preceded by an unescaped
> > >character [gives undefined results]".
> > >
> > >   Historically, we've interpreted an escaped ordinary character as the
> > >   ordinary character itself. This becomes problematic when some extensions
> > >   give special meanings to an otherwise ordinary character
> > >   (e.g. GNU's \b, \s, \w), meaning we may have two different valid
> > >   interpretations of the same sequence.
> > >
> > >   To make this easier to deal with and given that the standard calls this
> > >   undefined, we should throw an error (EESCAPE) if we run into this 
> > > scenario
> > >   to ease transition into a state where some escaped ordinaries are 
> > > blessed
> > >   with a special meaning -- it will either error out or have extended
> > >   behavior, rather than have two entirely different versions of undefined
> > >   behavior that leave the consumer of regex(3) guessing as to what 
> > > behavior
> > >   will be used or leaving them with false impressions.
> > >
> > >   This change bumps the symbol version of regcomp to FBSD_1.6 and 
> > > provides the
> > >   old escape semantics for legacy applications, just in case one has an 
> > > older
> > >   application that would immediately turn into a pumpkin because of an
> > >   extraneous escape that's 
> > > embehttps://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded
> > >  or otherwise critical to its operation.
> > >
> > >   This is the final piece needed before enhancing libregex with GNU 
> > > extensions
> > >   and flipping the switch on bsdgrep.
> > >
> > >   [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/
> > >
> > >   PR:   229925 (exp-run, courtesy of antoine)
> > >   Differential Revision:https://reviews.freebsd.org/D10510
> > >
> > > Modified:
> > >   head/contrib/netbsd-tests/lib/libc/regex/data/meta.in
> > >   head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in
> > >   head/lib/libc/regex/Symbol.map
> > >   head/lib/libc/regex/regcomp.c
> >
> > I think there are 3 test cases need to be modified after this change:
> >
> > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/
> > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/
> > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/
> >
>
> CC'ing asomers@ and ngie@, because ISTR they have some googletest stock.
>
> Testing my libregex GNU extensions revealed that I'm really not ready
> to commit that just yet. We have two options here for googletest:
>
> 1. Disable it and create a PR to be fixed when my changes are done,
> hopefully by the end of the week, or
> 2. Fix the expressions in
> contrib/googletest/googletest/test/googletest-port-test.cc to be POSIX
> compliant and upstream that.
>
> #2 is generally a replacement of \w -> [[:alnum:]] and \W ->
> [^[:alnum:]] and maybe \s -> [[:space:]].
>

Sorry, to be more precise: disable it meaning expect failure of that
specific test or something similar.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-30 Thread Kyle Evans
On Wed, Jul 29, 2020 at 10:53 PM Li-Wen Hsu  wrote:
>
> On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans  wrote:
> >
> > Author: kevans
> > Date: Wed Jul 29 23:21:56 2020
> > New Revision: 363679
> > URL: https://svnweb.freebsd.org/changeset/base/363679
> >
> > Log:
> >   regex(3): Interpret many escaped ordinary characters as EESCAPE
> >
> >   In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows for
> >   any character to be escaped, but "ORD_CHAR preceded by an unescaped
> >character [gives undefined results]".
> >
> >   Historically, we've interpreted an escaped ordinary character as the
> >   ordinary character itself. This becomes problematic when some extensions
> >   give special meanings to an otherwise ordinary character
> >   (e.g. GNU's \b, \s, \w), meaning we may have two different valid
> >   interpretations of the same sequence.
> >
> >   To make this easier to deal with and given that the standard calls this
> >   undefined, we should throw an error (EESCAPE) if we run into this scenario
> >   to ease transition into a state where some escaped ordinaries are blessed
> >   with a special meaning -- it will either error out or have extended
> >   behavior, rather than have two entirely different versions of undefined
> >   behavior that leave the consumer of regex(3) guessing as to what behavior
> >   will be used or leaving them with false impressions.
> >
> >   This change bumps the symbol version of regcomp to FBSD_1.6 and provides 
> > the
> >   old escape semantics for legacy applications, just in case one has an 
> > older
> >   application that would immediately turn into a pumpkin because of an
> >   extraneous escape that's 
> > embehttps://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded
> >  or otherwise critical to its operation.
> >
> >   This is the final piece needed before enhancing libregex with GNU 
> > extensions
> >   and flipping the switch on bsdgrep.
> >
> >   [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/
> >
> >   PR:   229925 (exp-run, courtesy of antoine)
> >   Differential Revision:https://reviews.freebsd.org/D10510
> >
> > Modified:
> >   head/contrib/netbsd-tests/lib/libc/regex/data/meta.in
> >   head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in
> >   head/lib/libc/regex/Symbol.map
> >   head/lib/libc/regex/regcomp.c
>
> I think there are 3 test cases need to be modified after this change:
>
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/
>

CC'ing asomers@ and ngie@, because ISTR they have some googletest stock.

Testing my libregex GNU extensions revealed that I'm really not ready
to commit that just yet. We have two options here for googletest:

1. Disable it and create a PR to be fixed when my changes are done,
hopefully by the end of the week, or
2. Fix the expressions in
contrib/googletest/googletest/test/googletest-port-test.cc to be POSIX
compliant and upstream that.

#2 is generally a replacement of \w -> [[:alnum:]] and \W ->
[^[:alnum:]] and maybe \s -> [[:space:]].

Thoughts?

Thanks,

Kyle Evans
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-30 Thread Stefan Eßer
Am 30.07.20 um 13:54 schrieb Kyle Evans:
> On Thu, Jul 30, 2020 at 6:48 AM Gordon Bergling  wrote:
>> I got the same error this morning and was able to solve it by doing a full
>> buildworld without NO_CLEAN=yes.
>>
>> You may want to try this in case you are using NO_CLEAN=yes.
>>
> 
> This is interesting; there shouldn't be any NO_CLEAN implications with
> this change. There were no dependency changes, libc should definitely
> get rebuilt because regcomp.c changed and thus, the libc in your
> objdir should have the symbol. The binary referenced above is one that
> we symlink into OBJDIR from the host system.
> 
> I think it's also likely your problem was just fixed by the second
> installworld. The first one will manage to get libc installed, but not
> before you get errors from all the other stuff.

This appears to be true: after once completing installworld with
WITHOUT_TESTS=yes the build and installation does also succeed for
subsequent runs with WITH_TESTS=yes.

My guess is that "make install" in tests tries to link against the
base system version of the library and the freshly built one with
the correct symbol version has not been installed, yet.

Regards, STefan
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-30 Thread Stefan Eßer
Am 30.07.20 um 13:48 schrieb Gordon Bergling:
> On Thu, Jul 30, 2020 at 01:26:46PM +0200, Stefan Eßer wrote:
>> Am 30.07.20 um 01:21 schrieb Kyle Evans:
>> [...]
>>>   This change bumps the symbol version of regcomp to FBSD_1.6 and provides 
>>> the
>>>   old escape semantics for legacy applications, just in case one has an 
>>> older
>>>   application that would immediately turn into a pumpkin because of an
>>>   extraneous escape that's embedded or otherwise critical to its operation.
>>
>> I get an error during make buildworld with option WITH_TESTS=yes:
>>
>> ===> usr.bin/bmake/tests (install)
>> ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined
>> symbol "regcomp@FBSD_1.6"
>>
>> Regards, STefan
> 
> I got the same error this morning and was able to solve it by doing a full
> buildworld without NO_CLEAN=yes.
> 
> You may want to try this in case you are using NO_CLEAN=yes.

Too late ... but thanks for the hint ...

I have restarted make buildworld installworld on an unmodified
source tree from when the error occurred and it just finished,
without error this time.

Maybe that it will work with WITH_TESTS too, now - I'll start
another build/install cycle now and will report back. If it does
not work, I'll try without NO_CLEAN (I'm building with META_MODE
enabled, normally).

Regards, STefan
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-30 Thread Kyle Evans
On Thu, Jul 30, 2020 at 6:48 AM Gordon Bergling  wrote:
>
> On Thu, Jul 30, 2020 at 01:26:46PM +0200, Stefan Eßer wrote:
> > Am 30.07.20 um 01:21 schrieb Kyle Evans:
> > [...]
> > >   This change bumps the symbol version of regcomp to FBSD_1.6 and 
> > > provides the
> > >   old escape semantics for legacy applications, just in case one has an 
> > > older
> > >   application that would immediately turn into a pumpkin because of an
> > >   extraneous escape that's embedded or otherwise critical to its 
> > > operation.
> >
> > I get an error during make buildworld with option WITH_TESTS=yes:
> >
> > ===> usr.bin/bmake/tests (install)
> > ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined
> > symbol "regcomp@FBSD_1.6"
> >
> > Regards, STefan
>
> I got the same error this morning and was able to solve it by doing a full
> buildworld without NO_CLEAN=yes.
>
> You may want to try this in case you are using NO_CLEAN=yes.
>

This is interesting; there shouldn't be any NO_CLEAN implications with
this change. There were no dependency changes, libc should definitely
get rebuilt because regcomp.c changed and thus, the libc in your
objdir should have the symbol. The binary referenced above is one that
we symlink into OBJDIR from the host system.

I think it's also likely your problem was just fixed by the second
installworld. The first one will manage to get libc installed, but not
before you get errors from all the other stuff.

Thanks,

Kyle Evans
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-30 Thread Gordon Bergling
On Thu, Jul 30, 2020 at 01:26:46PM +0200, Stefan Eßer wrote:
> Am 30.07.20 um 01:21 schrieb Kyle Evans:
> [...]
> >   This change bumps the symbol version of regcomp to FBSD_1.6 and provides 
> > the
> >   old escape semantics for legacy applications, just in case one has an 
> > older
> >   application that would immediately turn into a pumpkin because of an
> >   extraneous escape that's embedded or otherwise critical to its operation.
> 
> I get an error during make buildworld with option WITH_TESTS=yes:
> 
> ===> usr.bin/bmake/tests (install)
> ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined
> symbol "regcomp@FBSD_1.6"
> 
> Regards, STefan

I got the same error this morning and was able to solve it by doing a full
buildworld without NO_CLEAN=yes.

You may want to try this in case you are using NO_CLEAN=yes.

--Gordon
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-30 Thread Kyle Evans
On Thu, Jul 30, 2020 at 6:26 AM Stefan Eßer  wrote:
>
> Am 30.07.20 um 01:21 schrieb Kyle Evans:
> [...]
> >   This change bumps the symbol version of regcomp to FBSD_1.6 and provides 
> > the
> >   old escape semantics for legacy applications, just in case one has an 
> > older
> >   application that would immediately turn into a pumpkin because of an
> >   extraneous escape that's embedded or otherwise critical to its operation.
>
> I get an error during make buildworld with option WITH_TESTS=yes:
>
> ===> usr.bin/bmake/tests (install)
> ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined
> symbol "regcomp@FBSD_1.6"
>
> Regards, STefan

Hi,

Can you describe the environment in which you're running installworld,
please? i.e. is it just a raw installworld directly in your shell, or
something more complicated?

I observed this in testing an exceptional scenario; running
installworld in a buildenv. installworld injects .WAIT between lib and
libexec + other subdirs, which is supposed to prevent stuff like this
(new binary got installed linked against new libc before new libc).
Running in a buildenv set SYSROOT and stripped out the .WAITs, leaving
me with an annoyance where I had to installworld twice.

Thanks,

Kyle Evans
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-30 Thread Stefan Eßer
Am 30.07.20 um 01:21 schrieb Kyle Evans:
[...]
>   This change bumps the symbol version of regcomp to FBSD_1.6 and provides the
>   old escape semantics for legacy applications, just in case one has an older
>   application that would immediately turn into a pumpkin because of an
>   extraneous escape that's embedded or otherwise critical to its operation.

I get an error during make buildworld with option WITH_TESTS=yes:

===> usr.bin/bmake/tests (install)
ld-elf.so.1: /usr/src/amd64.amd64/tmp/legacy/usr/sbin/make: Undefined
symbol "regcomp@FBSD_1.6"

Regards, STefan
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-29 Thread Kyle Evans
Sorry, on mobile, so doubling down on bad formatting by top-posting...

The sed/diff tests are easy to fix, will do those in about 8/9 hours.

The Google test failure is interesting- this expression has clearly been
wrong and getting the wrong results, so we've caught a legitimate issue
here. I think the best path forward for that one is to commit my libregex
extensions and link that baby up so that \w works.

Thanks,

Kyle Evans

On Wed, Jul 29, 2020, 22:53 Li-Wen Hsu  wrote:

> On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans  wrote:
> >
> > Author: kevans
> > Date: Wed Jul 29 23:21:56 2020
> > New Revision: 363679
> > URL: https://svnweb.freebsd.org/changeset/base/363679
> >
> > Log:
> >   regex(3): Interpret many escaped ordinary characters as EESCAPE
> >
> >   In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows
> for
> >   any character to be escaped, but "ORD_CHAR preceded by an unescaped
> >character [gives undefined results]".
> >
> >   Historically, we've interpreted an escaped ordinary character as the
> >   ordinary character itself. This becomes problematic when some
> extensions
> >   give special meanings to an otherwise ordinary character
> >   (e.g. GNU's \b, \s, \w), meaning we may have two different valid
> >   interpretations of the same sequence.
> >
> >   To make this easier to deal with and given that the standard calls this
> >   undefined, we should throw an error (EESCAPE) if we run into this
> scenario
> >   to ease transition into a state where some escaped ordinaries are
> blessed
> >   with a special meaning -- it will either error out or have extended
> >   behavior, rather than have two entirely different versions of undefined
> >   behavior that leave the consumer of regex(3) guessing as to what
> behavior
> >   will be used or leaving them with false impressions.
> >
> >   This change bumps the symbol version of regcomp to FBSD_1.6 and
> provides the
> >   old escape semantics for legacy applications, just in case one has an
> older
> >   application that would immediately turn into a pumpkin because of an
> >   extraneous escape that's embehttps://
> ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded
> or otherwise critical to its operation.
> >
> >   This is the final piece needed before enhancing libregex with GNU
> extensions
> >   and flipping the switch on bsdgrep.
> >
> >   [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/
> >
> >   PR:   229925 (exp-run, courtesy of antoine)
> >   Differential Revision:https://reviews.freebsd.org/D10510
> >
> > Modified:
> >   head/contrib/netbsd-tests/lib/libc/regex/data/meta.in
> >   head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in
> >   head/lib/libc/regex/Symbol.map
> >   head/lib/libc/regex/regcomp.c
>
> I think there are 3 test cases need to be modified after this change:
>
>
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/
>
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/
>
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/
>
> Please help to check them, thanks!
>
> Li-Wen
>
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-29 Thread Li-Wen Hsu
On Thu, Jul 30, 2020 at 7:22 AM Kyle Evans  wrote:
>
> Author: kevans
> Date: Wed Jul 29 23:21:56 2020
> New Revision: 363679
> URL: https://svnweb.freebsd.org/changeset/base/363679
>
> Log:
>   regex(3): Interpret many escaped ordinary characters as EESCAPE
>
>   In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows for
>   any character to be escaped, but "ORD_CHAR preceded by an unescaped
>character [gives undefined results]".
>
>   Historically, we've interpreted an escaped ordinary character as the
>   ordinary character itself. This becomes problematic when some extensions
>   give special meanings to an otherwise ordinary character
>   (e.g. GNU's \b, \s, \w), meaning we may have two different valid
>   interpretations of the same sequence.
>
>   To make this easier to deal with and given that the standard calls this
>   undefined, we should throw an error (EESCAPE) if we run into this scenario
>   to ease transition into a state where some escaped ordinaries are blessed
>   with a special meaning -- it will either error out or have extended
>   behavior, rather than have two entirely different versions of undefined
>   behavior that leave the consumer of regex(3) guessing as to what behavior
>   will be used or leaving them with false impressions.
>
>   This change bumps the symbol version of regcomp to FBSD_1.6 and provides the
>   old escape semantics for legacy applications, just in case one has an older
>   application that would immediately turn into a pumpkin because of an
>   extraneous escape that's 
> embehttps://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/dded
>  or otherwise critical to its operation.
>
>   This is the final piece needed before enhancing libregex with GNU extensions
>   and flipping the switch on bsdgrep.
>
>   [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/
>
>   PR:   229925 (exp-run, courtesy of antoine)
>   Differential Revision:https://reviews.freebsd.org/D10510
>
> Modified:
>   head/contrib/netbsd-tests/lib/libc/regex/data/meta.in
>   head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in
>   head/lib/libc/regex/Symbol.map
>   head/lib/libc/regex/regcomp.c

I think there are 3 test cases need to be modified after this change:

https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/lib.googletest.gtest_main/googletest-port-test/main/
https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.diff/diff_test/side_by_side/
https://ci.freebsd.org/job/FreeBSD-head-amd64-test/16011/testReport/junit/usr.bin.sed/sed2_test/hex_subst/

Please help to check them, thanks!

Li-Wen
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


svn commit: r363679 - in head: contrib/netbsd-tests/lib/libc/regex/data lib/libc/regex

2020-07-29 Thread Kyle Evans
Author: kevans
Date: Wed Jul 29 23:21:56 2020
New Revision: 363679
URL: https://svnweb.freebsd.org/changeset/base/363679

Log:
  regex(3): Interpret many escaped ordinary characters as EESCAPE
  
  In IEEE 1003.1-2008 [1] and earlier revisions, BRE/ERE grammar allows for
  any character to be escaped, but "ORD_CHAR preceded by an unescaped
   character [gives undefined results]".
  
  Historically, we've interpreted an escaped ordinary character as the
  ordinary character itself. This becomes problematic when some extensions
  give special meanings to an otherwise ordinary character
  (e.g. GNU's \b, \s, \w), meaning we may have two different valid
  interpretations of the same sequence.
  
  To make this easier to deal with and given that the standard calls this
  undefined, we should throw an error (EESCAPE) if we run into this scenario
  to ease transition into a state where some escaped ordinaries are blessed
  with a special meaning -- it will either error out or have extended
  behavior, rather than have two entirely different versions of undefined
  behavior that leave the consumer of regex(3) guessing as to what behavior
  will be used or leaving them with false impressions.
  
  This change bumps the symbol version of regcomp to FBSD_1.6 and provides the
  old escape semantics for legacy applications, just in case one has an older
  application that would immediately turn into a pumpkin because of an
  extraneous escape that's embedded or otherwise critical to its operation.
  
  This is the final piece needed before enhancing libregex with GNU extensions
  and flipping the switch on bsdgrep.
  
  [1] http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/
  
  PR:   229925 (exp-run, courtesy of antoine)
  Differential Revision:https://reviews.freebsd.org/D10510

Modified:
  head/contrib/netbsd-tests/lib/libc/regex/data/meta.in
  head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in
  head/lib/libc/regex/Symbol.map
  head/lib/libc/regex/regcomp.c

Modified: head/contrib/netbsd-tests/lib/libc/regex/data/meta.in
==
--- head/contrib/netbsd-tests/lib/libc/regex/data/meta.in   Wed Jul 29 
23:17:16 2020(r363678)
+++ head/contrib/netbsd-tests/lib/libc/regex/data/meta.in   Wed Jul 29 
23:21:56 2020(r363679)
@@ -4,7 +4,9 @@ a[bc]d  &   abd abd
 a\*c   &   a*c a*c
 a\\b   &   a\b a\b
 a\\\*b &   a\*ba\*b
-a\bc   &   abc abc
+# Begin FreeBSD
+a\bc EESCAPE
+# End FreeBSD
 a\   EESCAPE
 a\\bc  &   a\bca\bc
 \{ bC  BADRPT

Modified: head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in
==
--- head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in Wed Jul 29 
23:17:16 2020(r363678)
+++ head/contrib/netbsd-tests/lib/libc/regex/data/subexp.in Wed Jul 29 
23:21:56 2020(r363679)
@@ -12,7 +12,7 @@ a(b+)c-   abbbc   abbbc   bbb
 a(b*)c -   ac  ac  @c
 (a|ab)(bc([de]+)f|cde) -   abcdef  abcdef  a,bcdef,de
 # Begin FreeBSD
-a\(b\|c\)d b   ab|cd   ab|cd   b|c
+a\(b|c\)d  b   ab|cd   ab|cd   b|c
 # End FreeBSD
 # the regression tester only asks for 9 subexpressions
 a(b)(c)(d)(e)(f)(g)(h)(i)(j)k  -   abcdefghijk abcdefghijk 
b,c,d,e,f,g,h,i,j

Modified: head/lib/libc/regex/Symbol.map
==
--- head/lib/libc/regex/Symbol.map  Wed Jul 29 23:17:16 2020
(r363678)
+++ head/lib/libc/regex/Symbol.map  Wed Jul 29 23:21:56 2020
(r363679)
@@ -3,8 +3,11 @@
  */
 
 FBSD_1.0 {
-   regcomp;
regerror;
regexec;
regfree;
+};
+
+FBSD_1.6 {
+   regcomp;
 };

Modified: head/lib/libc/regex/regcomp.c
==
--- head/lib/libc/regex/regcomp.c   Wed Jul 29 23:17:16 2020
(r363678)
+++ head/lib/libc/regex/regcomp.c   Wed Jul 29 23:21:56 2020
(r363679)
@@ -102,11 +102,14 @@ struct parse {
sopno pend[NPAREN]; /* -> ) ([0] unused) */
bool allowbranch;   /* can this expression branch? */
bool bre;   /* convenience; is this a BRE? */
+   int pflags; /* other parsing flags -- legacy escapes? */
bool (*parse_expr)(struct parse *, struct branchc *);
void (*pre_parse)(struct parse *, struct branchc *);
void (*post_parse)(struct parse *, struct branchc *);
 };
 
+#define PFLAG_LEGACY_ESC   0x0001
+
 /* = begin header generated by ./mkh = */
 #ifdef __cplusplus
 extern "C" {
@@ -132,6 +135,7 @@ static void p_b_cclass(struct parse *p, cset *cs);
 static void p_b_eclass(struct parse *p, cset *cs);