On 5/14/24 12:12, enh wrote:
> On Tue, May 14, 2024 at 1:04 PM Rob Landley <r...@landley.net> wrote:
>>
>> On 5/14/24 07:10, enh wrote:
>> > macOS tests seem to be broken since this commit?
>> >
>> > FAIL: find strlower edge case
>> > echo -ne '' | touch aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ; find . -iname aaaaaȺȺȺȺȺȺȺȺȺ
>> > --- expected 2024-05-10 17:32:56.000000000 +0000
>> > +++ actual 2024-05-10 17:32:56.000000000 +0000
>> > @@ -1 +0,0 @@
>> > -./aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ
>>
>> Sigh. Apple's handling of utf8/unicode continues to be... "a challenge".
>>
>> When I run "make test_find" standalone, it gives me:
>>
>> scripts/runtest.sh: line 219: syntax error near unexpected token `;'
>> scripts/runtest.sh: line 219: `      R) LEN=0; B=1; ;&'
>>
>> Because bash 3.2 from 2007 doesn't understand ;&
> 
> yeah, nor does mksh. it hasn't caused me any problems though; i've
> been ignoring it for years now.
> 
>> And THEN it goes:
>>
>> touch: out of range or illegal time specification: 
>> YYYY-MM-DDThh:mm:SS[.frac][tz]
>> touch: out of range or illegal time specification: 
>> YYYY-MM-DDThh:mm:SS[.frac][tz]
>> FAIL: find newerat
>> echo -ne '' | find dir -type f -newerat @12345
>> --- expected    2024-05-14 11:16:40.000000000 -0500
>> +++ actual      2024-05-14 11:16:40.000000000 -0500
>> @@ -1 +0,0 @@
>> -dir/two
>>
>> Which is a different error that DOESN'T happen with the global tests, because
>> those are using toybox touch rather than homebrew's $TOUCH. But it works on
>> debian. Let's see:
>>
>> $ touch --version
>> touch: illegal option -- -
>> usage: touch [-A [-][[hh]mm]SS] [-achm] [-r file] [-t [[CC]YY]MMDDhhmm[.SS]]
>>        [-d YYYY-MM-DDThh:mm:SS[.frac][tz]] file ...
>>
>> Thank you, gnu project. I'm gonna assume this is _also_ from 2007. (I made
>> scripts/prereq/build.sh for a REASON...)
> 
> no, i think this is a BSD touch.
> 
> yeah, that looks very like the FreeBSD touch's usage:
> 
> static void
> usage(const char *myname)
> {
>         fprintf(stderr, "usage: %s [-A [-][[hh]mm]SS] [-achm] [-r file] "
>                 "[-t [[CC]YY]MMDDhhmm[.SS]]\n"
>                 "       [-d YYYY-MM-DDThh:mm:SS[.frac][tz]] "
>                 "file ...\n", myname);
>         exit(1);
> }
> 
> 
>> Then when I run "make clean macos_defconfig tests" I get:
>>
>> Undefined symbols for architecture arm64:
>>   "_iconv", referenced from:
>>       _do_iconv in iconv.o
>>      (maybe you meant: _iconv_main)
>>   "_iconv_open", referenced from:
>>       _iconv_main in iconv.o
>> ld: symbol(s) not found for architecture arm64
>>
>> Because the Makefile has:
>>
>> tests: ASAN=1
>> tests: toybox
>>         scripts/test.sh
>>
>> And ASAN apparently breaks on homebrew's toolchain but not debian's 
>> toolchain.
>> Why does it break there but not on Linux...
>>
>> probe cc -Wall -Wundef -Werror=implicit-function-declaration
>> -Wno-char-subscripts -Wno-pointer-sign -funsigned-char
>> -Wno-deprecated-declarations -Wno-string-plus-int 
>> -Wno-invalid-source-encoding
>> -fsanitize=address -O1 -g -fno-omit-frame-pointer -fno-optimize-sibling-calls
>> -xc -o /dev/null -
>> error: cannot parse the debug map for '/dev/null': The file was not 
>> recognized
>> as a valid object file
>> clang: error: dsymutil command failed with exit code 1 (use -v to see 
>> invocation)
>>
>> Because it tries to read back the -o output we discarded, and fails when it
>> can't do so, thus all library probes fail and it tries to build with no
>> libraries. But only when ASAN is enabled, because ASAN uses -o as INPUT. 
>> Bravo.
>>
>> None of this is the actual unicode failure, this is just ambient macos...

FAIL: find strlower edge case
echo -ne '' | touch aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ; find . -iname aaaaaȺȺȺȺȺȺȺȺȺ
--- expected    2024-05-14 13:32:19.000000000 -0500
+++ actual      2024-05-14 13:32:19.000000000 -0500
@@ -1 +0,0 @@
-./aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ
make: *** [tests] Error 1
cfarm104 (homebrew):toybox landley$ ls generated/testdir/testdir/
aaaaa?????????
$ LC_ALL=en_US.UTF-8 ls generated/testdir/testdir
aaaaa?????????
$ generated/testdir/ls generated/testdir/testdir
aaaaa\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245
$ echo -./aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ
-./aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ
$ generated/testdir/ls -N generated/testdir/testdir
aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ
cfarm104 (homebrew):toybox landley$ generated/testdir/ls -N
generated/testdir/testdir
aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ
cfarm104 (homebrew):toybox landley$ ls -N generated/testdir/testdir
ls: invalid option -- N
usage: ls [-@ABCFGHILOPRSTUWabcdefghiklmnopqrstuvwxy1%,] [--color=when] [-D
format] [file ...]

Why is toybox ls escaping by default here but not on Linux? Hmmm, it's gotta be
this call in crunch_qb():

    // scrute the inscrutable, eff the ineffable, print the unprintable
    else if ((len = wcrtomb(buf, wc, 0) ) == -1) len = 1;

Once again, I wist for stable/portable unicode functions in lib/unicode.c. I
know why I haven't GOT them (mostly), but this is just ridiculous. (They don't
have to be GREAT, but NOT THAT...)

(There's only 100k code points and MOSTLY I'm doing tests that return ONE BIT
answers. I'm aware it's a trap, but DUDE...)

Anyway, STILL not the actual issue at hand, the issue is that:

cfarm104 (homebrew):toybox landley$ generated/testdir/find
generated/testdir/testdir -iname aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ
generated/testdir/testdir/aaaaaⱥⱥⱥⱥⱥⱥⱥⱥⱥ
cfarm104 (homebrew):toybox landley$ generated/testdir/find
generated/testdir/testdir -iname aaaaaȺȺȺȺȺȺȺȺȺ
cfarm104 (homebrew):toybox landley$

The upper case string is not converting into the lower case string. Ok, let's
stick a +dprintf(2, "%d->%d\n", c, towlower(c)); into strlower() and it says
"570->58" which... is a colon? Hmmm, prepending LC_ALL=en_US.UTF-8 did not
change that.

It looks like macos towlower() refuses to return expanding unicode characters.
Possibly to avoid exactly the kind of bug this fixed, in exchange for corrupting
the data.

I don't know how to fix this other than stubbing out the test on macos, or
adding lib/unicode.c. (I _really_ want to find an 80/20 there. I'm aware I have
failed at least three previous attempts, and am 2/3 of the way to clearing off
my laptop so I can install the new OS version and put the big ram sticks back so
NOW IS NOT THE TIME, but still...)

Rob
_______________________________________________
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Reply via email to