On 5/8/24 06:48, Rob Landley wrote: > On 5/6/24 17:12, Ray Gardner wrote: >> While working on an awk implementation for toybox, I found a bug in >> strlower(), which is used only in find.c. I've attached some tests to >> put in find.test to reveal it. I can't put them here directly because >> I don't think the UTF-8 names will come through. (I modelled my awk >> tolower()/toupper() code on your strlower().)
> Your test doesn't create the files you're finding, so find is supposed to > fail? The tests were not intended to tell anything about find per se, but only to elicit a memory violation to show the bug in strlower(). I didn't see any easier way to do it than via the only command that uses strlower(). > Your first test doesn't barf under ASAN, and then the second one's going to > fail > because echo -n | wc says it's 258 bytes and the VFS file length limit is 255 > bytes, so there CAN'T be a file named that on Linux. (Path length != path > component length, there's no slashes in there.) I was not aware of the filename length limit in Linux, thanks. The first test was just meant as a sort of "null test" to show that the command works OK with the expanding characters if there are just a few of them. >> The problem is in the test if the output string needs to be enlarged >> to take an expanded lowercase: >> // Case conversion can expand utf8 representation, but with extra mlen >> // space above we should basically never need to realloc >> if (mlen+4 > (len = new-try)) continue; >> >> The mlen+4 needs to be mlen-4 to leave at least 4 bytes for the next >> character. > Hmmm, possibly. I still don't understand what your test case is testing. (Just > trying to trigger an ASAN violation with an otherwise nonsense test?) Yes, the test is for strlower(), not for find. And I got errors from the allocator even without ASAN. >> As the comment indicates, it should "never" need to realloc; [snip] > The second is "basically never" because it requires an insane input string, > but > that's user controlled and users do crazy things, sometimes even > intentionally. That was me... >> it takes a very long name of uppercase characters that do expand when >> made lowercase. But the code is there to handle that very case. [Rob analyzes why that was wrong and my test was suboptimal ...] > Sigh, lemme come up with a test that demonstrates the fix working... the > minimal > one seems to be ./find . -iname aaaaaȺȺȺȺȺȺȺȺȺ I should have done better on the test cases; I didn't think it would overflow on one that short. But yours proves that maybe it could happen on a non-insane input string? [info about error handling etc.] Thanks for the info about error handling etc. I'm still a newb at Linux, though not exactly new to programming... BTW I was a bit surprised that mentioning my awk for toybox got no reaction. _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
