On 3/20/19 10:08 PM, Daniel Mentz wrote: > On Wed, Mar 20, 2019 at 7:24 PM Rob Landley <[email protected]> wrote: >> On 3/19/19 6:38 PM, Daniel Mentz via Toybox wrote: >>> -// TODO: does this need NUL terminator? >>> if (strlen(lnk) > sizeof(hdr.link)) >> >> The todo item you erased there was a note to self about checking what happens >> when a value fits exactly in the field and thus there is no NUL terminator, >> and >> whether the read side was handling that correctly. (Similarly, in theory a > > If I'm not mistaken, the code on the read side that handles the > tar.link field is the following: > > if (!TT.hdr.link_target && *tar.link) > TT.hdr.link_target = xstrndup(tar.link, sizeof(tar.link));
That would be the read side I said I hadn't yet checked, yes. (And I think I rewrote that code between when I wrote the TODO and when you just checked it?) The note also tells me to add tests for the corner cases of the file sizes (which I tested from the command line for "name" a couple commits back but haven't done for "link" yet, and I need to add them to the tar.tests file which I'm currently completely rewriting because it needs "done as root" vs "not done as root" sections _and_ it needs at least some tests that work in the absence of gzip on the test system; right now compression failure makes almost all tests fail. I didn't write that test file and my complaints about it from last week (http://lists.landley.net/pipermail/toybox-landley.net/2019-March/010261.html) remain true. >>> write_longname(lnk, 'K'); //write longname LINK >>> -// TODO: this will error_exit() if too long, not truncate. >>> - xstrncpy(hdr.link, lnk, sizeof(hdr.link)); >>> + strncpy(hdr.link, lnk, sizeof(hdr.link)); >> >> That todo was because it should be strlcpy(), but unfortunately Ulrich >> Drepper >> was an asshole and the committee that replaced him is very, very slow. As in >> this was 2014: https://lwn.net/Articles/612244/ > > I believe that strncpy is the correct function, since the string does > not need to be NULL terminated if it's 100 bytes long (i.e. And doesn't need to be memset with zeroes if it's 3 bytes long, since it was already memset with zeroes above. It's not a _lot_ of unnecessary work but it's annoying. That's why xstrncpy() exists, and there's only 3 instances of the function outside of pending: $ grep [^x]strncpy toys/*/*.c toys/net/tunctl.c: strncpy(ifr->ifr_name, *toys.optargs, sizeof(ifr->ifr_name)); toys/other/mkdosfs.c: strncpy(fat+3, oem, 8); // OEM toys/other/mkswap.c: if (TT.L) strncpy(label, TT.L, 15); > sizeof(hdr.link)). If you were to use strlcpy(), you would actually > lose one useful byte i.e. you wouldn't be able to fit a string that is > exactly 100 bytes long (you'd truncate it to 99 bytes and add a NULL > byte that's not required). I know, that's why I didn't. (That plus the glibc not having strlcpy thing.) This is why the TODO was still there. I had not gone back and attempted to resolve the issue yet. I'm unhappy with the gratuitous memset of the rest of the field that strncpy does is the wrong behavior, but it's the function that's currently available and it was blocking you, so I changed it for now. Doesn't mean I don't want to take another look at it. > I briefly looked at a tar file generated by GNU tar, and hdr.link > wasn't NULL terminated there either (I had GNU tar store a symbolic > link where the target was longer than 100 bytes). As I did with the file side multiple days ago, yes. And fixed up the code to handle it properly (The "Similar to commit 28711d30" you mentioned at the start of yours). I had not confirmed that all the cases were handled (the "overflow issue" I moved on to was files to big to fit in the size field, and thus the big endian binary encoding with the high bit set), but I need to add test suite entries for all of the corner cases I make the code handle and test from the command line. That's why the TODO was still there. The pending/ directory means it's code I haven't properly reviewed yet. Android using stuff out of the pending directory is awkward. I'm trying to get it _out_ of the pending directory, but that tends to involve really big changes. The issue I'm poking at _now_ is that testing tar against _itself_ proves very little to me, and I can't depend on multiple implementations being available (and if so, which is right?), so I'd like at least simple tarballs to be binary identical to what ubuntu produces, and I started with "touch -t 198001010101 hello; tar c hello --owner root --group root | sha1sum" to get a known sha1sum, but they're _not_ identical because gnu tar is adding WAAAAAAY too much NUL padding to the end, for reasons I do not understand, which is making the sha1sum not match even when I add the stupid space back after the checksum field. So I ran it through "head -c $((3*512))", but I tried to understand why gnu was doing that first. (Not a clue. Working my way through https://www.gnu.org/software/tar/manual/html_node/Standard.html again in case I missed something, but given it says that numeric fields are 0 padded N-1 length with a NUL terminator and does NOT mention the weird thing chksum does means this is not a very good document. I.E. adding the darn space back after the checksum violates GNU's own page on what the tarball is supposed to look like.) Anyway, back to the topic: this file is covered with TODO entries the same way an active construction site would be covered with warning tape. I'm doing https://landley.net/toybox/cleanup.html and have made a dozen commits to this one file since March 3. It's lingered in pending for years partly because I've been dreading finding out what Al Viro was talking about in http://lkml.iu.edu/hypermail/linux/kernel/0112.2/1540.html and partly because I'm never sure what counts as compatibility testing. (There were many historical format variants which I think no longer matter? For once the 7 year rule (http://landley.net/toybox/faq.html#support_horizon) may not be long _enough_ here, but it's what I've got.) In that context, deleting 2 of 19 TODO notes during this process may seem helpful, but isn't. Thanks for the bug report, I'll try not to break you again in what I commit, but the next thing I need to do is rewrite the tar tests so they actually test the sort of things you're hitting, so I _can_ reasonably regression test before checking in. Rob P.S. Plenty of todos that aren't in the file, those were just notes on lines of code as I was going past. For example, I'm adding an xabspath() constraint where each name it's about to create has to fit under the top level directory it's exracting under, that's why I'm storing the abspath of cwd in main. This _doesn't_ apply to symlinks but _does_ apply to hardlinks. And then there's the whole sparse file can of worms, and xattr/selinux support. I guess the lesson here is it was a mistake to make any of my TODO entries visible to other people. There was zero interest in http://lists.landley.net/pipermail/toybox-landley.net/2019-March/010268.html and making it public results in people trying to delete entries out of it and then when I object that I'm not done with them yet I get explained at, so I should just just remove them all in the next checkin, keep them in a file here like I usually do, and not bother anybody else with them. P.P.S. This sort of trivia is why I have a blog. I should really catch it up to current... _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
