Re: [PATCH 2/7] get-tar-commit-id: check write_in_full() return against 0
On Wed, Sep 13, 2017 at 02:09:27PM -0700, Jonathan Nieder wrote: > > We ask to write 41 bytes and make sure that the return value > > is at least 41. This is the same "dangerous" pattern that > > was fixed in the prior commit (wherein a negative return > > value is promoted to unsigned), though it is not dangerous > > here because our "41" is a constant, not an unsigned > > variable. > > > > But we should convert it anyway to avoid modeling a > > dangerous construct. > > > > Signed-off-by: Jeff King> > --- > > builtin/get-tar-commit-id.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > I kind of disagree with calling this dangerous (and I think that is > what you alluded to above by putting it in quotes), but I like the > postimage more than the preimage. Right, this instance is fine, but the pattern of using "<" is not. If you swapped out "41" for: size_t len = 41; then it would be a bug. Which I think would surprise most people. > The variable 'n' could be eliminated to simplify this further. I > realize that would go against the spirit of this patch, but (1) it's > on-topic for the patch, since it is another ssize_t vs constant > comparison and (2) as mentioned elsewhere in this thread, it's a very > common idiom with read_in_full. If we want to eliminate it then we > could introduce a separate helper to distinguish between > read_this_much_i_mean_it and read_this_much_or_to_eof. Yes, I noticed that, too, after you brought up read_in_full() as a potential source of problems. But I would rather deal with read_in_full() separately on top. Can you do it as a separate patch on top (possibly with other read_in_full() cleanups, though I think this is the only "<" case that exists). -Peff
Re: [PATCH 2/7] get-tar-commit-id: check write_in_full() return against 0
Hi, Jeff King wrote: > We ask to write 41 bytes and make sure that the return value > is at least 41. This is the same "dangerous" pattern that > was fixed in the prior commit (wherein a negative return > value is promoted to unsigned), though it is not dangerous > here because our "41" is a constant, not an unsigned > variable. > > But we should convert it anyway to avoid modeling a > dangerous construct. > > Signed-off-by: Jeff King> --- > builtin/get-tar-commit-id.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) I kind of disagree with calling this dangerous (and I think that is what you alluded to above by putting it in quotes), but I like the postimage more than the preimage. The variable 'n' could be eliminated to simplify this further. I realize that would go against the spirit of this patch, but (1) it's on-topic for the patch, since it is another ssize_t vs constant comparison and (2) as mentioned elsewhere in this thread, it's a very common idiom with read_in_full. If we want to eliminate it then we could introduce a separate helper to distinguish between read_this_much_i_mean_it and read_this_much_or_to_eof. Anyway, I think the below is an improvement. With or without this tweak, Reviewed-by: Jonathan Nieder Thanks. diff --git i/builtin/get-tar-commit-id.c w/builtin/get-tar-commit-id.c index 6d9a79f9b3..03ef7c5ba4 100644 --- i/builtin/get-tar-commit-id.c +++ w/builtin/get-tar-commit-id.c @@ -20,13 +20,11 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix) struct ustar_header *header = (struct ustar_header *)buffer; char *content = buffer + RECORDSIZE; const char *comment; - ssize_t n; if (argc != 1) usage(builtin_get_tar_commit_id_usage); - n = read_in_full(0, buffer, HEADERSIZE); - if (n < HEADERSIZE) + if (read_in_full(0, buffer, HEADERSIZE) < HEADERSIZE) die("git get-tar-commit-id: read error"); if (header->typeflag[0] != 'g') return 1;
Re: [PATCH 2/7] get-tar-commit-id: check write_in_full() return against 0
On Wed, Sep 13, 2017 at 02:02:32PM -0400, Jeff King wrote: > > Does read_in_full need a similar treatment? > > It might actually return fewer than the requested number of bytes, so it > can't just use "< 0" in the same way (nor be adapted to return 0 on > success). But I think it's still a bug to do: > > char buf[20]; > size_t len = sizeof(buf); > if (read_in_full(fd, buf, len) < len) > die(...); > > since that will promote the -1 to a size_t. So it's probably worth > auditing. I looked it over and found one potentially buggy case. In read-pack_header(), we do: if (read_in_full(fd, header, sizeof(*header)) < sizeof(*header)) /* "eof before pack header was fully read" */ return PH_ERROR_EOF; which will treat a read error as a silent success. I don't think it's harmful in practice because the next line checks that the header matches the PACK signature (which yes, means we're reading uninitialized bytes, but the chances are 1 in 2^32 that they match the signature. Unless the buffer had an old PACK signature in it, of course ;) ). There's one other harmless "< len" check. There are a bunch of "!=" checks which are OK as-is. Converting them to something equally short is hard, though we could introduce a helper similar to your write_fully(), like: int read_exactly(int fd, char *buf, size_t len) { ssize_t ret = read_in_full(fd, buf, len); if (ret < 0) return -1; /* real error */ else if (ret < len) return -1; /* early eof */ return 0; } But the trouble is that a _good_ caller actually handles those cases separately to provide better error messages (by doing that same if-cascade themselves). If those if's were turned into error() or die() calls, then we'd actually be improving the callsites. But of course not all of them actually print an error or die. And when they do, they usually say something specific about _what_ they were trying to read. So it may not be worth trying to do a mass-cleanup in the same way here. -Peff
Re: [PATCH 2/7] get-tar-commit-id: check write_in_full() return against 0
On Wed, Sep 13, 2017 at 10:53:57AM -0700, Jonathan Nieder wrote: > Jeff King wrote: > > > We ask to write 41 bytes and make sure that the return value > > is at least 41. This is the same "dangerous" pattern that > > was fixed in the prior commit (wherein a negative return > > value is promoted to unsigned), though it is not dangerous > > here because our "41" is a constant, not an unsigned > > variable. > > > > But we should convert it anyway to avoid modeling a > > dangerous construct. > > If the above logic is correct, then I suspect this series does not go > far enough. write_in_full() would be one of those APIs that is easy > to misuse and difficult to use correctly, and if so we should fix that > at the source instead of trying to teach callers not to hold it wrong. Yes, this series is just removing bad examples. It doesn't do anything to make write_in_full() less potentially dangerous. On the other hand, it's no more or less dangerous than write(), which has the same return-value semantics. > E.g. what would you think of > > 1. Introduce a write_fully (sorry, I am bad at names) function > that returns 0 on success and a coccinelle semantic patch in > contrib/coccinelle to migrate callers in "make coccicheck": Yes, I considered that, though some callers really do care about assigning the number of bytes written. The fact that write() has the same problem, plus the fact that there were only 2 buggy instances across the code base made me think there's not a huge gain to that extra step. > @@ > expression E; > expression F; > expression G; > @@ > -write_in_full(E, F, G) < G > +write_fully(E, F, G) > > 2. Run "make coccicheck" and apply the result. > 3. Remove the write_in_full function. There's a step between those where you have to update all of the write_in_full() callers that store the result. Some of them would be trivial conversions, but some of them actually care about the length E.g., the one in imap-send.c, which is the only one I didn't convert away from "!= len" because it's half of an #ifdef with SSL_write() (which uses an "int" as the return value!). > Does read_in_full need a similar treatment? It might actually return fewer than the requested number of bytes, so it can't just use "< 0" in the same way (nor be adapted to return 0 on success). But I think it's still a bug to do: char buf[20]; size_t len = sizeof(buf); if (read_in_full(fd, buf, len) < len) die(...); since that will promote the -1 to a size_t. So it's probably worth auditing. -Peff
Re: [PATCH 2/7] get-tar-commit-id: check write_in_full() return against 0
Jeff King wrote: > We ask to write 41 bytes and make sure that the return value > is at least 41. This is the same "dangerous" pattern that > was fixed in the prior commit (wherein a negative return > value is promoted to unsigned), though it is not dangerous > here because our "41" is a constant, not an unsigned > variable. > > But we should convert it anyway to avoid modeling a > dangerous construct. If the above logic is correct, then I suspect this series does not go far enough. write_in_full() would be one of those APIs that is easy to misuse and difficult to use correctly, and if so we should fix that at the source instead of trying to teach callers not to hold it wrong. E.g. what would you think of 1. Introduce a write_fully (sorry, I am bad at names) function that returns 0 on success and a coccinelle semantic patch in contrib/coccinelle to migrate callers in "make coccicheck": @@ expression E; expression F; expression G; @@ -write_in_full(E, F, G) < G +write_fully(E, F, G) 2. Run "make coccicheck" and apply the result. 3. Remove the write_in_full function. Does read_in_full need a similar treatment? Thanks, Jonathan