Re: [PATCH 4/5] log: handle integer overflow in timestamps
Jeff King writes: > On Mon, Feb 24, 2014 at 12:21:33PM -0800, Junio C Hamano wrote: > >> >> > + if (date_overflows(date)) >> >> > + date = 0; >> >> > + else { >> >> > + if (ident->tz_begin && ident->tz_end) >> >> > + tz = strtol(ident->tz_begin, NULL, 10); >> >> > + if (tz == LONG_MAX || tz == LONG_MIN) >> >> > + tz = 0; >> >> > + } >> >> >> >> ... don't we want to fix an input having a bogus timestamp and also >> >> a bogus tz recorded in it? >> > >> > If there is a bogus timestamp, then we do not want to look at tz at all. >> > We leave it at "0", so that we get a true sentinel: >> >> Ah, OK, I missed the initialization to 0 at the beginning. >> >> It might have been more clear if "int tz" declaration were left >> uninitialized, and the variable were explicitly cleared to 0 in the >> "date-overflows" error codepath, but it is not a big deal. > > It might be, but I think it would end up cumbersome. > ... > So I'd be in favor of keeping it as-is, but feel free to mark it up if > you feel strongly. I'd be in favor of keeping it as-is. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] log: handle integer overflow in timestamps
On Mon, Feb 24, 2014 at 12:21:33PM -0800, Junio C Hamano wrote: > >> > +if (date_overflows(date)) > >> > +date = 0; > >> > +else { > >> > +if (ident->tz_begin && ident->tz_end) > >> > +tz = strtol(ident->tz_begin, NULL, 10); > >> > +if (tz == LONG_MAX || tz == LONG_MIN) > >> > +tz = 0; > >> > +} > >> > >> ... don't we want to fix an input having a bogus timestamp and also > >> a bogus tz recorded in it? > > > > If there is a bogus timestamp, then we do not want to look at tz at all. > > We leave it at "0", so that we get a true sentinel: > > Ah, OK, I missed the initialization to 0 at the beginning. > > It might have been more clear if "int tz" declaration were left > uninitialized, and the variable were explicitly cleared to 0 in the > "date-overflows" error codepath, but it is not a big deal. It might be, but I think it would end up cumbersome. The initialization was already there from the previous version, which was hitting the else for "ident->tz_begin". Without fallback initializations, you end up with: if (ident->date_begin && ident->date_end) { date = strtoul(ident->date_begin, NULL, 10); if (date_overflows(date)) { date = 0; tz = 0; } else { if (ident->tz_begin && ident->tz_end) { tz = strtol(ident->tz_begin, NULL, 10); if (tz == LONG_MAX || tz == LONG_MIN) tz = 0; } else tz = 0; } } else { date = 0; tz = 0; } which I think is much more confusing (and hard to verify that the variables are always set). Checking !date as an error condition would make it a little more readable: if (ident->date_begin && ident->date_end) { date = strtoul(ident->date_begin, NULL, 10); if (date_overflows(date)) date = 0; } else date = 0; if (date) { if (ident->tz_begin && ident->tz_end) { tz = strtol(ident->tz_begin, NULL, 10); if (tz == LONG_MAX || tz == LONG_MIN) tz = 0; } else tz = 0; } else tz = 0; but then we treat date==0 as a sentinel, and can never correctly parse dates on Jan 1, 1970. So I'd be in favor of keeping it as-is, but feel free to mark it up if you feel strongly. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] log: handle integer overflow in timestamps
Jeff King writes: > On Mon, Feb 24, 2014 at 11:50:00AM -0800, Junio C Hamano wrote: > >> > We also recognize overflow in the timezone field, which >> > could produce nonsensical results. In this case we show the >> > parsed date, but in UTC. >> >> Both are good measures to fallback to sanity, but why is that >> if/else? In other words... >> >> > + if (date_overflows(date)) >> > + date = 0; >> > + else { >> > + if (ident->tz_begin && ident->tz_end) >> > + tz = strtol(ident->tz_begin, NULL, 10); >> > + if (tz == LONG_MAX || tz == LONG_MIN) >> > + tz = 0; >> > + } >> >> ... don't we want to fix an input having a bogus timestamp and also >> a bogus tz recorded in it? > > If there is a bogus timestamp, then we do not want to look at tz at all. > We leave it at "0", so that we get a true sentinel: Ah, OK, I missed the initialization to 0 at the beginning. It might have been more clear if "int tz" declaration were left uninitialized, and the variable were explicitly cleared to 0 in the "date-overflows" error codepath, but it is not a big deal. Thanks for clarification. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] log: handle integer overflow in timestamps
On Mon, Feb 24, 2014 at 11:50:00AM -0800, Junio C Hamano wrote: > > We also recognize overflow in the timezone field, which > > could produce nonsensical results. In this case we show the > > parsed date, but in UTC. > > Both are good measures to fallback to sanity, but why is that > if/else? In other words... > > > + if (date_overflows(date)) > > + date = 0; > > + else { > > + if (ident->tz_begin && ident->tz_end) > > + tz = strtol(ident->tz_begin, NULL, 10); > > + if (tz == LONG_MAX || tz == LONG_MIN) > > + tz = 0; > > + } > > ... don't we want to fix an input having a bogus timestamp and also > a bogus tz recorded in it? If there is a bogus timestamp, then we do not want to look at tz at all. We leave it at "0", so that we get a true sentinel: Thu Jan 1 00:00:00 1970 + and not: Wed Dec 31 19:00:00 1969 -0500 If the timestamp is valid, then we bother to parse the zone, and handle overflow there. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] log: handle integer overflow in timestamps
Jeff King writes: > If an ident line has a ridiculous date value like (2^64)+1, > we currently just pass ULONG_MAX along to the date code, > which can produce nonsensical dates. > > On systems with a signed long time_t (e.g., 64-bit glibc > systems), this actually doesn't end up too bad. The > ULONG_MAX is converted to -1, we apply the timezone field to > that, and the result ends up somewhere between Dec 31, 1969 > and Jan 1, 1970. > ... > We also recognize overflow in the timezone field, which > could produce nonsensical results. In this case we show the > parsed date, but in UTC. Both are good measures to fallback to sanity, but why is that if/else? In other words... > + if (date_overflows(date)) > + date = 0; > + else { > + if (ident->tz_begin && ident->tz_end) > + tz = strtol(ident->tz_begin, NULL, 10); > + if (tz == LONG_MAX || tz == LONG_MIN) > + tz = 0; > + } ... don't we want to fix an input having a bogus timestamp and also a bogus tz recorded in it? > return show_date(date, tz, mode); > } > > diff --git a/t/t4212-log-corrupt.sh b/t/t4212-log-corrupt.sh > index 83de981..ba25a2e 100755 > --- a/t/t4212-log-corrupt.sh > +++ b/t/t4212-log-corrupt.sh > @@ -65,4 +65,20 @@ test_expect_success 'unparsable dates produce sentinel > value (%ad)' ' > test_cmp expect actual > ' > > +# date is 2^64 + 1 > +test_expect_success 'date parser recognizes integer overflow' ' > + commit=$(munge_author_date HEAD 18446744073709551617) && > + echo "Thu Jan 1 00:00:00 1970 +" >expect && > + git log -1 --format=%ad $commit >actual && > + test_cmp expect actual > +' > + > +# date is 2^64 - 2 > +test_expect_success 'date parser recognizes time_t overflow' ' > + commit=$(munge_author_date HEAD 18446744073709551614) && > + echo "Thu Jan 1 00:00:00 1970 +" >expect && > + git log -1 --format=%ad $commit >actual && > + test_cmp expect actual > +' > + > test_done -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] log: handle integer overflow in timestamps
If an ident line has a ridiculous date value like (2^64)+1, we currently just pass ULONG_MAX along to the date code, which can produce nonsensical dates. On systems with a signed long time_t (e.g., 64-bit glibc systems), this actually doesn't end up too bad. The ULONG_MAX is converted to -1, we apply the timezone field to that, and the result ends up somewhere between Dec 31, 1969 and Jan 1, 1970. However, there is still a few good reasons to detect the overflow explicitly: 1. On systems where "unsigned long" is smaller than time_t, we get a nonsensical date in the future. 2. Even where it would produce "Dec 31, 1969", it's easier to recognize "midnight Jan 1" as a consistent sentinel value for "we could not parse this". 3. Values which do not overflow strtoul but do overflow a signed time_t produce nonsensical values in the past. For example, on a 64-bit system with a signed long time_t, a timestamp of 184467440730 produces a date in 1947. We also recognize overflow in the timezone field, which could produce nonsensical results. In this case we show the parsed date, but in UTC. Signed-off-by: Jeff King --- A note on these tests. They are designed for 64-bit systems, but should run fine on 32-bit systems (they are both just overflow, and the second test is not doing anything novel that the first is not). However, the second test relies on finding a value that fits into an "unsigned long" but does not fit into a time_t. For systems with a 64-bit signed time_t, that's fine. But if a system has an unsigned 64-bit time_t, the test will fail (it will actually produce some value 300 billion years from now). I'm inclined to include it as-is, as I do not know of any such system (and it would be kind of lame, since it could not represent dates before 1970). If somebody comes up with such a system, we can allow either output (we could do it preemptively, but somebody would have to calculate the exact date/time billions of years in the future; we would be just as likely to disagree with the system's gmtime about something silly like leapseconds). pretty.c | 10 -- t/t4212-log-corrupt.sh | 16 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/pretty.c b/pretty.c index 87db08b..3b811ed 100644 --- a/pretty.c +++ b/pretty.c @@ -401,8 +401,14 @@ static const char *show_ident_date(const struct ident_split *ident, if (ident->date_begin && ident->date_end) date = strtoul(ident->date_begin, NULL, 10); - if (ident->tz_begin && ident->tz_end) - tz = strtol(ident->tz_begin, NULL, 10); + if (date_overflows(date)) + date = 0; + else { + if (ident->tz_begin && ident->tz_end) + tz = strtol(ident->tz_begin, NULL, 10); + if (tz == LONG_MAX || tz == LONG_MIN) + tz = 0; + } return show_date(date, tz, mode); } diff --git a/t/t4212-log-corrupt.sh b/t/t4212-log-corrupt.sh index 83de981..ba25a2e 100755 --- a/t/t4212-log-corrupt.sh +++ b/t/t4212-log-corrupt.sh @@ -65,4 +65,20 @@ test_expect_success 'unparsable dates produce sentinel value (%ad)' ' test_cmp expect actual ' +# date is 2^64 + 1 +test_expect_success 'date parser recognizes integer overflow' ' + commit=$(munge_author_date HEAD 18446744073709551617) && + echo "Thu Jan 1 00:00:00 1970 +" >expect && + git log -1 --format=%ad $commit >actual && + test_cmp expect actual +' + +# date is 2^64 - 2 +test_expect_success 'date parser recognizes time_t overflow' ' + commit=$(munge_author_date HEAD 18446744073709551614) && + echo "Thu Jan 1 00:00:00 1970 +" >expect && + git log -1 --format=%ad $commit >actual && + test_cmp expect actual +' + test_done -- 1.8.5.2.500.g8060133 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html