Re: Crashes while trying to show tag objects with bad timestamps
On Fri, Feb 22, 2013 at 03:20:10PM -0800, Junio C Hamano wrote: > As pp_user_info() is called from very few places, I do not think it > is unreasonable to add an output parameter (i.e. "unsigned *") to > let the caller know that we made a best guess given malformed input > and handle the error in the caller. The make_cover_letter() caller > may look like: > > pp_user_info(&pp, NULL, &sb, committer, encoding, &errors); > if (errors & PP_CORRUPT_DATE) > warning("unparsable datestamp in '%s'", committer); > > although it is unlikely to see this error in practice, given that > committer is coming from git_committer_info(0) and would have the > current timestamp. Sadly that is not quite enough for the object-parsing cases (which are the ones we _really_ want to add context to, because they are buried inside other pp_* calls. Probably adding an object context field (or an error return) to the pretty-print context would make sense. But I don't relish the thought of annotating each pretty-print caller. I think we're OK to be silent and just react in an appropriate way; having looked over the other callers of split_ident_line, we already do so in some places. See my patch 1 below for details. Once fsck is taught to note this, then the warning is a lot less important (my patch 3 below). > The whole "cat-file -p" is a historical wart, aka poor-man's > "show". I do not even consider it a part of the plumbing. It is a > fair game for Porcelainisque improvement ;-) Good, that's how I feel, too. See my patch 4. :) Here are the patches I'd like to do: [1/4]: handle malformed dates in ident lines [2/4]: skip_prefix: return a non-const pointer [3/4]: fsck: check "tagger" lines [4/4]: cat-file: print tags raw for "cat-file -p" The first one is solid, and should probably go to maint and/or the -rc track, as it fixes a segfault on bogus input. It's hopefully a no-brainer, as the existing behavior is obviously unacceptable. We may change our mind later about exactly what to print for such bogus input, but whatever we print in such a case is just trying to be nice to the user, and anybody who depends on our particular handling of malformed objects is crazy. The rest can wait, as they are about improving output when fed bogus input, or tightening fsck. Moreover, they have some problems which make them not suitable for applying yet. I'll give details in each patch. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crashes while trying to show tag objects with bad timestamps
On Sat, Feb 23, 2013 at 01:14:40AM +0200, Mantas Mikulėnas wrote: > > Then I think it would make sense to allow the very specific no-date tag, > > but not allow arbitrary crud. I wonder if there's an example in the > > kernel or in git.git. > > I couldn't find any such examples. However, I did find several tags > with no "tagger" line at all: git.git has "v0.99" and linux.git has > many such tags starting with "v2.6.11" ending with "v2.6.13-rc3". Yes, I think Junio was mis-remembering the exact condition. It looks like we added tagger lines in c818566 ([PATCH] Update tags to record who made them, 2005-07-14), which pulls the identity straight from "git var GIT_COMMITTER_IDENT". I double-checked to be sure that we included the date stamp at that time, and we did. When parsing such a tag, we put a "0" in the date field of the "struct tag", and I suspect that is what caused the memory confusion. So I think we are fine to fsck tagger lines as we do ordinary author/committer ident lines; the only exception is that we should not complain if they do not exist. > It seems that `git cat-file -p` doesn't like such tags too – if there > is no "tagger", it doesn't display *any* header lines. More bugs? Yeah, I think we should just rid of that parser entirely. It is very inconsistent with the pretty-printer used by "git show", as well as the one used by "git for-each-ref", not to mention parse_tag (ugh, how many tag parsers do we have?). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crashes while trying to show tag objects with bad timestamps
Jeff King writes: > On Fri, Feb 22, 2013 at 02:53:48PM -0800, Junio C Hamano wrote: > >> > I guess we should probably issue a warning, too. Also disappointingly, >> > git-fsck does not seem to detect this breakage at all. >> >> Yes for the warning, > > Unfortunately, a good warning is harder than I had hoped. At the point > where we notice the problem, pp_user_info, we have very little context. > We can say only something like: > > warning: malformed date in ident 'Jeff King BOGUS' > > but we cannot say in which object, or even that it was a "tagger" line > (and in some cases we do not even have an object, as in > make_cover_letter). As pp_user_info() is called from very few places, I do not think it is unreasonable to add an output parameter (i.e. "unsigned *") to let the caller know that we made a best guess given malformed input and handle the error in the caller. The make_cover_letter() caller may look like: pp_user_info(&pp, NULL, &sb, committer, encoding, &errors); if (errors & PP_CORRUPT_DATE) warning("unparsable datestamp in '%s'", committer); although it is unlikely to see this error in practice, given that committer is coming from git_committer_info(0) and would have the current timestamp. > I also took a look at parsing routine of "cat-file -p". It's totally > hand-rolled, separate from what "git show" does, and is not build on the > pretty-print code at all. I wonder, though, if it actually makes sense > to munge the date there. The commit-object pretty-printer for cat-file > just shows the object intact. It seems weirdly inconsistent that we > would munge tags just to rewrite the date. If you want a real > pretty-printer, you should be using porcelain like "show". The whole "cat-file -p" is a historical wart, aka poor-man's "show". I do not even consider it a part of the plumbing. It is a fair game for Porcelainisque improvement ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crashes while trying to show tag objects with bad timestamps
On Sat, Feb 23, 2013 at 1:04 AM, Jeff King wrote: > On Fri, Feb 22, 2013 at 02:53:48PM -0800, Junio C Hamano wrote: >> and no for disappointing. IIRC, in the very early implementations >> allowed tag object without dates. >> >> I _think_ we can start tightening fsck, though. > > Then I think it would make sense to allow the very specific no-date tag, > but not allow arbitrary crud. I wonder if there's an example in the > kernel or in git.git. I couldn't find any such examples. However, I did find several tags with no "tagger" line at all: git.git has "v0.99" and linux.git has many such tags starting with "v2.6.11" ending with "v2.6.13-rc3". It seems that `git cat-file -p` doesn't like such tags too – if there is no "tagger", it doesn't display *any* header lines. More bugs? -- Mantas Mikulėnas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crashes while trying to show tag objects with bad timestamps
On Fri, Feb 22, 2013 at 02:53:48PM -0800, Junio C Hamano wrote: > > I guess we should probably issue a warning, too. Also disappointingly, > > git-fsck does not seem to detect this breakage at all. > > Yes for the warning, Unfortunately, a good warning is harder than I had hoped. At the point where we notice the problem, pp_user_info, we have very little context. We can say only something like: warning: malformed date in ident 'Jeff King BOGUS' but we cannot say in which object, or even that it was a "tagger" line (and in some cases we do not even have an object, as in make_cover_letter). > and no for disappointing. IIRC, in the very early implementations > allowed tag object without dates. > > I _think_ we can start tightening fsck, though. Then I think it would make sense to allow the very specific no-date tag, but not allow arbitrary crud. I wonder if there's an example in the kernel or in git.git. I also took a look at parsing routine of "cat-file -p". It's totally hand-rolled, separate from what "git show" does, and is not build on the pretty-print code at all. I wonder, though, if it actually makes sense to munge the date there. The commit-object pretty-printer for cat-file just shows the object intact. It seems weirdly inconsistent that we would munge tags just to rewrite the date. If you want a real pretty-printer, you should be using porcelain like "show". It would be a regression, of course, for people relying on "cat-file -p" to have consistent output. But I am very tempted to call it a bug, and tempted to call "cat-file -p" inside a script a bad thing (you cannot, after all, tell what object type you have; you should figure out the type you expect and then use "cat-file " to make sure you get the right one). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crashes while trying to show tag objects with bad timestamps
Jeff King writes: > I guess we should probably issue a warning, too. Also disappointingly, > git-fsck does not seem to detect this breakage at all. Yes for the warning, and no for disappointing. IIRC, in the very early implementations allowed tag object without dates. I _think_ we can start tightening fsck, though. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crashes while trying to show tag objects with bad timestamps
On Sat, Feb 23, 2013 at 12:30:28AM +0200, Mantas Mikulėnas wrote: > When messing around with various repositories, I noticed that git 1.8 > (currently using 1.8.2.rc0.22.gb3600c3) has problems parsing tag objects > that have invalid timestamps. > > Times in tag objects appear to be kept as Unix timestamps, but I didn't > realize this at first, and ran something roughly equivalent to: > git cat-file -p $tagname | git hash-object -w -t tag --stdin > creating a tag object the "tagger" line containing formatted time > instead of a Unix timestamp. Thanks, that makes it easy to replicate. It looks like it is not just tags, but rather the pp_user_info function does not realize that split_ident may return NULL for the date field if it is unparseable. Something like this stops the crash and just gives a bogus date: diff --git a/pretty.c b/pretty.c index eae57ad..9688857 100644 --- a/pretty.c +++ b/pretty.c @@ -428,8 +428,16 @@ void pp_user_info(const struct pretty_print_context *pp, strbuf_add(&name, namebuf, namelen); namelen = name.len + mail.len + 3; /* ' ' + '<' + '>' */ - time = strtoul(ident.date_begin, &date, 10); - tz = strtol(date, NULL, 10); + + if (ident.date_begin) { + time = strtoul(ident.date_begin, &date, 10); + tz = strtol(date, NULL, 10); + } + else { + /* ident line had malformed date */ + time = 0; + tz = 0; + } if (pp->fmt == CMIT_FMT_EMAIL) { strbuf_addstr(sb, "From: "); I guess we should probably issue a warning, too. Also disappointingly, git-fsck does not seem to detect this breakage at all. > Git doesn't handle the resulting tag objects nicely at all. For example, > running `git cat-file -p` on the new object outputs a really odd > timestamp "Thu Jun Thu Jan 1 00:16:09 1970 +0016" (I'm guessing it > parses the year as Unix time), and `git show` outright crashes > (backtrace included below.) If "cat-file -p" is not using the usual pretty-print routines, it probably should. I'll take a look. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html