Re: [BUG REPORT] Git log pretty date

2014-06-03 Thread Dennis Kaarsemaker
On Mon, Jun 02, 2014 at 11:46:19PM -0400, Jeff King wrote:
 On Fri, May 30, 2014 at 09:08:57AM +0100, Rodrigo Fernandes wrote:
 
  Do you have any idea how does github understand that is a bug and
  fixes it automatically?
  (I'm saying this because on Github the date is correct).
 
 I looked into this. The dates you see on GitHub's web UI are actually
 parsed by Rugged/libgit2. The libgit2 parser is slightly more forgiving
 in this instance; if it sees a broken timezone, it will leave the
 timestamp intact, and only omit the timezone. Whereas git says no, it's
 broken, and the timestamp cannot be trusted.
 
 I think both are equally valid strategies, and I do not even think it is
 a problem that they diverge between the two implementations. I'd be OK
 with a patch to make git handle errors in each independently, assuming
 it is not too invasive.

I think what libgit2 does is more wrong than what git does. It displays
the timestamp subtly wrong (off by 7 hours) instead of making it
completely clear that the timestamp is bogus.

-- 
Dennis Kaarsemaker den...@kaarsemaker.net
http://twitter.com/seveas
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-06-03 Thread Jeff King
On Tue, Jun 03, 2014 at 08:23:02AM +0200, Dennis Kaarsemaker wrote:

 On Mon, Jun 02, 2014 at 11:46:19PM -0400, Jeff King wrote:
  On Fri, May 30, 2014 at 09:08:57AM +0100, Rodrigo Fernandes wrote:
  
   Do you have any idea how does github understand that is a bug and
   fixes it automatically?
   (I'm saying this because on Github the date is correct).
  
  I looked into this. The dates you see on GitHub's web UI are actually
  parsed by Rugged/libgit2. The libgit2 parser is slightly more forgiving
  in this instance; if it sees a broken timezone, it will leave the
  timestamp intact, and only omit the timezone. Whereas git says no, it's
  broken, and the timestamp cannot be trusted.
  
  I think both are equally valid strategies, and I do not even think it is
  a problem that they diverge between the two implementations. I'd be OK
  with a patch to make git handle errors in each independently, assuming
  it is not too invasive.
 
 I think what libgit2 does is more wrong than what git does. It displays
 the timestamp subtly wrong (off by 7 hours) instead of making it
 completely clear that the timestamp is bogus.

I'm not sure what you mean. The timestamp is in UTC seconds-since-epoch,
and does not depend on the timezone. The timezone only indicates the
author's local time when the commit was made.

Whether the latter is relevant depends on the date format you are
showing (i.e., if you are showing it in the author's timezone, it
matters; for --date=local or --date=relative, it would not).

So I do not think libgit2 is at fault for parsing them separately; it
does not know how the result will be presented. What GitHub shows is
wrong, as it tries to put the timestamp into the author's timezone
(which it doesn't know). But in practice it turns out be more useful,
because we show relative dates for recent things (so 8 hours ago is
actually accurate and does not care about the timestamp). For distant
things, time zone effects become less important (we show Oct 29, 2010;
it may have actually been Oct 30 or Oct 28 in the author's zone, but
it's really not that important anymore).

Those comments are just on the two strategies. As far as
implementations, it looks like libgit2 somehow parses this commit as
+0700, which is odd. I'd expect it to fall back to +. I didn't dig
further.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-06-02 Thread Jeff King
On Fri, May 30, 2014 at 09:08:57AM +0100, Rodrigo Fernandes wrote:

 Do you have any idea how does github understand that is a bug and
 fixes it automatically?
 (I'm saying this because on Github the date is correct).

I looked into this. The dates you see on GitHub's web UI are actually
parsed by Rugged/libgit2. The libgit2 parser is slightly more forgiving
in this instance; if it sees a broken timezone, it will leave the
timestamp intact, and only omit the timezone. Whereas git says no, it's
broken, and the timestamp cannot be trusted.

I think both are equally valid strategies, and I do not even think it is
a problem that they diverge between the two implementations. I'd be OK
with a patch to make git handle errors in each independently, assuming
it is not too invasive.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-05-30 Thread Rodrigo Fernandes
Jeff,
Do you have any idea how does github understand that is a bug and
fixes it automatically?
(I'm saying this because on Github the date is correct).
Cumprimentos,
Rodrigo Fernandes


On Thu, May 29, 2014 at 8:57 PM, Rodrigo Fernandes rtfrodr...@gmail.com wrote:
 Jeff,
 I have no idea what was the tool. The repo is not mine. I found the
 problem when I was doing some tests and the commit parsing was failing
 on that repo.

 Cumprimentos,
 Rodrigo Fernandes


 On Thu, May 29, 2014 at 8:49 PM, Jeff King p...@peff.net wrote:
 On Thu, May 29, 2014 at 11:57:15AM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:

  ...
  to at least make --format date output consistent with the rest of git
  (and to make %at consistent with %ad and --date=raw). That still
  doesn't address Rodrigo's concern, though (we would print 0 +).
 [...]

 I actually am not very much interested in deciding what to show for
 a broken timestamp.  An empty string is just as good as any random
 cruft.

 I was thinking specifically of the first part I quoted above. We are not
 consistent between various methods of parsing/printing the date. That
 may fall into the if were doing it from scratch... category, though;
 it's possible that people currently using --format=%ad prefer and
 expect the empty string to denote a bogus value. I'm OK with leaving it.

 I agree with you that it would be nice to have one escape
 hatch to let the users see what garbage is recorded, if only for
 diagnostic purposes, and DATE_RAW may be one good way to do so (but
 I'd rather recommend cat-file commit for real diagnostics).

 Yeah, in case I wasn't clear, I don't actually like DATE_RAW as a way to
 do that. I'd prefer --pretty=raw or cat-file commit, which already
 work.

 I would be more interested to see whatever broken tool that created
 such a commit gets fixed.  Do we know where it came from?

 I don't think it has been said yet in the thread. Rodrigo?

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-05-29 Thread Duy Nguyen
On Thu, May 29, 2014 at 5:29 PM, Rodrigo Fernandes rtfrodr...@gmail.com wrote:
 I get an empty response on the date field, but since pretty has `%ad`
 it should follow the --date and return the date even if wrong.

 ...

 I tried to check the source code but have no idea where to start,
 maybe if you point me on some direction I can take a look for my self.

pretty.c, format_commit_one() - format_person_part() -
show_ident_date() - show_date(). The last one is in date.c
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-05-29 Thread Dennis Kaarsemaker
On do, 2014-05-29 at 11:29 +0100, Rodrigo Fernandes wrote:
 
 The problem happens when I try to get a pretty log for a commit with a
 wrong date.

The commit is:

===
$ git cat-file commit e9dddaf24c9de45d9b4efdf38eff7c30eb200f48
tree d63aeb159635cb231e191505a95a129a3b4a7b38
parent 9276202f1c0dcc360433df222c90f7874558f072
author SamWM s...@webmonkeysolutions.com 1288370243 --700
committer SamWM s...@webmonkeysolutions.com 1288370243 --700

Update version number, make text formatting and indentation consistent
with the rest of the code


Those dates are indeed wrong, I'm not surprised git refuses to parse
them.

-- 
Dennis Kaarsemaker
www.kaarsemaker.net

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-05-29 Thread Rodrigo Fernandes
Dennis I think that could be an improvement.

Duy, can you point me where is the date print from normal `git log` or
`git show` so I can compare?


On Thu, May 29, 2014 at 12:07 PM, Dennis Kaarsemaker
den...@kaarsemaker.net wrote:
 On do, 2014-05-29 at 11:29 +0100, Rodrigo Fernandes wrote:

 The problem happens when I try to get a pretty log for a commit with a
 wrong date.

 The commit is:

 ===
 $ git cat-file commit e9dddaf24c9de45d9b4efdf38eff7c30eb200f48
 tree d63aeb159635cb231e191505a95a129a3b4a7b38
 parent 9276202f1c0dcc360433df222c90f7874558f072
 author SamWM s...@webmonkeysolutions.com 1288370243 --700
 committer SamWM s...@webmonkeysolutions.com 1288370243 --700

 Update version number, make text formatting and indentation consistent
 with the rest of the code
 

 Those dates are indeed wrong, I'm not surprised git refuses to parse
 them.

 --
 Dennis Kaarsemaker
 www.kaarsemaker.net

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-05-29 Thread Duy Nguyen
On Thu, May 29, 2014 at 7:24 PM, Rodrigo Fernandes rtfrodr...@gmail.com wrote:
 Duy, can you point me where is the date print from normal `git log` or
 `git show` so I can compare?

It's the same function, show_date() in date.c.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-05-29 Thread Jeff King
On Thu, May 29, 2014 at 05:50:56PM +0700, Duy Nguyen wrote:

 On Thu, May 29, 2014 at 5:29 PM, Rodrigo Fernandes rtfrodr...@gmail.com 
 wrote:
  I get an empty response on the date field, but since pretty has `%ad`
  it should follow the --date and return the date even if wrong.
 
  ...
 
  I tried to check the source code but have no idea where to start,
  maybe if you point me on some direction I can take a look for my self.
 
 pretty.c, format_commit_one() - format_person_part() -
 show_ident_date() - show_date(). The last one is in date.c

show_ident_date() recently learned to always print Jan 1 1970 as a
sentinel for dates that cannot be parsed (or for --date=raw, 0 +).
But I do not think we hit that code path at all here.

It looks like split_ident_line notices the bogus date and returns NULL
for s.date_begin. Then format_person_part skips the placeholder, since
it knows we have nothing worth showing.

I think we probably want to do something like:

diff --git a/pretty.c b/pretty.c
index 3c43db5..7214a2c 100644
--- a/pretty.c
+++ b/pretty.c
@@ -732,14 +732,6 @@ static size_t format_person_part(struct strbuf *sb, char 
part,
return placeholder_len;
}
 
-   if (!s.date_begin)
-   goto skip;
-
-   if (part == 't') {  /* date, UNIX timestamp */
-   strbuf_add(sb, s.date_begin, s.date_end - s.date_begin);
-   return placeholder_len;
-   }
-
switch (part) {
case 'd':   /* date */
strbuf_addstr(sb, show_ident_date(s, dmode));
@@ -753,6 +745,9 @@ static size_t format_person_part(struct strbuf *sb, char 
part,
case 'i':   /* date, ISO 8601 */
strbuf_addstr(sb, show_ident_date(s, DATE_ISO8601));
return placeholder_len;
+   case 't':   /* date, UNIX timestamp */
+   strbuf_addstr(sb, show_ident_date(s, DATE_RAW));
+   return placeholder_len;
}
 
 skip:

to at least make --format date output consistent with the rest of git
(and to make %at consistent with %ad and --date=raw). That still
doesn't address Rodrigo's concern, though (we would print 0 +).

For that, we would want on top:

  1. Teach split_ident_line to return _something_ for his malformed
 case.

  2. Teach show_ident_line to treat DATE_RAW to truly output raw
 content, even if it is malformed.

I am not sure whether those are a good idea or not. The current code
provides some level of sanitization, so that a parser of log output will
not get malformed cruft. And DATE_RAW may mean show me what is in the
raw commit, even if it is broken. Or it may just mean show me the date
in unix timestamp format, because that is easy to parse.

I'd argue that if somebody really wants the former, they should be using
--pretty=raw, which will show the raw commit headers, without any
parsing or fixup at all.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-05-29 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 ...
 to at least make --format date output consistent with the rest of git
 (and to make %at consistent with %ad and --date=raw). That still
 doesn't address Rodrigo's concern, though (we would print 0 +).

 For that, we would want on top:

   1. Teach split_ident_line to return _something_ for his malformed
  case.

   2. Teach show_ident_line to treat DATE_RAW to truly output raw
  content, even if it is malformed.

 I am not sure whether those are a good idea or not. The current code
 provides some level of sanitization, so that a parser of log output will
 not get malformed cruft. And DATE_RAW may mean show me what is in the
 raw commit, even if it is broken. Or it may just mean show me the date
 in unix timestamp format, because that is easy to parse.

 I'd argue that if somebody really wants the former, they should be using
 --pretty=raw, which will show the raw commit headers, without any
 parsing or fixup at all.

I actually am not very much interested in deciding what to show for
a broken timestamp.  An empty string is just as good as any random
cruft.  I agree with you that it would be nice to have one escape
hatch to let the users see what garbage is recorded, if only for
diagnostic purposes, and DATE_RAW may be one good way to do so (but
I'd rather recommend cat-file commit for real diagnostics).

I would be more interested to see whatever broken tool that created
such a commit gets fixed.  Do we know where it came from?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-05-29 Thread Jeff King
On Thu, May 29, 2014 at 11:57:15AM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  ...
  to at least make --format date output consistent with the rest of git
  (and to make %at consistent with %ad and --date=raw). That still
  doesn't address Rodrigo's concern, though (we would print 0 +).
 [...]
 
 I actually am not very much interested in deciding what to show for
 a broken timestamp.  An empty string is just as good as any random
 cruft.

I was thinking specifically of the first part I quoted above. We are not
consistent between various methods of parsing/printing the date. That
may fall into the if were doing it from scratch... category, though;
it's possible that people currently using --format=%ad prefer and
expect the empty string to denote a bogus value. I'm OK with leaving it.

 I agree with you that it would be nice to have one escape
 hatch to let the users see what garbage is recorded, if only for
 diagnostic purposes, and DATE_RAW may be one good way to do so (but
 I'd rather recommend cat-file commit for real diagnostics).

Yeah, in case I wasn't clear, I don't actually like DATE_RAW as a way to
do that. I'd prefer --pretty=raw or cat-file commit, which already
work.

 I would be more interested to see whatever broken tool that created
 such a commit gets fixed.  Do we know where it came from?

I don't think it has been said yet in the thread. Rodrigo?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG REPORT] Git log pretty date

2014-05-29 Thread Rodrigo Fernandes
Jeff,
I have no idea what was the tool. The repo is not mine. I found the
problem when I was doing some tests and the commit parsing was failing
on that repo.

Cumprimentos,
Rodrigo Fernandes


On Thu, May 29, 2014 at 8:49 PM, Jeff King p...@peff.net wrote:
 On Thu, May 29, 2014 at 11:57:15AM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:

  ...
  to at least make --format date output consistent with the rest of git
  (and to make %at consistent with %ad and --date=raw). That still
  doesn't address Rodrigo's concern, though (we would print 0 +).
 [...]

 I actually am not very much interested in deciding what to show for
 a broken timestamp.  An empty string is just as good as any random
 cruft.

 I was thinking specifically of the first part I quoted above. We are not
 consistent between various methods of parsing/printing the date. That
 may fall into the if were doing it from scratch... category, though;
 it's possible that people currently using --format=%ad prefer and
 expect the empty string to denote a bogus value. I'm OK with leaving it.

 I agree with you that it would be nice to have one escape
 hatch to let the users see what garbage is recorded, if only for
 diagnostic purposes, and DATE_RAW may be one good way to do so (but
 I'd rather recommend cat-file commit for real diagnostics).

 Yeah, in case I wasn't clear, I don't actually like DATE_RAW as a way to
 do that. I'd prefer --pretty=raw or cat-file commit, which already
 work.

 I would be more interested to see whatever broken tool that created
 such a commit gets fixed.  Do we know where it came from?

 I don't think it has been said yet in the thread. Rodrigo?

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html