Re: approxidate parsing for bad time units

2012-09-10 Thread Jeffrey Middleton
As you mentioned, parsing n ... [month], and even ...n... (e.g.
the 3rd) as the nth day of a month is great, but in this case, I
think n ... ago is a pretty strong sign that that's not the intended
behavior.

My first thought was just to make it an error if the string ends in
ago but the date is parsed as a day of the month. You don't actually
have to come up with any typos to blacklist, just keep the ago from
being silently ignored. I suspect n units ago is by far the most
common use of the approxidate parsing in the wild, since it's
documented and has been popularized online. So throwing an error just
in that case would save essentially everyone. I hadn't even realized
it worked without ago until I looked at the code.

If that doesn't sound like a good plan, then yes, I agree, it'd be
tricky to catch it in the general case without breaking things.
(Levenshtein distance to the target strings instead of exact matching,
I guess, so that it could say did you mean... like for misspelled
commands.)

On Fri, Sep 7, 2012 at 6:54 AM, Jeff King p...@peff.net wrote:

 On Thu, Sep 06, 2012 at 02:01:30PM -0700, Jeffrey Middleton wrote:

  I'm generally very happy with the fuzzy parsing. It's a great feature
  that is designed to and in general does save users a lot of time and
  thought. In this case I don't think it does. The problems are:
  (1) It's not ignoring things it can't understand, it's silently
  interpreting them in a useless way.

 Right, but we would then need to come up with a list of things it _does_
 understand. So right now I can say 6 June or 6th of June or even 6
 de June, and it works because we just ignore the cruft in the middle.

 So I think you'd need to either whitelist what everybody is typing, or
 blacklist some common typos (or convince people to be stricter in what
 they type).

  So I do think it's worth improving. (Yes, I know, send patches; I'll
  think about it.)

 You read my mind. :)

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: approxidate parsing for bad time units

2012-09-07 Thread Jeff King
On Thu, Sep 06, 2012 at 02:01:30PM -0700, Jeffrey Middleton wrote:

 I'm generally very happy with the fuzzy parsing. It's a great feature
 that is designed to and in general does save users a lot of time and
 thought. In this case I don't think it does. The problems are:
 (1) It's not ignoring things it can't understand, it's silently
 interpreting them in a useless way.

Right, but we would then need to come up with a list of things it _does_
understand. So right now I can say 6 June or 6th of June or even 6
de June, and it works because we just ignore the cruft in the middle.

So I think you'd need to either whitelist what everybody is typing, or
blacklist some common typos (or convince people to be stricter in what
they type).

 So I do think it's worth improving. (Yes, I know, send patches; I'll
 think about it.)

You read my mind. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: approxidate parsing for bad time units

2012-09-06 Thread Junio C Hamano
Jeffrey Middleton jefr...@gmail.com writes:

 In telling someone what date formats git accepts, and how to verify it
 understands, I noticed this weirdness:

 $ export TEST_DATE_NOW=`date -u +%s --date='September 10'`;
 ./test-date approxidate now; for i in `seq 1 10`; do ./test-date
 approxidate $i frobbles ago; done
 now - 2012-09-10 00:00:00 +
 1 frobbles ago - 2012-09-02 00:00:00 +
 ...
 10 frobbles ago - 2012-09-11 00:00:00 +

 Which gets more concerning once you realize the same thing happens no
 matter what fake unit of time you use... including things like yaers
 and moths. Perhaps approxidate could be a little stricter?

Could be stricter, perhaps.

Do we care deeply?  I doubt it, and for a good reason.  The fuzzy
parsing is primarily [*1*] for humans getting interactive results
who are expected to be able to notice when the fuzziness went far
off.

As long as we have ways for scripts and humans to feed its input in
a more strict and unambiguous way [*2*], it does not hurt anybody if
the fuzzy parser ignored crufts that it does not understand.


[Footnotes]

*1* ... and of course some coding fun and easter egg values. Think
of it as our own Eliza or Zork parser ;-).

*2* And of course we do.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: approxidate parsing for bad time units

2012-09-06 Thread Jeffrey Middleton
I'm generally very happy with the fuzzy parsing. It's a great feature
that is designed to and in general does save users a lot of time and
thought. In this case I don't think it does. The problems are:
(1) It's not ignoring things it can't understand, it's silently
interpreting them in a useless way. I'm pretty sure that n units ago
is equivalent to the same time of day on the last day of the previous
month, plus n days.
(2) Though in some cases it's really obvious, in others it's quite
possible not to notice, e.g. if `git rev-list --since=5.dyas.ago` is
silently the same as `git rev-list --since=4.days.ago`.

So I do think it's worth improving. (Yes, I know, send patches; I'll
think about it.)


On Thu, Sep 6, 2012 at 1:36 PM, Junio C Hamano gits...@pobox.com wrote:
 Jeffrey Middleton jefr...@gmail.com writes:

 In telling someone what date formats git accepts, and how to verify it
 understands, I noticed this weirdness:

 $ export TEST_DATE_NOW=`date -u +%s --date='September 10'`;
 ./test-date approxidate now; for i in `seq 1 10`; do ./test-date
 approxidate $i frobbles ago; done
 now - 2012-09-10 00:00:00 +
 1 frobbles ago - 2012-09-02 00:00:00 +
 ...
 10 frobbles ago - 2012-09-11 00:00:00 +

 Which gets more concerning once you realize the same thing happens no
 matter what fake unit of time you use... including things like yaers
 and moths. Perhaps approxidate could be a little stricter?

 Could be stricter, perhaps.

 Do we care deeply?  I doubt it, and for a good reason.  The fuzzy
 parsing is primarily [*1*] for humans getting interactive results
 who are expected to be able to notice when the fuzziness went far
 off.

 As long as we have ways for scripts and humans to feed its input in
 a more strict and unambiguous way [*2*], it does not hurt anybody if
 the fuzzy parser ignored crufts that it does not understand.


 [Footnotes]

 *1* ... and of course some coding fun and easter egg values. Think
 of it as our own Eliza or Zork parser ;-).

 *2* And of course we do.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html