Re: approxidate parsing for bad time units
As you mentioned, parsing n ... [month], and even ...n... (e.g. the 3rd) as the nth day of a month is great, but in this case, I think n ... ago is a pretty strong sign that that's not the intended behavior. My first thought was just to make it an error if the string ends in ago but the date is parsed as a day of the month. You don't actually have to come up with any typos to blacklist, just keep the ago from being silently ignored. I suspect n units ago is by far the most common use of the approxidate parsing in the wild, since it's documented and has been popularized online. So throwing an error just in that case would save essentially everyone. I hadn't even realized it worked without ago until I looked at the code. If that doesn't sound like a good plan, then yes, I agree, it'd be tricky to catch it in the general case without breaking things. (Levenshtein distance to the target strings instead of exact matching, I guess, so that it could say did you mean... like for misspelled commands.) On Fri, Sep 7, 2012 at 6:54 AM, Jeff King p...@peff.net wrote: On Thu, Sep 06, 2012 at 02:01:30PM -0700, Jeffrey Middleton wrote: I'm generally very happy with the fuzzy parsing. It's a great feature that is designed to and in general does save users a lot of time and thought. In this case I don't think it does. The problems are: (1) It's not ignoring things it can't understand, it's silently interpreting them in a useless way. Right, but we would then need to come up with a list of things it _does_ understand. So right now I can say 6 June or 6th of June or even 6 de June, and it works because we just ignore the cruft in the middle. So I think you'd need to either whitelist what everybody is typing, or blacklist some common typos (or convince people to be stricter in what they type). So I do think it's worth improving. (Yes, I know, send patches; I'll think about it.) You read my mind. :) -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: approxidate parsing for bad time units
On Thu, Sep 06, 2012 at 02:01:30PM -0700, Jeffrey Middleton wrote: I'm generally very happy with the fuzzy parsing. It's a great feature that is designed to and in general does save users a lot of time and thought. In this case I don't think it does. The problems are: (1) It's not ignoring things it can't understand, it's silently interpreting them in a useless way. Right, but we would then need to come up with a list of things it _does_ understand. So right now I can say 6 June or 6th of June or even 6 de June, and it works because we just ignore the cruft in the middle. So I think you'd need to either whitelist what everybody is typing, or blacklist some common typos (or convince people to be stricter in what they type). So I do think it's worth improving. (Yes, I know, send patches; I'll think about it.) You read my mind. :) -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: approxidate parsing for bad time units
Jeffrey Middleton jefr...@gmail.com writes: In telling someone what date formats git accepts, and how to verify it understands, I noticed this weirdness: $ export TEST_DATE_NOW=`date -u +%s --date='September 10'`; ./test-date approxidate now; for i in `seq 1 10`; do ./test-date approxidate $i frobbles ago; done now - 2012-09-10 00:00:00 + 1 frobbles ago - 2012-09-02 00:00:00 + ... 10 frobbles ago - 2012-09-11 00:00:00 + Which gets more concerning once you realize the same thing happens no matter what fake unit of time you use... including things like yaers and moths. Perhaps approxidate could be a little stricter? Could be stricter, perhaps. Do we care deeply? I doubt it, and for a good reason. The fuzzy parsing is primarily [*1*] for humans getting interactive results who are expected to be able to notice when the fuzziness went far off. As long as we have ways for scripts and humans to feed its input in a more strict and unambiguous way [*2*], it does not hurt anybody if the fuzzy parser ignored crufts that it does not understand. [Footnotes] *1* ... and of course some coding fun and easter egg values. Think of it as our own Eliza or Zork parser ;-). *2* And of course we do. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: approxidate parsing for bad time units
I'm generally very happy with the fuzzy parsing. It's a great feature that is designed to and in general does save users a lot of time and thought. In this case I don't think it does. The problems are: (1) It's not ignoring things it can't understand, it's silently interpreting them in a useless way. I'm pretty sure that n units ago is equivalent to the same time of day on the last day of the previous month, plus n days. (2) Though in some cases it's really obvious, in others it's quite possible not to notice, e.g. if `git rev-list --since=5.dyas.ago` is silently the same as `git rev-list --since=4.days.ago`. So I do think it's worth improving. (Yes, I know, send patches; I'll think about it.) On Thu, Sep 6, 2012 at 1:36 PM, Junio C Hamano gits...@pobox.com wrote: Jeffrey Middleton jefr...@gmail.com writes: In telling someone what date formats git accepts, and how to verify it understands, I noticed this weirdness: $ export TEST_DATE_NOW=`date -u +%s --date='September 10'`; ./test-date approxidate now; for i in `seq 1 10`; do ./test-date approxidate $i frobbles ago; done now - 2012-09-10 00:00:00 + 1 frobbles ago - 2012-09-02 00:00:00 + ... 10 frobbles ago - 2012-09-11 00:00:00 + Which gets more concerning once you realize the same thing happens no matter what fake unit of time you use... including things like yaers and moths. Perhaps approxidate could be a little stricter? Could be stricter, perhaps. Do we care deeply? I doubt it, and for a good reason. The fuzzy parsing is primarily [*1*] for humans getting interactive results who are expected to be able to notice when the fuzziness went far off. As long as we have ways for scripts and humans to feed its input in a more strict and unambiguous way [*2*], it does not hurt anybody if the fuzzy parser ignored crufts that it does not understand. [Footnotes] *1* ... and of course some coding fun and easter egg values. Think of it as our own Eliza or Zork parser ;-). *2* And of course we do. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html