Re: policy around 'wontfix' bug tag

Michael Stone Sun, 04 Feb 2018 07:02:31 -0800

On Sun, Feb 04, 2018 at 02:27:00PM +0100, Nicolas George wrote:

Michael Stone (2018-02-04):

But a better parser would allow the same functionality, without being
confusing, inconsistent, and hard to maintain. So yes, I'll stand by
"complete misfeature".


Can you describe what you mean by "better parser" in more details?

Beware that the "same functionality" includes "same convenience".
Convenience is hard to achieve.

Well, it's not particularly convenient for people to have to constantlywonder why the parser isn't doing what they think it should do. I'vebeen getting the questions and bug reports for 20 years, so trust mewhen I say that people have trouble predicting the output of a giveninput.

As far as "better parser" that means something that requires the inputto be fully specified, and does not try to guess based on naturallanguage parsing. For example, what does "last month" mean? What does itmean when you're on the 31st and the previous month didn't have a 31st?What date is 1/2? What time zone is "EST"? Making guesses seems"convenient" but when you hit corner cases and things break horribly,that's not convenient after all. Most date parsers address this byrequiring a format specifier along with the input, so you can saysomething like "parse '1/2' assuming the input isnumericday/numericmonth". Is it less "convenient" to have to specify theformat? Maybe, but it's also a heck of a lot more reliable. Someone elsepointed out postgresql's date parser, which lets you do things likespecify a date and then add something like "interval '1 day'".Specifying the fact that a particular string is an interval makes theparsing much more regular than trying to pull the interval out ofnatural language. At one point date would appear to properly parseISO8601 input (YYYY-mm-ddTHH:MM:SS) but it interpreted the "T" as atimezone specifier instead of the ISO8601 delimiter. (Compare outputwith YYYY-mm-ddUHH:MM:SS or YYYY-mm-ddSHH:MM:SS.) Why would it ever havebeen "convenient" to put a alphabet character timezone specifier afterthe date and before the time? Who knows, but the natural language parserwas doing its best to guess a meaning for the input. That particularissue was fixed, but how you can tell whether you're using a versionthat works the old way or the new way? (Answer: you can't easily do so.If you had to specify a format it would be easier to hard fail if tryingto use a format that wasn't understood rather than soft fail and producerandom output.) Is it "convenient" that there's a natural languageparser that only understands english? Maybe, if you speak english?


Mike Stone

Re: policy around 'wontfix' bug tag

Reply via email to