Em 17/01/2010 12:17, David Hotham <[email protected]> escreveu: > This seems to have been more controversial than I expected. I'm > going to write this post in which I will: > > - try to make the case for supporting 2-digit years in some > - sensible way answer some of the counter-arguments and questions > - that people have > raised > ... at which point I intend to withdraw, and the community will no > doubt reach some sensible conclusion. David,
Thanks for expending some time of yours articulating a subject than at first may seem simple but has some complexities hidden due the need to bring the natural language of users closer to the deterministic and rigorous of the computers. > > > > The case for accepting 2-digit years is, I think, very simple: it's > useful. Like it or not, data sometimes comes that way and this has > to be dealt with. I agree with the bulk of the statement about usefulness, the only problem I see is in the details. "data sometimes comes that way" has to be qualified: if we're talking about some batch processing where all the dates are in strings and no human intervention occurs during the import of the data, or if we've the situation where an operator of an interface is entering the data. In the first scenario, IMO the right thing is to have a specific conversion routine for the string to Date object. For the later, since we're talking about Smalltalk, we could always have a popup window asking for the correct interpretation or have appropriate flags to discern how to interpret these strings as dates (IIRC Excel has/had such a setting). So since these details will be also very application specific, I think they belong to application code and not the Core of Pharo. > Granted, this means that we have to introduce some heuristic to > guess which century is intended. However, I don't see that this is > so bad. Yes. It would be not so bad as far as we refrain ourselves to try to encompass (i.e. putting in Core) any approach that other platforms or applications have arrived at. > Amongst the possible approaches are: > - Following the Posix standard for strptime, the date is assumed to > be between 1969 and 2068 (see > http://www.opengroup.org/onlinepubs/009695399/functions/strptime.html). But the danger goes in this: <quote> Note: It is expected that in a future version of IEEE Std 1003.1-2001 the default century inferred from a 2-digit year will change. (This would apply to all commands accepting a 2-digit year as input.) </quote> > Python goes this way, for example. > - the date is assumed to be between 80 years in the past and 20 > years in the future (eg Java's SimpleDateFormat, see > http://java.sun.com/javase/7/docs/api/java/text/SimpleDateFormat.html) > - allow the 100 year-period in which 2-digit years will be placed > to be specified (eg the Java SimpleDateFormat also allows this) > This an interesting advantage (at first) that uses some moving epoch to compute the century, but has IMNHO a terrific disadvantage: it will break unit tests as soon your code crosses the boundary of ten years period (like some code done last year and tested in 2011). > > Now I will address some concerns that people have raised. > The first is essentially aesthetic: this kind of cleverness should > not be there, and everyone should always use four-digit years. I've > some sympathy with this, and it would be wonderful if everyone was > as principled as we are. But, alas, it is not so! My judgment is > that it's worth a little ugliness to be able to deal with the common > case of two-digit years. As complement to 'accepting 2-digit years...", I would say (quoting liberally Stephen Leake) that we should have a programming environment that helps us to write better programs. So with little extra code we can introduce the needed discipline. > A second objection was that the code was right all along: > '6-Jan-10' should correctly be parsed as 6th January 1910. This > seems to me peculiar. I understand a position that says that > two-digit years should not be accepted at all, but arguing that they > should be accepted and should be interpreted to be 100 or more years > ago... well, this seems unlikely to be the most useful approach. This is sensible because is a common epoch date for computers, and in the last century we have accumulated lots of records in computers where these two digit dates had the meaning of 1910. > A third objection was that this is the thin end of the wedge: if we > start accepting two-digit years, who knows where the madness will > end? We will have to deal with all kinds of crazy situation! Yes, if we come to think about it, it is really serious! Within some time period one of us will need to accept the February, 29, 1900 date "because Lotus 1-2-3 compatibility" or perhaps a special case to allow for January, 0, 1900 "due to Microsoft Excel compatibility"!! > I think that this objection is just wrong. Noone is arguing for > three-digit dates, or hexadecimal dates, or any other crazy stuff. In general people do not argue, they introduce these quircks in the systems together with some nice unit tests, some interesting explanation, etc. The point is to have these things in the application code, not in Core of Pharo. > Two-digits are commonplace. Other crazy things are not commonplace. > Let's draw the line where it makes sense to draw the line: to my > mind handling two-digit years clearly falls on the 'useful' side and > not the 'crazy' side. > Finally, I must note the irony of one poster declaring that we > should only accept well-formed strings, and pointing us at RFC822 > for reference. Years in RFC822 are defined to have two digits. The > RFC does not say what century they should fall in. (This has been > obsoleted by RFC2822, which uses four-digit years). Yes, I was thinking of 2822 but my memory still is too impressed with 822 (which the POSIX standard points to to refer to timezones). . .! -- Cesar Rabak _______________________________________________ Pharo-project mailing list [email protected] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
