On Sunday, July 13, 2003, at 08:11 PM, Iain Truskett wrote:
Remember: part of the point of having the various format
modules is that you can pick'n'mix. You could conceivably
wrap a number of them in Builder to make your own parser
that recognises the sorts of dates you come across. I mostly
come across HTTP, W3CDTF and RFC2822 dates so I'm all set =)

Sure, but my main point is that I don't want to have to manually load and use parsers. I want DateTime (the class and/or individual objects) have a catch-all parse() method that uses the parser class of my choosing. And I want the default to be something than can handle most "normal" date formats, ignoring as much complexity as it takes to get into the "core" (even though DateTime::Format::Simple would still be its own module, and might not even be loaded until the first call to parse())


However, I'm interested in seeing your regexen to see if you
parse anything that might be of use to me. Would you like to
share?

They're super boring. Heck, these two cover almost everything I'm interested in:


# yyyy mm dd [hh:mm[:ss[.nnn]]] [am/pm]($year, $month, $mday, $hours, $mins, $secs, $fsecs, $ampm) =
/^(\d{4})\s*-?\s*(\d{2})\s*-?\s*(\d{2})\s*(?:- ?\s*(\d{1,2}):?(\d{2})(?::?(\d{2}))?)?(?:\.(\d+))?(?:\s*([aApP]\.?[mM]\. ?))?$/


# mm/dd/yyyy, mm-dd-yyyy, [hh:mm[:ss[.nnn]]] [am/pm]
($month, $mday, $year, $hours, $mins, $secs, $fsecs, $ampm) =
m#^(\d{1,2})[-/](\d{1,2})[-/ ](\d{4})(?:\s+(\d{1,2}):(\d{2})(?::(\d{2}))?)?(?:\.(\d+))?(?:\s*([aApP]\ .?[mM]\.?))?$#))


(Post-processing to handle am/pm and fsecs correctly omitted)

To this I just add "now", "today" (meaning 00:00:00), +/-infinity, and a no-op for things that are already DateTime objects. This, believe it or not, covers almost everything I expect a user to enter in a form (ignoring excess whitespace stripping and such, which happens earlier), as well as everything I expect to read from a file or whatever.

Really, I'm not asking for the moon. The key features are the "built-in/used by default" nature and the "I can handle whatever" parse() method (instead of parse_date(), parse_datetime(), parse_year_and_day_but_not_month(), etc.) Since this is all so simple, I think it should be "built in" via the creation and default use of a DateTime::Format::Simple module and a generalized parse() class/object method for DateTime.

For kicks, I just threw this into DateTime.pm

use vars qw($VERSION $PARSER);
...
use DateTime::Format::Simple(); # pre-loading in this case
use constant DEFAULT_PARSER_CLASS => 'DateTime::Format::Simple';

...

sub parser
{
  my $self = shift;

  if(ref $self)
  {
    return $self->{'parser'} = shift  if(@_);
    return $self->{'parser'} ||= $PARSER ||= DEFAULT_PARSER_CLASS;
  }

  return $PARSER = shift  if(@_);
  return $PARSER ||= DEFAULT_PARSER_CLASS;
}

sub parse
{
  my $dt;
  eval { $dt = $_[0]->parser->parse_datetime($_[1]) };
  return undef  if($@);
  return $_[0] = $dt  if(ref $_[0]);
  return $dt;
}

and then spent 5 minutes writing a bare bones implementation of DateTime::Format::Simple using the regexes above. And behold, simple things are easy:

$dt = DateTime->parse('5/29/1945');

$dt->parse('2002-03-05 1:02 a.m.');

and hard things are possible:

DateTime->parser('DateTime::Format::ReallyComplex');

$dt = DateTime->parse('the 20th of september, nineteen seventy-five, midnight');
$dt->parse('the day after tomorrow');
$dt->parse('my birthday'); # ;)


-John




Reply via email to