Very rough DateTime::Format::Simple

Ben Bennett Fri, 18 Jul 2003 20:16:59 -0700

Sorry this is such a dense post, but this module spawned a lot of
discussion and deciding what a simple format turned out to be not so
simple.  Anyway, if you want to play with this, make sure you look at
the note about regenerating the DT::Locale data.


If people are okay with the general direction of the code then I will
commit it to CVS (I would like to get some consesus on the name, but
will start a separate thread for that).


Available from:
 http://www.limey.net/~fiji/perl/DateTime-Format-Simple-0.01.tar.gz
 http://www.limey.net/~fiji/perl/generate_from_icu (see Notes)

Notes:
 - I use the locale to determine the meaning of \d\d-\d\d-\d\d.  It
   can be ymd, dmy, or mdy.  In order to make this work the DT::Locale
   tools/generate_icu needed a small change.  Get the script above and
   re-run the generator.  If this change meets with approval I will
   commit it.
 - If the length of the year is <= 2 then I will use the base_year
   argument (defaults to the current year) to work out the appropriate
   century:
     my $base_century = int( $base_year / 100 ) * 100;
     $year += $base_century;
     $year -= 100 if $year - $base_year > 50;

Major ommissions:
 - No POD
 - AM/PM & BC/AD are not localized
 - BC/AD is not supported at all yet (neither are negative years)
 - Only tests English and French(?) parsing
 - Needs many more tests and in more languages (I actually need fluent
   speakers to tell me if the formats it parses are reasonable)

Interface:
 - Only the DT::F::Simple->parse_datetime( ... ) at the moment, I will
   add a new() and the ability to call through the returned object
 - The ... can either be a single argument giving the string to parse
   OR
   name => value pairs:
     string: The string to parse
     locale: The locale to parse in (assumes root which is en_US)
     time_zone: The default TZ of the returned DT object (if no TZ is
          specified in the string), defaults to 'floating'
     base_year: The year to use if the string does not specify one,
          defaults to the current year, also gives the point to use
          when inferring a complete year from a 2 digit one...
     debug: If true then it will print lots of info, defaults to false
    
Formats it should be able to parse:
 - ISO8601 date (only in the format YYYY-MM-DDTHH:MM:SS.FFFF).  The T
   separator is optional and may be replaced by 0+ spaces.  The time
   may be ommitted from the rightmost part to the left (all of the
   date must be given).  The separators inside the date and time
   (i.e. the -s and :s are also optional)
 - HH:MM:SS.FFFFFFF AM/PM is parsed, again the highest precision parts
   may be ommitted.  AM/PM is optional, if not present it assumes 24
   hour time.  You may specify HH AM/PM, but in this case the AM/PM is
   required.
 - Dates of the form Y+/M+/D+ with -, . or / as the separator.  Also
   accepted are M/D/Y or D/M/Y depending on the locale. (Y/M/D is
   assumed of the first number is longer than 2 digits or the locale
   explicitly calls for YMD)
 - DD-MonthName-Y+, where Month name is the locale appropriate string
 - DD MonthName or MonthName DD with a year somewhere else in the
   string
 - If there is a locale appropriate day name somewhere in the string
   it is used to validate the parsed date.
 - Timezones are supported either as GMT offsets: GMT+5:00, +5:00
   Only supports offsets from GMT or UTC, the : is optional, and you
   may omit minutes, you may also provide seconds, but they must
   always be 00.
 - Named time zones are fine, it will check against the current set of
   DT::TZ names (at runtime, so aliases are honored)
 - You may use both offsets and parenthesized named TZs, so 
   '-0600 (CST)' will work (assuming an alias for CST).
 - It will ignore accents in languages when parsing the strings.

Please help:
 - I am desperate for speakers of other languages to provide me with
   good test strings!
 - If you want to mail me with more English test strings that would be
   great
 - Suggestions of additional formats to parse would be greatly
   appreciated


     Thanks for bearing with me!

                -ben

Very rough DateTime::Format::Simple

Reply via email to