Dave Rolsky wrote:
>First, I don't want to make these new modules requirements. What I'd 
>prefer to do is just load them if they exist and use them. No one (that I 
>recall) has asked for Posix or binary file support, so making them 
>dependencies seems like overkill.

OK.  But it seems a pity that a default install won't be able to parse
all $TZ settings that are valid for libc.  Actually there have been
two DT::TZ users who wanted their System V timezone strings handled
(see bottom of this message).

>Given that, I don't want to change the handling of these names. This would 
>be a backwards-incompatible change as it is, since in the Olson database a 
>name like EST5EDT uses the US rules, meaning it includes many historical 
>changes.

Fair enough.  Using US rules is a valid interpretation of "EST5EDT"
as a System V string anyway.

>This should probably use File::Spec->file_name_is_absolute() to figure 
>this out. None of the currently valid names are absolute file names on any 
>system (I think).

That would be a bad idea: it would mean that unknown filename syntaxes
constrain what system-independent syntaxes we can use in the future.
It's theoretically totally unworkable.  Anyway, among what we do know,
on the Mac any string that doesn't start with ":" is a valid absolute
filename.

I found a better way: URI::file can translate between filename syntaxes.
We can specify that the $TZ setting is a Unix-style filename, and it gets
translated as appropriate.  On Windows you can do "/c:/etc/localtime"
and the file C:\etc\localtime is used, and "/etc/localtime" translates
to the not-quite-absolute \etc\localtime.

>It seems like a leading colon for other things (like "EST5EDT") would be 
>an error. Of course, removing it isn't a big deal.

I addressed this issue in my survey of $TZ parsing.  glibc and HP-UX
both accept System V strings both with and without leading ":".  glibc,
Olson, and Solaris all accept filenames (including Olson names) both
with and without ":".  Basically, the consensus $TZ syntax is the same
both with and without ":".  Consistently ignoring the leading ":" is
where the trend is heading.

>1. Make SystemV & Tzfile entirely optional.
>2. Use the Olson versions of existing SystemV names (follows from 1)
>3. Use File::Spec->file_name_is_absolute to determine if a name could be a 
>path to a tz file.

Patch attached does 1 and 2, and does filename translation instead of 3.
I extended 2 to the non-SysV single-part Olson names that my previous
patch implemented as links to SysV strings.

There were also two SysV-style strings that were never in Olson, but which
DT::TZ had as links to Olson zones.  These are "AKST9AKDT" and "JST-9".
They were added to satisfy particular DT::TZ users who were using these
$TZ settings.  As they're not in Olson, I reckon these should be treated
as any ordinary SysV string, so this patch has no grandfathering for them.
Those users will have to install DT::TZ::SystemV.  Grandfathering for
them is easily added if you really want to.

In other respects it's equivalent to the previous patch.  As before
you'll need to rerun tools/parse_olson, to pick up the changes to
TimeZoneCatalog.pm.

-zefram
diff -ur dttz-0.6101/lib/DateTime/TimeZone/Local.pm 
dttz-mod1/lib/DateTime/TimeZone/Local.pm
--- dttz-0.6101/lib/DateTime/TimeZone/Local.pm  2007-02-18 15:54:12.000000000 
+0000
+++ dttz-mod1/lib/DateTime/TimeZone/Local.pm    2007-02-19 19:30:30.018396544 
+0000
@@ -92,7 +92,7 @@
     return 0 unless defined $_[0];
     return 0 if $_[0] eq 'local';
 
-    return $_[0] =~ m{^[\w/\-\+]+$};
+    return 1;
 }
 
 
diff -ur dttz-0.6101/lib/DateTime/TimeZone/OlsonDB.pm 
dttz-mod1/lib/DateTime/TimeZone/OlsonDB.pm
--- dttz-0.6101/lib/DateTime/TimeZone/OlsonDB.pm        2007-02-18 
15:54:11.000000000 +0000
+++ dttz-mod1/lib/DateTime/TimeZone/OlsonDB.pm  2007-02-19 00:02:04.000000000 
+0000
@@ -111,8 +111,6 @@
         $name = shift @items;
     }
 
-    return if $name =~ /[WCME]ET/ && ! $self->{backwards_compat};
-
     @obs{ qw( gmtoff rules format until ) } = @items;
 
     if ( $obs{rules} =~ /\d\d?:\d\d/ )
diff -ur dttz-0.6101/lib/DateTime/TimeZone.pm dttz-mod1/lib/DateTime/TimeZone.pm
--- dttz-0.6101/lib/DateTime/TimeZone.pm        2007-02-18 15:54:12.000000000 
+0000
+++ dttz-mod1/lib/DateTime/TimeZone.pm  2007-02-19 20:19:13.383139216 +0000
@@ -6,10 +6,6 @@
 $VERSION = '0.6101';
 
 use DateTime::TimeZoneCatalog;
-use DateTime::TimeZone::Floating;
-use DateTime::TimeZone::Local;
-use DateTime::TimeZone::OffsetOnly;
-use DateTime::TimeZone::UTC;
 use Params::Validate qw( validate validate_pos SCALAR ARRAYREF BOOLEAN );
 
 use constant INFINITY     =>       100 ** 1000 ;
@@ -24,7 +20,8 @@
 use constant IS_DST      => 5;
 use constant SHORT_NAME  => 6;
 
-my %SpecialName = map { $_ => 1 } qw( EST MST HST EST5EDT CST6CDT MST7MDT 
PST8PDT );
+my %grandfather_olson = map { $_ => 1 }
+       qw( CST6CDT EST EST5EDT HST MST MST7MDT PST8PDT );
 
 sub new
 {
@@ -33,66 +30,85 @@
                       { name => { type => SCALAR } },
                     );
 
-    if ( exists $DateTime::TimeZone::LINKS{ $p{name} } )
+    if ( $p{name} eq 'local' )
     {
-        $p{name} = $DateTime::TimeZone::LINKS{ $p{name} };
+        require DateTime::TimeZone::Local;
+        return DateTime::TimeZone::Local->TimeZone;
     }
-    elsif ( exists $DateTime::TimeZone::LINKS{ uc $p{name} } )
+
+    my $name = $p{name};
+    $name =~ s/\A://;
+
+    if ( exists $DateTime::TimeZone::LINKS{ $name } )
     {
-        $p{name} = $DateTime::TimeZone::LINKS{ uc $p{name} };
+        $name = $DateTime::TimeZone::LINKS{ $name };
     }
-
-    unless ( $p{name} =~ m,/,
-             || $SpecialName{ $p{name} }
-           )
+    elsif ( exists $DateTime::TimeZone::LINKS{ uc $name } )
     {
-        if ( $p{name} eq 'floating' )
-        {
-            return DateTime::TimeZone::Floating->new;
-        }
-
-        if ( $p{name} eq 'local' )
-        {
-            return DateTime::TimeZone::Local->TimeZone();
-        }
-
-        if ( $p{name} eq 'UTC' || $p{name} eq 'Z' )
-        {
-            return DateTime::TimeZone::UTC->new;
-        }
-
-        return DateTime::TimeZone::OffsetOnly->new( offset => $p{name} );
+        $name = $DateTime::TimeZone::LINKS{ uc $name };
     }
 
-    my $subclass = $p{name};
-    $subclass =~ s/-/_/g;
-    $subclass =~ s{/}{::}g;
-    my $real_class = "DateTime::TimeZone::$subclass";
-
-    die "The timezone '$p{name}' in an invalid name.\n"
-        unless $real_class =~ /^\w+(::\w+)*$/;
-
-    unless ( $real_class->can('instance') )
+    if ( $name eq 'UTC' || $name eq 'Z' )
     {
-        eval "require $real_class";
+        require DateTime::TimeZone::UTC;
+        return DateTime::TimeZone::UTC->new;
+    }
+    elsif ( $name =~ m#\A[-\w]+(?:/[-\w]+)+\z# || $grandfather_olson{$name} )
+    {
+        my $subclass = $name;
+        $subclass =~ s/-/_/g;
+        $subclass =~ s{/}{::}g;
+        my $real_class = "DateTime::TimeZone::$subclass";
 
-        if ($@)
+        unless ( $real_class->can('instance') )
         {
-            my $regex = join '.', split /::/, $real_class;
-            $regex .= '\\.pm';
+            eval "require $real_class";
 
-            if ( $@ =~ /^Can't locate $regex/i )
+            if ($@)
             {
-                die "The timezone '$p{name}' could not be loaded, or is an 
invalid name.\n";
-            }
-            else
-            {
-                die $@;
+                my $regex = join '.', split /::/, $real_class;
+                $regex .= '\\.pm';
+
+                if ( $@ =~ /^Can't locate $regex/i )
+                {
+                    die "The timezone '$name' could not be loaded, or is an 
invalid name.\n";
+                }
+                else
+                {
+                    die $@;
+                }
             }
         }
+        return $real_class->instance( name => $name, is_olson => 1 );
+    }
+    elsif ( $name =~ /\A(?:[A-Za-z]{3,}|\<[-+0-9A-Za-z]{3,}\>)[-+]?\d/ )
+    {
+        require DateTime::TimeZone::SystemV;
+        return DateTime::TimeZone::SystemV->new( $name );
+    }
+    elsif ( $name =~ m#\A/# )
+    {
+        require URI::file;
+        my $filename = URI::file->new($name, "Unix")->file;
+        die "The timezone name '$name' couldn't be translated to a filename.\n"
+                unless defined $filename;
+        require DateTime::TimeZone::Tzfile;
+        return DateTime::TimeZone::Tzfile->new( $filename );
+    }
+    elsif ( $name =~ /\A[-+]?(?:\d\d?:\d\d(?::\d\d)?|\d{4}(?:\d\d)?)\z/ )
+    {
+        require DateTime::TimeZone::OffsetOnly;
+        return DateTime::TimeZone::OffsetOnly->new( offset => $name );
+    }
+    elsif ( $name eq 'floating' )
+    {
+        require DateTime::TimeZone::Floating;
+        return DateTime::TimeZone::Floating->new;
+    }
+    else
+    {
+        die "The timezone name '$name' was not understood.\n";
     }
-
-    return $real_class->instance( name => $p{name}, is_olson => 1 );
 }
 
 sub _init
@@ -521,29 +537,83 @@
 
 =head2 DateTime::TimeZone->new( name => $tz_name )
 
-Given a valid time zone name, this method returns a new time zone
-blessed into the appropriate subclass.  Subclasses are named for the
-given time zone, so that the time zone "America/Chicago" is the
+Parses the given timezone name, and returns a timezone object representing
+the appropriate zone.  The following types of name are understood:
+
+=over
+
+=item *
+
+If the name parameter is "local", then the module attempts to determine
+the local time zone for the system.  This is described in more detail
+below.
+
+=item *
+
+If the name begins with a ":" character, that character is dropped and the
+rest of the name is parsed without it.  This is for compatibility with
+parsing of the B<TZ> environment variable on many systems.  The special
+meaning of "local" does not apply to ":local".
+
+=item *
+
+If the name is an alias established by L<DateTime::TimeZone::Alias>,
+or a "link" in the Olson database, or one of a small number of built-in
+aliases, then the target of the alias is parsed instead.  An alias name
+that is all uppercase will be recognised in any case, but any other alias
+name must be given exactly.  The target of an alias is not itself subject
+to ":" removal or alias interpretation, and "local" is not a valid target.
+
+=item *
+
+If the name is "UTC" or "Z", then a C<DateTime::TimeZone::UTC>
+object is returned.
+
+=item *
+
+If the name looks like a multipart name in the Olson database, or is
+one of a small number of grandfathered single-part names, an appropriate
+subclass is used.  For example, the time zone "America/Chicago" is the
 DateTime::TimeZone::America::Chicago class.
 
-If the name given is a "link" name in the Olson database, the object
-created may have a different name.  For example, there is a link from
-the old "EST5EDT" name to "America/New_York".
+=item *
+
+If the name looks like a System V style timezone string,
+including one of the POSIX extended form, it is parsed as such
+by C<DateTime::TimeZone::SystemV>.  Exception: the System V style
+timezone strings "CST6CDT", "EST5EDT", "MST7MDT", and "PST8PDT" (none
+of which specify their DST rules) are treated as Olson database names,
+for historical reasons.  Note: the System V timezone facility is only
+available if the module C<DateTime::TimeZone::SystemV> is installed;
+a default installation of C<DateTime::TimeZone> does not include that
+modules.
+
+=item *
+
+If the name begins with a "/" character, it is interpreted as a
+Unix-style absolute filename.  It is translated into the local filename
+syntax, and used as the name of a L<tzfile(5)> file.  The file is
+parsed by C<DateTime::TimeZone::Tzfile>.  Note: this facility is only
+available if the modules C<URI::file> (for the filename translation) and
+C<DateTime::TimeZone::Tzfile> (for the parsing) are installed; a default
+installation of C<DateTime::TimeZone> does not include those modules.
 
-There are also several special values that can be given as names.
+=item *
 
-If the "name" parameter is "floating", then a
+If the name looks like a fixed offset in hours and minutes
+(and optional seconds), it is converted to a number, and a
+C<DateTime::TimeZone::OffsetOnly> object is returned.
+
+=item *
+
+If the name is "floating", then a
 C<DateTime::TimeZone::Floating> object is returned.  A floating time
 zone does have I<any> offset, and is always the same time.  This is
 useful for calendaring applications, which may need to specify that a
 given event happens at the same I<local> time, regardless of where it
 occurs.  See RFC 2445 for more details.
 
-If the "name" parameter is "UTC", then a C<DateTime::TimeZone::UTC>
-object is returned.
-
-If the "name" is an offset string, it is converted to a number, and a
-C<DateTime::TimeZone::OffsetOnly> object is returned.
+=back
 
 =head3 The "local" time zone
 
diff -ur dttz-0.6101/tools/parse_olson dttz-mod1/tools/parse_olson
--- dttz-0.6101/tools/parse_olson       2007-02-18 15:54:12.000000000 +0000
+++ dttz-mod1/tools/parse_olson 2007-02-19 21:20:36.000420408 +0000
@@ -20,6 +20,9 @@
 
 my $INFINITY  = 100 ** 100 ** 100;
 
+my %grandfather_olson_zones = map { $_ => 1 }
+       qw( CST6CDT EST EST5EDT HST MST MST7MDT PST8PDT );
+
 my %opts;
 GetOptions( 'dir:s'     => \$opts{dir},
             'clean'     => \$opts{clean},
@@ -151,6 +154,7 @@
 
     foreach my $zone_name ( sort $odb->zone_names )
     {
+        next unless $zone_name =~ m{/} || $grandfather_olson_zones{$zone_name};
         if ( $opts{name} )
         {
             next unless $zone_name eq $opts{name};
@@ -377,23 +381,6 @@
 
 sub clean_links
 {
-    # override some links and add others
-    %links =
-        ( %links,
-          'Etc/GMT'       => 'UTC',
-          'Etc/GMT+0'     => 'UTC',
-          'Etc/Universal' => 'UTC',
-          'Etc/UCT'       => 'UTC',
-          'Etc/UTC'       => 'UTC',
-          'Etc/Zulu'      => 'UTC',
-          'GMT0'          => 'UTC',
-          'GMT'           => 'UTC',
-          'AKST9AKDT'     => 'America/Anchorage',
-          'JST-9'         => 'Asia/Tokyo',
-        );
-
-    delete $links{UTC};
-
     # Some links resolve to other links - chase them down until they point
     # to a real zone.
     while ( my @k = grep { $links{ $links{$_} } } keys %links )
@@ -403,6 +390,24 @@
             $links{$k} = $links{ $links{$k} };
         }
     }
+
+    # Anything linking to Etc/UTC, Etc/UCT, or Etc/GMT we want to link to
+    # our UTC.
+    foreach ( keys %links )
+    {
+        $links{$_} = 'UTC' if $links{$_} =~ m#\AEtc/(?:U(?:TC|CT)|GMT)\z#;
+    }
+
+    # The UTC-equivalent zones themselves need to be links to our UTC.
+    # We also support some other Etc/ links, even though we don't provide
+    # the full set of Etc/ names.
+    foreach (qw(Etc/UTC Etc/UCT Etc/GMT Etc/GMT+0 Etc/Universal Etc/Zulu GMT))
+    {
+        $links{$_} = 'UTC'
+    }
+
+    # UTC itself must not be a link: for us it's the canonical name of a zone.
+    delete $links{UTC};
 }
 
 sub make_catalog_pm

Reply via email to