Hi
I help develop an email website (http://www.fastmail.fm) and recently we
wanted to move over to providing proper timezone support for users (eg give
us a location, and we'll keep the time up to date, rather than having to
change for daylight savings every 6 months). As part of this, we were going
to use the DateTime::TimeZone modules, however it's turned out to be a bit
of a problem.
Basically we're using a mod_perl environment, and we have lots of users all
around the world. Because we end up wanting to use lots of different
timezones, and often a different one for every web request, it's generally a
good idea to pre-load the modules in the parent process so that all the data
is shared by each child process. Additionally, it's often even worthwhile
pre-instantiating each timezone in the parent, and storing it in a hash for
later retrieval, rather than constructing a TimeZone object during each
request.
So my intial code involved in the startup.pl phase of our mod_perl Apache
server:
use DateTime::Timezone;
$::TZ{$_} = DateTime::TimeZone->new(name => $_)
for (DateTime::Timezone::all_names());
So this will force loading of all timezones at startup, which will then be
shared amongst all children.
Then in the code you can then get a timezone object with just:
my $TZ = $::TZ{$ZoneName};
The problem was I just didn't realise HOW much timezone data needs to be
loaded...
./perlbloat.pl 'use DateTime::TimeZone; $TZ{$_} =
DateTime::TimeZone->new(name => $_) for (DateTime::TimeZone::all_names)'
use DateTime::TimeZone; $TZ{$_} = DateTime::TimeZone->new(name => $_) for
(DateTime::TimeZone::all_names) added 12.7M
So just loading all those timezone classes takes 12.7M of RAM. That
increases our process size by almost 50% over it's current size. Now this
all gets shared in the children, but it's still an issue on some of our
development machines which have less RAM (they're linux in vmware), and on
my 1Ghz PIII laptop it takes almost 4 seconds just to load this:
[robm test]$ time perl -e 'use DateTime::TimeZone; $TZ{$_} =
DateTime::TimeZone->new(name => $_) for (DateTime::TimeZone::all_names)'
real 0m3.821s
user 0m2.970s
sys 0m0.500s
Having a look at the code, I noticed that each timezone has it's own class,
and also a lot of data in perl structures. I'm not really sure why the
timezone classes were developed this way, it seems fine for a simple case
where you only need a couple of timezones, but in a case where you can
possibly be using ANY timezone in the same script, it seems a HUGE overhead
in memory and time to have to load all those structures into memory.
In the end, we ended up going with the POSIX timezone related calls.
Although they're pretty hacky, they give us what we want (a seconds offset
from GMT for a given timezone name at a particular time) in a simple, quick
interface without needing 13M of overhead!
I find this a bit of a pity, because we'd really hoped to move more and more
to using pure use of DateTime modules for all time related work, since it's
a really nice looking library. However, for now we've found the TimeZone
classes impractical in a persistent Perl environment (e.g. mod_perl,
Net::Server daemons, etc).
We're fine using our POSIX solution now, but I thought you folks might be
interested in this feedback - we chatted about it with Rick Measham at Perl
Mongers yesterday and he asked if we could provide this summary. It's
probably a good idea to keep web app developers in mind as you develop the
DateTime namespace, since it's a place where a lot of date/time calcuations
in Perl are required.
Regards,
Rob