Re: DateTime performance

2012-05-06 Thread Philipp K. Janert
On Saturday 05 May 2012 17:14:52 N Heinrichs wrote:
 ++ to using precomposed TZ object (as you observed, supplying only the
 name as a string still results in lengthy DT:TZ object creation
 overhead.)
 
 If you use them, I would also precompose any necessary Formatter or
 Locale objects.
 

Thanks for the hint, but it does not apply to my
situation: most of the time, I am creating DateTime
objects from Unix epoch seconds; very occasionally
from individual pieces (year, month, etc). So, stuff
is already broken down quite nicely.

I'll check out the NO_VALIDATION option.

 
 Depending on how you're parsing the date strings and composing the DT
 objects, `local $Params::Validate::NO_VALIDATION = 1;` can speed
 things up for you.
 
 If you're using a DateTime::Formatter class to parse strings into DT
 objects for you, you might investigate using your own regex to chop up
 the string and call DT-new directly.
 
 This code (including comments) is from early 2011 and I unfortunately
 do not have benchmark data handy:
 
 # NOTE: This method is faster than using DateTime::Formatter::MySQL
 # NOTE2: It's also faster than `split m#[ /:T-]#`
 $timestamp =~ m!^
 (?:\s+)?(\d{4,4})[/-](\d{1,2})[/-](\d{1,2}) # Required date
 portion (?:[T\s](\d{1,2}):(\d{1,2}):(\d{1,2}))? # Optional time
 portion (?:\s?([\w/\+:]+))? # Optional timezone
 $!x;
 my ($y, $m, $d, $hr, $min, $sec, $tz) = ($1, $2, $3, $4, $5, $6,
 $7);
 
 On 4 May 2012 13:20, Philipp K. Janert jan...@ieee.org wrote:
  On Thursday 03 May 2012 02:14:45 you wrote:
   From: Philipp K. Janert [mailto:jan...@ieee.org]
   Sent: Wednesday, 2 May 2012 8:29 AM
   
   Question:
   
   When using DateTime for a large number of
   instances, it becomes a serious performance
   drag.
  
  ...
  
   Is this expected behavior? And are there access
   patterns that I can use to mitigate this effect?
   (I tried to supply a time_zone explicitly, but that
   does not seem to improve things significantly.)
  
  Hi Phillip,
  
  My #1 tip is to pre-prepare/cache the DateTime::TimeZone object and 
pass
  it in to each creation of a DateTime object (via whatever mechanism
  you're using to do that). I have seen a case where we were using
  time_zone = 'local' in a reasonably tight datetime object creation
  loop and saw significant speed increases just by cutting out that chunk
  of processing.
  
  In hindsight that was a silly thing to do but it became an easy win :-)
  
  I apologise if this is what you meant by supplying a time_zone
  explicitly in your comment above.
  
  I have tried to specify the timezone explicitly as a string:
   $dt = DateTime-new( ..., time_zone = America/Chicago )
  which does not seem to help, but I have not tried to do:
   $tz = DateTime::TimeZone( 'America/Chicago' )
   $dt = DateTime-new( ..., time_zone = $tz )
  
  I'll try that the next time I have to process one of my data
  sets again. ;-)
  
  Thanks for the hint.
  
  I can't recommend using a tool like NYTProf highly enough on a run of
  your tool to spot the low hanging fruit. See
  https://metacpan.org/module/Devel::NYTProf
  
  Cheers,
  
  Andrew


Re: DateTime performance

2012-05-05 Thread N Heinrichs
++ to using precomposed TZ object (as you observed, supplying only the
name as a string still results in lengthy DT:TZ object creation
overhead.)

If you use them, I would also precompose any necessary Formatter or
Locale objects.


Depending on how you're parsing the date strings and composing the DT
objects, `local $Params::Validate::NO_VALIDATION = 1;` can speed
things up for you.

If you're using a DateTime::Formatter class to parse strings into DT
objects for you, you might investigate using your own regex to chop up
the string and call DT-new directly.

This code (including comments) is from early 2011 and I unfortunately
do not have benchmark data handy:

# NOTE: This method is faster than using DateTime::Formatter::MySQL
# NOTE2: It's also faster than `split m#[ /:T-]#`
$timestamp =~ m!^
(?:\s+)?(\d{4,4})[/-](\d{1,2})[/-](\d{1,2}) # Required date portion
(?:[T\s](\d{1,2}):(\d{1,2}):(\d{1,2}))? # Optional time portion
(?:\s?([\w/\+:]+))? # Optional timezone
$!x;
my ($y, $m, $d, $hr, $min, $sec, $tz) = ($1, $2, $3, $4, $5, $6, $7);

On 4 May 2012 13:20, Philipp K. Janert jan...@ieee.org wrote:
 On Thursday 03 May 2012 02:14:45 you wrote:
  From: Philipp K. Janert [mailto:jan...@ieee.org]
  Sent: Wednesday, 2 May 2012 8:29 AM
 
  Question:
 
  When using DateTime for a large number of
  instances, it becomes a serious performance
  drag.

 ...

  Is this expected behavior? And are there access
  patterns that I can use to mitigate this effect?
  (I tried to supply a time_zone explicitly, but that
  does not seem to improve things significantly.)

 Hi Phillip,

 My #1 tip is to pre-prepare/cache the DateTime::TimeZone object and pass it
 in to each creation of a DateTime object (via whatever mechanism you're
 using to do that). I have seen a case where we were using time_zone =
 'local' in a reasonably tight datetime object creation loop and saw
 significant speed increases just by cutting out that chunk of processing.

 In hindsight that was a silly thing to do but it became an easy win :-)

 I apologise if this is what you meant by supplying a time_zone explicitly
 in your comment above.

 I have tried to specify the timezone explicitly as a string:
  $dt = DateTime-new( ..., time_zone = America/Chicago )
 which does not seem to help, but I have not tried to do:
  $tz = DateTime::TimeZone( 'America/Chicago' )
  $dt = DateTime-new( ..., time_zone = $tz )

 I'll try that the next time I have to process one of my data
 sets again. ;-)

 Thanks for the hint.


 I can't recommend using a tool like NYTProf highly enough on a run of your
 tool to spot the low hanging fruit. See
 https://metacpan.org/module/Devel::NYTProf

 Cheers,

 Andrew


Re: DateTime performance

2012-05-04 Thread Philipp K. Janert
On Thursday 03 May 2012 02:10:04 you wrote:
 On 2012.5.1 3:29 PM, Philipp K. Janert wrote:
  However, when working through a files with a few
  tens of millions of records, DateTime turns into a
  REAL drag on performance.
  
  Is this expected behavior? And are there access
  patterns that I can use to mitigate this effect?
  (I tried to supply a time_zone explicitly, but that
  does not seem to improve things significantly.)
 
 Unfortunately due to the way DateTime is architected it does a lot of
 precalculation upon object instantiation which is usually not used.  So
 yes, it is expected in that sense.

Ok.

 
 If all you need is date objects with a sensible interface, try
 DateTimeX::Lite.  It claims to replicate a good chunk of the DateTime
 interface in a fraction of the memory.

I'll check it out, thanks.

 
 Given how much time it takes to make a DateTime object, and your scale of
 tens of millions of records, you could cache DateTime objects for each
 timestamp and use clone() to get a new instance.

I considered that, but in reality, most of my timestamps
are actually different. (There are about 30M seconds in
a year, so I won't have much duplication, looking at 10-50M
records spread over a year...)

 
 sub get_datetime {
 my $timestamp = shift;
 
 state $cache = {};
 
 if( defined $cache-{$timestamp} ) {
 return $cache-{$timestamp}-clone;
 }
 else {
 $cache-{$timestamp} =
 make_datetime_from_timestamp($timestamp); return $cache-{$timestamp};
 }
 }


Re: DateTime performance

2012-05-04 Thread Philipp K. Janert
On Thursday 03 May 2012 02:14:45 you wrote:
  From: Philipp K. Janert [mailto:jan...@ieee.org]
  Sent: Wednesday, 2 May 2012 8:29 AM
  
  Question:
  
  When using DateTime for a large number of
  instances, it becomes a serious performance
  drag.
 
 ...
 
  Is this expected behavior? And are there access
  patterns that I can use to mitigate this effect?
  (I tried to supply a time_zone explicitly, but that
  does not seem to improve things significantly.)
 
 Hi Phillip,
 
 My #1 tip is to pre-prepare/cache the DateTime::TimeZone object and pass it
 in to each creation of a DateTime object (via whatever mechanism you're
 using to do that). I have seen a case where we were using time_zone =
 'local' in a reasonably tight datetime object creation loop and saw
 significant speed increases just by cutting out that chunk of processing.
 
 In hindsight that was a silly thing to do but it became an easy win :-)
 
 I apologise if this is what you meant by supplying a time_zone explicitly
 in your comment above.

I have tried to specify the timezone explicitly as a string:
  $dt = DateTime-new( ..., time_zone = America/Chicago )
which does not seem to help, but I have not tried to do:
  $tz = DateTime::TimeZone( 'America/Chicago' )
  $dt = DateTime-new( ..., time_zone = $tz )

I'll try that the next time I have to process one of my data
sets again. ;-)

Thanks for the hint.

 
 I can't recommend using a tool like NYTProf highly enough on a run of your
 tool to spot the low hanging fruit. See
 https://metacpan.org/module/Devel::NYTProf
 
 Cheers,
 
 Andrew


DateTime performance

2012-05-03 Thread Philipp K. Janert

Question:

When using DateTime for a large number of
instances, it becomes a serious performance
drag. 

A typical application for me involves things like
log files: I use DateTime to translate the timestamps 
in these files into a canonical format, and then get 
information such as day-of-week or time-of-day 
from DateTime. 

However, when working through a files with a few 
tens of millions of records, DateTime turns into a 
REAL drag on performance.

Is this expected behavior? And are there access
patterns that I can use to mitigate this effect? 
(I tried to supply a time_zone explicitly, but that
does not seem to improve things significantly.)

Best,

Ph.



Re: DateTime performance

2012-05-03 Thread Michael G Schwern
On 2012.5.1 3:29 PM, Philipp K. Janert wrote:
 However, when working through a files with a few 
 tens of millions of records, DateTime turns into a 
 REAL drag on performance.
 
 Is this expected behavior? And are there access
 patterns that I can use to mitigate this effect? 
 (I tried to supply a time_zone explicitly, but that
 does not seem to improve things significantly.)

Unfortunately due to the way DateTime is architected it does a lot of
precalculation upon object instantiation which is usually not used.  So yes,
it is expected in that sense.

If all you need is date objects with a sensible interface, try
DateTimeX::Lite.  It claims to replicate a good chunk of the DateTime
interface in a fraction of the memory.

Given how much time it takes to make a DateTime object, and your scale of tens
of millions of records, you could cache DateTime objects for each timestamp
and use clone() to get a new instance.

sub get_datetime {
my $timestamp = shift;

state $cache = {};

if( defined $cache-{$timestamp} ) {
return $cache-{$timestamp}-clone;
}
else {
$cache-{$timestamp} = make_datetime_from_timestamp($timestamp);
return $cache-{$timestamp};
}
}


-- 
100. Claymore mines are not filled with yummy candy, and it is wrong
 to tell new soldiers that they are.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


RE: DateTime performance

2012-05-03 Thread Andrew O'Brien
 From: Philipp K. Janert [mailto:jan...@ieee.org]
 Sent: Wednesday, 2 May 2012 8:29 AM
 
 Question:
 
 When using DateTime for a large number of
 instances, it becomes a serious performance
 drag.
...
 Is this expected behavior? And are there access
 patterns that I can use to mitigate this effect?
 (I tried to supply a time_zone explicitly, but that
 does not seem to improve things significantly.)

Hi Phillip,

My #1 tip is to pre-prepare/cache the DateTime::TimeZone object and pass it in 
to each creation of a DateTime object (via whatever mechanism you're using to 
do that). I have seen a case where we were using time_zone = 'local' in a 
reasonably tight datetime object creation loop and saw significant speed 
increases just by cutting out that chunk of processing.

In hindsight that was a silly thing to do but it became an easy win :-)

I apologise if this is what you meant by supplying a time_zone explicitly in 
your comment above.

I can't recommend using a tool like NYTProf highly enough on a run of your tool 
to spot the low hanging fruit. See https://metacpan.org/module/Devel::NYTProf

Cheers,

Andrew


Re: DateTime performance

2012-05-03 Thread Rick Measham
In the spirit of TIMTOWTDI, there's my DateTime::LazyInit module which I wrote 
for this sort of case. It only inflates to a full DateTime object when you call 
methods that aren't simple. 

http://search.cpan.org/~rickm/DateTime-LazyInit-1.0200/lib/DateTime/LazyInit.pm

Caveat: I haven't tested it against any recent DateTime releases. 

Cheers!
Rick Measham


On 02/05/2012, at 8:29, Philipp K. Janert jan...@ieee.org wrote:

 
 Question:
 
 When using DateTime for a large number of
 instances, it becomes a serious performance
 drag. 
 
 A typical application for me involves things like
 log files: I use DateTime to translate the timestamps 
 in these files into a canonical format, and then get 
 information such as day-of-week or time-of-day 
 from DateTime. 
 
 However, when working through a files with a few 
 tens of millions of records, DateTime turns into a 
 REAL drag on performance.
 
 Is this expected behavior? And are there access
 patterns that I can use to mitigate this effect? 
 (I tried to supply a time_zone explicitly, but that
 does not seem to improve things significantly.)
 
 Best,
 
Ph.
 
 -- 
 Message  protected for iSite by MailGuard: e-mail anti-virus, anti-spam and 
 content filtering.http://www.mailguard.com.au
 Click here to report this message as spam:
 https://login.mailguard.com.au/report/1EEXMobD68/14EZiTvCo3I3sbAw7UgxdE/0
 
-- 
Message  protected for iSite by MailGuard: e-mail anti-virus, anti-spam and 
content filtering.http://www.mailguard.com.au



Re: DateTime performance

2012-05-03 Thread Ashley Pond V
I love and use DateTime for for 10s of millions of records at once I
would be choosing Date::Calc instead and dealing with any necessary
futzy bits manually.

On Thu, May 3, 2012 at 2:53 AM, Rick Measham r...@measham.id.au wrote:
 In the spirit of TIMTOWTDI, there's my DateTime::LazyInit module which I 
 wrote for this sort of case. It only inflates to a full DateTime object when 
 you call methods that aren't simple.

 http://search.cpan.org/~rickm/DateTime-LazyInit-1.0200/lib/DateTime/LazyInit.pm

 Caveat: I haven't tested it against any recent DateTime releases.

 Cheers!
 Rick Measham
 

 On 02/05/2012, at 8:29, Philipp K. Janert jan...@ieee.org wrote:


 Question:

 When using DateTime for a large number of
 instances, it becomes a serious performance
 drag.

 A typical application for me involves things like
 log files: I use DateTime to translate the timestamps
 in these files into a canonical format, and then get
 information such as day-of-week or time-of-day
 from DateTime.

 However, when working through a files with a few
 tens of millions of records, DateTime turns into a
 REAL drag on performance.

 Is this expected behavior? And are there access
 patterns that I can use to mitigate this effect?
 (I tried to supply a time_zone explicitly, but that
 does not seem to improve things significantly.)

 Best,

        Ph.

 --
 Message  protected for iSite by MailGuard: e-mail anti-virus, anti-spam and 
 content filtering.http://www.mailguard.com.au
 Click here to report this message as spam:
 https://login.mailguard.com.au/report/1EEXMobD68/14EZiTvCo3I3sbAw7UgxdE/0

 --
 Message  protected for iSite by MailGuard: e-mail anti-virus, anti-spam and 
 content filtering.http://www.mailguard.com.au



Re: DateTime performance

2009-02-14 Thread Zefram
arie.ha...@gmail.com wrote:
Our project requres getting time-zone offset for the given time-zone
id at the current time.

You can speed things up a bit by using the timezone modules in isolation.
You can construct a fake DateTime class, which only provides the methods
-utc_rd_as_seconds and -utc_year.  Use that class to construct an
object representing the current time.  Then call -offset_for_datetime
on a timezone object, passing in your fake DateTime.

-zefram


DateTime performance

2009-01-23 Thread arie . haan1
Hey!

Why DateTime module is loaded so slow?

This simple script that just imports DateTime is executed for 1 second
approximately:
use DateTime;

Can I make it faster?

Our project requres getting time-zone offset for the given time-zone
id at the current time.
This is the primary reason I use DateTime modules family.

Is there any alternative for DateTime to solve the task?

Thanks!
Arie



Re: DateTime performance

2009-01-23 Thread Dave Rolsky

On Fri, 23 Jan 2009, arie.ha...@gmail.com wrote:


Why DateTime module is loaded so slow?

This simple script that just imports DateTime is executed for 1 second
approximately:
use DateTime;

Can I make it faster?


Yes, you need a faster computer!

 auta...@houseabsolute:~/projects/R2$ time perl -MDateTime -e1

 real   0m0.109s
 user   0m0.096s
 sys0m0.016s

That's my desktop, which is a Core2 Duo of some sort.

Note that once you do this once it gets much quicker because the OS keeps 
the data in memory until it gets paged out by something else. If you keep 
using it, it won't get paged out. The results above are _not_ from the 
first load.



-dave

/*
http://VegGuide.org   http://blog.urth.org
Your guide to all that's veg  House Absolute(ly Pointless)
*/


Re: DateTime performance

2006-01-18 Thread Yitzchak Scott-Thoennes
On Mon, Jan 16, 2006 at 06:21:54PM -0800, [EMAIL PROTECTED] wrote:
 One might hope that a script like this:
 
 test3
 #!/usr/bin/perl
 BEGIN {
 no lib qw|/usr/lib/perl5/site_perl/5.8.6/i386-linux-thread-multi /usr/ 
 lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/ 
 site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/ 
 5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.6 /usr/lib/ 
 perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/ 
 site_perl/5.8.3 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/ 
 5.8.6/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5/i386- 
 linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread- 
 multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/ 
 lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/ 
 perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/ 
 perl5/vendor_perl /usr/lib/perl5/5.8.6/i386-linux-thread-multi /usr/ 
 lib/perl5/5.8.6 .|;
 use lib qw|/usr/lib/perl5/site_perl/5.8.6/i386-linux-thread-multi / 
 usr/lib/perl5/site_perl/5.8.6 /usr/lib/perl5/site_perl /usr/lib/perl5/ 
 vendor_perl/5.8.6/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/ 
 5.8.6 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.6/i386-linux- 
 thread-multi /usr/lib/perl5/5.8.6 .|;
 }
 use DateTime;
 
 Might improve the situation. However even this has no significant  
 improvement and from additional traces it doesn't actually stop perl  
 from using the built in paths.

Then no lib isn't doing what you want.  Try just:

BEGIN { @INC = grep !/5\.8\.[0-5]/, @INC }


Re: DateTime performance

2006-01-18 Thread Yitzchak Scott-Thoennes
On Wed, Jan 18, 2006 at 08:38:13AM -0800, [EMAIL PROTECTED] wrote:
 Then no lib isn't doing what you want.
 
 Agree. But, that is the point. Outside of recompiling perl with new  
 paths or significantly altering DateTime to use far fewer  
 dependancies nothing can really be done.
 
 test4
 #!/usr/bin/perl
 BEGIN { @INC = grep !/5\.8\.[0-5]/, @INC }
 use DateTime;

Do your traces show it still searching all the removed paths?
There's no way the above should be doing that, unless you're
loading DateTime earlier, via sitecustomize.pl or $PERL5OPT?


Re: DateTime performance

2006-01-18 Thread matthew

Then no lib isn't doing what you want.


Agree. But, that is the point. Outside of recompiling perl with new  
paths or significantly altering DateTime to use far fewer  
dependancies nothing can really be done.


test4
#!/usr/bin/perl
BEGIN { @INC = grep !/5\.8\.[0-5]/, @INC }
use DateTime;

[EMAIL PROTECTED] tmp]$ time perl test4

real0m5.780s
user0m5.524s
sys 0m0.188s

Matthew



Re: DateTime performance

2006-01-18 Thread matthew

Do your traces show it still searching all the removed paths?


yes


There's no way the above should be doing that, unless you're
loading DateTime earlier, via sitecustomize.pl or $PERL5OPT?


Neither of the items you have identified are used in any way during  
these tests. I would expect if either of those had been the issue  
then even test1 would be slow.


Regards,
Matthew


DateTime performance

2006-01-17 Thread matthew


I don't consider this to be completely a DateTime issue however I  
thought I would share my findings to this list for consideration.


I'm using the latest release of DateTime with perl 5.8 (Standard RPM  
distro) for FC4 on a very old 166MHz Pentium system. So I don't  
expect this system to fast.


Using this system if I take the following two scripts:

test1
#!/usr/bin/perl

test2
#!/usr/bin/perl
use DateTime;

The performance of them is like this:

[EMAIL PROTECTED] tmp]$ time perl test1

real0m0.060s
user0m0.016s
sys 0m0.044s
[EMAIL PROTECTED] tmp]$ time perl test2

real0m5.805s
user0m5.456s
sys 0m0.284s

That to me is a huge performance hit to just load a module. This is  
the distribution of where all the time for test2 is getting spent:


[EMAIL PROTECTED] tmp]$ strace -c perl test2
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
41.510.098303 209   471   420 open
32.340.076580 181   424   420 stat64
13.730.032522 290   112   read
  3.740.008861 26933   old_mmap
  2.200.005213  5791 3 _llseek
  1.560.003700  7351   close
  1.080.0025492549 1   execve
  1.020.002408  623936 ioctl
  0.950.002241  9324   brk
  0.460.001085 121 9   mprotect
  0.450.001054  6616   fstat64
  0.210.000508 508 1   readlink
  0.190.000446 149 3   mmap2
  0.130.000301 151 2   munmap
  0.090.000213 213 1   _sysctl
  0.080.000181  45 4   rt_sigaction
  0.050.000129 129 1 1 access
  0.040.89  45 2   time
  0.020.59  59 1   futex
  0.020.50  50 1   fcntl64
  0.020.48  48 1   getrlimit
  0.020.44  44 1   set_thread_area
  0.020.42  42 1   rt_sigprocmask
  0.020.38  38 1   getuid32
  0.020.37  37 1   set_tid_address
  0.020.36  36 1   geteuid32
  0.010.35  35 1   getgid32
  0.010.34  34 1   getegid32
-- --- --- - - 
100.000.236806  1295   880 total

From this the biggest  time consumer is opening and stating files  
followed by reading them. This also generates a lot of errors because  
of how FC4 has decided to support old path's in perl's default @INC.  
They include a lot of old directories that are empty but this forces  
perl to search them all anyways. So with  a path of about 20+  
directories most empty plus DateTime loading 100+ different files  
just to get started turns into a fair amount of searching and loading.


One might hope that a script like this:

test3
#!/usr/bin/perl
BEGIN {
no lib qw|/usr/lib/perl5/site_perl/5.8.6/i386-linux-thread-multi /usr/ 
lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/ 
site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/ 
5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.6 /usr/lib/ 
perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/ 
site_perl/5.8.3 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/ 
5.8.6/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5/i386- 
linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread- 
multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/ 
lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/ 
perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/ 
perl5/vendor_perl /usr/lib/perl5/5.8.6/i386-linux-thread-multi /usr/ 
lib/perl5/5.8.6 .|;
use lib qw|/usr/lib/perl5/site_perl/5.8.6/i386-linux-thread-multi / 
usr/lib/perl5/site_perl/5.8.6 /usr/lib/perl5/site_perl /usr/lib/perl5/ 
vendor_perl/5.8.6/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/ 
5.8.6 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.6/i386-linux- 
thread-multi /usr/lib/perl5/5.8.6 .|;

}
use DateTime;

Might improve the situation. However even this has no significant  
improvement and from additional traces it doesn't actually stop perl  
from using the built in paths.


[EMAIL PROTECTED] tmp]$ time perl test3

real0m5.721s
user0m5.424s
sys 0m0.216s

Not that I expect anyone to fix anything about this but I just  
thought I would pass it  along. On most fast computers today this  
busy work probably isn't noticed as a delay but on this box 5 sec  
just to get started is a very 

Re: DateTime performance

2006-01-17 Thread Jerrad Pierce
Unfortunately, it's a known problem that CentOS suffers from too (@[EMAIL 
PROTECTED]).
This also makes reading error output incredibly difficult since a full screen
is given to list @INC. Instead of a few folks who are upgrading systems
having to set PERL5LIB everyone else has to recompile perl or put up with
the shit they fed us, funk you very much Red Hat. (I doubt they cripple gcc
in a similar manner, but you never know). See also: 

http://www.perl.com/pub/a/2005/12/21/a_timely_start.html
-- 
H4sICNoBwDoAA3NpZwA9jbsNwDAIRHumuC4NklvXTOD0KSJEnwU8fHz4Q8M9i3sGzkS7BBrm
OkCTwsycb4S3DloZuMIYeXpLFqw5LaMhXC2ymhreVXNWMw9YGuAYdfmAbwomoPSyFJuFn2x8
Opr8bBBidcc=
--
MOTD on Boomtime, the 17th of Chaos, in the YOLD 3172:
Running on empty...


Re: DateTime performance

2006-01-17 Thread Roderick A. Anderson

Jerrad Pierce wrote:

Unfortunately, it's a known problem that CentOS suffers from too (@[EMAIL 
PROTECTED]).
This also makes reading error output incredibly difficult since a full screen
is given to list @INC. Instead of a few folks who are upgrading systems
having to set PERL5LIB everyone else has to recompile perl or put up with
the shit they fed us, funk you very much Red Hat. (I doubt they cripple gcc
in a similar manner, but you never know).


How about python.  Seems like Redhat has got very snake-ish in the last 
3-4 years.


So if perl stands for Practical Extraction and Report Language ( not 
to mention my favorite Pathological Eclectic Rubbish Lister ) does 
python mean slowly squeeze the life out of you?


See also: 


http://www.perl.com/pub/a/2005/12/21/a_timely_start.html


Sorry to vent here.  Especially bad day.


Rod
--


Re: DateTime Performance

2003-08-09 Thread Eugene van der Pijll
John Siracusa schreef:
 Okay, here's a simple implementation of a memoized 
 DateTime::Locale::load().

A solution that is more or less equivalent, is to change the
DefaultLocale routine. At the moment, $DefaultLocale is saved as a
string; every time DT::new() is called without a locale argument, the
default locale is loaded again.

It should be a bit faster than your version, because DT::Locale::load is
never called in DT::new(). (Except if you specify another locale; in
that case you should pass the locale object, not the locale name, if you
want speed.)

Probably this changes the behaviour if the default locale is aliased.
But IMHO, that's probably for the better.

Probably this should happen with the timezone parameter as well: change

default = 'floating'

to

default = DateTime::TimeZone-new( name = 'floating' )

Eugene


Re: DateTime Performance

2003-08-08 Thread Joshua Hoblitt
  A solution that is more or less equivalent, is to change the
  DefaultLocale routine. [...]
  Probably this changes the behaviour if the default locale is aliased.
  But IMHO, that's probably for the better.

 Yeah, that was my concern: add_aliases() and friends in
 DateTime::Locale would have to reach back into DateTime and blank the
 cached locale, which seemed evil to me.  But I was just thinking of
 preserving the existing behavior.  If this is not a constraint, then
 I'm all for the alternative you suggested.

Ack - lets not go around fiddling with caches in other namespaces.  The caching 
mechanism should be _internal_ to DateTime::Locale.

-J

--


Re: DateTime Performance

2003-08-07 Thread Tim Bunce
On Mon, Aug 04, 2003 at 11:32:15PM -0500, Dave Rolsky wrote:
 
  Maybe that looks more sane to you?
 
 What makes no sense is for BEGIN to show up as a significant chunk of the
 time it would take to do anything, since this stuff should only happen
 once.  I'm somewhat skeptical that Devel::DProf is working, or works
 properly at all in general.

It's working okay but unhelpfully... Devel::DProf records the name of the
sub only when it's first called (naturally, for performance reasons).

The problem with BEGIN blocks is that they're called once *then freed*
and then the same address is then reused for the next sub definition.

Probably easy to fix but I've never had the time.

Tim.


Re: DateTime Performance

2003-08-04 Thread John Siracusa
On 8/4/03 12:26 AM, Dave Rolsky wrote:
 # ... includes args: year, month, day, hour, minute, second
 DateTime-new(...): 16 wallclock secs @ 687.29/s
(14.48 usr +  0.07 sys = 14.55 CPU)
 
 This does a lot of work, including calculating both UTC  local times,
 which involves calculating leap seconds, etc.

Does it need to do that?  I mean, sure, eventually it might have to do that
if I want to do some sort of date manipulation, or even just fetch or print
the date.  But does it have to really do anything at all during object
construction other than stash the args somewhere?

 DateTime-now(): 21 wallclock secs @ 547.95/s
(18.13 usr +  0.12 sys = 18.25 CPU)
 
 Ditto.

I'm assuming now() is slower than new() due to the system call overhead of
getting the current time...?

 Total Elapsed Time = 19.91729 Seconds
User+System Time = 14.60729 Seconds
 Exclusive Times
 %Time ExclSec CumulS #Calls sec/call Csec/c  Name
   27.6   4.035  4.685  20274   0.0002 0.0002  Params::Validate::_validate
   24.0   3.510 17.549  1   0.0004 0.0018  DateTime::new
   18.9   2.770  3.809  10001   0.0003 0.0004
 DateTime::Locale::_load_class_from_id
 
 This seems quite odd.  It really doesn't do much.
 
   8.96   1.309  2.647  10020   0.0001 0.0003  DateTime::TimeZone::BEGIN
 
 And this is completely mystifying.  Can you show us your code?

Sure, here it is:

for(1 .. 1)
{
  my $d = DateTime-new(year = 200, month = 1, day = 1, hour = 2, minute
= 3, second = 4);
}

Those stats were produced on a G3/400 running a development release of OS X
that uses some build of Perl 5.8.1, which could explain some oddness.  Here
is the same code run on a G4/800 using Perl 5.8.0 on the latest released
version of OS X 10.2:

Total Elapsed Time = 8.817281 Seconds
  User+System Time = 5.352659 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 60.4   3.236 10.844  1   0.0003 0.0011  DateTime::new
 44.7   2.395  3.305  10001   0.0002 0.0003
DateTime::Locale::_load_class_from_id
 43.3   2.318  2.127  20274   0.0001 0.0001  Params::Validate::_validate
 22.5   1.207  1.095  10001   0.0001 0.0001  DateTime::Locale::Base::new
 18.4   0.987  1.223  10020   0.0001 0.0001  DateTime::TimeZone::BEGIN
 17.5   0.939  0.465  5   0. 0.  DateTime::__ANON__
 15.2   0.818  0.645  10002   0.0001 0.0001
DateTime::_calc_local_components
 12.8   0.687  1.025  10002   0.0001 0.0001  DateTime::_calc_local_rd
 10.6   0.568  0.525  10002   0.0001 0.0001  DateTime::_calc_utc_rd
 8.20   0.439  0.225  10002   0. 0.  DateTime::_normalize_seconds
 7.83   0.419  0.275  1   0. 0.  DateTime::_last_day_of_month
 7.47   0.400  0.115  30006   0. 0.
DateTime::TimeZone::Floating::is_floating
 7.27   0.389  3.505  10001   0. 0.0004  DateTime::Locale::load
 5.79   0.310  0.214  10006   0. 0.
DateTime::TimeZone::Floating::BEGIN
 4.86   0.260  0.070  20004   0. 0.
DateTime::TimeZone::OffsetOnly::is_utc

Maybe that looks more sane to you?
 
 So, what does everyone else think of the object creation performance
 situation?  Is it simply good enough to be 3x faster that
 Date::Manip::ParseDate()?  Are there any obvious areas that I should
 consider before I start mucking around with DateTime::new()?
 
 Considering that up til now my concern has been primarily on getting
 things correct, I wouldn't worry about it.  There are definitely some big
 performance improvements possible.  One possibility is to move the leap
 second bits into the DateTime XS code, which should help a lot.  The
 timezone stuff can also benefit from being rewritten as XS, but that won't
 help the particular cases you benchmarked, since the UTC and floating time
 zones are quite fast already.

What about what I mentioned earlier: lazy (or lazier) evaluation in the
constructor?  Basically, I want construction with known values to be as fast
as possible since there's a chance I may not even look at the date fields of
my objects.  But it's a hassle to have special-case code that either doesn't
fetch or doesn't set the date fields of my objects, just so I can avoid the
relatively expansive calls to DateTime-new()

-John



Re: DateTime Performance

2003-08-04 Thread John Siracusa
On 8/4/03 10:10 AM, John Siracusa wrote:
 On 8/4/03 12:26 AM, Dave Rolsky wrote:
 # ... includes args: year, month, day, hour, minute, second
 DateTime-new(...): 16 wallclock secs @ 687.29/s
(14.48 usr +  0.07 sys = 14.55 CPU)
 
 This does a lot of work, including calculating both UTC  local times,
 which involves calculating leap seconds, etc.
 
 Does it need to do that?  I mean, sure, eventually it might have to do that
 if I want to do some sort of date manipulation, or even just fetch or print
 the date.  But does it have to really do anything at all during object
 construction other than stash the args somewhere?

I played around with DateTime::new() and found that the biggest culprit is
this line:

$self-{locale} = DateTime::Locale-load( $p{locale} );

The removal of which more than doubles the performance of calling
DateTime::new(...) with ymdhms args.  The only way to get a comparable
speedup is to remove every line below that one except for these two:

bless $self, $class;
return $self;

And even that only gives a ~90% speedup vs. the 100%+ gained by ditching
DateTime::Locale-load().  (Obviously all of this will hose DateTime's
actual functionality, but bear with me :)

Profiling showed that DateTime::Locale::_load_class_from_id() was being
called N+1 times during N calls to DateTime-new(...), and that it was #3 in
the dprofpp list (2000 iterations shown):

%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 47.8   0.663  2.135   2000   0.0003 0.0011  DateTime::new
 35.2   0.488  0.399   4274   0.0001 0.0001  Params::Validate::_validate
 31.6   0.439  0.517   2001   0.0002 0.0003
DateTime::Locale::_load_class_from_id
 15.8   0.219  0.313   2020   0.0001 0.0002  DateTime::TimeZone::BEGIN

I found that _load_class_from_id() unconditionally executes this code:

eval require $real_class;

Skipping that line was good for a 30%+ speed boost, but that got me
thinking...aren't the Locale objects loaded/created by _load_class_from_id()
singletons?  Replacing calls to _load_class_from_id() within
DateTime::Locale::load() with some dumb caching like this:

$Cache_By_Id{$id} ||= $class-_load_from_id($id)

Resulted in an easy 50% speed-up for DateTime-new(...), and
_load_class_from_id() dropped completely off the dprofpp output:

Total Elapsed Time = 0.841889 Seconds
  User+System Time = 0.501889 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 116.   0.584  1.290   2000   0.0003 0.0006  DateTime::new
 79.3   0.398  0.287   4274   0.0001 0.0001  Params::Validate::_validate
 41.6   0.209  0.220   2002   0.0001 0.0001  DateTime::_calc_local_rd
 37.6   0.189  0.238   2020   0.0001 0.0001  DateTime::TimeZone::BEGIN
 31.6   0.159  0.150   2002   0.0001 0.0001  DateTime::_calc_utc_rd
 27.8   0.140  0.070   2002   0.0001 0.
DateTime::_calc_local_components
 25.9   0.130  0.030  1   0. 0.  DateTime::__ANON__
 17.9   0.090  0.070   2001   0. 0.  DateTime::DefaultLocale
 15.9   0.080  0.040   4004   0. 0.
DateTime::TimeZone::OffsetOnly::is_utc
 15.9   0.080  0.030   2000   0. 0.  DateTime::_last_day_of_month
 15.9   0.080  0.040   2002   0. 0.  DateTime::_normalize_seconds
 13.9   0.070  0.010   6006   0. 0.
DateTime::TimeZone::Floating::is_floating
 13.9   0.070  0.069   2006   0. 0.
DateTime::TimeZone::Floating::BEGIN
 11.3   0.057  0.115  1   0.0573 0.1145  DateTime::Locale::register
 7.97   0.040  0.154  6   0.0067 0.0257  DateTime::Locale::BEGIN

(An aside: why is DateTime::DefaultLocale on this list at all?)

To test my theory that this kind of dumb caching is valid, I ran all of
DateTime::Locale's tests, and then ran DateTime's tests while using the
modified DateTime::Locale.  Everything passed.

So, assuming I'm not missing a finer point here, I'm thinking that one easy
speed-up for DateTime object creation would be to make the various
DateTime::Locale::* classes into singletons (using whatever the proper
method is for this in the DT project) and avoid repeated string evals and
repeated calls to _load_class_from_id().

Going further, if calls to DateTime::Locale-load(...) could be memoized
safely, that'd be great too :)

-John



Re: DateTime Performance

2003-08-04 Thread John Siracusa
On 8/4/03 1:25 PM, Ben Bennett wrote:
 Why not make your module be lazy about whether or not it creates a
 DateTime?

I thought of that, but I also use the act of creating a DateTime object to
check the validity of date attributes.  Anyway, I think there's room for
DateTime-new() optimization even without adding lazy evaluation (see
earlier posts).

-John



DateTime Performance

2003-08-03 Thread John Siracusa
I was profiling a database-backed mod_perl application recently.  A 
particular request was taking several seconds to complete.  At first I 
thought the database was the bottleneck, but the request included only 
one database query, and that query completed in about 300msec when run 
from a command-line script. Something Perl-ish was the culprit, so I 
set out to find it.

This task was made more difficult by my inability to get Devel::DProf 
working in Mac OS X (see my posts to the mod_perl and [EMAIL PROTECTED] 
lists), so I had to resort to the use of Time::HiRes and a smattering 
of calls to my own simple timer routines.  I eventually narrowed the 
time-suck down to a loop that looked something like this:

	# bind columns to %row here

while($sth-fetch)
{
  push(@widgets, Widget-new(%row));
}
Now I suspected some sort of DBI issue, so I replaced the loop body 
with a no-op.  Suddenly, the request completed in one second or less.  
Now I suspected my Widget class, and benchmarked its constructor 
offline.  (The constructor just calls $self-$key($value) for each k/v 
pair in %row.)  This eventually led me to find that setting the date 
fields in the Widget object was the culprit.

I use DateTime objects for my internal date representation, but I have 
a set of wrapper functions that hide this fact.  Now I suspected that 
my date parsing wrapper code was the problem, so I replaced my parse 
function's body with a simple call to DateTime-now.  The request 
became slow again, taking several seconds to complete.

There was no avoiding it: the bottleneck for my web app was not the 
database, not HTML::Mason, not my object classes, not even my date 
parsing code, but DateTime object creation!  (Perl 5.8, latest DateTime 
from CPAN.)

My quick fix was to make sure that %row only contains a single date 
field, rather than the four that each object has when completely 
filled out.  This produced a noticeable (~2x) speed increase for the 
whole request.

Sorry to provide so many gory details, but I wanted to try to establish 
exactly how I'm using DateTime, and how its performance came to my 
attention in the first place.  I benchmarked DateTime's object creation 
speed against a few random classes, just to get a feel for where it 
stands:

CGI-new(''):  5 wallclock secs @ 1869.16/s
  (5.25 usr +  0.10 sys =  5.35 CPU)
Date::Manip::ParseDate('now'): 49 wallclock secs @ 223.81/s
  (44.44 usr  0.24 sys +  0.01 cusr  0.01 csys = 44.70 CPU)
Date::Simple-new('2003-01-01'):  2 wallclock secs @ 4273.50/s
  (2.31 usr +  0.03 sys =  2.34 CPU)
# ... includes args: year, month, day, hour, minute, second
DateTime-new(...): 16 wallclock secs @ 687.29/s
  (14.48 usr +  0.07 sys = 14.55 CPU)
DateTime-now(): 21 wallclock secs @ 547.95/s
  (18.13 usr +  0.12 sys = 18.25 CPU)
DateTime does well against Date::Manip, but not so well against even a 
big module like CGI.  But for object creation alone, should it really 
be ~5x as slow as Date::Simple?

My final step was to profile 10,000 calls to DateTime-new(...) using 
Devel::DProf (which works from the command line in OS X).  dprofpp had 
this to say:

Total Elapsed Time = 19.91729 Seconds
  User+System Time = 14.60729 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 27.6   4.035  4.685  20274   0.0002 0.0002  Params::Validate::_validate
 24.0   3.510 17.549  1   0.0004 0.0018  DateTime::new
 18.9   2.770  3.809  10001   0.0003 0.0004  
DateTime::Locale::_load_class_from
 _id
 8.96   1.309  2.647  10020   0.0001 0.0003  DateTime::TimeZone::BEGIN
 6.44   0.940  1.030  10001   0.0001 0.0001  DateTime::Locale::Base::new
 6.23   0.910  1.190  10002   0.0001 0.0001  
DateTime::_calc_local_components
 4.45   0.650  0.650  5   0. 0.  DateTime::__ANON__
 3.90   0.570  1.009  10002   0.0001 0.0001  DateTime::_calc_utc_rd
 2.88   0.420  0.490  1   0. 0.  
DateTime::_last_day_of_month
 2.67   0.390  0.399  10006   0. 0.  
DateTime::TimeZone::Floating::BEGI
 N
 2.40   0.350  1.619  10002   0. 0.0002  DateTime::_calc_local_rd
 1.92   0.280  0.299  10001   0. 0.  DateTime::DefaultLocale
 1.64   0.240  0.240  30006   0. 0.  
DateTime::TimeZone::Floating::is_f
 loating
 1.51   0.220  0.220  1   0. 0.  DateTime::_rd2ymd
 1.37   0.200  4.009  10001   0. 0.0004  DateTime::Locale::load

These numbers confuse me a bit, because I'm only creating about 30 
Widget objects in my mod_perl request, not 10,000.  But I see a very 
significant speed hit, even if I replace my entire Widget-new() call 
with a simple call to DateTime-new().  Maybe it's some sort of 
mod_perl/DateTime interaction?

Anyway, I don't want to get sidetracked into mod_perl stuff.  I'm not 
sure what (else) to make of the results above, other than a possible 
wish that I could 

Re: DateTime Performance

2003-08-03 Thread Dave Rolsky
On Sun, 3 Aug 2003, John Siracusa wrote:

 CGI-new(''):  5 wallclock secs @ 1869.16/s
(5.25 usr +  0.10 sys =  5.35 CPU)

CGI's constructor really doesn't do much at all, especially if there's no
query string or form submission to handle.

 Date::Simple-new('2003-01-01'):  2 wallclock secs @ 4273.50/s
(2.31 usr +  0.03 sys =  2.34 CPU)

This also doesn't really do much of anything.

 # ... includes args: year, month, day, hour, minute, second
 DateTime-new(...): 16 wallclock secs @ 687.29/s
(14.48 usr +  0.07 sys = 14.55 CPU)

This does a lot of work, including calculating both UTC  local times,
which involves calculating leap seconds, etc.

 DateTime-now(): 21 wallclock secs @ 547.95/s
(18.13 usr +  0.12 sys = 18.25 CPU)

Ditto.

 DateTime does well against Date::Manip, but not so well against even a
 big module like CGI.  But for object creation alone, should it really
 be ~5x as slow as Date::Simple?

Yeah, probably.

 Total Elapsed Time = 19.91729 Seconds
User+System Time = 14.60729 Seconds
 Exclusive Times
 %Time ExclSec CumulS #Calls sec/call Csec/c  Name
   27.6   4.035  4.685  20274   0.0002 0.0002  Params::Validate::_validate
   24.0   3.510 17.549  1   0.0004 0.0018  DateTime::new
   18.9   2.770  3.809  10001   0.0003 0.0004
 DateTime::Locale::_load_class_from_id

This seems quite odd.  It really doesn't do much.

   8.96   1.309  2.647  10020   0.0001 0.0003  DateTime::TimeZone::BEGIN

And this is completely mystifying.  Can you show us your code?

 These numbers confuse me a bit, because I'm only creating about 30
 Widget objects in my mod_perl request, not 10,000.  But I see a very
 significant speed hit, even if I replace my entire Widget-new() call
 with a simple call to DateTime-new().  Maybe it's some sort of
 mod_perl/DateTime interaction?

No, DateTime just does a lot of stuff.

 Anyway, I don't want to get sidetracked into mod_perl stuff.  I'm not
 sure what (else) to make of the results above, other than a possible
 wish that I could turn off Params::Validate's validation in certain
 performance-critical situations.

You can turn it off for everything by setting the PERL_NO_VALIDATION
environment variable to true.  There's no way to turn it off and on at
runtime currently, though this could be added.

 So, what does everyone else think of the object creation performance
 situation?  Is it simply good enough to be 3x faster that
 Date::Manip::ParseDate()?  Are there any obvious areas that I should
 consider before I start mucking around with DateTime::new()?

Considering that up til now my concern has been primarily on getting
things correct, I wouldn't worry about it.  There are definitely some big
performance improvements possible.  One possibility is to move the leap
second bits into the DateTime XS code, which should help a lot.  The
timezone stuff can also benefit from being rewritten as XS, but that won't
help the particular cases you benchmarked, since the UTC and floating time
zones are quite fast already.


-dave

/*===
House Absolute Consulting
www.houseabsolute.com
===*/