I was optimizing some log-parsing script today, and found DateTime::Format::Strptime to be bottleneck. Out of curiosity I wrote simple benchmark. Code is here: http://pastebin.com/PU8nXGPW
Results are interesting, as they show fairly noticeable differences: http://pastebin.com/S4rt6bYd (run on Ubuntu 12.04 with perl 5.14.2 and ubuntu-packaged datetime modules) While the fact that „trying many approaches” parses like Natural or Flexible are slow is, well, natural (still DateParse is much better), I find it confusing that Strptime is soo slow. ISO8601 parser also has fairly strict syntax to handle so I expected it to perform better... Any insights? And which parsing method would you recommend to get optimal performance? ~~~~ Sidenote ~~~~~ DateTime::Format::Builder does not give any easy way to treat „below-second” part with float semantic (treat .12 as 120 miliseconds, treat .1347 as 134700 microseconds, etc). In spite of the fact, that this is most natural and ... only sensible semantics. As you can see from my code and it's results, DateTime::Format::Flexible falls into this trap (treats those digits as nanoseconds even if there are less than 9 of them), while my hand-made builders require postprocessing to clean this field up. Am I missing sth? Is there a way to handle this better?