https://issues.apache.org/bugzilla/show_bug.cgi?id=48425
Summary: DateUtil.isCellDateFormatted() method is slow
Product: POI
Version: 3.6
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: POI Overall
AssignedTo: [email protected]
ReportedBy: [email protected]
I have done some performance testing for code reading data from large
spreadsheets using POI. In this use case, I found that half of the CPU time
was spent in a single method in POI: DateUtil.isCellDateFormatted(cell). We
call this method every time we extract a value from a cell in order to
correctly create Date objects when cells contain dates.
Looking at this method, it spends most of its time in DateUtil.isADateFormat().
This method is very slow, as it performs seven regular expression
substitutions on the formatString parameter and one additional regex match.
None of the regexes are precompiled, so they're all compiled on every call to
this method.
I would suggest replacing the first five regexes with calls to a string
substitution method that doesn't require regexes, as they are simple
replacements. For the remaining three regexes, I would suggest precompiling
them instead of just calling String.replaceAll() and String.matches().
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]