Hi Daniel,
On 02/23/2015 10:50 AM, Daniel Fuchs wrote:
Hi Peter,
On 22/02/15 21:21, Peter Levart wrote:
Hi,
I noticed internal JDK class java.util.zip.ZipUtils uses deprecated
java.util.Date API for implementing two methods for converting DOS to
Java time and back. I thought I'd try translating them to use new
java.time API. Translation was straightforward, but I noticed that new
implementations are not on par with speed to old java.util.Date
versions. Here's a JMH benchmark implementing those two conversion
methods with java.util.Date and java.time.ZonedDateTime APIs:
http://cr.openjdk.java.net/~plevart/jdk9-dev/ZoneId.systemDefault/ZipUtilsTest.java
The results show the following:
Benchmark Mode Samples Score Score
error Units
j.t.ZipUtilsTest.dosToJavaTime_Date avgt 5 101.890 17.570
ns/op
j.t.ZipUtilsTest.dosToJavaTime_ZDT avgt 5 137.587 13.533
ns/op
j.t.ZipUtilsTest.javaToDosTime_Date avgt 5 67.114 10.382
ns/op
j.t.ZipUtilsTest.javaToDosTime_ZDT avgt 5 143.739 15.243
ns/op
Quick sampling with jvisualvm shows that a substantial time is spent
repeatedly obtaining ZoneId.systemDefault() instance. I checked the
implementation and came up with the following optimization:
http://cr.openjdk.java.net/~plevart/jdk9-dev/ZoneId.systemDefault/webrev.01/
It will be good to hear what Stephen will have to say about this.
Also I see that Claes is suggesting using LocalDateTime.
I'll compare LocalDateTime based approach too, but the conversion logic
needs to take into account ZoneId.systemDefault(). Most zip utilities
interpret DOS time in zip entries (when it is not specified in extra
field data) in the local time zone and so does ZipEntry. This is
unfortunate, as interpreting it in fixed (GMT) time zone would be more
appropriate and less troublesome. So speeding-up ZoneId.systemDefault()
is needed anyway.
If there's no undesirable side effect then caching zoneId
inside TimeZone looks reasonable to me - however I'm not an expert
of the field, and I got to learn that time/date can be more complex
than what I first thought ;-)
One thing I noticed is that there are two kinds of ZoneRulesProvider(s):
those that are static and allow their rules to be cached and the dynamic
ones, that can provide dynamic view on rules and therefore don't allow
caching of rules. But this aspect is encapsulated in ZoneRegion class
(an implementation of ZoneId). So I think an instance to ZoneRegion (or
any ZoneId) can be cached for indefinite time as long as it's id is the
one that we are looking for.
The use of SharedSecrets seems rather ugly though - and its only
purpose seems to avoid a clone(). I have to wonder whether there
is a significant performance gain in this. I mean - if you simply
cached zoneId in TimeZone and called TimeZone.getDefault() directly,
how worse would that be?
Caching of ZoneId in the defensive clone of defualt TimeZone object
would not have a desired effect as the clone is thrown away in each call
to ZoneId.systemDefault(). We must get hold of the TimeZone instance
that is cached. Another way to do that would be to take the route of
reflection objects (Field, Method, Constructor) and put a pointer to
'root' TimeZone instance in the clones, so it would be accessible
through the clone.
Finally, I also wonder whether a better idea would be to add
a variant of dosToJavaTime/javaToDosTime that would take an
additional zoneId as parameter.
For ZipEntry it would not make much sense as there's only one time value
that has to be converted only once in each ZipEntry.
You probably don't want to support changing the time zone in
the middle of a Zip. Do you?
Ok, in that case the ZoneId would have to be attached to ZipFile and
passed to each ZipEntry. I agree that per ZipFile default zone atomicity
is a desired property. Speeding up ZoneId.systemDefault() retrieval is
also generally desirable, don't you think?
Regards, Peter
best regards,
-- daniel
TimeZone is a mutable object and has to be defensively cloned when
TimeZone.getDefault() is invoked. So there's no point in caching a
ZoneId instance inside TimeZone instance if we cache it on a clone that
is thrown away each time ZoneId.systemDefault() is invoked. I use
SharedSecrets to access the uncloned TimeZone.defaultTimeZone instance
where caching of ZoneId pays of.
I think that it was never meant to change TimeZone's ID
(TimeZone.setID()) after such instance was put into operation (for
example installed as default time zone with TimeZone.setDefault()). Such
use is unintended and buggy. So I also changed the implementation of
TimeZone.setDefault() to save a defensive copy of TimeZone object as
default time zone instead of a reference to it.
With this patch, the DOS/Java time conversion benchmark shows an
improvement:
Benchmark Mode Samples Score Score
error Units
j.t.ZipUtilsTest.dosToJavaTime_Date avgt 5 97.986 18.379
ns/op
j.t.ZipUtilsTest.dosToJavaTime_ZDT avgt 5 85.010 10.863
ns/op
j.t.ZipUtilsTest.javaToDosTime_Date avgt 5 71.073 25.408
ns/op
j.t.ZipUtilsTest.javaToDosTime_ZDT avgt 5 95.298 17.620
ns/op
Is this patch correct or did I miss something from the internals of
java.time API that does not permit such caching?
Regards, Peter