Hi,

On 7/19/2017 5:49 AM, Peter Levart wrote:
Hi Claes,


On 07/17/2017 02:16 PM, Claes Redestad wrote:
Hi Peter!

On 2017-07-15 14:08, Peter Levart wrote:

It seems that interning signature(s) is important for correctness (for example, in ObjectOutputStream.writeTypeString(str) the 'str' is used to lookup a handle so that handles are put into stream instead of the type signature(s) for multiple references to the same type). Looking up objects in handles table is based on identity comparison.

Yes, interned signatures is important for correctness (and performance?)
of the current serialization implementation.


But there might be a way to obtain a singleton signature String per type and still profit. By adding a field to java.lang.Class and caching the JVM signature there. This would also be a useful public method, don't you think?

I have a nagging feeling that we should be careful about leaking
implementation details about the underlying VM through public APIs,
since making changes to various specifications is hard enough as it is.

You're right. There's already more than enough "implementation details" that pertain to JVM exposed through reflection API which was supposed to represent Java - the language - view of the world. JVM type signatures just happen to be used in serialization too, which is another implementation detail which might change in the future (with value types etc), so it's better to keep it private.
right



Out of 191 ObjectStreamField constructions I found in JDK sources, there are only 39 distinct field types involved, so the number if intern() calls is reduced by a factor of ~5. There's no need to cache signature in ObjectStreamField(s) this way any more, but there must still be a single final field for ObjectStreamField(s) constructed with explicit signature(s).

Here's how this looks like in code:

http://cr.openjdk.java.net/~plevart/misc/Class.getJvmTypeSignature/webrev.01/

Could this be done as a ClassValue instead of another field on Class? My
guess is only a small number of classes in any given app will be directly
involved in serialization, so growing Class seems to be a pessimization.

It could be, yes. We are trying to solve two issues here. One is the original 8184603 which is concerned with start-up overhead and your proposal is the right solution for it as it only delays the work to when/if it is needed. The other issue is overheads of repeatable signature interning. These are not frequent enough for cases that just create a bunch of ObjectStreamField instances assigned to static final fields, but I suspect are more frequent when signatures are being de-serialized from stream. At that time, we don't yet have a Class object to go with the signature and to use as a caching anchor, but we still want to keep the invariant of OSF signature(s) being interned Strings. If they really need to be interned right away in that case is a question which needs more studying of deserialization code.
The pacakge-private ObjectStreamField constructor(name, signature, unshare) is used only to create temporary OSF objects during deserialization. Those OSF instances are compared with the OSF instances created from the local class to determine common fields.
The signature.intern() in that constructor is not significant.

The signature.intern() in the public constructor is not important for correctness,
comparisons between signatures use equals.

It may have a slight performance or size impact on the object streams because otherwise
equivalent signatures will be serialized as separate strings.



What do you think?

I wonder what workloads actually see a bottleneck in these String.intern
calls, and *which* String.intern calls we are bottlenecking on in these
workloads. There's still a couple of constructors here that won't see a
speedup.

Right. I suspect the intern() call bottleneck is most problematic when deserializing. All other cases could be optimized by caching the signature on the appropriate Class object(s) via ClassValue for example.
I'd remove the intern in the package-private constructor.


I think we need more data to ensure this is actually worthwhile to pursue,
or whether there are other optimizations on a higher level that could
be done.

Ok, we agree that no new public API for JVM signatures is desired and the problem of intern() calls bottleneck when deserializing should be researched more deeply. I agree that your solution is currently the best for the original issue.
Ditto.

Roger



Reply via email to