Re: RFR: 8247532: Records deserialization is slow

Peter Levart Sun, 21 Jun 2020 05:52:48 -0700

Hi Chris,


Good work on SerialByteStreamBuilder. A very useful tool.

What I noticed in the tests is that you usually follow the followingpattern:


- create a record instance r

- serialize it to bytes and deserialize it back to instance deser1

- create a stream with SerialByteStreamBuilder and deserialize it toinstance deser2


- compare deser1 with deser2

Now if there was a bug in deserialization logic it would be possible forboth deser1 and deser2 to be equal but still wrong. Adding anotherassert to compare r with deser1 in each test could catch such bug.



In the following test:


 245     public void testArrays() throws Exception {
 246         out.println("\n---");
 247         {

248 record IntArray(int[] ints, long[] longs) implementsSerializable { } 249 IntArray r = new IntArray(new int[] { 5, 4, 3,2, 1 },new long[] { 9L });

 250             IntArray deser1 = deserialize(serialize(r));
 251
 252             byte[] builderBytes = SerialByteStreamBuilder
 253                     .newBuilder(IntArray.class.getName())

254 .addField("ints", String[].class, new int[] {5, 4, 3,2, 1 }) 255 .addField("longs", String[].class, new long[]{ 9L })

 256                     .build();
 257
 258             IntArray deser2 = deserialize(builderBytes);
 259             assertEquals(deser2.ints(), deser1.ints());
 260             assertEquals(deser2.longs(), deser1.longs());
 261         }

...in lines 254, 255, you specify wrong types of fields (making themethod generic with a type parameter could prevent such mistakes). Iwonder why the test passes. In both variants (with or without the patch)we have this check in java.io.ObjectStreamClass.RecordSupport:



            for (int i = 0; i < fields.length; i++) {
                ObjectStreamField f = fields[i];
                String fName = f.getName();
                if (!fName.equals(pName))
                    continue;

                Class<?> fType = f.getField().getType();

                if (!pType.isAssignableFrom(fType))

                    throw new InternalError(fName + " unassignable, pType:" + pType + 
", fType:" + fType);

... which is actually checking types of reflectedjava.lang.reflact.Field(s) with types of reflectedjava.lang.reflect.RecordComponent(s) that match in name which alwaysmatch when record type is correctly constructed. And later when valuesare deserialized and assigned to constructor parameters, we have theright types of values in this "unusual" stream. In other words, changingthe field type from one reference type to another is always a compatiblechange as far as stream format is concerned, deserialization thensucceeds if the value in the stream is of correct type for the localtype of field. Theoretically the change can be such that the referencetypes are unrelated, because such types have one common value: null. Inthis example the stream was unusual. The types were unrelated, but thestream had the correct type of value != null.

Anyway, maybe the test should also contain a case where the type offield does change compatibly (for example, Integer field serialized,deserialized into Number field), and one or more where it changesincompatibly. For the incompatible changes, I think some exceptionswould be emitted by MethodHandle adapters that try to adapt incompatiblereference types, other incompatible primitive field type changes wouldbe caught by java.io.ObjectStreamClass.matchFields() method...

I think that the following method in SerialByteStreamBuilder is alsoincorrect:

363 private void writeObjectFieldDesc(DataOutputStream out)throws IOException { 364 for (Map.Entry<NameAndType,Object> entry :objectFields.entrySet()) {

 365                 Class<?> cl = entry.getKey().type();
 366                 assert !cl.isPrimitive();
 367                 // obj_typecode
 368                 if (cl.isArray()) {
 369                     out.writeByte('[');
 370                 } else {
 371                     out.writeByte('L');
 372                 }
 373                 writeUTF(out, entry.getKey().name());
 374                 out.writeByte(TC_STRING);

375 writeUTF(out, "L" + cl.getName().replace('.', '/')+ ";");

 376
 377             }
 378         }


Line 375 should probably be:


            writeUTF(out,

(cl.isArray() ? cl.getName() : "L" + cl.getName()+ ";")

                         .replace('.', '/'));

So I took the liberty and incorporated your work into my patch (bothwill be authors of the patch) and addressed above concerns. The modifiedtest does not pass on JDK14. In testIncompatibleRefFieldTypeChange JDK14 throws ClassCastException (because it is thrown from the binding ofarguments to method handle) while with the patched JDK16 it throwsInvalidObjectException, because ClassCastException is thrown frominvokeExact, which is then wrapped with InvalidObjectException and Ithink is more correct:



http://cr.openjdk.java.net/~plevart/jdk-dev/RecordsDeserialization/webrev.04/


So this patch actually fixes a little bug too.


Regards, Peter


6/17/20 4:13 PM, Chris Hegarty wrote:

Peter,

On 17 Jun 2020, at 09:23, Chris Hegarty <chris.hega...@oracle.com> wrote:

On 17 Jun 2020, at 06:44, Peter Levart <peter.lev...@gmail.com> wrote:

..

If there is a way for a test to compile several versions of the same (record) 
class and then produce byte streams with it, then we could simulate various 
class-evolution cases (field missing, surplus field, change of field type, 
etc...) without crafting the bytestream by hand. WDYT?

I have a better idea. The test could contain several different classes with different 
names that would be used to generated variations of serialized stream. Such streams could 
then be "patched" so they would contain the common target class name and then 
used to (attempt to) deserialize the target class. I'll try to devise such test…

I think that could work. I’ll mock something up too, just for comparison 
purposes.


Here is an initial version at a test that can be expanded to cover more of the 
stream-field shapes of serializable records.

I quickly backed away from generating the byte streams using OOS alone, since 
it enforces strict ordering of the fields in the stream, beyond that of 
primitives followed by non-primitives. And I want to be able to do things like, 
for example, record Point(int x, int y) - stream of x:1 y:2, or y:2 x:1 - which 
will verify the nominality aspect of the caching implementation. So I opted for 
a basic byte-stream-builder approach.

https://cr.openjdk.java.net/~chegar/serialFieldTest_webrev/test/jdk/java/io/Serializable/records/DifferentStreamFieldsTest.java.html

I also would like a builder of serial byte streams anyway, as it will be useful 
for things beyond this change. Maybe it could even be expanded upon and used as 
a test library, at some future point. The above implementation can probably be 
easily broken if pushed hard with larger graphs of objects, but it should be 
good enough for what we need here.

-Chris.

Re: RFR: 8247532: Records deserialization is slow

Reply via email to