This is an automated email from the ASF dual-hosted git repository.
wangweipeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-fury.git
The following commit(s) were added to refs/heads/main by this push:
new ffc500a2 feat(java): enable deserializeUnexistedClass by default
(#1575)
ffc500a2 is described below
commit ffc500a2fe8ea5af9f0b5aa7c49e7a3709c8e5f8
Author: Shawn Yang <[email protected]>
AuthorDate: Fri Apr 26 07:57:21 2024 +0800
feat(java): enable deserializeUnexistedClass by default (#1575)
---
docs/guide/java_serialization_guide.md | 211 ++++++++++++---------
.../java/org/apache/fury/config/FuryBuilder.java | 9 +-
2 files changed, 133 insertions(+), 87 deletions(-)
diff --git a/docs/guide/java_serialization_guide.md
b/docs/guide/java_serialization_guide.md
index 8f076765..a86c39e4 100644
--- a/docs/guide/java_serialization_guide.md
+++ b/docs/guide/java_serialization_guide.md
@@ -96,24 +96,24 @@ public class Example {
## FuryBuilder options
-| Option Name | Description
[...]
-|-------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
-| `timeRefIgnored` | Whether to ignore reference tracking
of all time types registered in `TimeSerializers` and subclasses of those types
when ref tracking is enabled. If ignored, ref tracking of every time type can
be enabled by invoking `Fury#registerSerializer(Class, Serializer)`. For
example, `fury.registerSerializer(Date.class, new DateSerializer(fury, true))`.
Note that enabling ref tracking should happen before serializer codegen of any
types which contain time [...]
-| `compressInt` | Enables or disables int compression
for smaller size.
[...]
-| `compressLong` | Enables or disables long compression
for smaller size.
[...]
-| `compressString` | Enables or disables string compression
for smaller size.
[...]
-| `classLoader` | The classloader should not be updated;
Fury caches class metadata. Use `LoaderBinding` or `ThreadSafeFury` for
classloader updates.
[...]
-| `compatibleMode` | Type forward/backward compatibility
config. Also Related to `checkClassVersion` config. `SCHEMA_CONSISTENT`: Class
schema must be consistent between serialization peer and deserialization peer.
`COMPATIBLE`: Class schema can be different between serialization peer and
deserialization peer. They can add/delete fields independently.
[...]
-| `checkClassVersion` | Determines whether to check the
consistency of the class schema. If enabled, Fury checks, writes, and checks
consistency using the `classVersionHash`. It will be automatically disabled
when `CompatibleMode#COMPATIBLE` is enabled. Disabling is not recommended
unless you can ensure the class won't evolve.
[...]
-| `checkJdkClassSerializable` | Enables or disables checking of
`Serializable` interface for classes under `java.*`. If a class under `java.*`
is not `Serializable`, Fury will throw an `UnsupportedOperationException`.
[...]
-| `registerGuavaTypes` | Whether to pre-register Guava types
such as `RegularImmutableMap`/`RegularImmutableList`. These types are not
public API, but seem pretty stable.
[...]
-| `requireClassRegistration` | Disabling may allow unknown classes to
be deserialized, potentially causing security risks.
[...]
-| `suppressClassRegistrationWarnings` | Whether to suppress class registration
warnings. The warnings can be used for security audit, but may be annoying,
this suppression will be enabled by default.
| `true`
[...]
-| `shareMetaContext` | Enables or disables meta share mode.
[...]
-| `deserializeUnexistedClass` | Enables or disables
deserialization/skipping of data for non-existent classes.
[...]
-| `codeGenEnabled` | Disabling may result in faster initial
serialization but slower subsequent serializations.
[...]
-| `asyncCompilationEnabled` | If enabled, serialization uses
interpreter mode first and switches to JIT serialization after async serializer
JIT for a class is finished.
[...]
-| `scalaOptimizationEnabled` | Enables or disables Scala-specific
serialization optimization.
[...]
+| Option Name | Description
[...]
+|-------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
+| `timeRefIgnored` | Whether to ignore reference tracking
of all time types registered in `TimeSerializers` and subclasses of those types
when ref tracking is enabled. If ignored, ref tracking of every time type can
be enabled by invoking `Fury#registerSerializer(Class, Serializer)`. For
example, `fury.registerSerializer(Date.class, new DateSerializer(fury, true))`.
Note that enabling ref tracking should happen before serializer codegen of any
types which contain time [...]
+| `compressInt` | Enables or disables int compression
for smaller size.
[...]
+| `compressLong` | Enables or disables long compression
for smaller size.
[...]
+| `compressString` | Enables or disables string compression
for smaller size.
[...]
+| `classLoader` | The classloader should not be updated;
Fury caches class metadata. Use `LoaderBinding` or `ThreadSafeFury` for
classloader updates.
[...]
+| `compatibleMode` | Type forward/backward compatibility
config. Also Related to `checkClassVersion` config. `SCHEMA_CONSISTENT`: Class
schema must be consistent between serialization peer and deserialization peer.
`COMPATIBLE`: Class schema can be different between serialization peer and
deserialization peer. They can add/delete fields independently.
[...]
+| `checkClassVersion` | Determines whether to check the
consistency of the class schema. If enabled, Fury checks, writes, and checks
consistency using the `classVersionHash`. It will be automatically disabled
when `CompatibleMode#COMPATIBLE` is enabled. Disabling is not recommended
unless you can ensure the class won't evolve.
[...]
+| `checkJdkClassSerializable` | Enables or disables checking of
`Serializable` interface for classes under `java.*`. If a class under `java.*`
is not `Serializable`, Fury will throw an `UnsupportedOperationException`.
[...]
+| `registerGuavaTypes` | Whether to pre-register Guava types
such as `RegularImmutableMap`/`RegularImmutableList`. These types are not
public API, but seem pretty stable.
[...]
+| `requireClassRegistration` | Disabling may allow unknown classes to
be deserialized, potentially causing security risks.
[...]
+| `suppressClassRegistrationWarnings` | Whether to suppress class registration
warnings. The warnings can be used for security audit, but may be annoying,
this suppression will be enabled by default.
[...]
+| `shareMetaContext` | Enables or disables meta share mode.
[...]
+| `deserializeUnexistedClass` | Enables or disables
deserialization/skipping of data for non-existent classes.
[...]
+| `codeGenEnabled` | Disabling may result in faster initial
serialization but slower subsequent serializations.
[...]
+| `asyncCompilationEnabled` | If enabled, serialization uses
interpreter mode first and switches to JIT serialization after async serializer
JIT for a class is finished.
[...]
+| `scalaOptimizationEnabled` | Enables or disables Scala-specific
serialization optimization.
[...]
## Advanced Usage
@@ -122,7 +122,7 @@ public class Example {
Single thread fury:
```java
-Fury fury = Fury.builder()
+Fury fury=Fury.builder()
.withLanguage(Language.JAVA)
// enable reference tracking for shared/circular reference.
// Disable it will have better performance if no duplicate reference.
@@ -134,14 +134,14 @@ Fury fury = Fury.builder()
// enable async multi-threaded compilation.
.withAsyncCompilation(true)
.build();
- byte[] bytes = fury.serialize(object);
+ byte[]bytes=fury.serialize(object);
System.out.println(fury.deserialize(bytes));
```
Thread-safe fury:
```java
-ThreadSafeFury fury = Fury.builder()
+ThreadSafeFury fury=Fury.builder()
.withLanguage(Language.JAVA)
// enable reference tracking for shared/circular reference.
// Disable it will have better performance if no duplicate reference.
@@ -157,39 +157,52 @@ ThreadSafeFury fury = Fury.builder()
// enable async multi-threaded compilation.
.withAsyncCompilation(true)
.buildThreadSafeFury();
- byte[] bytes = fury.serialize(object);
+ byte[]bytes=fury.serialize(object);
System.out.println(fury.deserialize(bytes));
```
### Smaller size
+
`FuryBuilder#withIntCompressed`/`FuryBuilder#withLongCompressed` can be used
to compress int/long for smaller size.
Normally compress int is enough.
Both compression are enabled by default, if the serialized is not important,
for example, you use flatbuffers for
-serialization before, which doesn't compress anything, then you should disable
compression. If your data are all numbers,
+serialization before, which doesn't compress anything, then you should disable
compression. If your data are all
+numbers,
the compression may bring 80% performance regression.
-For int compression, fury use 1~5 bytes for encoding. First bit in every byte
indicate whether has next byte. if first bit is set, then next byte will be
read util first bit of next byte is unset.
+For int compression, fury use 1~5 bytes for encoding. First bit in every byte
indicate whether has next byte. if first
+bit is set, then next byte will be read util first bit of next byte is unset.
For long compression, fury support two encoding:
+
- Fury SLI(Small long as int) Encoding (**used by default**):
- If long is in [-1073741824, 1073741823], encode as 4 bytes int: `|
little-endian: ((int) value) << 1 |`
- Otherwise write as 9 bytes: `| 0b1 | little-endian 8bytes long |`
- Fury PVL(Progressive Variable-length Long) Encoding:
- - First bit in every byte indicate whether has next byte. if first bit is
set, then next byte will be read util first bit of next byte is unset.
- - Negative number will be converted to positive number by ` (v << 1) ^ (v
>> 63)` to reduce cost of small negative numbers.
-
-If a number are `long` type, it can't be represented by smaller bytes mostly,
the compression won't get good enough result,
-not worthy compared to performance cost. Maybe you should try to disable long
compression if you find it didn't bring much
+ - First bit in every byte indicate whether has next byte. if first bit is
set, then next byte will be read util
+ first bit of next byte is unset.
+ - Negative number will be converted to positive number by ` (v << 1) ^ (v
>> 63)` to reduce cost of small negative
+ numbers.
+
+If a number are `long` type, it can't be represented by smaller bytes mostly,
the compression won't get good enough
+result,
+not worthy compared to performance cost. Maybe you should try to disable long
compression if you find it didn't bring
+much
space savings.
### Implement a customized serializer
-In some cases, you may want to implement a serializer for your type,
especially some class customize serialization by JDK
-writeObject/writeReplace/readObject/readResolve, which is very inefficient.
For example, you don't want following `Foo#writeObject`
+
+In some cases, you may want to implement a serializer for your type,
especially some class customize serialization by
+JDK
+writeObject/writeReplace/readObject/readResolve, which is very inefficient.
For example, you don't want
+following `Foo#writeObject`
got invoked, you can take following `FooSerializer` as an example:
+
```java
class Foo {
public long f1;
+
private void writeObject(ObjectOutputStream s) throws IOException {
System.out.println(f1);
s.defaultWriteObject();
@@ -216,9 +229,10 @@ class FooSerializer extends Serializer<Foo> {
```
Register serializer:
+
```java
-Fury fury = getFury();
-fury.registerSerializer(Foo.class, new FooSerializer(fury));
+Fury fury=getFury();
+ fury.registerSerializer(Foo.class,new FooSerializer(fury));
```
### Security & Class Registration
@@ -239,31 +253,36 @@ Note that class registration order is important,
serialization and deserializati
should have same registration order.
```java
-Fury fury = xxx;
-fury.register(SomeClass.class);
-fury.register(SomeClass1.class, 200);
+Fury fury=xxx;
+ fury.register(SomeClass.class);
+ fury.register(SomeClass1.class,200);
```
If you invoke `FuryBuilder#requireClassRegistration(false)` to disable class
registration check,
-you can set `org.apache.fury.resolver.ClassChecker` by
`ClassResolver#setClassChecker` to control which classes are allowed
+you can set `org.apache.fury.resolver.ClassChecker` by
`ClassResolver#setClassChecker` to control which classes are
+allowed
for serialization. For example, you can allow classes started with
`org.example.*` by:
+
```java
-Fury fury = xxx;
-fury.getClassResolver().setClassChecker((classResolver, className) ->
className.startsWith("org.example."));
+Fury fury=xxx;
+
fury.getClassResolver().setClassChecker((classResolver,className)->className.startsWith("org.example."));
```
+
```java
-AllowListChecker checker = new
AllowListChecker(AllowListChecker.CheckLevel.STRICT);
-ThreadSafeFury fury = new ThreadLocalFury(classLoader -> {
- Fury f =
Fury.builder().requireClassRegistration(true).withClassLoader(classLoader).build();
+AllowListChecker checker=new
AllowListChecker(AllowListChecker.CheckLevel.STRICT);
+ ThreadSafeFury fury=new ThreadLocalFury(classLoader->{
+ Fury
f=Fury.builder().requireClassRegistration(true).withClassLoader(classLoader).build();
f.getClassResolver().setClassChecker(checker);
checker.addListener(f.getClassResolver());
return f;
-});
-checker.allowClass("org.example.*");
+ });
+ checker.allowClass("org.example.*");
```
-Fury also provided a `org.apache.fury.resolver.AllowListChecker` which is
allowed/disallowed list based checker to simplify
-the customization of class check mechanism. You can use this checker or
implement more sophisticated checker by yourself.
+Fury also provided a `org.apache.fury.resolver.AllowListChecker` which is
allowed/disallowed list based checker to
+simplify
+the customization of class check mechanism. You can use this checker or
implement more sophisticated checker by
+yourself.
### Serializer Registration
@@ -315,30 +334,30 @@ forward/backward compatibility automatically.
// // share meta across serialization.
// .withMetaContextShare(true)
// Not thread-safe fury.
-MetaContext context = xxx;
-fury.getSerializationContext().setMetaContext(context);
-byte[] bytes = fury.serialize(o);
+MetaContext context=xxx;
+ fury.getSerializationContext().setMetaContext(context);
+ byte[]bytes=fury.serialize(o);
// Not thread-safe fury.
-MetaContext context = xxx;
-fury.getSerializationContext().setMetaContext(context);
-fury.deserialize(bytes)
+ MetaContext context=xxx;
+ fury.getSerializationContext().setMetaContext(context);
+ fury.deserialize(bytes)
// Thread-safe fury
-fury.setClassLoader(beanA.getClass().getClassLoader());
-byte[] serialized = fury.execute(
- f -> {
- f.getSerializationContext().setMetaContext(context);
- return f.serialize(beanA);
+ fury.setClassLoader(beanA.getClass().getClassLoader());
+ byte[]serialized=fury.execute(
+ f->{
+ f.getSerializationContext().setMetaContext(context);
+ return f.serialize(beanA);
}
-);
+ );
// thread-safe fury
-fury.setClassLoader(beanA.getClass().getClassLoader());
-Object newObj = fury.execute(
- f -> {
- f.getSerializationContext().setMetaContext(context);
- return f.deserialize(serialized);
+ fury.setClassLoader(beanA.getClass().getClassLoader());
+ Object newObj=fury.execute(
+ f->{
+ f.getSerializationContext().setMetaContext(context);
+ return f.deserialize(serialized);
}
-);
+ );
```
### Deserialize non-existent classes
@@ -358,39 +377,48 @@ returned.
### JDK migration
If you use jdk serialization before, and you can't upgrade your client and
server at the same time, which is common for
-online application. Fury provided an util method
`org.apache.fury.serializer.JavaSerializer.serializedByJDK` to check whether
+online application. Fury provided an util method
`org.apache.fury.serializer.JavaSerializer.serializedByJDK` to check
+whether
the binary are generated by jdk serialization, you use following pattern to
make exiting serialization protocol-aware,
then upgrade serialization to fury in an async rolling-up way:
```java
-if (JavaSerializer.serializedByJDK(bytes)) {
- ObjectInputStream objectInputStream = xxx;
+if(JavaSerializer.serializedByJDK(bytes)){
+ ObjectInputStream objectInputStream=xxx;
return objectInputStream.readObject();
-} else {
+ }else{
return fury.deserialize(bytes);
-}
+ }
```
### Upgrade fury
-Currently binary compatibility is ensured for minor versions only. For
example, if you are using fury`v0.2.0`, binary compatibility will
-be provided if you upgrade to fury `v0.2.1`. But if upgrade to fury `v0.4.1`,
no binary compatibility are ensured.
-Most of the time there is no need to upgrade fury to newer major version, the
current version is fast and compact enough,
+
+Currently binary compatibility is ensured for minor versions only. For
example, if you are using fury`v0.2.0`, binary
+compatibility will
+be provided if you upgrade to fury `v0.2.1`. But if upgrade to fury `v0.4.1`,
no binary compatibility are ensured.
+Most of the time there is no need to upgrade fury to newer major version, the
current version is fast and compact
+enough,
and we provide some minor fix for recent older versions.
-But if you do want to upgrade fury for better performance and smaller size,
you need to write fury version as header to serialized data
+But if you do want to upgrade fury for better performance and smaller size,
you need to write fury version as header to
+serialized data
using code like following to keep binary compatibility:
+
```java
-MemoryBuffer buffer = xxx;
-buffer.writeVarInt32(2);
-fury.serialize(buffer, obj);
+MemoryBuffer buffer=xxx;
+ buffer.writeVarInt32(2);
+ fury.serialize(buffer,obj);
```
+
Then for deserialization, you need:
+
```java
-MemoryBuffer buffer = xxx;
-int furyVersion = buffer.readVarInt32()
-Fury fury = getFury(furyVersion);
-fury.deserialize(buffer);
+MemoryBuffer buffer=xxx;
+ int furyVersion=buffer.readVarInt32()
+ Fury fury=getFury(furyVersion);
+ fury.deserialize(buffer);
```
+
`getFury` is a method to load corresponding fury, you can shade and relocate
different version of fury to different
package, and load fury by version.
@@ -398,19 +426,30 @@ If you upgrade fury by minor version, or you won't have
data serialized by older
no need to `versioning` the data.
## Trouble shooting
+
### Class inconsistency and class version check
-If you create fury without setting `CompatibleMode` to
`org.apache.fury.config.CompatibleMode.COMPATIBLE`, and you got a strange
+
+If you create fury without setting `CompatibleMode` to
`org.apache.fury.config.CompatibleMode.COMPATIBLE`, and you got a
+strange
serialization error, it may be caused by class inconsistency between
serialization peer and deserialization peer.
-In such cases, you can invoke `FuryBuilder#withClassVersionCheck` to create
fury to validate it, if deserialization throws
`org.apache.fury.exception.ClassNotCompatibleException`, it shows class are
inconsistent, and you should create fury with
+In such cases, you can invoke `FuryBuilder#withClassVersionCheck` to create
fury to validate it, if deserialization
+throws `org.apache.fury.exception.ClassNotCompatibleException`, it shows class
are inconsistent, and you should create
+fury with
`FuryBuilder#withCompaibleMode(CompatibleMode.COMPATIBLE)`.
-`CompatibleMode.COMPATIBLE` has more performance and space cost, do not set it
by default if your classes are always consistent between serialization and
deserialization.
+`CompatibleMode.COMPATIBLE` has more performance and space cost, do not set it
by default if your classes are always
+consistent between serialization and deserialization.
### Use wrong API for deserialization
-If you serialize an object by invoking `Fury#serialize`, you should invoke
`Fury#deserialize` for deserialization instead of
+
+If you serialize an object by invoking `Fury#serialize`, you should invoke
`Fury#deserialize` for deserialization
+instead of
`Fury#deserializeJavaObject`.
-If you serialize an object by invoking `Fury#serializeJavaObject`, you should
invoke `Fury#deserializeJavaObject` for deserialization instead of
`Fury#deserializeJavaObjectAndClass`/`Fury#deserialize`.
+If you serialize an object by invoking `Fury#serializeJavaObject`, you should
invoke `Fury#deserializeJavaObject` for
+deserialization instead of
`Fury#deserializeJavaObjectAndClass`/`Fury#deserialize`.
-If you serialize an object by invoking `Fury#serializeJavaObjectAndClass`, you
should invoke `Fury#deserializeJavaObjectAndClass` for deserialization instead
of `Fury#deserializeJavaObject`/`Fury#deserialize`.
+If you serialize an object by invoking `Fury#serializeJavaObjectAndClass`, you
should
+invoke `Fury#deserializeJavaObjectAndClass` for deserialization instead
+of `Fury#deserializeJavaObject`/`Fury#deserialize`.
diff --git
a/java/fury-core/src/main/java/org/apache/fury/config/FuryBuilder.java
b/java/fury-core/src/main/java/org/apache/fury/config/FuryBuilder.java
index 05e30d7b..264c23c8 100644
--- a/java/fury-core/src/main/java/org/apache/fury/config/FuryBuilder.java
+++ b/java/fury-core/src/main/java/org/apache/fury/config/FuryBuilder.java
@@ -70,7 +70,7 @@ public final class FuryBuilder {
boolean requireClassRegistration = true;
boolean shareMetaContext = false;
boolean codeGenEnabled = true;
- boolean deserializeUnexistedClass = false;
+ Boolean deserializeUnexistedClass;
boolean asyncCompilationEnabled = false;
boolean registerGuavaTypes = true;
boolean scalaOptimizationEnabled = false;
@@ -293,6 +293,13 @@ public final class FuryBuilder {
}
if (compatibleMode == CompatibleMode.COMPATIBLE) {
checkClassVersion = false;
+ if (deserializeUnexistedClass == null) {
+ deserializeUnexistedClass = true;
+ }
+ } else {
+ if (deserializeUnexistedClass == null) {
+ deserializeUnexistedClass = false;
+ }
}
if (!requireClassRegistration) {
LOG.warn(
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]