[GitHub] [flink] stevenzwu commented on a change in pull request #17111: [FLINK-24064][connector/common] HybridSource restore from savepoint

GitBox Thu, 02 Sep 2021 12:31:50 -0700


stevenzwu commented on a change in pull request #17111:
URL: https://github.com/apache/flink/pull/17111#discussion_r701365690




##########
File path: 
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/hybrid/HybridSourceEnumeratorStateSerializer.java
##########
@@ -54,12 +44,9 @@ public int getVersion() {
         try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
                 DataOutputStream out = new DataOutputStream(baos)) {
             out.writeInt(enumState.getCurrentSourceIndex());
-            SimpleVersionedSerializer<Object> serializer =
-                    serializerOf(enumState.getCurrentSourceIndex());
-            out.writeInt(serializer.getVersion());
-            byte[] enumStateBytes = 
serializer.serialize(enumState.getWrappedState());
-            out.writeInt(enumStateBytes.length);
-            out.write(enumStateBytes);
+            out.writeInt(enumState.wrappedStateSerializerVersion());
+            out.writeInt(enumState.getWrappedState().length);

Review comment:
       Each Iceberg split contains data files, delete files (for upsert), 
schema string. Each data file also contains stats for every column. if the 
table is wide (many columns), each split may go over 10 KB




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] stevenzwu commented on a change in pull request #17111: [FLINK-24064][connector/common] HybridSource restore from savepoint

Reply via email to