alexeykudinkin commented on code in PR #6977:
URL: https://github.com/apache/hudi/pull/6977#discussion_r998458384
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/commmon/model/HoodieSparkRecord.java:
##########
@@ -70,51 +74,67 @@
* need to be updated (ie serving as an overlay layer on top of
[[UnsafeRow]])</li>
* </ul>
*
-
*/
-public class HoodieSparkRecord extends HoodieRecord<InternalRow> {
+public class HoodieSparkRecord extends HoodieRecord<InternalRow> implements
KryoSerializable {
/**
* Record copy operation to avoid double copying. InternalRow do not need to
copy twice.
*/
private boolean copy;
/**
- * We should use this construction method when we read internalRow from file.
- * The record constructed by this method must be used in iter.
+ * NOTE: {@code HoodieSparkRecord} is holding the schema only in cases when
it would have
+ * to execute {@link UnsafeProjection} so that the {@link InternalRow}
it's holding to
+ * could be projected into {@link UnsafeRow} and be efficiently
serialized subsequently
+ * (by Kryo)
*/
- public HoodieSparkRecord(InternalRow data) {
+ private final transient StructType schema;
Review Comment:
We actually now don't do projection in ctor, instead we only do it when
serializing `HoodieSparkRecord`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]