yihua commented on code in PR #11373:
URL: https://github.com/apache/hudi/pull/11373#discussion_r1685274595
##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -157,7 +156,7 @@ private static class AvroSupport {
private static final String OVERFLOW_BYTES_FIELD_NAME = "proto_bytes";
private static final Schema RECURSION_OVERFLOW_SCHEMA =
Schema.createRecord("recursion_overflow", null, "org.apache.hudi.proto", false,
Arrays.asList(new Schema.Field(OVERFLOW_DESCRIPTOR_FIELD_NAME,
STRING_SCHEMA, null, ""),
- new Schema.Field(OVERFLOW_BYTES_FIELD_NAME,
Schema.create(Schema.Type.BYTES), null, getUTF8Bytes(""))));
+ new Schema.Field(OVERFLOW_BYTES_FIELD_NAME,
Schema.create(Schema.Type.BYTES), null, "".getBytes())));
Review Comment:
This change seems unintended?
##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/helpers/TestProtoConversionUtil.java:
##########
@@ -206,7 +233,7 @@ private Pair<Sample, GenericRecord>
createInputOutputSampleWithRandomValues(Sche
long primitiveFixedSignedLong = RANDOM.nextLong();
boolean primitiveBoolean = RANDOM.nextBoolean();
String primitiveString = randomString(10);
- byte[] primitiveBytes = getUTF8Bytes(randomString(10));
+ byte[] primitiveBytes = randomString(10).getBytes();
Review Comment:
Similar here on unintended changes. In OSS, we have explicitly enforced
UTF-8, although `.getBytes()` implicitly uses `UTF-8` on UNIX-like systems.
##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -348,17 +348,17 @@ private Object getDefault(Descriptors.FieldDescriptor f) {
case SFIXED64:
return 0;
case UINT64:
- return "\u0000"; // requires bytes for decimal type
+ return DECIMAL_CONVERSION.toFixed(new BigDecimal(BigInteger.ZERO),
fieldSchema, fieldSchema.getLogicalType()).bytes();
Review Comment:
I assume this does not cause backwards compatibility issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]