Abacn commented on code in PR #28624:
URL: https://github.com/apache/beam/pull/28624#discussion_r1337539782
##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableWriteSchemaTransformProvider.java:
##########
@@ -179,12 +179,13 @@ public KV<ByteString, Iterable<Mutation>> apply(Row row) {
.setColumnQualifier(
ByteString.copyFrom(ofNullable(mutation.get("column_qualifier")).get()))
.setFamilyNameBytes(
-
ByteString.copyFrom(ofNullable(mutation.get("family_name")).get()));
- if (mutation.containsKey("timestamp_micros")) {
- setMutation =
- setMutation.setTimestampMicros(
-
Longs.fromByteArray(ofNullable(mutation.get("timestamp_micros")).get()));
- }
+
ByteString.copyFrom(ofNullable(mutation.get("family_name")).get()))
+ // Use timestamp if provided, else default to -1 (current
Bigtable server time)
+ .setTimestampMicros(
Review Comment:
This is due to a different/inconsistent behavior in Java an Python API. For
Python, If timestamp not set it defaults to -1:
https://github.com/googleapis/python-bigtable/blob/e5af3597f45fc4c094c59abca876374f5a866c1b/google/cloud/bigtable/row.py#L164
For Java, if timestamp not set it defaults to 0 and causing problem
Arguably the Documentation for Java client asks user to set Timestamp and
warns that it will defaults to 0 if unspecified:
https://github.com/googleapis/java-bigtable/blob/15cd4868ff807513914095a3758134eaa14f0ea3/proto-google-cloud-bigtable-v2/src/main/java/com/google/bigtable/v2/Mutation.java#L902
Consequently the possible misuse (did not set Timestamp and then data loss)
could still happen in Java BigtableIO with user constructed Mutation: #27022
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]