[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4789: [HUDI-1296] Support Metadata Table in Spark Datasource

GitBox Tue, 22 Feb 2022 12:23:11 -0800


alexeykudinkin commented on a change in pull request #4789:
URL: https://github.com/apache/hudi/pull/4789#discussion_r812326499




##########
File path: 
hudi-common/src/test/java/org/apache/hudi/common/testutils/RawTripTestPayload.java
##########
@@ -80,6 +80,23 @@ public RawTripTestPayload(String jsonData) throws 
IOException {
     this.isDeleted = false;
   }
 
+  /**
+   * @deprecated PLEASE READ THIS CAREFULLY
+   *
+   * Converting properly typed schemas into JSON leads to inevitable 
information loss, since JSON
+   * encodes only representation of the record (with no schema accompanying 
it), therefore occasionally
+   * losing nuances of the original data-types provided by the schema (for ex, 
with 1.23 literal it's
+   * impossible to tell whether original type was Double or Decimal).
+   *
+   * Multiplied by the fact that Spark 2 JSON schema inference has substantial 
gaps in it (see below),
+   * it's **NOT RECOMMENDED** to use this method. Instead please consider 
using {@link AvroConversionUtils#createDataframe()}
+   * method accepting list of {@link HoodieRecord} (as produced by the {@link 
HoodieTestDataGenerator}
+   * to create Spark's {@code Dataframe}s directly.
+   *
+   * REFs
+   * 
https://medium.com/swlh/notes-about-json-schema-handling-in-spark-sql-be1e7f13839d
+   */
+  @Deprecated

Review comment:
       @xushiyan @YannByron i actually think that fair amount of compatibility 
issues as we transition from Spark 2 to 3 would be b/c of this. Would suggest 
to just point whoever is working on this transition to this new utility to 
address test failures




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4789: [HUDI-1296] Support Metadata Table in Spark Datasource

Reply via email to