[GitHub] [spark] dtenedor commented on a diff in pull request #36583: [SPARK-39211][SQL] Support JSON scans with DEFAULT values

GitBox Fri, 20 May 2022 11:11:20 -0700


dtenedor commented on code in PR #36583:
URL: https://github.com/apache/spark/pull/36583#discussion_r878419700



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala:
##########
@@ -419,7 +419,13 @@ class JacksonParser(
     val row = new GenericInternalRow(schema.length)
     var badRecordException: Option[Throwable] = None
     var skipRow = false
-
+    // Apply default values from the column metadata to the initial row, if 
any.
+    if (schema.hasExistenceDefaultValues) {
+      for ((value: Any, i: Int) <- schema.existenceDefaultValues.zipWithIndex) 
{

Review Comment:
   Thanks for pointing this out, I implemented a suggested from Gengliang to 
keep a boolean array to track which columns we need to assign default values 
to. This way it's fast, and we only update columns with explicit default values 
that never got values assigned during the scan.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] dtenedor commented on a diff in pull request #36583: [SPARK-39211][SQL] Support JSON scans with DEFAULT values

Reply via email to