HyukjinKwon commented on a change in pull request #33362:
URL: https://github.com/apache/spark/pull/33362#discussion_r670326046
##########
File path: docs/sql-ref-syntax-qry-select-transform.md
##########
@@ -57,16 +65,38 @@ SELECT TRANSFORM ( expression [ , ... ] )
Specifies a command or a path to script to process data.
-### SerDe behavior
+### ROW FORMAT DELIMITED BEHAVIOR
+
+When spark use `ROW FORMAT DELIMITED` format, Spark will use `\u0001` as
default filed delimit,
+use `\n` as default line delimit and use `"\N"` as `NULL` value in order to
differentiate `NULL` values
+from empty strings. These delimit can be overridden by `FIELDS TERMINATED BY`,
`LINES TERMINATED BY` and
+`NULL TERMINATED AS`. Since we use `to_json` and `from_json` to handle complex
data type, so
+`COLLECTION ITEMS TERMINATED BY` and `MAP KEYS TERMINATED BY` won't work in
current code.
+Spark will cast all columns to `STRING` and combined by tabs before feeding to
the user script.
+For complex type such as `ARRAY\MAP\STRUCT`, spark use `to_json` cast it to
input json string
Review comment:
```suggestion
For complex type such as `ARRAY`\`MAP`\`STRUCT`, spark use `to_json` cast it
to input json string
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]