AngersZhuuuu commented on a change in pull request #33362:
URL: https://github.com/apache/spark/pull/33362#discussion_r672146117
##########
File path: docs/sql-ref-syntax-qry-select-transform.md
##########
@@ -57,16 +65,38 @@ SELECT TRANSFORM ( expression [ , ... ] )
Specifies a command or a path to script to process data.
-### SerDe behavior
+### ROW FORMAT DELIMITED BEHAVIOR
+
+When spark use `ROW FORMAT DELIMITED` format, Spark will use `\u0001` as
default filed delimit,
+use `\n` as default line delimit and use `"\N"` as `NULL` value in order to
differentiate `NULL` values
+from empty strings. These delimit can be overridden by `FIELDS TERMINATED BY`,
`LINES TERMINATED BY` and
+`NULL TERMINATED AS`. Since we use `to_json` and `from_json` to handle complex
data type, so
+`COLLECTION ITEMS TERMINATED BY` and `MAP KEYS TERMINATED BY` won't work in
current code.
+Spark will cast all columns to `STRING` and combined by tabs before feeding to
the user script.
+For complex type such as `ARRAY\MAP\STRUCT`, spark use `to_json` cast it to
input json string
+and use `from_json` to convert result output to `ARRAY/MAP/STRUCT` data. The
standard output of
+the user script will be treated as tab-separated `STRING` columns, any cell
containing only `"\N"`
+will be re-interpreted as a `NULL` value, and then the resulting STRING column
will be cast to the
+data type specified in `col_type`. If the actual number of output columns is
less than the number
+of specified output columns, insufficient output columns will be supplemented
with `NULL`.
+If the actual number of output columns is more than the number of specified
output columns,
+the output columns will only select the corresponding columns and the
remaining part will be discarded.
Review comment:
How about current
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]