Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/21472#discussion_r192339647
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
---
@@ -747,8 +748,13 @@ case class StructsToJson(
object JsonExprUtils {
- def validateSchemaLiteral(exp: Expression): StructType = exp match {
- case Literal(s, StringType) =>
CatalystSqlParser.parseTableSchema(s.toString)
+ def validateSchemaLiteral(exp: Expression): DataType = exp match {
+ case Literal(s, StringType) =>
+ try {
+ DataType.fromJson(s.toString)
--- End diff --
> How do they get the metadata ...
Metadata is stored together with data in distributed fs and loaded by a
standard facilities of language.
> and how do they insert it into SQL?
SQL statements are formed programmatically as strings, and loaded schemas
are inserted in particular positions in the string ( you can think about it as
quasiquotes in Scala). The formed sql statements are sent via JDBC to Spark.
> Is that the only way to do it?
Probably it is possible to convert schemas in JSON format to DDL format but:
- it requires much more effort and time than just modifying 5 lines
proposed in the PR
- Schema in DDL supports only `StructType` as root types. It is not
possible to specify `MapType` like in the test:
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala#L330-L345
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]