[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16981 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104278366 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -566,3 +586,17 @@ case class StructToJson( override def inputTypes: Seq[AbstractDataType] = StructType :: Nil } + +object StructToJson { + + def convertToMapData(exp: Expression): Map[String, String] = exp match { +case m: CreateMap +if m.dataType.acceptsType(MapType(StringType, StringType, valueContainsNull = false)) => + val arrayMap = m.eval().asInstanceOf[ArrayBasedMapData] + ArrayBasedMapData.toScalaMap(arrayMap).map { case (key, value) => +key.toString -> value.toString + } +case _ => + throw new AnalysisException("Must use a map() function for options") --- End diff -- How about this? https://github.com/apache/spark/pull/16981/files#diff-6626026091295ad8c0dfb66ecbcd04b1R601 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104277851 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -566,3 +586,17 @@ case class StructToJson( override def inputTypes: Seq[AbstractDataType] = StructType :: Nil } + +object StructToJson { + + def convertToMapData(exp: Expression): Map[String, String] = exp match { +case m: CreateMap +if m.dataType.acceptsType(MapType(StringType, StringType, valueContainsNull = false)) => + val arrayMap = m.eval().asInstanceOf[ArrayBasedMapData] + ArrayBasedMapData.toScalaMap(arrayMap).map { case (key, value) => +key.toString -> value.toString + } +case _ => + throw new AnalysisException("Must use a map() function for options") --- End diff -- okay! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104277780 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -566,3 +586,17 @@ case class StructToJson( override def inputTypes: Seq[AbstractDataType] = StructType :: Nil } + +object StructToJson { + + def convertToMapData(exp: Expression): Map[String, String] = exp match { +case m: CreateMap +if m.dataType.acceptsType(MapType(StringType, StringType, valueContainsNull = false)) => + val arrayMap = m.eval().asInstanceOf[ArrayBasedMapData] + ArrayBasedMapData.toScalaMap(arrayMap).map { case (key, value) => +key.toString -> value.toString + } +case _ => + throw new AnalysisException("Must use a map() function for options") --- End diff -- This message is misleading in the following case: ``` df2.selectExpr("to_json(a, map('a', 1))") ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104277787 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -566,3 +586,17 @@ case class StructToJson( override def inputTypes: Seq[AbstractDataType] = StructType :: Nil } + +object StructToJson { + + def convertToMapData(exp: Expression): Map[String, String] = exp match { +case m: CreateMap +if m.dataType.acceptsType(MapType(StringType, StringType, valueContainsNull = false)) => + val arrayMap = m.eval().asInstanceOf[ArrayBasedMapData] + ArrayBasedMapData.toScalaMap(arrayMap).map { case (key, value) => +key.toString -> value.toString + } +case _ => + throw new AnalysisException("Must use a map() function for options") --- End diff -- Please also include the test case for this. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104201532 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala --- @@ -55,4 +58,15 @@ object JacksonUtils { schema.foreach(field => verifyType(field.name, field.dataType)) } + + /** Convert a map literal to Map-type data. */ + def validateMapData(exp: Expression): Map[String, String] = exp match { +case m: CreateMap => m.dataType.acceptsType(MapType(StringType, StringType, false)) --- End diff -- Nit: `valueContainsNull = false` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104201434 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala --- @@ -55,4 +58,15 @@ object JacksonUtils { schema.foreach(field => verifyType(field.name, field.dataType)) } + + /** Convert a map literal to Map-type data. */ + def validateMapData(exp: Expression): Map[String, String] = exp match { +case m: CreateMap => m.dataType.acceptsType(MapType(StringType, StringType, false)) --- End diff -- ``` case m: CreateMap if m.dataType.acceptsType(MapType(StringType, StringType, false)) => ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104195595 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala --- @@ -55,4 +58,15 @@ object JacksonUtils { schema.foreach(field => verifyType(field.name, field.dataType)) } + + /** Convert a map literal to Map-type data. */ + def validateMapData(exp: Expression): Map[String, String] = exp match { +case m: CreateMap => m.dataType.acceptsType(MapType(StringType, StringType, false)) + val arrayMap = m.eval().asInstanceOf[ArrayBasedMapData] + ArrayBasedMapData.toScalaMap(arrayMap).map { case (key, value) => +key.toString -> value.toString + }.toMap --- End diff -- `.toMap` is needed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104196513 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala --- @@ -55,4 +58,15 @@ object JacksonUtils { schema.foreach(field => verifyType(field.name, field.dataType)) } + + /** Convert a map literal to Map-type data. */ + def validateMapData(exp: Expression): Map[String, String] = exp match { --- End diff -- Why we need to keep here? This is not related to `JacksonUtils.scala`. The function name also needs a change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104091757 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala --- @@ -55,4 +60,22 @@ object JacksonUtils { schema.foreach(field => verifyType(field.name, field.dataType)) } + + def strToStructType(schemaAsJson: String): StructType = Try { --- End diff -- yes, I'll remove --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104091471 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala --- @@ -55,4 +60,22 @@ object JacksonUtils { schema.foreach(field => verifyType(field.name, field.dataType)) } + + def strToStructType(schemaAsJson: String): StructType = Try { +DataType.fromJson(schemaAsJson).asInstanceOf[StructType] + }.getOrElse { --- End diff -- okay, I'll fix --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104091422 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -3007,7 +3008,7 @@ object functions { * @since 2.1.0 */ def from_json(e: Column, schema: String, options: java.util.Map[String, String]): Column = -from_json(e, DataType.fromJson(schema).asInstanceOf[StructType], options) +from_json(e, JacksonUtils.strToStructType(schema), options) --- End diff -- okay, I'll do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104091265 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala --- @@ -174,4 +174,22 @@ class JsonFunctionsSuite extends QueryTest with SharedSQLContext { .select(to_json($"struct").as("json")) checkAnswer(dfTwo, readBackTwo) } + + test("SPARK-19637 Support to_json in SQL") { +// to_json --- End diff -- Nit: remove this comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104090922 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -3007,7 +3008,7 @@ object functions { * @since 2.1.0 */ def from_json(e: Column, schema: String, options: java.util.Map[String, String]): Column = -from_json(e, DataType.fromJson(schema).asInstanceOf[StructType], options) +from_json(e, JacksonUtils.strToStructType(schema), options) --- End diff -- Revert it back? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104090870 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala --- @@ -55,4 +60,22 @@ object JacksonUtils { schema.foreach(field => verifyType(field.name, field.dataType)) } + + def strToStructType(schemaAsJson: String): StructType = Try { --- End diff -- Maybe we do not need it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104090791 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala --- @@ -55,4 +60,22 @@ object JacksonUtils { schema.foreach(field => verifyType(field.name, field.dataType)) } + + def strToStructType(schemaAsJson: String): StructType = Try { +DataType.fromJson(schemaAsJson).asInstanceOf[StructType] + }.getOrElse { +throw new AnalysisException( + s"""Illegal json string for representing a schema: $schemaAsJson) --- End diff -- \"\"\"\" Four? `s"Illegal json string for representing a schema: $schemaAsJson"`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add to_json in FunctionRegistr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16981#discussion_r104090730 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala --- @@ -55,4 +60,22 @@ object JacksonUtils { schema.foreach(field => verifyType(field.name, field.dataType)) } + + def strToStructType(schemaAsJson: String): StructType = Try { +DataType.fromJson(schemaAsJson).asInstanceOf[StructType] + }.getOrElse { --- End diff -- > prefer explicitly throwing exceptions for abnormal execution and Java style try/catch for exception handling. Above is the style recommendation. Please use try/catch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org