[ https://issues.apache.org/jira/browse/SPARK-25226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593816#comment-16593816 ]
Hyukjin Kwon commented on SPARK-25226: -------------------------------------- can you use: {code} >>> df = df.withColumn("parsed_data", F.from_json(F.col('data'), >>> "array<string>")) >>> df.show() +--------------------+---+--------------------+ | data| id| parsed_data| +--------------------+---+--------------------+ |["string1", true,...| 1| [string1, true,]| |["string2", false...| 2| [string2, false,]| |["string3", true,...| 3|[string3, true, a...| +--------------------+---+--------------------+ {code} instead? > Extend functionality of from_json to support arrays of differently-typed > elements > --------------------------------------------------------------------------------- > > Key: SPARK-25226 > URL: https://issues.apache.org/jira/browse/SPARK-25226 > Project: Spark > Issue Type: Improvement > Components: PySpark, Spark Core > Affects Versions: 2.3.1 > Reporter: Yuriy Davygora > Priority: Minor > > At the moment, the 'from_json' function only supports a STRUCT or an ARRAY of > STRUCTS as input. Support for ARRAY of primitives is, apparently, coming with > Spark 2.4, but it will only support arrays of elements of same data type. It > will not, for example, support JSON-arrays like > {noformat} > ["string_value", 0, true, null] > {noformat} > which is JSON-valid with schema > {noformat} > {"containsNull":true,"elementType":["string","integer","boolean"],"type":"array"} > {noformat} > We would like to kindly ask you to add support for different-typed element > arrays in the 'from_json' function. This will necessitate extending the > functionality of ArrayType or maybe adding a new type (refer to > [[SPARK-25225]]) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org