MaxGekk commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-593511926 > At the moment we have to parse all ... You can avoid deep parsing by specifying string as element type. For example: ```scala scala> val df = Seq("""[{"a":1}, {"a": 2}]""").toDF("json") df: org.apache.spark.sql.DataFrame = [json: string] scala> df.select(size(from_json($"json", ArrayType(StringType)))).show +---------------------+ |size(from_json(json))| +---------------------+ | 2| +---------------------+ ``` It does actually the same as your expression. Maybe it is less optimal because `from_json()` materializes arrays but this is another question how to optimize the combination of size + from_json of array of strings. I would add an optimization rule instead of extending public API.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
