Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22938#discussion_r230585549
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
---
@@ -552,13 +552,19 @@ case class JsonToStructs(
// This converts parsed rows to the desired output by the given schema.
@transient
- lazy val converter = nullableSchema match {
- case _: StructType =>
- (rows: Iterator[InternalRow]) => if (rows.hasNext) rows.next() else
null
- case _: ArrayType =>
- (rows: Iterator[InternalRow]) => if (rows.hasNext)
rows.next().getArray(0) else null
- case _: MapType =>
- (rows: Iterator[InternalRow]) => if (rows.hasNext)
rows.next().getMap(0) else null
+ lazy val converter = (rows: Iterator[InternalRow]) => {
+ if (rows.hasNext) {
+ val result = rows.next()
+ // JSON's parser produces one record only.
+ assert(!rows.hasNext)
+ nullableSchema match {
+ case _: StructType => result
--- End diff --
I don't visible overhead of this in the profiler but will change it since
it is easy to do.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]