[ https://issues.apache.org/jira/browse/SPARK-32533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171183#comment-17171183 ]
Apache Spark commented on SPARK-32533: -------------------------------------- User 'msamirkhan' has created a pull request for this issue: https://github.com/apache/spark/pull/29354 > Improve Avro read/write performance on nested structs and array of structs > -------------------------------------------------------------------------- > > Key: SPARK-32533 > URL: https://issues.apache.org/jira/browse/SPARK-32533 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.0.0 > Reporter: Muhammad Samir Khan > Priority: Major > > Have some improvements for Avro file format to reduce time taken when > reading/writing nested/array'd structs. Using benchmarks in SPARK-32531 was > able to improve performance on branch-3.0 as follows (measurements in > seconds): > Read: > Nested Structs: 75 -> 46 > Array of Struct: 47 -> 17 > Write > Nested Structs: 147 -> 36 > Array of Struct: 139 -> 34 > Will be putting up the PR soon with the changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org