[ https://issues.apache.org/jira/browse/SPARK-32531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Muhammad Samir Khan updated SPARK-32531: ---------------------------------------- Component/s: Tests > Add benchmarks for nested structs and arrays for different file formats > ----------------------------------------------------------------------- > > Key: SPARK-32531 > URL: https://issues.apache.org/jira/browse/SPARK-32531 > Project: Spark > Issue Type: Test > Components: SQL, Tests > Affects Versions: 3.0.0 > Reporter: Muhammad Samir Khan > Priority: Major > > We had found that Spark performance was slow as compared to PIG on some > schemas in our pipelines. On investigation, it was found that Spark > performance was slow for nested structs and array'd structs and these cases > were not being profiled by the current benchmarks. I have some improvements > for ORC (SPARK-32532) and Avro (SPARK-32533) file formats which improve the > performance in these cases and will be putting up the PRs soon. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org