[ https://issues.apache.org/jira/browse/SPARK-7825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-7825: -------------------------------- Labels: bulk-closed (was: ) > Poor performance in Cross Product due to no combine operations for small > files. > ------------------------------------------------------------------------------- > > Key: SPARK-7825 > URL: https://issues.apache.org/jira/browse/SPARK-7825 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Reporter: Tang Yan > Priority: Major > Labels: bulk-closed > > Dealing with Cross Product, if one table has many small files, spark sql > has to handle so many tasks which will lead to poor performance, while Hive > has a CombineHiveInputFormat which can combine small files to decrease the > task number. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org