Could you check the Spark web UI for the number of tasks issued when the
query is executed? I digged out |mapred.map.tasks| because I saw 2 tasks
were issued.
On 2/26/15 3:01 AM, Kannan Rajah wrote:
Cheng, We tried this setting and it still did not help. This was on
Spark 1.2.0.
--
Kannan
Cheng, We tried this setting and it still did not help. This was on Spark
1.2.0.
--
Kannan
On Mon, Feb 23, 2015 at 6:38 PM, Cheng Lian lian.cs@gmail.com wrote:
(Move to user list.)
Hi Kannan,
You need to set mapred.map.tasks to 1 in hive-site.xml. The reason is this
line of code
How many reducers you set for Hive? With small data set, Hive will run in local
mode, which will set the reducer count always as 1.
From: Kannan Rajah [mailto:kra...@maprtech.com]
Sent: Thursday, February 26, 2015 3:02 AM
To: Cheng Lian
Cc: user@spark.apache.org
Subject: Re: Spark-SQL 1.2.0 sort
(Move to user list.)
Hi Kannan,
You need to set |mapred.map.tasks| to 1 in hive-site.xml. The reason is
this line of code
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala#L68,
which overrides |spark.default.parallelism|. Also,