)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
--
郑旭东
ZHENG, Xu-dong
archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
--
郑旭东
ZHENG, Xu-dong
RDD.repartition().
For coalesce without shuffle, I don't know how to set the right number
of partitions either ...
-Xiangrui
On Tue, Aug 12, 2014 at 6:16 AM, ZHENG, Xu-dong dong...@gmail.com wrote:
Hi Xiangrui,
Thanks for your reply!
Yes, our data is very sparse, but RDD.repartition invoke
On Mon, Aug 11, 2014 at 10:39 PM, ZHENG, Xu-dong dong...@gmail.com
wrote:
I think this has the same effect and issue with #1, right?
On Tue, Aug 12, 2014 at 1:08 PM, Jiusheng Chen chenjiush...@gmail.com
wrote:
How about increase HDFS file extent size? like current value is 128M, we
(SparkSubmit.scala:73)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
--
郑旭东
ZHENG, Xu-dong
PM, ZHENG, Xu-dong dong...@gmail.com wrote:
Hi Cheng,
I also meet some issues when I try to start ThriftServer based a build
from master branch (I could successfully run it from the branch-1.0-jdbc
branch). Below is my build command:
./make-distribution.sh --skip-java-test -Phadoop-2.4 -Phive
a lot of 'ANY' tasks, that means that tasks
read data from other nodes, and become slower than that read data from
local memory.
I think the best way should like #3, but leverage locality as more as
possible. Is there any way to do that? Any suggestions?
Thanks!
--
ZHENG, Xu-dong
I think this has the same effect and issue with #1, right?
On Tue, Aug 12, 2014 at 1:08 PM, Jiusheng Chen chenjiush...@gmail.com
wrote:
How about increase HDFS file extent size? like current value is 128M, we
make it 512M or bigger.
On Tue, Aug 12, 2014 at 11:46 AM, ZHENG, Xu-dong dong