Re: application failed on large dataset

2015-09-18 Thread
Hi, The issue turn outs to be a memory issue. Thanks for the guidance. 周千昊 <qhz...@apache.org>于2015年9月17日周四 下午12:39写道: > indeed, the operation in this stage is quite memory consuming. > We are trying to enable the printGCDetail option and see what is going on. > &g

Re: application failed on large dataset

2015-09-16 Thread
error, does any executor die due to whatever error? > > Do you check to see if any executor restarts during your job? > > It is hard to help you just with the stack trace. You need to tell us the > whole picture when your jobs are running. > > Yong > > ---

Re: application failed on large dataset

2015-09-16 Thread
ny executor restart during > the job. > PS: the operator I am using during that stage if > rdd.glom().mapPartitions() > > > java8964 <java8...@hotmail.com>于2015年9月15日周二 下午11:44写道: > > When you saw this error, does any executor die due to whatever error? > &g

Re: application failed on large dataset

2015-09-15 Thread
> From: qhz...@apache.org > Date: Tue, 15 Sep 2015 15:02:28 + > Subject: Re: application failed on large dataset > To: user@spark.apache.org > > > has anyone met the same problems? > 周千昊 <qhz...@apache.org>于2015年9月14日周一 下午9:07写道: > > Hi

Re: application failed on large dataset

2015-09-15 Thread
has anyone met the same problems? 周千昊 <qhz...@apache.org>于2015年9月14日周一 下午9:07写道: > Hi, community > I am facing a strange problem: > all executors does not respond, and then all of them failed with the > ExecutorLostFailure. > when I look into yarn

application failed on large dataset

2015-09-14 Thread
Hi, community I am facing a strange problem: all executors does not respond, and then all of them failed with the ExecutorLostFailure. when I look into yarn logs, there are full of such exception 15/09/14 04:35:33 ERROR shuffle.RetryingBlockFetcher: Exception while beginning

Re: about mr-style merge sort

2015-09-10 Thread
Hi, all Can anyone give some tips about this issue? 周千昊 <qhz...@apache.org>于2015年9月8日周二 下午4:46写道: > Hi, community > I have an application which I try to migrate from MR to Spark. > It will do some calculations from Hive and output to hfile which will > be bulk l

Re: about mr-style merge sort

2015-09-10 Thread
tition of rdd rather than total sorting the >> rdd.. >> In Rdd.mapPartition you can sort the data in one partition and try... >> On Sep 11, 2015 7:36 AM, "周千昊" <z.qian...@gmail.com> wrote: >> >>> Hi, all >>> Can anyone give some tips ab

about mr-style merge sort

2015-09-08 Thread
Hi, community I have an application which I try to migrate from MR to Spark. It will do some calculations from Hive and output to hfile which will be bulk load to HBase Table, details as follow: Rdd input = getSourceInputFromHive() Rdd> mapSideResult =

serialization issue

2015-08-13 Thread
Hi, I am using spark 1.4 when an issue occurs to me. I am trying to use the aggregate function: JavaRddString rdd = some rdd; HashMapLong, TypeA zeroValue = new HashMap(); // add initial key-value pair for zeroValue rdd.aggregate(zeroValue, new