Please provide source code and exceptions that are in executor and/or driver 
log.


> On 26. Oct 2017, at 08:42, Donni Khan <prince.don...@googlemail.com> wrote:
> 
> Hi,
> I'm applying preprocessing methods on big data of text by using spark-Java. I 
> created my own NLP pipline as a normal java code and call it in the map 
> function like this:
> 
> MyRDD.map(call nlp pipeline fr each row)
> 
> I run my job in a cluster 14 machines(32 Cores  and about 140G for each). The 
> job run correctltly, it distrbutes the documents across executors, but the 
> job stuck on the last task for several minutes
> I looked at the job details, I found that most of documents are processed in 
> several executrs, but only one task stuck on the small number of documents, 
> it looks like the task waits for something, then after 10-20 minutes the task 
> cntinues to process the rest documents and finish.
> 
> I also tried to test different configurations but still the same.
> any help?
> 
> thanks,
> Donni
> 
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to