Re: can I do spark-submit --jars [s3://bucket/folder/jar_file]? or --jars

2017-07-31 Thread
When using spark-submit, the application jar along with any jars included with the --jars option will be automatically transferred to the cluster. URLs supplied after --jars must be separated by commas. That list is included on the driver and executor classpaths. Directory expansion does not work

Re: DAGScheduler - two runtimes

2017-08-15 Thread
ResultStage cost time is your job's last stage cost time. Job 13 finished: reduce at VertexRDDImpl.scala:90, took 0.035546 s is the time your job cost 2017-08-14 18:58 GMT+08:00 Kaepke, Marc : > Hi everyone, > > I’m a Spark newbie and have one question: > What is the

Re: ClassNotFoundException for Workers

2017-07-25 Thread
Ensure com.amazonaws.services.s3.AmazonS3ClientBuilder in your classpath which include your application jar and attached executor jars. 2017-07-20 6:12 GMT+08:00 Noppanit Charassinvichai : > I have this spark job which is using S3 client in mapPartition. And I get > this

Re: Spark Job crash due to File Not found when shuffle intermittently

2017-07-24 Thread
You can also check whether space left in the executor node enough to store shuffle file or not. 2017-07-25 13:01 GMT+08:00 周康 <zhoukang199...@gmail.com>: > First,spark will handle task fail so if job ended normally , this error > can be ignore. > Second, when using BypassMergeSo

Re: Spark Job crash due to File Not found when shuffle intermittently

2017-07-24 Thread
First,spark will handle task fail so if job ended normally , this error can be ignore. Second, when using BypassMergeSortShuffleWriter, it will first write data file then write an index file. You can check "Failed to delete temporary index file at" or "fail to rename file" in related executor

Re: Spark Job crash due to File Not found when shuffle intermittently

2017-07-24 Thread
be no permission or no space,or no enough file descriptor) 2017-07-25 13:05 GMT+08:00 周康 <zhoukang199...@gmail.com>: > You can also check whether space left in the executor node enough to store > shuffle file or not. > > 2017-07-25 13:01 GMT+08:00 周康 <zhoukang199...@gmail.c

Re: How to list only erros for a stage

2017-07-24 Thread
May be you can click Header Status cloumn of Task section,then failed task will appear first. 2017-07-25 10:02 GMT+08:00 jeff saremi : > On the Spark status UI you can click Stages on the menu and see Active > (and completed stages). For the active stage, you can see

Re: running spark application compiled with 1.6 on spark 2.1 cluster

2017-07-27 Thread
>From spark2.x the package of Logging is changed 2017-07-27 23:45 GMT+08:00 Marcelo Vanzin : > On Wed, Jul 26, 2017 at 10:45 PM, satishl wrote: > > is this a supported scenario - i.e., can I run app compiled with spark > 1.6 > > on a 2.+ spark

Re: SPARK Storagelevel issues

2017-07-28 Thread
t; I have done all of that, but my question is "why should a 62 MB data give > memory error when we have over 2 GB of memory available". > > Therefore all that is mentioned by Zhoukang is not pertinent at all. > > > Regards, > Gourav Sengupta > > On Fri, Jul 2

Re: Job keeps aborting because of org.apache.spark.shuffle.FetchFailedException: Failed to connect to server/ip:39232

2017-07-29 Thread
I think you should check the rpc target, may be the nodemanager has memory issue like gc or other.Check it out first. And i wonder why you assign --executor-cores 8? 2017-07-29 7:40 GMT+08:00 jeff saremi : > asking this on a tangent: > > Is there anyway for the shuffle

Re: SPARK Storagelevel issues

2017-07-27 Thread
testdf.persist(pyspark.storagelevel.StorageLevel.MEMORY_ONLY_SER) maybe StorageLevel should change.And check you config " spark.memory.storageFraction" which default value is 0.5 2017-07-28 3:04 GMT+08:00 Gourav Sengupta : > Hi, > > I cached in a table in a large EMR

Re: A bug in spark or hadoop RPC with kerberos authentication?

2017-08-22 Thread
you can checkout Hadoop**credential class in spark yarn。During spark submit,it will use config on the classpath. I wonder how do you reference your own config?

Re: Netty Issues

2017-08-21 Thread
Use maven shade plugin may help 2017-08-21 18:43 GMT+08:00 Pascal Stammer : > Hi all, > > i got following exception: > > 17/08/21 12:33:56 ERROR TransportClient: Failed to send RPC > 5493448667271613330 to /10.210.85.3:52482: java.lang.AbstractMethodError >

Re: a set of practice and LAB

2017-08-21 Thread
For spark,you can dive into examples source folder. 2017-08-21 4:49 GMT+08:00 Mohsen Pahlevanzadeh : > Dear All, > > > I need to a set of practice and LAB with sparc and hadoop, You will make > me happy for your help. > > Yours, > Mohsen > >

Re: java heap space

2017-09-03 Thread
May be you can repartition? 2017-09-04 9:25 GMT+08:00 KhajaAsmath Mohammed : > Hi, > > I am getting java.lang.OutOfMemoryError: Java heap space error whenever I > ran the spark sql job. > > I came to conclusion issue is because of reading number of files from > spark. >