Re: Launch a pyspark Job From UI

2018-06-11 Thread Sathishkumar Manimoorthy
You can use Zeppelin as well

https://zeppelin.apache.org/docs/latest/interpreter/spark.html

Thanks,
Sathish

On Mon, Jun 11, 2018 at 4:25 PM, hemant singh  wrote:

> You can explore Livy https://dzone.com/articles/quick-start-with-
> apache-livy
>
> On Mon, Jun 11, 2018 at 3:35 PM, srungarapu vamsi <
> srungarapu1...@gmail.com> wrote:
>
>> Hi,
>>
>> I am looking for applications where we can trigger spark jobs from UI.
>> Are there any such applications available?
>>
>> I have checked Spark-jobserver using which we can expose an api to submit
>> a spark application.
>>
>> Are there any other alternatives using which i can submit pyspark jobs
>> from UI ?
>>
>> Thanks,
>> Vamsi
>>
>
>


Re: Spark YARN Error - triggering spark-shell

2018-06-08 Thread Sathishkumar Manimoorthy
It seems, your spark-on-yarn application is not able to get it's
application master,

org.apache.spark.SparkException: Yarn application has already ended!
It might have been killed or unable to launch application master.


Check once on yarn logs

Thanks,
Sathish-


On Fri, Jun 8, 2018 at 2:22 PM, Jeff Zhang  wrote:

>
> Check the yarn AM log for details.
>
>
>
> Aakash Basu 于2018年6月8日周五 下午4:36写道:
>
>> Hi,
>>
>> Getting this error when trying to run Spark Shell using YARN -
>>
>> Command: *spark-shell --master yarn --deploy-mode client*
>>
>> 2018-06-08 13:39:09 WARN  Client:66 - Neither spark.yarn.jars nor 
>> spark.yarn.archive is set, falling back to uploading libraries under 
>> SPARK_HOME.
>> 2018-06-08 13:39:25 ERROR SparkContext:91 - Error initializing SparkContext.
>> org.apache.spark.SparkException: Yarn application has already ended! It 
>> might have been killed or unable to launch application master.
>>
>>
>> The last half of stack-trace -
>>
>> 2018-06-08 13:56:11 WARN  YarnSchedulerBackend$YarnSchedulerEndpoint:66 - 
>> Attempted to request executors before the AM has registered!
>> 2018-06-08 13:56:11 WARN  MetricsSystem:66 - Stopping a MetricsSystem that 
>> is not running
>> org.apache.spark.SparkException: Yarn application has already ended! It 
>> might have been killed or unable to launch application master.
>>   at 
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:89)
>>   at 
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:63)
>>   at 
>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
>>   at org.apache.spark.SparkContext.(SparkContext.scala:500)
>>   at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)
>>   at 
>> org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)
>>   at 
>> org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)
>>   at scala.Option.getOrElse(Option.scala:121)
>>   at 
>> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
>>   at org.apache.spark.repl.Main$.createSparkSession(Main.scala:103)
>>   ... 55 elided
>> :14: error: not found: value spark
>>import spark.implicits._
>>   ^
>> :14: error: not found: value spark
>>import spark.sql
>>
>>
>> Tried putting the *spark-yarn_2.11-2.3.0.jar *in Hadoop yarn, still not
>> working, anything else to do?
>>
>> Thanks,
>> Aakash.
>>
>


Re: [Spark-Submit] Where to store data files while running job in cluster mode?

2017-09-29 Thread Sathishkumar Manimoorthy
Place it in HDFS and give the reference path in your code.

Thanks,
Sathish

On Fri, Sep 29, 2017 at 3:31 PM, Gaurav1809  wrote:

> Hi All,
>
> I have multi node architecture of (1 master,2 workers) Spark cluster, the
> job runs to read CSV file data and it works fine when run on local mode
> (Local(*)). However, when the same job is ran in cluster mode
> (Spark://HOST:PORT), it is not able to read it. I want to know how to
> reference the files Or where to store them? Currently the CSV data file is
> on master(from where the job is submitted).
>
> Following code works fine in local mode but not in cluster mode.
>
> val spark = SparkSession
>   .builder()
>   .appName("SampleFlightsApp")
>   .master("spark://masterIP:7077") // change it to .master("local[*])
> for local mode
>   .getOrCreate()
>
> val flightDF =
> spark.read.option("header",true).csv("/home/username/sampleflightdata")
> flightDF.printSchema()
>
> Error: FileNotFoundException: File file:/home/gaurav/sampleflightdata does
> not exist
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Re: Debugging Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

2017-09-26 Thread Sathishkumar Manimoorthy
@Ayan

It seems to be running on spark standalone. Not mostly on Yarn I guess.

Thanks,
Sathish

On Tue, Sep 26, 2017 at 9:09 PM, ayan guha  wrote:

> I would check the queue you are submitting job, assuming it is yarn...
>
> On Tue, Sep 26, 2017 at 11:40 PM, JG Perrin  wrote:
>
>> Hi,
>>
>>
>>
>> I get the infamous:
>>
>> Initial job has not accepted any resources; check your cluster UI to
>> ensure that workers are registered and have sufficient resources
>>
>>
>>
>> I run the app via Eclipse, connecting:
>>
>> SparkSession spark = SparkSession.*builder*()
>>
>> .appName("Converter - Benchmark")
>>
>> .master(ConfigurationManager.*getMaster*())
>>
>> .config("spark.cores.max", "4")
>>
>> .config("spark.executor.memory", "16g")
>>
>> .getOrCreate();
>>
>>
>>
>>
>>
>> Everything seems ok on the cluster side:
>>
>>
>>
>>
>>
>> I probably missed something super obvious, but can’t find it…
>>
>>
>>
>> Any help/hint is welcome! - TIA
>>
>>
>>
>> jg
>>
>>
>>
>>
>>
>>
>>
>
>
>
> --
> Best Regards,
> Ayan Guha
>