Re: Use SparkContext in Web Application

2018-10-03 Thread Girish Vasmatkar
On Mon, Oct 1, 2018 at 12:18 PM Girish Vasmatkar < girish.vasmat...@hotwaxsystems.com> wrote: > Hi All > > We are very early into our Spark days so the following may sound like a > novice question :) I will try to keep this as short as possible. > > We are trying to use Spark to introduce a

Re: Use SparkContext in Web Application

2018-10-03 Thread Girish Vasmatkar
All Can someone please shed some light on the above query? Any help is greatly appreciated. Thanks, Girish Vasmatkar HotWax Systems On Thu, Oct 4, 2018 at 10:25 AM Girish Vasmatkar < girish.vasmat...@hotwaxsystems.com> wrote: > > > On Mon, Oct 1, 2018 at 12:18 PM Girish Vasmatkar < >

Re: How to do a broadcast join using raw Spark SQL 2.3.1 or 2.3.2?

2018-10-03 Thread kathleen li
Not sure what you mean about “raw” Spark sql, but there is one parameter which will impact the optimizer choose broadcast join automatically or not : spark.sql.autoBroadcastJoinThreshold You can read Spark doc about above parameter setting and using explain to check your join using broadcast

How to do a broadcast join using raw Spark SQL 2.3.1 or 2.3.2?

2018-10-03 Thread kant kodali
Hi All, How to do a broadcast join using raw Spark SQL 2.3.1 or 2.3.2? Thanks

Re: Back to SQL

2018-10-03 Thread Reynold Xin
No we used to have that (for views) but it wasn’t working well enough so we removed it. On Wed, Oct 3, 2018 at 6:41 PM Olivier Girardot < o.girar...@lateral-thoughts.com> wrote: > Hi everyone, > Is there any known way to go from a Spark SQL Logical Plan (optimised ?) > Back to a SQL query ? > >

Back to SQL

2018-10-03 Thread Olivier Girardot
Hi everyone, Is there any known way to go from a Spark SQL Logical Plan (optimised ?) Back to a SQL query ? Regards, Olivier.

Restarting a failed Spark streaming job running on top of a yarn cluster

2018-10-03 Thread jcgarciam
Hi Folks, We have few spark job streaming jobs running on a yarn cluster, and from time to time a job need to be restarted (it was killed due to external reason or others). Once we submit the new job we are face with the following exception: ERROR spark.SparkContext: Failed to add

Re: How to read remote HDFS from Spark using username?

2018-10-03 Thread Aakash Basu
If it is so, how to update/fix the firewall issue? On Wed, Oct 3, 2018 at 1:14 PM Jörn Franke wrote: > Looks like a firewall issue > > Am 03.10.2018 um 09:34 schrieb Aakash Basu : > > The stacktrace is below - > > --- >>

Re: How to read remote HDFS from Spark using username?

2018-10-03 Thread Jörn Franke
Looks like a firewall issue > Am 03.10.2018 um 09:34 schrieb Aakash Basu : > > The stacktrace is below - > >> --- >> Py4JJavaError Traceback (most recent call last) >> in () >> > 1 df = >>

Re: How to read remote HDFS from Spark using username?

2018-10-03 Thread Aakash Basu
The stacktrace is below - --- > Py4JJavaError Traceback (most recent call last) > in () > > 1 df = spark.read.load("hdfs:// >

How to read remote HDFS from Spark using username?

2018-10-03 Thread Aakash Basu
Hi, I have to read data stored in HDFS of a different machine and needs to be accessed through Spark for being read. How to do that? Full HDFS address along with port doesn't seem to work. Anyone did it before? Thanks, AB.