On Mon, Oct 1, 2018 at 12:18 PM Girish Vasmatkar <
girish.vasmat...@hotwaxsystems.com> wrote:
> Hi All
>
> We are very early into our Spark days so the following may sound like a
> novice question :) I will try to keep this as short as possible.
>
> We are trying to use Spark to introduce a
All
Can someone please shed some light on the above query? Any help is greatly
appreciated.
Thanks,
Girish Vasmatkar
HotWax Systems
On Thu, Oct 4, 2018 at 10:25 AM Girish Vasmatkar <
girish.vasmat...@hotwaxsystems.com> wrote:
>
>
> On Mon, Oct 1, 2018 at 12:18 PM Girish Vasmatkar <
>
Not sure what you mean about “raw” Spark sql, but there is one parameter which
will impact the optimizer choose broadcast join automatically or not :
spark.sql.autoBroadcastJoinThreshold
You can read Spark doc about above parameter setting and using explain to check
your join using broadcast
Hi All,
How to do a broadcast join using raw Spark SQL 2.3.1 or 2.3.2?
Thanks
No we used to have that (for views) but it wasn’t working well enough so we
removed it.
On Wed, Oct 3, 2018 at 6:41 PM Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:
> Hi everyone,
> Is there any known way to go from a Spark SQL Logical Plan (optimised ?)
> Back to a SQL query ?
>
>
Hi everyone,
Is there any known way to go from a Spark SQL Logical Plan (optimised ?)
Back to a SQL query ?
Regards,
Olivier.
Hi Folks,
We have few spark job streaming jobs running on a yarn cluster, and from
time to time a job need to be restarted (it was killed due to external
reason or others).
Once we submit the new job we are face with the following exception:
ERROR spark.SparkContext: Failed to add
If it is so, how to update/fix the firewall issue?
On Wed, Oct 3, 2018 at 1:14 PM Jörn Franke wrote:
> Looks like a firewall issue
>
> Am 03.10.2018 um 09:34 schrieb Aakash Basu :
>
> The stacktrace is below -
>
> ---
>>
Looks like a firewall issue
> Am 03.10.2018 um 09:34 schrieb Aakash Basu :
>
> The stacktrace is below -
>
>> ---
>> Py4JJavaError Traceback (most recent call last)
>> in ()
>> > 1 df =
>>
The stacktrace is below -
---
> Py4JJavaError Traceback (most recent call last)
> in ()
> > 1 df = spark.read.load("hdfs://
>
Hi,
I have to read data stored in HDFS of a different machine and needs to be
accessed through Spark for being read.
How to do that? Full HDFS address along with port doesn't seem to work.
Anyone did it before?
Thanks,
AB.
11 matches
Mail list logo