Re: Spark 2.4.3 with hadoop 3.2 docker image.

2019-07-06 Thread Julien Laurenceau
Hi Did you try using the image build by mesosphere ? I am not sure they already build the combo 2.4 / 3.2 but they provide a project on github that Can be used to generate tour custom combo. It is named mesosphere/spark-build Regards Le jeu. 4 juil. 2019 à 19:13, José Luis Pedrosa a écrit : >

Spark and Java10

2019-07-06 Thread Jack Kolokasis
Hello,     I try to use Apache Spark v.2.3.1 using JAVA 10 but i can not. Spark documentation refers that Spark works using Java8+ . So, has anyone tried to use Apache Spark with Java 10 ? Thanks for your help, Iacovos - To

Re: Attempting to avoid a shuffle on join

2019-07-06 Thread Chris Teoh
Dataframes have a partitionBy function too. You can avoid a shuffle if one of your datasets is small enough to broadcast. On Thu., 4 Jul. 2019, 7:34 am Mkal, wrote: > Please keep in mind i'm fairly new to spark. > I have some spark code where i load two textfiles as datasets and after > some >