Hi Shyam,
It will be good if you mention what are you using the --master url as? Is
it running on YARN, Mesos or Spark cluster?
However, I faced such an issue in my earlier trials with spark, in which I
created connections with a lot of external databases like Cassandra within
the Driver (or main
Dear Community ,
>From what I understand , Spark uses a variation of Semantic Versioning[1] ,
but this information is not enough for me to clarify if it is compatible or
not within versions.
For example , if my cluster is running Spark 2.3.1 , can I develop using API
additions in Spark 2.
Hi All,
countDistinct on dataframe returns different results every time it is run,
I expect that when approxCountDistinct is used but even for
countDistinct()? Is there a way to get accurate count using pyspark
(deterministic result)?
--
Regards,
Rishi Shah
Hey Oliver,
I am also facing the same issue on my kubernetes
cluster(v1.11.5) on AWS with spark version 2.3.3, any luck in figuring out
the root cause?
On Fri, May 3, 2019 at 5:37 AM Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:
> Hi,
> I did not try on another
That did the trick, Abhishek! Thanks for the explanation, that answered a lot
of questions I had.
Dave
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.o
Hi,
Any clue why spark job goes into UNDEFINED state ?
More detail are in the url.
https://stackoverflow.com/questions/56545644/why-my-spark-sql-job-stays-in-state-runningfinalstatus-undefined
Appreciate your help.
Regards,
Shyam
Hi folks,
Does anyone know what is happening in this case? I tried both with MySQL and
PostgreSQL and none of them finishes schema creation without error. It seems
something has changed from 2.2. to 2.4 that broke schema generation for Hive
Metastore.
--
Sent from: http://apache-spark-user-list
I'm writing a large dataset in Parquet format to HDFS using Spark and it runs
rather slowly in EMR vs say Databricks. I realize that if I was able to use
Hadoop 3.1, it would be much more performant because it has a high performance
output committer. Is this the case, and if so - when will the
Please provide update if any one knows.
On Monday, June 10, 2019, Amit Sharma wrote:
>
> We have spark kafka sreaming job running on standalone spark cluster. We
> have below kafka architecture
>
> 1. Two cluster running on two data centers.
> 2. There is LTM on top on each data center (load bal
Hi Marcelo,
I'm used to work with https://github.com/jupyter/docker-stacks. There's the
Scala+jupyter option too. Though there might be better option with Zeppelin
too.
Hth
On Tue, 11 Jun 2019, 11:52 Marcelo Valle, wrote:
> Hi,
>
> I would like to run spark shell + scala on a docker environmen
Hi Deepak,
Please let us know - how you managed it ?
Thanks,
NJ
On Mon, Jun 10, 2019 at 4:42 PM Deepak Sharma wrote:
> Thanks All.
> I managed to get this working.
> Marking this thread as closed.
>
> On Mon, Jun 10, 2019 at 4:14 PM Deepak Sharma
> wrote:
>
>> This is the project requirement ,
Got the point. If you would like to get "correct" output, you may need to
set global watermark as "min", because watermark is not only used for
evicting rows in state, but also discarding input rows later than
watermark. Here you may want to be aware that there're two stateful
operators which will
Hi,
I would like to run spark shell + scala on a docker environment, just to
play with docker in development machine without having to install JVM + a
lot of things.
Is there something as an "official docker image" I am recommended to use? I
saw some on docker hub, but it seems they are all contr
Hi Patrick,
I guess the easiest way is to use log aggregation:
https://spark.apache.org/docs/latest/running-on-yarn.html#debugging-your-application
BR
Jean-Michel
Dr. Ing. h.c. F. Porsche Aktiengesellschaft
Sitz der Gesellschaft: Stuttgart
Registergericht: Amtsgericht Stuttgart HRB-Nr. 7306
For grouping with each: look into grouping sets
https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-multi-dimensional-aggregation.html
Am Di., 11. Juni 2019 um 06:09 Uhr schrieb Rishi Shah <
rishishah.s...@gmail.com>:
> Thank you both for your input!
>
> To calculate moving average o
15 matches
Mail list logo