Hi all,
Non-deterministic FAILED_TO_UNCOMPRESS(5) or ’Stream is corrupted’ errors
may occur during shuffle read, described as this
JIRA(https://issues.apache.org/jira/browse/SPARK-4105).
There is not new comment for a long time in this JIRA. So, Is there
anyone seen these errors in
Why jenkins locale is:
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX
Hadoop will throw InvalidPathExcept
If you use UDFs in Python, you would want to use Pandas UDF for better
performance.
On Mon, Mar 11, 2019 at 7:50 PM Jonathan Winandy
wrote:
> Thanks, I didn't know!
>
> That being said, any udf use seems to affect badly code generation (and
> the performance).
>
>
> On Mon, 11 Mar 2019, 15:13 Dy
Thanks, I didn't know!
That being said, any udf use seems to affect badly code generation (and the
performance).
On Mon, 11 Mar 2019, 15:13 Dylan Guedes, wrote:
> Btw, even if you are using Python you can register your UDFs in Scala and
> use them in Python.
>
> On Mon, Mar 11, 2019 at 6:55 AM
The problem is located within
\scheduler\cluster\CoarseGrainedSchedulerBackend.scala on the receive function,
StatusUpdate.
When my incident occurs the scheduler becomes effectively single threaded
processing 80k continuous messages
Take 3 of those consecutive messages
04-03-19 22:02:43:037 [
Search JIRA ... https://issues.apache.org/jira/browse/SPARK-24417
On Mon, Mar 11, 2019 at 1:03 PM Sudhir Menon wrote:
>
> Is there a timeline for Spark 3.0?
> Or more specifically, is there a timeline for moving to Java 9 and beyond?
>
> Thanks in advance
> Suds
>
>
>
> On Tue, Nov 6, 2018 at 9:1
Is there a timeline for Spark 3.0?
Or more specifically, is there a timeline for moving to Java 9 and beyond?
Thanks in advance
Suds
On Tue, Nov 6, 2018 at 9:16 AM Felix Cheung
wrote:
> +1 for Spark 3, definitely
> Thanks for the updates
>
>
> --
> *From:* Sean Owe
Btw, even if you are using Python you can register your UDFs in Scala and
use them in Python.
On Mon, Mar 11, 2019 at 6:55 AM Jonathan Winandy
wrote:
> Hello Snehasish
>
> If you are not using UDFs, you will have very similar performance with
> those languages on SQL.
>
> So it go down to :
> *
Hello Snehasish
If you are not using UDFs, you will have very similar performance with
those languages on SQL.
So it go down to :
* if you know python, go for python.
* if you are used to the JVM, and are ready for a bit of paradigm shift, go
for Scala.
Our team is using Scala, however we help o
Hi
Is there a way to get performance benchmarks for development of application
using either Java/Scala/Python
Use case mostly involve SQL pipeline/data ingested from various sources
including Kafka
What should be the most preferred language and it would be great if the
preference for language ca
Well it will be difficult to say anything without knowing func. It could be
that 40 cores and 200 gb for an executor is not a setup that suits the func and
the overall architecture.
It could be also GC collection issues etc.
Sometimes it also does not help to throw hardware at the issue. It de
Thanks
There is no issue on the worker/executor side they have ample memory > 200GB, I
gave that information as background to the system apologies for the confusion.
The problem is isolated to the lifetime of processing a DriverEndpoint
StatusUpdate message. For 40 minutes the system runs fin
Well it is a little bit difficult to say, because a lot of things are mixing up
here. What function is calculated? Does it need a lot of memory? Could it be
that you run out of memory and some spillover happens and you have a lot of IO
to disk which is blocking?
Related to that could be 1 exec
13 matches
Mail list logo