Github user vpchelko commented on the issue:
https://github.com/apache/spark/pull/16374
We use spark 2.0.0.
Changing spark.memory.fraction (we tried 0.3 and 0.4) (also tried to
increase memoryOverhead to 20% of executor memory) does not help.
The application is not stable due to memory issues (extremely high GC -
20-30% of total time and intermittently some executors die due to OOM).
The memory issues leads to application failure:
User class threw exception: org.apache.spark.SparkException: Job aborted
due to stage failure: Task 17 in stage 40.0 failed 10 times, most recent
failure: Lost task 17.9 in stage 40.0 (TID 4004,
ip-96-114-221-174.us-west-2.compute.internal): scala.NotImplementedError: put()
should not be called on an EmptyStateMap
See "magic" stacktrace:
[stacktrace.txt](https://github.com/apache/spark/files/686852/stacktrace.txt)
Notice, that the proposed workaround allows us to run smoothly more 2 week
(with low GC - about 1% of total time).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]