GitHub user manishgupta88 opened a pull request:
https://github.com/apache/carbondata/pull/996
[WIP] Executor lost failure in case of data load failure due to bad records
Problem: Executor lost failure in case of data load failure due to bad
records
Analysis: In case when we try to do data load with bad records
continuously, after some time it is observed that executor is lost due to OOM
error and application also gets restarted by yarn after some time. This happens
because in case of data load failure due to bad records exception is thrown by
the executor and task keeps retrying till the max number of retry attempts are
reached. This keeps happening continuously and after some time application is
restarted by yarn.
Fix: When it is known that data load failure is due to bad records and it
is an intentional failure from the carbon, then in that case executor should
not retry for data load and complete the job gracefully and the failure
information should be handled by the driver.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/manishgupta88/incubator-carbondata
bad_record_failure_suppress
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/996.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #996
----
commit 771ac3d22d0585f4ef26a8e38c35fc7a353a0ccf
Author: manishgupta88 <[email protected]>
Date: 2017-06-06T06:48:35Z
Problem: Executor lost failure in case of data load failure due to bad
records
Analysis: In case when we try to do data load with bad records
continuously, after some time it is observed that executor is lost due to OOM
error and application also gets restarted by yarn after some time. This happens
because in case of data load failure due to bad records exception is thrown by
the executor and task keeps retrying till the max number of retry attempts are
reached. This keeps happening continuously and after some time application is
restarted by yarn.
Fix: When it is known that data load failure is due to bad records and it
is an intentional failure from the carbon, then in that case executor should
not retry for data load and complete the job gracefully and the failure
information should be handled by the driver.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---