[ https://issues.apache.org/jira/browse/TEZ-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117544#comment-16117544 ]
Anant Mittal edited comment on TEZ-3814 at 8/7/17 11:24 PM: ------------------------------------------------------------ [~gopalv] Thanks for the response. This is a functionality which is not working as expected. Please tell me if anything further can be done so that it satisfies being a bug. The JIRA pointed out was something new. I gave the suggestion a try but even that failed to fix the issue and we still see failures with the same error. I used the value 4000 for net.core.somaxconn. Please note that this is not a large load. The table used for insert has only 62 rows with a total size of 26948 bytes. was (Author: infinitymittal): [~gopalv] Thanks for the response. This is a functionality which is not working as expected. Please tell me if I anything further can be done so that it satisfies being a bug. The JIRA pointed out was something I had not yet tried then. I gave it a try but even that failed to fix the issue and we still see failures with the same error. I used the value 4000 for net.core.somaxconn. Please note that this is not a large load. The table used for insert has only 62 rows with a total size of 26948 bytes. > Inserts into a bucketed table fail randomly with Hive on Tez > ------------------------------------------------------------ > > Key: TEZ-3814 > URL: https://issues.apache.org/jira/browse/TEZ-3814 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.7.0 > Reporter: Anant Mittal > Labels: Bucketing, Hive, Tez > > The MAP phase for Inserts into a bucketed table randomly fails with the error > "Vertex <vertex_id> [Map 1] failed as task <task_id> failed after vertex > succeeded.]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 > killedVertices:0". > The task fails because it fails for all attempts with "<attempt_id> being > failed for too many output errors. failureFraction=0.2, > MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, > MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, > readErrorTimespan=0" > This happens more often if the table is ACID enabled and a delete operation > is performed before the inserts. > I have tried the following: > Changed tez.am.launch.cmd-opts, tez.task.launch.cmd-opts and > hive.tez.java.opts to use parallel GC. > tez.runtime.shuffle.max.allowed.failed.fetch.fraction = 0.95 > tez.runtime.shuffle.failed.check.since-last.completion=false > tez.runtime.shuffle.fetch.buffer.percent = 0.1 > tez.runtime.shuffle.memory.limit.percent = 0.25 > tez.runtime.shuffle.ssl.enable=false > Deleted ".../usercache/<user>/filecache" and ".../usercache/<user>/appcache" > I am using HDP 2.6 dsitribution. -- This message was sent by Atlassian JIRA (v6.4.14#64029)