[jira] [Commented] (DRILL-5902) Queries encounter random failure due to RPC connection timed out
[ https://issues.apache.org/jira/browse/DRILL-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16375189#comment-16375189 ] ASF GitHub Bot commented on DRILL-5902: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1113 > Queries encounter random failure due to RPC connection timed out > > > Key: DRILL-5902 > URL: https://issues.apache.org/jira/browse/DRILL-5902 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Vlad Rozov >Priority: Critical > Labels: ready-to-commit > Fix For: 1.13.0 > > Attachments: 261230f7-e3b9-0cee-22d8-921cb56e3e12.sys.drill, > node196.drillbit.log > > > Multiple random failures (25) occurred with the latest > Functional-Baseline-88.193 run. Here is a sample query: > {noformat} > /root/drillAutomation/prasadns14/framework/resources/Functional/window_functions/multiple_partitions/q27.sql > -- Kitchen sink > -- Use all supported functions > select > rank() over W, > dense_rank()over W, > percent_rank() over W, > cume_dist() over W, > avg(c_integer + c_integer) over W, > sum(c_integer/100) over W, > count(*)over W, > min(c_integer) over W, > max(c_integer) over W, > row_number()over W > from > j7 > where > c_boolean is not null > window W as (partition by c_bigint, c_date, c_time, c_boolean order by > c_integer) > {noformat} > From the logs: > {noformat} > 2017-10-23 04:14:36,536 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > {noformat} > {noformat} > 2017-10-23 04:14:53,941 [UserServer-1] INFO > o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.88.196:31010 <--> > /10.10.88.193:38281 (user server) timed out. Timeout was set to 30 seconds. > Closing connection. > 2017-10-23 04:14:53,952 [UserServer-1] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested RUNNING --> > FAILED > 2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested FAILED --> > FINISHED > 2017-10-23 04:14:53,956 [UserServer-1] WARN > o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc > response. > java.lang.IllegalArgumentException: Self-suppression not permitted > at java.lang.Throwable.addSuppressed(Throwable.java:1043) > ~[na:1.7.0_45] > at > org.apache.drill.common.DeferredException.addException(DeferredException.java:88) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at >
[jira] [Commented] (DRILL-5902) Queries encounter random failure due to RPC connection timed out
[ https://issues.apache.org/jira/browse/DRILL-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368940#comment-16368940 ] ASF GitHub Bot commented on DRILL-5902: --- Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1113 +1, LGTM. > Queries encounter random failure due to RPC connection timed out > > > Key: DRILL-5902 > URL: https://issues.apache.org/jira/browse/DRILL-5902 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Vlad Rozov >Priority: Critical > Labels: ready-to-commit > Fix For: 1.13.0 > > Attachments: 261230f7-e3b9-0cee-22d8-921cb56e3e12.sys.drill, > node196.drillbit.log > > > Multiple random failures (25) occurred with the latest > Functional-Baseline-88.193 run. Here is a sample query: > {noformat} > /root/drillAutomation/prasadns14/framework/resources/Functional/window_functions/multiple_partitions/q27.sql > -- Kitchen sink > -- Use all supported functions > select > rank() over W, > dense_rank()over W, > percent_rank() over W, > cume_dist() over W, > avg(c_integer + c_integer) over W, > sum(c_integer/100) over W, > count(*)over W, > min(c_integer) over W, > max(c_integer) over W, > row_number()over W > from > j7 > where > c_boolean is not null > window W as (partition by c_bigint, c_date, c_time, c_boolean order by > c_integer) > {noformat} > From the logs: > {noformat} > 2017-10-23 04:14:36,536 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > {noformat} > {noformat} > 2017-10-23 04:14:53,941 [UserServer-1] INFO > o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.88.196:31010 <--> > /10.10.88.193:38281 (user server) timed out. Timeout was set to 30 seconds. > Closing connection. > 2017-10-23 04:14:53,952 [UserServer-1] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested RUNNING --> > FAILED > 2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested FAILED --> > FINISHED > 2017-10-23 04:14:53,956 [UserServer-1] WARN > o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc > response. > java.lang.IllegalArgumentException: Self-suppression not permitted > at java.lang.Throwable.addSuppressed(Throwable.java:1043) > ~[na:1.7.0_45] > at > org.apache.drill.common.DeferredException.addException(DeferredException.java:88) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at >