[jira] [Commented] (DRILL-5902) Queries encounter random failure due to RPC connection timed out

2018-02-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16375189#comment-16375189
 ] 

ASF GitHub Bot commented on DRILL-5902:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1113


> Queries encounter random failure due to RPC connection timed out
> 
>
> Key: DRILL-5902
> URL: https://issues.apache.org/jira/browse/DRILL-5902
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Vlad Rozov
>Priority: Critical
>  Labels: ready-to-commit
> Fix For: 1.13.0
>
> Attachments: 261230f7-e3b9-0cee-22d8-921cb56e3e12.sys.drill, 
> node196.drillbit.log
>
>
> Multiple random failures (25) occurred with the latest 
> Functional-Baseline-88.193 run.  Here is a sample query:
> {noformat}
> /root/drillAutomation/prasadns14/framework/resources/Functional/window_functions/multiple_partitions/q27.sql
> -- Kitchen sink
> -- Use all supported functions
> select
> rank()  over W,
> dense_rank()over W,
> percent_rank()  over W,
> cume_dist() over W,
> avg(c_integer + c_integer)  over W,
> sum(c_integer/100)  over W,
> count(*)over W,
> min(c_integer)  over W,
> max(c_integer)  over W,
> row_number()over W
> from
> j7
> where
> c_boolean is not null
> window  W as (partition by c_bigint, c_date, c_time, c_boolean order by 
> c_integer)
> {noformat}
> From the logs:
> {noformat}
> 2017-10-23 04:14:36,536 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> {noformat}
> {noformat}
> 2017-10-23 04:14:53,941 [UserServer-1] INFO  
> o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.88.196:31010 <--> 
> /10.10.88.193:38281 (user server) timed out.  Timeout was set to 30 seconds. 
> Closing connection.
> 2017-10-23 04:14:53,952 [UserServer-1] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested RUNNING --> 
> FAILED
> 2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested FAILED --> 
> FINISHED
> 2017-10-23 04:14:53,956 [UserServer-1] WARN  
> o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc 
> response.
> java.lang.IllegalArgumentException: Self-suppression not permitted
> at java.lang.Throwable.addSuppressed(Throwable.java:1043) 
> ~[na:1.7.0_45]
> at 
> org.apache.drill.common.DeferredException.addException(DeferredException.java:88)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (DRILL-5902) Queries encounter random failure due to RPC connection timed out

2018-02-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368940#comment-16368940
 ] 

ASF GitHub Bot commented on DRILL-5902:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1113
  
+1, LGTM.


> Queries encounter random failure due to RPC connection timed out
> 
>
> Key: DRILL-5902
> URL: https://issues.apache.org/jira/browse/DRILL-5902
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Vlad Rozov
>Priority: Critical
>  Labels: ready-to-commit
> Fix For: 1.13.0
>
> Attachments: 261230f7-e3b9-0cee-22d8-921cb56e3e12.sys.drill, 
> node196.drillbit.log
>
>
> Multiple random failures (25) occurred with the latest 
> Functional-Baseline-88.193 run.  Here is a sample query:
> {noformat}
> /root/drillAutomation/prasadns14/framework/resources/Functional/window_functions/multiple_partitions/q27.sql
> -- Kitchen sink
> -- Use all supported functions
> select
> rank()  over W,
> dense_rank()over W,
> percent_rank()  over W,
> cume_dist() over W,
> avg(c_integer + c_integer)  over W,
> sum(c_integer/100)  over W,
> count(*)over W,
> min(c_integer)  over W,
> max(c_integer)  over W,
> row_number()over W
> from
> j7
> where
> c_boolean is not null
> window  W as (partition by c_bigint, c_date, c_time, c_boolean order by 
> c_integer)
> {noformat}
> From the logs:
> {noformat}
> 2017-10-23 04:14:36,536 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> {noformat}
> {noformat}
> 2017-10-23 04:14:53,941 [UserServer-1] INFO  
> o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.88.196:31010 <--> 
> /10.10.88.193:38281 (user server) timed out.  Timeout was set to 30 seconds. 
> Closing connection.
> 2017-10-23 04:14:53,952 [UserServer-1] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested RUNNING --> 
> FAILED
> 2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested FAILED --> 
> FINISHED
> 2017-10-23 04:14:53,956 [UserServer-1] WARN  
> o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc 
> response.
> java.lang.IllegalArgumentException: Self-suppression not permitted
> at java.lang.Throwable.addSuppressed(Throwable.java:1043) 
> ~[na:1.7.0_45]
> at 
> org.apache.drill.common.DeferredException.addException(DeferredException.java:88)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
>