[
https://issues.apache.org/jira/browse/DRILL-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437703#comment-15437703
]
Krystal commented on DRILL-4766:
--------------------------------
Used the following steps to re-produce problem without the fix:
1. Added 10000 schemas/tables to hive.
2. Lowered the rpc bit timeout (drillbit heart beat) to 10 seconds in
drill-override.conf (rpc.bit.timeout: 10).
3. Ran the following query:
select DISTINCT TABLE_TYPE from INFORMATION_SCHEMA.`TABLES` WHERE TABLE_TYPE
LIKE '%';
4. Cancelled the query.
5. Kicked off several queries from different clients.
The queries from Step 5 that connect to the same foreman as the cancelled query
consistently fail with error:
{code}
ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.
Connection: /10.10.100.123:60884 <--> /10.10.100.123:31011 (control client).
Closing connection.
java.io.IOException: syscall:read(...)() failed: Connection reset by peer
2016-08-09 14:50:10,285 [BitServer-3] ERROR
o.a.d.exec.work.foreman.QueryManager - Failure while attempting to CANCEL
fragment query_id {
part1: 2906422873165293041
part2: 4085948525804402223
}
major_fragment_id: 2
minor_fragment_id: 1
on endpoint address: "qa-node111"
user_port: 31010
control_port: 31011
data_port: 31012
with org.apache.drill.exec.rpc.ChannelClosedException: Channel closed
/10.10.100.123:60884 <--> /10.10.100.123:31011..
2016-08-09 14:50:10,286 [BitServer-3] ERROR
o.a.d.exec.work.foreman.QueryManager - Failure while attempting to CANCEL
fragment query_id {
part1: 2906422873165293041
part2: 4085948525804402223
}
{code}
Executed the same steps against build with the fix and did not encounter any
failures.
> FragmentExecutor should use EventProcessor and avoid blocking rpc threads
> -------------------------------------------------------------------------
>
> Key: DRILL-4766
> URL: https://issues.apache.org/jira/browse/DRILL-4766
> Project: Apache Drill
> Issue Type: Improvement
> Components: Execution - Flow
> Affects Versions: 1.7.0
> Reporter: Deneche A. Hakim
> Assignee: Sudheesh Katkam
> Priority: Minor
> Fix For: 1.8.0
>
>
> Currently, rpc thread can block when trying to deliver a cancel or early
> termination message to a blocked fragment executor.
> Foreman already uses an EventProcessor to avoid such scenarios.
> FragmentExecutor could be improved to avoid blocking rpc threads as well
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)