[ 
https://issues.apache.org/jira/browse/IMPALA-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107678#comment-17107678
 ] 

Sahil Takiar commented on IMPALA-9253:
--------------------------------------

IMPALA-9199 adds "ECONNRESET, // 104: Connection reset by peer"

> Blacklist additional posix error codes for failed DataStreamService RPCs
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-9253
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9253
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Sahil Takiar
>            Priority: Major
>
> Filing as a follow up to 
> [IMPALA-9137|http://issues.cloudera.org/browse/IMPALA-9137], 
> [IMPALA-9137|http://issues.cloudera.org/browse/IMPALA-9137] blacklists a node 
> if a RPC fails with specific posix error codes:
>  * 107 = ENOTCONN: Transport endpoint is not connected
>  * 108 = ESHUTDOWN: Cannot send after transport endpoint shutdown
>  * 111 = ECONNREFUSED: Connection refused
> These codes were produced by running a query, killing a node running that 
> query, and then seeing what error codes the query failed with.
> There may be other error codes that are worth using for node blacklisting as 
> well. One way to come up with more error codes is to use iptables to 
> introduce network faults between Impala processes and see how RPCs fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to