Hi Baptiste,

There's an on-going work that enhances the impalad to be able to shut down 
gracefully: https://gerrit.cloudera.org/c/10744/ Thanks to Tim's efforts on 
this and hope it can be merged soon (so the patch can be more easier to merge 
into the 2.x branch).


We've faced the similar scenario before. One idea to mitigate the service 
interruption is to set up another impala cluster as a temporary backup. Then 
switch the load-balancing to the backup cluster and perform long-time 
maintenance on the origin cluster. Finally, switch back the load-balancing to 
the origin cluster after all is done.
Hope this helps.


Regards,
Quanlong
--
Quanlong Huang
Software Developer, Hulu


At 2018-07-26 20:21:56, "Baptiste Mille-Mathias" 
<[email protected]> wrote:

Hello,


In operation I face having to stop a node or even to perform a rolling-restart 
over a whole cluster to apply a system patch or an update of configuration. The 
cluster is running Impala 2.10 and running behind load-balancing (haproxy).
The problem is when an Impala server is stopped (being coordinator or executor) 
all queries it is handling are killed and clients will receive an error, which 
is quite bad, therefore when you do a rolling-restart that will create as many 
interruption as you have nodes.


I've looked in a way to remove both roles dynamically in order to move the 
nodes properly out of the cluster before really stopping the service, so no 
service interruption is experienced but I did not see such API (only saw this 
possible in configuration file).



Is it possible ? if not how do you handle this scenario.



thanks for your advice.
--

Les gens heureux ne sont pas pressés

Reply via email to