[ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15436060#comment-15436060
 ] 

Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:07 AM:
------------------------------------------------------------------

Edit: removed the confusion between ZK node vs LLAP node/machine.

This basically creates the nodes in ZK for "slots" in the cluster. The LLAPs 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The LLAPs are always sorted 
by the slot number for splits.
The idea is that as long as LLAP is running, it will retain the same position 
in the ordering, regardless of other LLAPs restarting, without knowing about 
each other, the predecessors location (if restarted in a different place), or 
the total size of the cluster. 
The restarting LLAPs may not take the same positions as their predecessors 
(i.e. if two LLAPs restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, 
they will take whatever slots, but 3 will stay the 3rd and retain cache 
locality.

This also handles size increase, as new LLAPs will always be added to the end 
of the sequence, which is what consistent hashing needs.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if LLAPs are removed that have the slots in the middle; until 
some are restarted, it will result in misses. 


was (Author: sershe):
This basically creates the nodes in ZK for "slots" in the cluster. The LLAPs 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The nodes are always sorted 
by the slot number for splits.
The idea is that as long as the node is running, it will retain the same 
position in the ordering, regardless of other nodes restarting, without knowing 
about each other, the predecessors location (if restarted in a different 
place), or the total count of nodes in the cluster. 
The restarting nodes may not take the same positions as their predecessors 
(i.e. if two nodes restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will 
take whatever spots, but 3 will stay the 3rd and retain cache locality.

This also handles size increase, as new nodes will always be added to the end 
of the sequence, which is what consistent hashing needs.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if nodes are removed that have the slots in the middle; until 
some nodes restart, it will result in misses. 

> add consistent node replacement to LLAP for splits
> --------------------------------------------------
>
>                 Key: HIVE-14589
>                 URL: https://issues.apache.org/jira/browse/HIVE-14589
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to