Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-09-01 Thread Siddharth Seth

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147610
---


Ship it!




Think there's several issues which still need resolving. Follow up jiras.
+1, assuming it's been tested locally for regresions and the new functionality.

- Siddharth Seth


On Sept. 1, 2016, 6:35 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Sept. 1, 2016, 6:35 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   llap-client/pom.xml 0243340 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java
>  PRE-CREATION 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-09-01 Thread Siddharth Seth


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 554
> > 
> >
> > The size of this collection is used to determine the #knownWorkers in 
> > HostAffinitySplitLocationProvider. The size at the moment represents active 
> > nodes.
> > 
> > I don't think the intent is to return a fewer number of nodes than what 
> > existed if a node goes down? 
> > 1) Return n entries, and one entry would indicate it's not active (and 
> > this will need to be acted upon by all clients
> > 2) Add a getNumInstances itnerface, which can differ from actual nodes 
> > available, along with a getNodeAtLocation interface?
> 
> Sergey Shelukhin wrote:
> HIVE-14680
> 
> Siddharth Seth wrote:
> Think this is very incomplete without this change. I don't think it's any 
> worse than it is today though.
> 
> Sergey Shelukhin wrote:
> Why? It handles consistency in most scenarios as described. There's just 
> a small window during the crash where splits will not get good locations.

The 'small window' is assuming the node comes back up in the fisrt place. 
Minutes is enough to cause damage to the cache on multiple nodes. What happens 
on a cluster where a physical node is taken out for maintenance, and there's no 
capacity to launch another node. What happens in case of an hour long 
maintenence window for a node/set of nodes.

Will the consisten caching part take care of this ? (The target bucket size 
will be decreasing).

The bucket size could be based on the number of instances requested by the 
slider app.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 561
> > 
> >
> > Here, as well as other places, a node is only available when it's slot 
> > has been registered. Otherwise it should not be visible to clients.
> 
> Sergey Shelukhin wrote:
> I think it only makes sense for ordered, to not mess up the order. 
> Changed that. It's also problematic because we'd have to keep track of 
> coordinated notifications.
> 
> Siddharth Seth wrote:
> Assuming notifications for registered workers behave in the following 
> manner.
> 
> NewNode - only when the slot has been registered.
> Node down - when either the slot or the actual node goes away.
> State changed - I'm not sure anyone is expected to act on this (what are 
> the possible notifications here).
> 
> Sergey Shelukhin wrote:
> Why? It doesn't seem to provide a lot of benefit compared to existing 
> worker node handling, in presence of crashes/etc.

It does provide consistency, and makes it easier for someone who does not know 
this specific detail about the two ZK nodes per instance to reason about 
potential problems. Eseentially easier to undestand and maintain.


- Siddharth


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
---


On Sept. 1, 2016, 6:35 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Sept. 1, 2016, 6:35 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   llap-client/pom.xml 0243340 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java
>  PRE-CREATION 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-09-01 Thread Siddharth Seth


> On Aug. 30, 2016, 10:36 p.m., Siddharth Seth wrote:
> > Unrelated to this patch - Any idea why 'uniq' is static/ (private static 
> > final UUID uniq = UUID.randomUUID();)
> 
> Sergey Shelukhin wrote:
> it's daemon identity, so it's global per daemon

Why is the daemon identity a static? Do we expect a new instance of the class 
to be created within the same daemon.

The reasona I ask is that this gets in the way of running a MiniLlapCluster 
with multiple simulated daemons.

Anyway, created a jira for this.


- Siddharth


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147377
---


On Sept. 1, 2016, 6:35 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Sept. 1, 2016, 6:35 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   llap-client/pom.xml 0243340 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java
>  PRE-CREATION 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-09-01 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/
---

(Updated Sept. 1, 2016, 6:35 p.m.)


Review request for hive and Prasanth_J.


Repository: hive-git


Description
---

see jira


Diffs (updated)
-

  llap-client/pom.xml 0243340 
  
llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
 99ead9b 
  
llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
 e9456f2 
  
llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
 64d2617 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
PRE-CREATION 
  
llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 
921e050 
  
llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java
 PRE-CREATION 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
 17ce69b 
  
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
 efd774d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
 54f7363 

Diff: https://reviews.apache.org/r/51312/diff/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-09-01 Thread Sergey Shelukhin


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 561
> > 
> >
> > Here, as well as other places, a node is only available when it's slot 
> > has been registered. Otherwise it should not be visible to clients.
> 
> Sergey Shelukhin wrote:
> I think it only makes sense for ordered, to not mess up the order. 
> Changed that. It's also problematic because we'd have to keep track of 
> coordinated notifications.
> 
> Siddharth Seth wrote:
> Assuming notifications for registered workers behave in the following 
> manner.
> 
> NewNode - only when the slot has been registered.
> Node down - when either the slot or the actual node goes away.
> State changed - I'm not sure anyone is expected to act on this (what are 
> the possible notifications here).

Why? It doesn't seem to provide a lot of benefit compared to existing worker 
node handling, in presence of crashes/etc.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 554
> > 
> >
> > The size of this collection is used to determine the #knownWorkers in 
> > HostAffinitySplitLocationProvider. The size at the moment represents active 
> > nodes.
> > 
> > I don't think the intent is to return a fewer number of nodes than what 
> > existed if a node goes down? 
> > 1) Return n entries, and one entry would indicate it's not active (and 
> > this will need to be acted upon by all clients
> > 2) Add a getNumInstances itnerface, which can differ from actual nodes 
> > available, along with a getNodeAtLocation interface?
> 
> Sergey Shelukhin wrote:
> HIVE-14680
> 
> Siddharth Seth wrote:
> Think this is very incomplete without this change. I don't think it's any 
> worse than it is today though.

Why? It handles consistency in most scenarios as described. There's just a 
small window during the crash where splits will not get good locations.


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
---


On Aug. 31, 2016, 10:25 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Aug. 31, 2016, 10:25 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   llap-client/pom.xml 0243340 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java
>  PRE-CREATION 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-09-01 Thread Siddharth Seth

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147548
---



Will look at the updated patch tomorrow.

- Siddharth Seth


On Aug. 31, 2016, 10:25 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Aug. 31, 2016, 10:25 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   llap-client/pom.xml 0243340 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java
>  PRE-CREATION 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-09-01 Thread Siddharth Seth


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 538
> > 
> >
> > There's a lot of similar code to read records within a loop, and a 
> > method to read records. Move into a single method? Can be done in a 
> > separate jiras as well.
> 
> Sergey Shelukhin wrote:
> That code sets several variables, which makes it impossible (or too 
> verbose) to refactor in Java

Think it can be attempted in a separate jira. Looking at this in IDEA with it's 
duplicate code detection is a lot of squigly lines.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 554
> > 
> >
> > The size of this collection is used to determine the #knownWorkers in 
> > HostAffinitySplitLocationProvider. The size at the moment represents active 
> > nodes.
> > 
> > I don't think the intent is to return a fewer number of nodes than what 
> > existed if a node goes down? 
> > 1) Return n entries, and one entry would indicate it's not active (and 
> > this will need to be acted upon by all clients
> > 2) Add a getNumInstances itnerface, which can differ from actual nodes 
> > available, along with a getNodeAtLocation interface?
> 
> Sergey Shelukhin wrote:
> HIVE-14680

Think this is very incomplete without this change. I don't think it's any worse 
than it is today though.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 561
> > 
> >
> > Here, as well as other places, a node is only available when it's slot 
> > has been registered. Otherwise it should not be visible to clients.
> 
> Sergey Shelukhin wrote:
> I think it only makes sense for ordered, to not mess up the order. 
> Changed that. It's also problematic because we'd have to keep track of 
> coordinated notifications.

Assuming notifications for registered workers behave in the following manner.

NewNode - only when the slot has been registered.
Node down - when either the slot or the actual node goes away.
State changed - I'm not sure anyone is expected to act on this (what are the 
possible notifications here).


- Siddharth


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
---


On Aug. 31, 2016, 10:25 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Aug. 31, 2016, 10:25 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   llap-client/pom.xml 0243340 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java
>  PRE-CREATION 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-08-31 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147499
---




ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java
 (line 82)


hmm, this is bogus change


- Sergey Shelukhin


On Aug. 31, 2016, 10:25 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Aug. 31, 2016, 10:25 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   llap-client/pom.xml 0243340 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java
>  PRE-CREATION 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  54f7363 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-08-31 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/
---

(Updated Aug. 31, 2016, 10:25 p.m.)


Review request for hive and Prasanth_J.


Repository: hive-git


Description
---

see jira


Diffs (updated)
-

  llap-client/pom.xml 0243340 
  
llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
 99ead9b 
  
llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
 e9456f2 
  
llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
 64d2617 
  llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
PRE-CREATION 
  
llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java 
921e050 
  
llap-client/src/test/org/apache/hadoop/hive/llap/registry/impl/TestSlotZnode.java
 PRE-CREATION 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
 17ce69b 
  
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
 efd774d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
 54f7363 

Diff: https://reviews.apache.org/r/51312/diff/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-08-31 Thread Sergey Shelukhin


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java,
> >  line 98
> > 
> >
> > This isn't part of the patch uploaded to jira?

this was in the previous jira patch


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
---


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java
>  c06499e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  d98a5ff 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-08-31 Thread Sergey Shelukhin


> On Aug. 30, 2016, 10:36 p.m., Siddharth Seth wrote:
> > Unrelated to this patch - Any idea why 'uniq' is static/ (private static 
> > final UUID uniq = UUID.randomUUID();)

it's daemon identity, so it's global per daemon


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147377
---


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java
>  c06499e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  d98a5ff 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-08-31 Thread Sergey Shelukhin


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 104
> > 
> >
> > Move this into a separate path altogether, instead of listing and 
> > filtering at the same path each time?

That would require multiple listeners, acl checks, etc.


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 538
> > 
> >
> > There's a lot of similar code to read records within a loop, and a 
> > method to read records. Move into a single method? Can be done in a 
> > separate jiras as well.

That code sets several variables, which makes it impossible (or too verbose) to 
refactor in Java


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 554
> > 
> >
> > The size of this collection is used to determine the #knownWorkers in 
> > HostAffinitySplitLocationProvider. The size at the moment represents active 
> > nodes.
> > 
> > I don't think the intent is to return a fewer number of nodes than what 
> > existed if a node goes down? 
> > 1) Return n entries, and one entry would indicate it's not active (and 
> > this will need to be acted upon by all clients
> > 2) Add a getNumInstances itnerface, which can differ from actual nodes 
> > available, along with a getNodeAtLocation interface?

HIVE-14680


> On Aug. 30, 2016, 10:35 p.m., Siddharth Seth wrote:
> > llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java,
> >  line 561
> > 
> >
> > Here, as well as other places, a node is only available when it's slot 
> > has been registered. Otherwise it should not be visible to clients.

I think it only makes sense for ordered, to not mess up the order. Changed 
that. It's also problematic because we'd have to keep track of coordinated 
notifications.


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
---


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java
>  c06499e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  d98a5ff 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-08-30 Thread Siddharth Seth

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147377
---



Unrelated to this patch - Any idea why 'uniq' is static/ (private static final 
UUID uniq = UUID.randomUUID();)

- Siddharth Seth


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
>  64d2617 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
> PRE-CREATION 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/security/LlapTokenClient.java
>  921e050 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  17ce69b 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  efd774d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HostAffinitySplitLocationProvider.java
>  c06499e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/Utils.java 8a4fc08 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
>  d98a5ff 
> 
> Diff: https://reviews.apache.org/r/51312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 51312: HIVE-14589 add consistent node replacement to LLAP for splits

2016-08-30 Thread Siddharth Seth

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51312/#review147362
---



Possible to add a test (or a few)? (I don' see one in the patch attached to the 
jira).

Question: How long does it take for a persistent node to go away - in case of a 
JVM crash / node crash. Is it possible that a new process starts up within the 
duration it takes for the old node-slot entry to be removed? (It gets added to 
the end as a result)

In terms of a permanent cluster size reduction - I think we should at least 
take some steps to handle this scenario, or an equivalent scenario where it 
takes a long time (minutes) for a node to come back. We have a force locality 
scheduling mode, this will effectively cause new queries to start in a state 
where they cannot complete. Check for this and fail the query?, or fallback to 
a next node for the split. The fallback to the next node is something that is 
going to come in soon anyway, to control locality if nodes are not available 
(current is always random as against 'random' being a configurable option :) )


llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
 (line 104)


Move this into a separate path altogether, instead of listing and filtering 
at the same path each time?



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
 (line 110)


woekersPath is no long defined, so the comment can be deleted.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
 (line 533)


There's a lot of similar code to read records within a loop, and a method 
to read records. Move into a single method? Can be done in a separate jiras as 
well.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
 (line 547)


The size of this collection is used to determine the #knownWorkers in 
HostAffinitySplitLocationProvider. The size at the moment represents active 
nodes.

I don't think the intent is to return a fewer number of nodes than what 
existed if a node goes down? 
1) Return n entries, and one entry would indicate it's not active (and this 
will need to be acted upon by all clients
2) Add a getNumInstances itnerface, which can differ from actual nodes 
available, along with a getNodeAtLocation interface?



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
 (line 554)


Here, as well as other places, a node is only available when it's slot has 
been registered. Otherwise it should not be visible to clients.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapZookeeperRegistryImpl.java
 (line 577)


Don't send back a node until it's slot registration has completed. We don't 
know what position it will take. This logic puts such nodes at the end.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
(line 51)


Are there tests in curator for the said node, which we could borrow parts 
from?



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
(line 128)


Nit: else {
  slots.add( ..)
}

Makes it a little more readable, and maintanable for future changes.



llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/SlotZnode.java 
(line 138)


Potential divide by 0? (new cluster)

What's the purpose of the 2.0f/approxWorkerCount ?



ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestHostAffinitySplitLocationProvider.java
 (line 94)


This isn't part of the patch uploaded to jira?


- Siddharth Seth


On Aug. 23, 2016, 1:41 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51312/
> ---
> 
> (Updated Aug. 23, 2016, 1:41 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/ServiceInstanceSet.java
>  99ead9b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java
>  e9456f2 
>