[GitHub] helix pull request #275: PR

2018-10-26 Thread narendly
GitHub user narendly opened a pull request:

https://github.com/apache/helix/pull/275

PR



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/narendly/helix master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/275.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #275


commit e7b960c22896c08337292d20f674f20a7f1391d0
Author: Hunter Lee 
Date:   2018-10-27T01:32:16Z

[HELIX-762] TASK: Change LOG mode from info to debug

In production, it was observed that some users were running thousands of 
tasks, and since AssignableInstance leaves a line of log for each task assigned 
or released, the amount of log that was being generated was too much, and it 
was too verbose.
Changelist:
1. Change the logging mode from info to debug in AssignableInstance and 
AssignableInstanceManager

commit e492d9f663d8edad0f344208cc8affc6828708a3
Author: Hunter Lee 
Date:   2018-10-27T01:49:52Z

[HELIX-763] Task:Ignore tasks whose workflow and job are inactive

It was discovered that by manual testing, there were task states in INIT 
and RUNNING, and they were occupying a thread count even though their parent 
job or workflow was in an inactive state (terminal or stopped). This was 
happening when the capacities were being rebuilt from scratch, which could have 
caused a thread leak.
Changelist:
1. Add a check in buildAssignableInstances() so that it ignores workflows 
and jobs whose states are inactive states (that is, their tasks cannot be 
occupying a thread on Participants)

commit d33d9efea25fe9d29e84a4ce7614b544ef2d
Author: Hunter Lee 
Date:   2018-10-27T02:03:47Z

[HELIX-764] TASK: Fix LiveInstanceCurrentState change flag

Previously, existsLiveInstanceOrCurrentStateChange was getting reset in 
ClusterDataCache when its getter was called. This was problematic because if 
there were multiple jobs or multiple workflows, whoever calls this getter would 
get the correct flag value, and the ensuing callers would get a false because 
the flag would have been reset. This RB fixes that bug by reseting the flat 
right in the beginning of refresh() call in ClusterDataCache, which allows all 
callers during that pipeline would get the same, correct value.
Changelist:
1. Change the getter so that it does not reset the flag; instead, reset the 
flag in the beginning of refresh()

commit 930a4b7ae7eb63be0a751a593ba630ae55fb2cfb
Author: Hunter Lee 
Date:   2018-10-27T02:06:42Z

[HELIX-765] TASK: Build quota profile from scratch every rebalance

It has been reported that instances have a full quota despite no tasks 
existing in their CURRENTSTATES. The cause of this is not clear, so making 
ClusterDataCache trigger a refresh of all AssignableInstances will ensure that 
there aren't situations where it looks like there has been a thread leak. 
Optimizations will be implemented if necessary.
Changelist:
1. Make AssignableInstanceManager build all AssignableInstances from 
scratch every rebalance

commit 5033785c231af363953367f65f77513911b753f5
Author: Hunter Lee 
Date:   2018-10-27T02:08:02Z

[HELIX-766] TASK: Add logging functionality in AssignableInstanceManager

In order to debug task-related inquiries and issues, we realized that it 
would be very helpful if we logged there was a log recording the current quota 
capacity of all AssignableInstances. This is for cases where we see jobs whose 
tasks are not getting assigned so that we could quickly rule out the 
possibility of bugs in quota-based scheduling.
Changelist:
1. Add a method that logs current quota profile in a JSON format with 
an option flag of only displaying when there are quota types whose capacities 
are full
2. Add info logs in AssignableInstanceManager




---


[jira] [Created] (HELIX-766) [TASK] Add logging functionality in AssignableInstanceManager

2018-10-26 Thread Hunter L (JIRA)
Hunter L created HELIX-766:
--

 Summary: [TASK] Add logging functionality in 
AssignableInstanceManager
 Key: HELIX-766
 URL: https://issues.apache.org/jira/browse/HELIX-766
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


In order to debug task-related inquiries and issues, we realized that it would 
be very helpful if we logged there was a log recording the current quota 
capacity of all AssignableInstances. This is for cases where we see jobs whose 
tasks are not getting assigned so that we could quickly rule out the 
possibility of bugs in quota-based scheduling.
Changelist:
1. Add a method that logs current quota profile in a JSON format with an 
option flag of only displaying when there are quota types whose capacities are 
full
2. Add info logs in AssignableInstanceManager



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-765) [TASK] Build quota profile from scratch every rebalance

2018-10-26 Thread Hunter L (JIRA)
Hunter L created HELIX-765:
--

 Summary: [TASK] Build quota profile from scratch every rebalance
 Key: HELIX-765
 URL: https://issues.apache.org/jira/browse/HELIX-765
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


It has been reported that instances have a full quota despite no tasks existing 
in their CURRENTSTATES. The cause of this is not clear, so making 
ClusterDataCache trigger a refresh of all AssignableInstances will ensure that 
there aren't situations where it looks like there has been a thread leak. 
Optimizations will be implemented if necessary. Changelist: 1. Make 
AssignableInstanceManager build all AssignableInstances from scratch every 
rebalance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-764) [TASK] Fix LiveInstanceCurrentState change flag

2018-10-26 Thread Hunter L (JIRA)
Hunter L created HELIX-764:
--

 Summary: [TASK] Fix LiveInstanceCurrentState change flag
 Key: HELIX-764
 URL: https://issues.apache.org/jira/browse/HELIX-764
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


Previously, existsLiveInstanceOrCurrentStateChange was getting reset in 
ClusterDataCache when its getter was called. This was problematic because if 
there were multiple jobs or multiple workflows, whoever calls this getter would 
get the correct flag value, and the ensuing callers would get a false because 
the flag would have been reset. This RB fixes that bug by reseting the flat 
right in the beginning of refresh() call in ClusterDataCache, which allows all 
callers during that pipeline would get the same, correct value.
Changelist:
1. Change the getter so that it does not reset the flag; instead, reset the 
flag in the beginning of refresh()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-763) [TASK] Ignore tasks whose workflow and job are inactive

2018-10-26 Thread Hunter L (JIRA)
Hunter L created HELIX-763:
--

 Summary: [TASK] Ignore tasks whose workflow and job are inactive
 Key: HELIX-763
 URL: https://issues.apache.org/jira/browse/HELIX-763
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


It was discovered that by manual testing, there were task states in INIT and 
RUNNING, and they were occupying a thread count even though their parent job or 
workflow was in an inactive state (terminal or stopped). This was happening 
when the capacities were being rebuilt from scratch, which could have caused a 
thread leak. Changelist: 1. Add a check in buildAssignableInstances() so that 
it ignores workflows and jobs whose states are inactive states (that is, their 
tasks cannot be occupying a thread on Participants)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-762) [TASK] Change LOG mode from info to debug

2018-10-26 Thread Hunter L (JIRA)
Hunter L created HELIX-762:
--

 Summary: [TASK] Change LOG mode from info to debug
 Key: HELIX-762
 URL: https://issues.apache.org/jira/browse/HELIX-762
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


In production, it was observed that some users were running thousands of tasks, 
and since AssignableInstance leaves a line of log for each task assigned or 
released, the amount of log that was being generated was too much, and it was 
too verbose.
Changelist:
1. Change the logging mode from info to debug in AssignableInstance and 
AssignableInstanceManager



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-756) TASK: Change LOG mode from info to debug

2018-10-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665820#comment-16665820
 ] 

ASF GitHub Bot commented on HELIX-756:
--

Github user narendly closed the pull request at:

https://github.com/apache/helix/pull/271


> TASK: Change LOG mode from info to debug
> 
>
> Key: HELIX-756
> URL: https://issues.apache.org/jira/browse/HELIX-756
> Project: Apache Helix
>  Issue Type: Improvement
>Reporter: Hunter L
>Assignee: Hunter L
>Priority: Major
>
> In production, it was observed that some users were running thousands of 
> tasks, and since AssignableInstance leaves a line of log for each task 
> assigned or released, the amount of log that was being generated was too 
> much, and it was too verbose.
> Changelist:
> 1. Change the logging mode from info to debug in AssignableInstance and 
> AssignableInstanceManager



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] helix pull request #271: [HELIX-756] TASK: Change LOG mode from info to debu...

2018-10-26 Thread narendly
Github user narendly closed the pull request at:

https://github.com/apache/helix/pull/271


---