[jira] [Created] (HELIX-774) Helix process getting increased day by day

2018-10-31 Thread Mohanraj Tirougnaname (JIRA)
Mohanraj Tirougnaname created HELIX-774:
---

 Summary: Helix process getting increased day by day 
 Key: HELIX-774
 URL: https://issues.apache.org/jira/browse/HELIX-774
 Project: Apache Helix
  Issue Type: Bug
  Components: helix-webapp-admin
Affects Versions: 0.6.5
 Environment: Linux
Reporter: Mohanraj Tirougnaname
 Fix For: 0.6.5


Hi Team, 

We are using helix in cluster Load balancing. While starting jbpm server the 3 
helix process daily getting added in the process and takes lot of memory. Below 
I have added the helix process, please help me to fix this.

Please refer below process

jboss 31027 1 0 Oct19 ? 00:04:27 
/u01/app/mw/jdk1.8.0_121/bin/java -Xms512m -Xmx512m -classpath 
/u01/app/mw/prod_zk/helix/conf
/u01/app/mw/prod_zk/heli /repo/log4j/log4j/1.2.15/log4j-1.2.15.jar
/u01/app/mw/prod_zk/helix/repo/org/apache/zookeeper/zookeeper/3.3.4/zookeeper-3.3.4.jar
/u01/app/mw/prod_zk/helix/repo/jline/jline/0.9.94/jline-0.9.94.jar
/u01/app/mw/prod_zk/helix/repo/org/codehaus/jackson/jackson-core-asl/1.8.5/jackson-core-asl-1.8.5.jar
/u01/app/mw/prod_zk/helix/repo/org/codehaus/jackson/jackson-mapper-asl/1.8.5/jackson-mapper-asl-1.8.5.jar
/u01/app/mw/prod_zk/helix/repo/commons-io/commons-io/1.4/commons-io-1.4.jar
/u01/app/mw/prod_zk/helix/repo/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
/u01/app/mw/prod_zk/helix/repo/com/github/sgroschupf/zkclient/0.1/zkclient-0.1.jar
/u01/app/mw/prod_zk/helix/repo/org/apache/commons/commons-math/2.1/commons-math-2.1.jar
/u01/app/mw/prod_zk/helix/repo/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
/u01/app/mw/prod_zk/helix/repo/com/google/guava/guava/15.0/guava-15.0.jar
/u01/app/mw/prod_zk/helix/repo/org/yaml/snakeyaml/1.12/snakeyaml-1.12.jar
/u01/app/mw/prod_zk/helix/repo/org/apache/helix/helix-core/0.6.5/helix-core-0.6.5.jar
 -Dapp.name=run-helix-controller -Dapp.pid=31027 
-Dapp.repo=/u01/app/mw/prod_zk/helix/repo -Dbasedir=/u01/app/mw/prod_zk/helix 
org.apache.helix.controller.HelixControllerMain --zkSvr 
204.26.160.42:4181,204.26.160.43:4181 --cluster repoCluster3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


helix git commit: [HELIX-772] add TaskDriver.addUserContent() api and related tests

2018-10-31 Thread jxue
Repository: helix
Updated Branches:
  refs/heads/master 0beeb8fa2 -> 0c251bbf6


[HELIX-772] add TaskDriver.addUserContent() api and related tests


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/0c251bbf
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/0c251bbf
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/0c251bbf

Branch: refs/heads/master
Commit: 0c251bbf640206729755301c3dda734eea78343f
Parents: 0beeb8f
Author: Harry Zhang 
Authored: Tue Oct 30 16:25:12 2018 -0700
Committer: Harry Zhang 
Committed: Wed Oct 31 13:47:52 2018 -0700

--
 .../java/org/apache/helix/task/TaskDriver.java  |  46 -
 .../java/org/apache/helix/task/TaskUtil.java|  52 +++--
 .../task/TestIndependentTaskRebalancer.java |  43 ++--
 .../helix/task/TestGetSetUserContentStore.java  | 206 +++
 .../helix/task/TestGetUserContentStore.java | 144 -
 5 files changed, 309 insertions(+), 182 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/0c251bbf/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
--
diff --git a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java 
b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
index ea529e8..e6256ed 100644
--- a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
+++ b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
@@ -19,8 +19,26 @@ package org.apache.helix.task;
  * under the License.
  */
 
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
 import org.I0Itec.zkclient.DataUpdater;
-import org.apache.helix.*;
+import org.apache.helix.AccessOption;
+import org.apache.helix.BaseDataAccessor;
+import org.apache.helix.ConfigAccessor;
+import org.apache.helix.HelixAdmin;
+import org.apache.helix.HelixDataAccessor;
+import org.apache.helix.HelixException;
+import org.apache.helix.HelixManager;
+import org.apache.helix.PropertyKey;
+import org.apache.helix.PropertyPathBuilder;
+import org.apache.helix.SystemPropertyKeys;
+import org.apache.helix.ZNRecord;
 import org.apache.helix.controller.rebalancer.util.RebalanceScheduler;
 import org.apache.helix.manager.zk.ZKHelixAdmin;
 import org.apache.helix.manager.zk.ZKHelixDataAccessor;
@@ -35,8 +53,6 @@ import org.apache.helix.util.HelixUtil;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import java.util.*;
-
 /**
  * CLI for scheduling/canceling workflows
  */
@@ -1003,6 +1019,30 @@ public class TaskDriver {
   }
 
   /**
+   * Set user content defined by the given key and string
+   * @param key content key
+   * @param value content value
+   * @param workflowName name of the workflow - must provide when scope is 
WORKFLOW
+   * @param jobName name of the job - must provide when scope is JOB or TASK
+   * @param taskName name of the task - must provide when scope is TASK
+   * @param scope scope of the content
+   */
+  public void addUserContent(String key, String value, String workflowName, 
String jobName, String taskName,
+  UserContentStore.Scope scope) {
+switch (scope) {
+case WORKFLOW:
+  TaskUtil.addWorkflowJobUserContent(_propertyStore, workflowName, key, 
value);
+  break;
+case JOB:
+  TaskUtil.addWorkflowJobUserContent(_propertyStore, jobName, key, value);
+  break;
+default:
+  TaskUtil.addTaskUserContent(_propertyStore, jobName, taskName, key, 
value);
+  break;
+}
+  }
+
+  /**
* Throw Exception if children nodes will exceed limitation after adding 
newNodesCount children.
* @param newConfigNodeCount
*/

http://git-wip-us.apache.org/repos/asf/helix/blob/0c251bbf/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
--
diff --git a/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java 
b/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
index 1ce448c..3461233 100644
--- a/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
+++ b/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
@@ -19,7 +19,6 @@ package org.apache.helix.task;
  * under the License.
  */
 
-import com.google.common.collect.Sets;
 import java.io.IOException;
 import java.util.Collections;
 import java.util.HashMap;
@@ -41,12 +40,13 @@ import org.apache.helix.model.HelixConfigScope;
 import org.apache.helix.model.ResourceConfig;
 import org.apache.helix.model.builder.HelixConfigScopeBuilder;
 import org.apache.helix.store.HelixPropertyStore;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 import org.codehaus.jackson.ma

[jira] [Commented] (HELIX-772) Support TaskDriver.addUserContent() api

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670698#comment-16670698
 ] 

ASF GitHub Bot commented on HELIX-772:
--

Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/280


> Support TaskDriver.addUserContent() api
> ---
>
> Key: HELIX-772
> URL: https://issues.apache.org/jira/browse/HELIX-772
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Need to support add user content in task driver
>  
> AC:
>  * implement APi
>  * add test
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


helix git commit: [HELIX-773] add getLastScheduledTaskTimestamp information in workflow rest api

2018-10-31 Thread jxue
Repository: helix
Updated Branches:
  refs/heads/master 0c251bbf6 -> 566d4f166


[HELIX-773] add getLastScheduledTaskTimestamp information in workflow rest api


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/566d4f16
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/566d4f16
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/566d4f16

Branch: refs/heads/master
Commit: 566d4f166473b477ea0db1cfba5d04c8f3d6bf30
Parents: 0c251bb
Author: Harry Zhang 
Authored: Tue Oct 30 16:43:25 2018 -0700
Committer: Harry Zhang 
Committed: Wed Oct 31 13:48:46 2018 -0700

--
 .../java/org/apache/helix/task/TaskDriver.java  |  22 +++-
 .../apache/helix/task/TaskExecutionInfo.java|  65 ++
 .../task/TestGetLastScheduledTaskExecInfo.java  | 122 +++
 .../task/TestGetLastScheduledTaskTimestamp.java | 110 -
 .../resources/helix/WorkflowAccessor.java   |   5 +-
 .../helix/rest/server/TestWorkflowAccessor.java |   9 +-
 6 files changed, 213 insertions(+), 120 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/566d4f16/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
--
diff --git a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java 
b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
index e6256ed..54e3ab3 100644
--- a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
+++ b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
@@ -85,7 +85,6 @@ public class TaskDriver {
   // TODO Implement or configure the limitation in ZK server.
   private final static long DEFAULT_CONFIGS_LIMITATION =
   
HelixUtil.getSystemPropertyAsLong(SystemPropertyKeys.TASK_CONFIG_LIMITATION, 
10L);
-  private final static long TIMESTAMP_NOT_SET = -1L;
   private final static String TASK_START_TIME_KEY = "START_TIME";
   protected long _configsLimitation = DEFAULT_CONFIGS_LIMITATION;
 
@@ -977,14 +976,22 @@ public class TaskDriver {
* -1L if timestamp is not set (either nothing is scheduled or no start time 
recorded).
*/
   public long getLastScheduledTaskTimestamp(String workflowName) {
-long lastScheduledTaskTimestamp = TIMESTAMP_NOT_SET;
+return getLastScheduledTaskExecutionInfo(workflowName).getStartTimeStamp();
+  }
+
+  public TaskExecutionInfo getLastScheduledTaskExecutionInfo(String 
workflowName) {
+long lastScheduledTaskTimestamp = TaskExecutionInfo.TIMESTAMP_NOT_SET;
+String jobName = null;
+Integer taskPartitionIndex = null;
+TaskPartitionState state = null;
+
 
 WorkflowContext workflowContext = getWorkflowContext(workflowName);
 if (workflowContext != null) {
   Map allJobStates = workflowContext.getJobStates();
-  for (String job : allJobStates.keySet()) {
-if (!allJobStates.get(job).equals(TaskState.NOT_STARTED)) {
-  JobContext jobContext = getJobContext(job);
+  for (Map.Entry jobState : allJobStates.entrySet()) {
+if (!jobState.getValue().equals(TaskState.NOT_STARTED)) {
+  JobContext jobContext = getJobContext(jobState.getKey());
   if (jobContext != null) {
 Set allPartitions = jobContext.getPartitionSet();
 for (Integer partition : allPartitions) {
@@ -993,6 +1000,9 @@ public class TaskDriver {
 long startTimeLong = Long.parseLong(startTime);
 if (startTimeLong > lastScheduledTaskTimestamp) {
   lastScheduledTaskTimestamp = startTimeLong;
+  jobName = jobState.getKey();
+  taskPartitionIndex = partition;
+  state = jobContext.getPartitionState(partition);
 }
   }
 }
@@ -1000,7 +1010,7 @@ public class TaskDriver {
 }
   }
 }
-return lastScheduledTaskTimestamp;
+return new TaskExecutionInfo(jobName, taskPartitionIndex, state, 
lastScheduledTaskTimestamp);
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/helix/blob/566d4f16/helix-core/src/main/java/org/apache/helix/task/TaskExecutionInfo.java
--
diff --git 
a/helix-core/src/main/java/org/apache/helix/task/TaskExecutionInfo.java 
b/helix-core/src/main/java/org/apache/helix/task/TaskExecutionInfo.java
new file mode 100644
index 000..03d66b4
--- /dev/null
+++ b/helix-core/src/main/java/org/apache/helix/task/TaskExecutionInfo.java
@@ -0,0 +1,65 @@
+package org.apache.helix.task;
+
+import java.io.IOException;
+import org.codehaus.jackson.annotate.JsonCreator;
+import org.codehaus.jackson.annotate.JsonIgnoreProperties;
+import org.codehaus.jackson.annotate.JsonProperty;
+import org.codehaus.jackson.map.ObjectMapper;
+
+@JsonIgnorePrope

[jira] [Commented] (HELIX-773) Support getLastScheduledTaskTimestamp information in workflow rest api

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670702#comment-16670702
 ] 

ASF GitHub Bot commented on HELIX-773:
--

Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/281


> Support getLastScheduledTaskTimestamp information in workflow rest api
> --
>
> Key: HELIX-773
> URL: https://issues.apache.org/jira/browse/HELIX-773
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Support getLastScheduledTaskTimestamp information in workflow rest api



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[1/2] helix git commit: Customize the pipeline thread name with cluster name and pipeline type.

2018-10-31 Thread jxue
Repository: helix
Updated Branches:
  refs/heads/master 566d4f166 -> 1103fecb6


Customize the pipeline thread name with cluster name and pipeline type.


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/96593708
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/96593708
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/96593708

Branch: refs/heads/master
Commit: 9659370849116b8a3f5d1853c7695274942ce211
Parents: 566d4f1
Author: Lei Xia 
Authored: Tue Oct 2 14:43:25 2018 -0700
Committer: Junkai Xue 
Committed: Wed Oct 31 13:50:33 2018 -0700

--
 .../controller/GenericHelixController.java  | 30 +---
 1 file changed, 19 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/96593708/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
--
diff --git 
a/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
 
b/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
index dd409e5..eb75286 100644
--- 
a/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
+++ 
b/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
@@ -356,7 +356,7 @@ public class GenericHelixController implements 
IdealStateChangeListener,
   }
 
   private GenericHelixController(PipelineRegistry registry, PipelineRegistry 
taskRegistry,
-  String clusterName) {
+  final String clusterName) {
 _paused = false;
 _registry = registry;
 _taskRegistry = taskRegistry;
@@ -366,7 +366,7 @@ public class GenericHelixController implements 
IdealStateChangeListener,
 _asyncTasksThreadPool =
 Executors.newScheduledThreadPool(ASYNC_TASKS_THREADPOOL_SIZE, new 
ThreadFactory() {
   @Override public Thread newThread(Runnable r) {
-return new Thread(r, "GerenricHelixController-async_task_thread");
+return new Thread(r, "HelixController-async_tasks-" + 
_clusterName);
   }
 });
 
@@ -378,8 +378,9 @@ public class GenericHelixController implements 
IdealStateChangeListener,
 _cache = new ClusterDataCache(clusterName);
 _taskCache = new ClusterDataCache(clusterName);
 
-_eventThread = new ClusterEventProcessor(_cache, _eventQueue);
-_taskEventThread = new ClusterEventProcessor(_taskCache, _taskEventQueue);
+_eventThread = new ClusterEventProcessor(_cache, _eventQueue, "default-" + 
clusterName);
+_taskEventThread =
+new ClusterEventProcessor(_taskCache, _taskEventQueue, "task-" + 
clusterName);
 
 _forceRebalanceTimer = new Timer();
 _lastPipelineEndTimestamp = 
TopStateHandoffReportStage.TIMESTAMP_NOT_RECORDED;
@@ -979,33 +980,40 @@ public class GenericHelixController implements 
IdealStateChangeListener,
   private class ClusterEventProcessor extends Thread {
 private final ClusterDataCache _cache;
 private final ClusterEventBlockingQueue _eventBlockingQueue;
+private final String _processorName;
 
 public ClusterEventProcessor(ClusterDataCache cache,
-ClusterEventBlockingQueue eventBlockingQueue) {
-  super("GenericHelixController-event_process");
+ClusterEventBlockingQueue eventBlockingQueue, String processorName) {
+  super("HelixController-pipeline-" + processorName);
   _cache = cache;
   _eventBlockingQueue = eventBlockingQueue;
+  _processorName = processorName;
 }
 
 @Override
 public void run() {
-  logger.info("START ClusterEventProcessor thread  for cluster " + 
_clusterName);
+  logger.info(
+  "START ClusterEventProcessor thread  for cluster " + _clusterName + 
", processor name: "
+  + _processorName);
   while (!isInterrupted()) {
 try {
   handleEvent(_eventBlockingQueue.take(), _cache);
 } catch (InterruptedException e) {
-  logger.warn("ClusterEventProcessor interrupted", e);
+  logger.warn("ClusterEventProcessor interrupted " + _processorName, 
e);
   interrupt();
 } catch (ZkInterruptedException e) {
-  logger.warn("ClusterEventProcessor caught a ZK connection 
interrupt", e);
+  logger
+  .warn("ClusterEventProcessor caught a ZK connection interrupt " 
+ _processorName, e);
   interrupt();
 } catch (ThreadDeath death) {
+  logger.error("ClusterEventProcessor caught a ThreadDeath  " + 
_processorName, death);
   throw death;
 } catch (Throwable t) {
-  logger.error("ClusterEventProcessor failed while running the 
controller pipeline", t);
+  logger.error("ClusterEventProcessor failed while running the 
controller pipeline "
+   

[2/2] helix git commit: Improve message handling properly

2018-10-31 Thread jxue
Improve message handling properly

We need some improvement for client side to help stabilize the message 
handling. Current scenario is that when process message, Helix marked the 
message read before really scheduled.

In this case, if scheduling has problem, but the message already has been 
marked as read, no one will take care of the message and hang there forever.


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/1103fecb
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/1103fecb
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/1103fecb

Branch: refs/heads/master
Commit: 1103fecb67def5e610a7b22636ba4ac25e23777b
Parents: 9659370
Author: Junkai Xue 
Authored: Mon Sep 17 11:51:31 2018 -0700
Committer: Junkai Xue 
Committed: Wed Oct 31 13:50:37 2018 -0700

--
 .../helix/messaging/handling/HelixTaskExecutor.java  | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/1103fecb/helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
--
diff --git 
a/helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
 
b/helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
index 5e2082c..3ae90d3 100644
--- 
a/helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
+++ 
b/helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
@@ -952,22 +952,25 @@ public class HelixTaskExecutor implements 
MessageListener, TaskExecutor {
 if (readMsgs.size() > 0) {
   updateMessageState(readMsgs, accessor, instanceName);
 
+  // Remove message if schedule tasks are failed.
   for (Map.Entry handlerEntry : 
stateTransitionHandlers.entrySet()) {
 MessageHandler handler = handlerEntry.getValue();
 NotificationContext context = 
stateTransitionContexts.get(handlerEntry.getKey());
 Message msg = handler._message;
-scheduleTask(
-new HelixTask(msg, context, handler, this)
-);
+if (!scheduleTask(new HelixTask(msg, context, handler, this))) {
+  removeMessageFromTaskAndFutureMap(msg);
+  removeMessageFromZK(accessor, msg, instanceName);
+}
   }
 
   for (int i = 0; i < nonStateTransitionHandlers.size(); i++) {
 MessageHandler handler = nonStateTransitionHandlers.get(i);
 NotificationContext context = nonStateTransitionContexts.get(i);
 Message msg = handler._message;
-scheduleTask(
-new HelixTask(msg, context, handler, this)
-);
+if (!scheduleTask(new HelixTask(msg, context, handler, this))) {
+  removeMessageFromTaskAndFutureMap(msg);
+  removeMessageFromZK(accessor, msg, instanceName);
+}
   }
 }
   }



[jira] [Created] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread Harry Zhang (JIRA)
Harry Zhang created HELIX-775:
-

 Summary: Task driver should support add/get task framework user 
content
 Key: HELIX-775
 URL: https://issues.apache.org/jira/browse/HELIX-775
 Project: Apache Helix
  Issue Type: Task
Reporter: Harry Zhang
Assignee: Harry Zhang


Task driver should support add/get task framework user content at 
workflow/job/task levels

 

AC:
 * finish implementation
 * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670711#comment-16670711
 ] 

ASF GitHub Bot commented on HELIX-775:
--

GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/282

[HELIX-775] add task driver support for helix rest to add/get task fr…

…amework user content


consolidate user content related apis for task driver


To consolidate task driver user content related apis, and corresponding 
rest apis, I'm deprecating the general getUserContent() api, but instead, we 
now have the following apis for get / add / update user content.

```java
public void addOrUpdateWorkflowUserContentMap(String workflowName,
  final Map contentToAddOrUpdate);

public void addOrUpdateJobUserContentMap(String workflowName, String 
jobName,
  final Map contentToAddOrUpdate);

public void addOrUpdateTaskUserContentMap(String workflowName, String 
jobName,
  String taskPartitionId, final Map 
contentToAddOrUpdate);


public Map getWorkflowUserContentMap(String workflowName);


public Map getJobUserContentMap(String workflowName, String 
jobName);

public Map getTaskUserContentMap(String workflowName, 
String jobName,
  String taskPartitionId);
```

API for deleting user content is TBD but can use the same convension

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/task-user-content

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/282.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #282


commit 7ec5313bccb679014d6a0605ee5d7184063e555e
Author: Harry Zhang 
Date:   2018-10-31T20:55:44Z

[HELIX-775] add task driver support for helix rest to add/get task 
framework user content




> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


helix git commit: [HELIX-775] add task driver support for helix rest to add/get task framework user content

2018-10-31 Thread jxue
Repository: helix
Updated Branches:
  refs/heads/master 1103fecb6 -> 7ec5313bc


[HELIX-775] add task driver support for helix rest to add/get task framework 
user content


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/7ec5313b
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/7ec5313b
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/7ec5313b

Branch: refs/heads/master
Commit: 7ec5313bccb679014d6a0605ee5d7184063e555e
Parents: 1103fec
Author: Harry Zhang 
Authored: Wed Oct 31 13:55:44 2018 -0700
Committer: Harry Zhang 
Committed: Wed Oct 31 13:55:44 2018 -0700

--
 .../java/org/apache/helix/task/TaskDriver.java  | 35 ++
 .../java/org/apache/helix/task/TaskUtil.java| 50 +---
 2 files changed, 78 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/7ec5313b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
--
diff --git a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java 
b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
index 54e3ab3..25a4fe4 100644
--- a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
+++ b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
@@ -1029,6 +1029,41 @@ public class TaskDriver {
   }
 
   /**
+   * Return the full user content map for workflow
+   * @param workflowName workflow name
+   * @return user content map
+   */
+  public Map getWorkflowUserContentMap(String workflowName) {
+return TaskUtil.getWorkflowJobUserContentMap(_propertyStore, workflowName);
+  }
+
+  /**
+   * Return full user content map for job
+   * @param workflowName workflow name
+   * @param jobName Un-namespaced job name
+   * @return user content map
+   */
+  public Map getJobUserContentMap(String workflowName, String 
jobName) {
+String namespacedJobName = TaskUtil.getNamespacedJobName(workflowName, 
jobName);
+return TaskUtil.getWorkflowJobUserContentMap(_propertyStore, 
namespacedJobName);
+  }
+
+  /**
+   * Return full user content map for task
+   * @param workflowName workflow name
+   * @param jobName Un-namespaced job name
+   * @param taskPartitionId task partition id
+   * @return user content map
+   */
+  public Map getTaskContentMap(String workflowName, String 
jobName, String taskPartitionId) {
+String namespacedJobName = TaskUtil.getNamespacedJobName(workflowName, 
jobName);
+String namespacedTaskName = 
TaskUtil.getNamespacedTaskName(namespacedJobName, taskPartitionId);
+return TaskUtil.getTaskUserContentMap(_propertyStore, namespacedJobName, 
namespacedTaskName);
+  }
+
+
+
+  /**
* Set user content defined by the given key and string
* @param key content key
* @param value content value

http://git-wip-us.apache.org/repos/asf/helix/blob/7ec5313b/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
--
diff --git a/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java 
b/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
index 3461233..5581b6f 100644
--- a/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
+++ b/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
@@ -26,7 +26,6 @@ import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
-
 import org.I0Itec.zkclient.DataUpdater;
 import org.apache.helix.AccessOption;
 import org.apache.helix.HelixDataAccessor;
@@ -320,9 +319,22 @@ public class TaskUtil {
*/
   protected static String 
getWorkflowJobUserContent(HelixPropertyStore propertyStore,
   String workflowJobResource, String key) {
-ZNRecord r = 
propertyStore.get(Joiner.on("/").join(TaskConstants.REBALANCER_CONTEXT_ROOT,
-workflowJobResource, USER_CONTENT_NODE), null, 
AccessOption.PERSISTENT);
-return r != null ? r.getSimpleField(key) : null;
+Map userContentMap = 
getWorkflowJobUserContentMap(propertyStore, workflowJobResource);
+return userContentMap != null ? userContentMap.get(key) : null;
+  }
+
+  /**
+   * get workflow/job user content map
+   * @param propertyStore property store
+   * @param workflowJobResource workflow name or namespaced job name
+   * @return user content map
+   */
+  protected static Map getWorkflowJobUserContentMap(
+  HelixPropertyStore propertyStore, String workflowJobResource) {
+ZNRecord record = propertyStore.get(Joiner.on("/")
+.join(TaskConstants.REBALANCER_CONTEXT_ROOT, workflowJobResource, 
USER_CONTENT_NODE), null,
+AccessOption.PERSISTENT);
+return record != null ? record.getSimpleFields() : null;
   }
 
   /**
@@ -365,10 +377,24 @@ public class TaskUtil {
*/
   protected s

[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670714#comment-16670714
 ] 

ASF GitHub Bot commented on HELIX-775:
--

Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/282


> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-773) Support getLastScheduledTaskTimestamp information in workflow rest api

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670718#comment-16670718
 ] 

Hudson commented on HELIX-773:
--

FAILURE: Integrated in Jenkins build helix #1558 (See 
[https://builds.apache.org/job/helix/1558/])
[HELIX-773] add getLastScheduledTaskTimestamp information in workflow (hrzhang: 
rev 566d4f166473b477ea0db1cfba5d04c8f3d6bf30)
* (add) 
helix-core/src/test/java/org/apache/helix/task/TestGetLastScheduledTaskExecInfo.java
* (edit) 
helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/WorkflowAccessor.java
* (delete) 
helix-core/src/test/java/org/apache/helix/task/TestGetLastScheduledTaskTimestamp.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
* (add) helix-core/src/main/java/org/apache/helix/task/TaskExecutionInfo.java
* (edit) 
helix-rest/src/test/java/org/apache/helix/rest/server/TestWorkflowAccessor.java


> Support getLastScheduledTaskTimestamp information in workflow rest api
> --
>
> Key: HELIX-773
> URL: https://issues.apache.org/jira/browse/HELIX-773
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Support getLastScheduledTaskTimestamp information in workflow rest api



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-772) Support TaskDriver.addUserContent() api

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670717#comment-16670717
 ] 

Hudson commented on HELIX-772:
--

FAILURE: Integrated in Jenkins build helix #1558 (See 
[https://builds.apache.org/job/helix/1558/])
[HELIX-772] add TaskDriver.addUserContent() api and related tests (hrzhang: rev 
0c251bbf640206729755301c3dda734eea78343f)
* (add) 
helix-core/src/test/java/org/apache/helix/task/TestGetSetUserContentStore.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
* (edit) 
helix-core/src/test/java/org/apache/helix/integration/task/TestIndependentTaskRebalancer.java
* (delete) 
helix-core/src/test/java/org/apache/helix/task/TestGetUserContentStore.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskDriver.java


> Support TaskDriver.addUserContent() api
> ---
>
> Key: HELIX-772
> URL: https://issues.apache.org/jira/browse/HELIX-772
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Need to support add user content in task driver
>  
> AC:
>  * implement APi
>  * add test
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670726#comment-16670726
 ] 

Hudson commented on HELIX-775:
--

FAILURE: Integrated in Jenkins build helix #1559 (See 
[https://builds.apache.org/job/helix/1559/])
[HELIX-775] add task driver support for helix rest to add/get task (hrzhang: 
rev 7ec5313bccb679014d6a0605ee5d7184063e555e)
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskDriver.java


> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670730#comment-16670730
 ] 

ASF GitHub Bot commented on HELIX-775:
--

GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/283

[HELIX-775] consolidate user content related apis for task driver

HELIX-1315: consolidate user content related apis for task driver


To consolidate task driver user content related apis, and corresponding 
rest apis, I'm deprecating the general getUserContent() api, but instead, we 
now have the following apis for get / add / update user content.

```java
public void addOrUpdateWorkflowUserContentMap(String workflowName,
  final Map contentToAddOrUpdate);

public void addOrUpdateJobUserContentMap(String workflowName, String 
jobName,
  final Map contentToAddOrUpdate);

public void addOrUpdateTaskUserContentMap(String workflowName, String 
jobName,
  String taskPartitionId, final Map 
contentToAddOrUpdate);


public Map getWorkflowUserContentMap(String workflowName);


public Map getJobUserContentMap(String workflowName, String 
jobName);

public Map getTaskUserContentMap(String workflowName, 
String jobName,
  String taskPartitionId);
```

delete user content api tbd but can use the same convension

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/task-user-content

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/283.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #283


commit b235c4ee5a82c5970d29e839317ea242813a58bc
Author: Harry Zhang 
Date:   2018-10-04T18:25:08Z

[HELIX-775] consolidate user content related apis for task driver




> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


helix git commit: [HELIX-775] consolidate user content related apis for task driver

2018-10-31 Thread jxue
Repository: helix
Updated Branches:
  refs/heads/master 7ec5313bc -> b235c4ee5


[HELIX-775] consolidate user content related apis for task driver


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/b235c4ee
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/b235c4ee
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/b235c4ee

Branch: refs/heads/master
Commit: b235c4ee5a82c5970d29e839317ea242813a58bc
Parents: 7ec5313
Author: Harry Zhang 
Authored: Thu Oct 4 11:25:08 2018 -0700
Committer: Harry Zhang 
Committed: Wed Oct 31 14:03:37 2018 -0700

--
 .../java/org/apache/helix/task/TaskDriver.java  | 64 +---
 .../java/org/apache/helix/task/TaskUtil.java| 18 +++---
 .../helix/task/TestGetSetUserContentStore.java  | 53 +---
 3 files changed, 83 insertions(+), 52 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/b235c4ee/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
--
diff --git a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java 
b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
index 25a4fe4..e675c86 100644
--- a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
+++ b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
@@ -1022,7 +1022,12 @@ public class TaskDriver {
* @param taskName name of task. Optional if scope is WORKFLOW or JOB
* @return null if key-value pair not found or this content store does not 
exist. Otherwise,
* return a String
+   *
+   * @deprecated use the following equivalents: {@link 
#getWorkflowUserContentMap(String)},
+   * {@link #getJobUserContentMap(String, String)},
+   * @{{@link #getTaskContentMap(String, String, String)}}
*/
+  @Deprecated
   public String getUserContent(String key, UserContentStore.Scope scope, 
String workflowName,
   String jobName, String taskName) {
 return TaskUtil.getUserContent(_propertyStore, key, scope, workflowName, 
jobName, taskName);
@@ -1055,36 +1060,53 @@ public class TaskDriver {
* @param taskPartitionId task partition id
* @return user content map
*/
-  public Map getTaskContentMap(String workflowName, String 
jobName, String taskPartitionId) {
+  public Map getTaskUserContentMap(String workflowName, String 
jobName,
+  String taskPartitionId) {
 String namespacedJobName = TaskUtil.getNamespacedJobName(workflowName, 
jobName);
 String namespacedTaskName = 
TaskUtil.getNamespacedTaskName(namespacedJobName, taskPartitionId);
 return TaskUtil.getTaskUserContentMap(_propertyStore, namespacedJobName, 
namespacedTaskName);
   }
 
+  /**
+   * Add or update workflow user content with the given map - new keys will be 
added, and old
+   * keys will be updated
+   * @param workflowName workflow name
+   * @param contentToAddOrUpdate map containing items to add or update
+   */
+  public void addOrUpdateWorkflowUserContentMap(String workflowName,
+  final Map contentToAddOrUpdate) {
+TaskUtil
+.addOrUpdateWorkflowJobUserContentMap(_propertyStore, workflowName, 
contentToAddOrUpdate);
+  }
 
+  /**
+   * Add or update job user content with the given map - new keys will be 
added, and old keys will
+   * be updated
+   * @param workflowName workflow name
+   * @param jobName Un-namespaced job name
+   * @param contentToAddOrUpdate map containing items to add or update
+   */
+  public void addOrUpdateJobUserContentMap(String workflowName, String jobName,
+  final Map contentToAddOrUpdate) {
+String namespacedJobName = TaskUtil.getNamespacedJobName(workflowName, 
jobName);
+TaskUtil.addOrUpdateWorkflowJobUserContentMap(_propertyStore, 
namespacedJobName,
+contentToAddOrUpdate);
+  }
 
   /**
-   * Set user content defined by the given key and string
-   * @param key content key
-   * @param value content value
-   * @param workflowName name of the workflow - must provide when scope is 
WORKFLOW
-   * @param jobName name of the job - must provide when scope is JOB or TASK
-   * @param taskName name of the task - must provide when scope is TASK
-   * @param scope scope of the content
+   * Add or update task user content with the given map - new keys will be 
added, and old keys
+   * will be updated
+   * @param workflowName workflow name
+   * @param jobName Un-namespaced job name
+   * @param taskPartitionId task partition id
+   * @param contentToAddOrUpdate map containing items to add or update
*/
-  public void addUserContent(String key, String value, String workflowName, 
String jobName, String taskName,
-  UserContentStore.Scope scope) {
-switch (scope) {
-case WORKFLOW:
-  TaskUtil.addWorkflowJobUserContent(_propertyStore, workflowName, key, 
v

[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670734#comment-16670734
 ] 

ASF GitHub Bot commented on HELIX-775:
--

Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/283


> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670739#comment-16670739
 ] 

Hudson commented on HELIX-775:
--

FAILURE: Integrated in Jenkins build helix #1560 (See 
[https://builds.apache.org/job/helix/1560/])
[HELIX-775] consolidate user content related apis for task driver (hrzhang: rev 
b235c4ee5a82c5970d29e839317ea242813a58bc)
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
* (edit) 
helix-core/src/test/java/org/apache/helix/task/TestGetSetUserContentStore.java


> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-776) REST2.0: Add delete command to updateInstanceConfig

2018-10-31 Thread Hunter L (JIRA)
Hunter L created HELIX-776:
--

 Summary: REST2.0: Add delete command to updateInstanceConfig
 Key: HELIX-776
 URL: https://issues.apache.org/jira/browse/HELIX-776
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


For instance configs, REST2.0 did not expose the REST API for deletion of 
fields. This RB adds update and delete commands to updateInstanceConfig and an 
integration test thereof. Changelist: 1. Add delete command to 
updateInstanceConfig in InstanceAccessor 2. Add integration tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-777) TASK: Handle null currentState for unscheduled tasks

2018-10-31 Thread Hunter L (JIRA)
Hunter L created HELIX-777:
--

 Summary: TASK: Handle null currentState for unscheduled tasks
 Key: HELIX-777
 URL: https://issues.apache.org/jira/browse/HELIX-777
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


It was observed that when a workflow is submitted and the Controller attempts 
to schedule its tasks, ZK read fails to read the appropriate job's context, 
causing the job to be stuck in an unscheduled state. The job remained 
unscheduled because it had no currentStates, and its job context did not 
contain any assignment/state information. This RB fixes such stuck states by 
detecting null currentStates.
Changelist:
1. Check if currentState is null and if it is, manually assign an INIT state



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-778) TASK: Fix a race condition in updatePreviousAssignedTasksStatus

2018-10-31 Thread Hunter L (JIRA)
Hunter L created HELIX-778:
--

 Summary: TASK: Fix a race condition in 
updatePreviousAssignedTasksStatus
 Key: HELIX-778
 URL: https://issues.apache.org/jira/browse/HELIX-778
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


It was observed that TestUnregisteredCommand is very unstable. The reason was 
identified to be a race condition where when a task fails, sometimes a pending 
message for that task (from INIT to RUNNING) wasn't being cleaned up on time, 
so AbstractTaskDispatcher's updatePreviousAssignedTasksStatus would try to 
process that message and skip the status update of that task (like updating its 
status and NUM_ATTEMPTS field in JobContext).

A short, temporary fix is to call markPartitionError() prior to checking the 
pending message, but over the long haul, we would need to revisit the task 
status update's design here to avoid this type of race conditions.

Changelist:
1. Move markPartitionError() up before checking for a pending message on the 
task
2. Fix TestUnregisteredCommand's instability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[1/3] helix git commit: [HELIX-776] REST2.0: Add delete command to updateInstanceConfig

2018-10-31 Thread jxue
Repository: helix
Updated Branches:
  refs/heads/master b235c4ee5 -> ceba1a55a


[HELIX-776] REST2.0: Add delete command to updateInstanceConfig

For instance configs, REST2.0 did not expose the REST API for deletion of 
fields. This RB adds update and delete commands to updateInstanceConfig and an 
integration test thereof.
Changelist:
1. Add delete command to updateInstanceConfig in InstanceAccessor
2. Add integration tests


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/6090732b
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/6090732b
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/6090732b

Branch: refs/heads/master
Commit: 6090732be6b88863017a93106fa692dc7350520b
Parents: b235c4e
Author: Hunter Lee 
Authored: Wed Oct 31 14:20:18 2018 -0700
Committer: Hunter Lee 
Committed: Wed Oct 31 14:20:18 2018 -0700

--
 .../java/org/apache/helix/ConfigAccessor.java   |   6 +
 .../resources/helix/InstanceAccessor.java   |  34 +-
 .../helix/rest/server/TestInstanceAccessor.java | 114 ++-
 3 files changed, 146 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/6090732b/helix-core/src/main/java/org/apache/helix/ConfigAccessor.java
--
diff --git a/helix-core/src/main/java/org/apache/helix/ConfigAccessor.java 
b/helix-core/src/main/java/org/apache/helix/ConfigAccessor.java
index 2755113..53f42fb 100644
--- a/helix-core/src/main/java/org/apache/helix/ConfigAccessor.java
+++ b/helix-core/src/main/java/org/apache/helix/ConfigAccessor.java
@@ -765,6 +765,12 @@ public class ConfigAccessor {
 .forParticipant(instanceName).build();
 String zkPath = scope.getZkPath();
 
+if (!zkClient.exists(zkPath)) {
+  throw new HelixException(
+  "updateInstanceConfig failed. Given InstanceConfig does not already 
exist. instance: "
+  + instanceName);
+}
+
 if (overwrite) {
   ZKUtil.createOrReplace(zkClient, zkPath, instanceConfig.getRecord(), 
true);
 } else {

http://git-wip-us.apache.org/repos/asf/helix/blob/6090732b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstanceAccessor.java
--
diff --git 
a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstanceAccessor.java
 
b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstanceAccessor.java
index 98af0ee..38ee3b5 100644
--- 
a/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstanceAccessor.java
+++ 
b/helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstanceAccessor.java
@@ -41,10 +41,12 @@ import org.apache.helix.model.ClusterConfig;
 import org.apache.helix.model.CurrentState;
 import org.apache.helix.model.Error;
 import org.apache.helix.model.HealthStat;
+import org.apache.helix.model.HelixConfigScope;
 import org.apache.helix.model.InstanceConfig;
 import org.apache.helix.model.LiveInstance;
 import org.apache.helix.model.Message;
 import org.apache.helix.model.ParticipantHistory;
+import org.apache.helix.model.builder.HelixConfigScopeBuilder;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 import org.codehaus.jackson.JsonNode;
@@ -321,7 +323,19 @@ public class InstanceAccessor extends 
AbstractHelixResource {
   @POST
   @Path("{instanceName}/configs")
   public Response updateInstanceConfig(@PathParam("clusterId") String 
clusterId,
-  @PathParam("instanceName") String instanceName, String content) {
+  @PathParam("instanceName") String instanceName, @QueryParam("command") 
String commandStr,
+  String content) {
+Command command;
+if (commandStr == null || commandStr.isEmpty()) {
+  command = Command.update; // Default behavior to keep it 
backward-compatible
+} else {
+  try {
+command = getCommand(commandStr);
+  } catch (HelixException ex) {
+return badRequest(ex.getMessage());
+  }
+}
+
 ZNRecord record;
 try {
   record = toZNRecord(content);
@@ -332,11 +346,25 @@ public class InstanceAccessor extends 
AbstractHelixResource {
 InstanceConfig instanceConfig = new InstanceConfig(record);
 ConfigAccessor configAccessor = getConfigAccessor();
 try {
-  configAccessor.updateInstanceConfig(clusterId, instanceName, 
instanceConfig);
+  switch (command) {
+  case update:
+configAccessor.updateInstanceConfig(clusterId, instanceName, 
instanceConfig);
+break;
+  case delete: {
+HelixConfigScope instanceScope =
+new 
HelixConfigScopeBuilder(HelixConfigScope.ConfigScopeProperty.PARTICIPANT)
+.forCluster(clusterId).forParticipant(instanceName)

[2/3] helix git commit: [HELIX-777] TASK: Handle null currentState for unscheduled tasks

2018-10-31 Thread jxue
[HELIX-777] TASK: Handle null currentState for unscheduled tasks

It was observed that when a workflow is submitted and the Controller attempts 
to schedule its tasks, ZK read fails to read the appropriate job's context, 
causing the job to be stuck in an unscheduled state. The job remained 
unscheduled because it had no currentStates, and its job context did not 
contain any assignment/state information. This RB fixes such stuck states by 
detecting null currentStates.
Changelist:
1. Check if currentState is null and if it is, manually assign an INIT state


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/5d24ed54
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/5d24ed54
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/5d24ed54

Branch: refs/heads/master
Commit: 5d24ed544898ff69f289f54be71a04413735d118
Parents: 6090732
Author: Hunter Lee 
Authored: Wed Oct 31 14:21:49 2018 -0700
Committer: Hunter Lee 
Committed: Wed Oct 31 17:17:16 2018 -0700

--
 .../helix/task/AbstractTaskDispatcher.java  | 213 +--
 1 file changed, 106 insertions(+), 107 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/5d24ed54/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java
--
diff --git 
a/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java 
b/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java
index 617263b..aa72f2d 100644
--- a/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java
+++ b/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java
@@ -23,7 +23,6 @@ import org.apache.helix.model.Partition;
 import org.apache.helix.model.ResourceAssignment;
 import org.apache.helix.monitoring.mbeans.ClusterStatusMonitor;
 import org.apache.helix.task.assigner.AssignableInstance;
-
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -73,7 +72,6 @@ public abstract class AbstractTaskDispatcher {
 // this pending state transition, essentially "waiting" until this 
pending message clears
 Message pendingMessage =
 currStateOutput.getPendingMessage(jobResource, new 
Partition(pName), instance);
-
 if (pendingMessage != null && 
!pendingMessage.getToState().equals(currState.name())) {
   // If there is a pending message whose destination state is 
different from the current
   // state, just make the same assignment as the pending message. This 
is essentially
@@ -125,26 +123,26 @@ public abstract class AbstractTaskDispatcher {
 }
 
 switch (currState) {
-  case RUNNING: {
-TaskPartitionState nextState = TaskPartitionState.RUNNING;
-if (jobState == TaskState.TIMING_OUT) {
-  nextState = TaskPartitionState.TASK_ABORTED;
-} else if (jobTgtState == TargetState.STOP) {
-  nextState = TaskPartitionState.STOPPED;
-} else if (jobState == TaskState.ABORTED || jobState == 
TaskState.FAILED
-|| jobState == TaskState.FAILING || jobState == 
TaskState.TIMED_OUT) {
-  // Drop tasks if parent job is not in progress
-  paMap.put(pId, new PartitionAssignment(instance, 
TaskPartitionState.DROPPED.name()));
-  break;
-}
+case RUNNING: {
+  TaskPartitionState nextState = TaskPartitionState.RUNNING;
+  if (jobState == TaskState.TIMING_OUT) {
+nextState = TaskPartitionState.TASK_ABORTED;
+  } else if (jobTgtState == TargetState.STOP) {
+nextState = TaskPartitionState.STOPPED;
+  } else if (jobState == TaskState.ABORTED || jobState == 
TaskState.FAILED
+  || jobState == TaskState.FAILING || jobState == 
TaskState.TIMED_OUT) {
+// Drop tasks if parent job is not in progress
+paMap.put(pId, new PartitionAssignment(instance, 
TaskPartitionState.DROPPED.name()));
+break;
+  }
 
-paMap.put(pId, new PartitionAssignment(instance, 
nextState.name()));
-assignedPartitions.add(pId);
-if (LOG.isDebugEnabled()) {
-  LOG.debug(String.format("Setting task partition %s state to %s 
on instance %s.", pName,
-  nextState, instance));
-}
+  paMap.put(pId, new PartitionAssignment(instance, nextState.name()));
+  assignedPartitions.add(pId);
+  if (LOG.isDebugEnabled()) {
+LOG.debug(String.format("Setting task partition %s state to %s on 
instance %s.", pName,
+nextState, instance));
   }
+}
   break;
 case STOPPED: {
   // TODO: This case statement migh

[3/3] helix git commit: [HELIX-778] TASK: Fix a race condition in updatePreviousAssignedTasksStatus

2018-10-31 Thread jxue
[HELIX-778] TASK: Fix a race condition in updatePreviousAssignedTasksStatus

It was observed that TestUnregisteredCommand is very unstable. The reason was 
identified to be a race condition where when a task fails, sometimes a pending 
message for that task (from INIT to RUNNING) wasn't being cleaned up on time, 
so AbstractTaskDispatcher's updatePreviousAssignedTasksStatus would try to 
process that message and skip the status update of that task (like updating its 
status and NUM_ATTEMPTS field in JobContext).

A short, temporary fix is to call markPartitionError() prior to checking the 
pending message, but over the long haul, we would need to revisit the task 
status update's design here to avoid this type of race conditions.

Changelist:
1. Move markPartitionError() up before checking for a pending message on the 
task
2. Fix TestUnregisteredCommand's instability


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/ceba1a55
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/ceba1a55
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/ceba1a55

Branch: refs/heads/master
Commit: ceba1a55ae351090144c001324f908f2364212a4
Parents: 5d24ed5
Author: Hunter Lee 
Authored: Wed Oct 31 17:20:37 2018 -0700
Committer: Hunter Lee 
Committed: Wed Oct 31 17:20:37 2018 -0700

--
 .../apache/helix/task/AbstractTaskDispatcher.java| 15 ---
 .../integration/task/TestUnregisteredCommand.java|  3 ++-
 2 files changed, 14 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/ceba1a55/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java
--
diff --git 
a/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java 
b/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java
index aa72f2d..cbf9fb8 100644
--- a/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java
+++ b/helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java
@@ -67,6 +67,16 @@ public abstract class AbstractTaskDispatcher {
 TaskPartitionState currState = 
updateJobContextAndGetTaskCurrentState(currStateOutput,
 jobResource, pId, pName, instance, jobCtx);
 
+// This avoids a race condition in the case that although currentState 
is in the following
+// error condition, the pending message (INIT->RUNNNING) might still 
be present.
+// This is undesirable because this prevents JobContext from getting 
the proper update of
+// fields including task state and task's NUM_ATTEMPTS
+if (currState == TaskPartitionState.ERROR || currState == 
TaskPartitionState.TASK_ERROR
+|| currState == TaskPartitionState.TIMED_OUT
+|| currState == TaskPartitionState.TASK_ABORTED) {
+  markPartitionError(jobCtx, pId, currState, true);
+}
+
 // Check for pending state transitions on this (partition, instance). 
If there is a pending
 // state transition, we prioritize this pending state transition and 
set the assignment from
 // this pending state transition, essentially "waiting" until this 
pending message clears
@@ -197,7 +207,6 @@ public abstract class AbstractTaskDispatcher {
 "Task partition %s has error state %s with msg %s. Marking as 
such in rebalancer context.",
 pName, currState, jobCtx.getPartitionInfo(pId)));
   }
-  markPartitionError(jobCtx, pId, currState, true);
   // The error policy is to fail the task as soon a single partition 
fails for a specified
   // maximum number of attempts or task is in ABORTED state.
   // But notice that if job is TIMED_OUT, aborted task won't be 
treated as fail and won't
@@ -239,7 +248,6 @@ public abstract class AbstractTaskDispatcher {
 
 // Also release resources for these tasks
 assignableInstance.release(taskConfig, quotaType);
-
   } else if (jobState == TaskState.IN_PROGRESS
   && (jobTgtState != TargetState.STOP && jobTgtState != 
TargetState.DELETE)) {
 // Job is in progress, implying that tasks are being re-tried, so 
set it to RUNNING
@@ -940,7 +948,8 @@ public abstract class AbstractTaskDispatcher {
 
   private long getTimeoutTime(long startTime, long timeoutPeriod) {
 return (timeoutPeriod == TaskConstants.DEFAULT_NEVER_TIMEOUT
-|| timeoutPeriod > Long.MAX_VALUE - startTime) // check long overflow
+|| timeoutPeriod > Long.MAX_VALUE - startTime)
+// check long overflow
 ? TaskConstants.DEFAULT_NEVER_TIMEOUT
 : startTime + timeoutPeriod;
   }

http://git-wip-us.apache.org/repos/asf/helix/blob/ceba1a55/helix

helix git commit: Fix log argument order

2018-10-31 Thread jxue
Repository: helix
Updated Branches:
  refs/heads/master ceba1a55a -> 89f351558


Fix log argument order


Project: http://git-wip-us.apache.org/repos/asf/helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/89f35155
Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/89f35155
Diff: http://git-wip-us.apache.org/repos/asf/helix/diff/89f35155

Branch: refs/heads/master
Commit: 89f351558734b4bb95019452e4c58d6befeed0dd
Parents: ceba1a5
Author: Junkai Xue 
Authored: Tue Oct 9 11:41:08 2018 -0700
Committer: Junkai Xue 
Committed: Wed Oct 31 17:28:04 2018 -0700

--
 helix-core/src/main/java/org/apache/helix/task/TaskDriver.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/helix/blob/89f35155/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
--
diff --git a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java 
b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
index e675c86..27670e9 100644
--- a/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
+++ b/helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
@@ -420,8 +420,8 @@ public class TaskDriver {
 Set allNodes = jobDag.getAllNodes();
 if (capacity > 0 && allNodes.size() + jobConfigs.size() >= capacity) {
   throw new IllegalStateException(String
-  .format("Queue %s already reaches its max capacity %f, failed to 
add %s", capacity,
-  queue, jobs.toString()));
+  .format("Queue %s already reaches its max capacity %f, failed to 
add %s", queue,
+  capacity, jobs.toString()));
 }
 
 String lastNodeName = null;



[jira] [Commented] (HELIX-776) REST2.0: Add delete command to updateInstanceConfig

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670953#comment-16670953
 ] 

Hudson commented on HELIX-776:
--

FAILURE: Integrated in Jenkins build helix #1561 (See 
[https://builds.apache.org/job/helix/1561/])
[HELIX-776] REST2.0: Add delete command to updateInstanceConfig (hulee: rev 
6090732be6b88863017a93106fa692dc7350520b)
* (edit) 
helix-rest/src/test/java/org/apache/helix/rest/server/TestInstanceAccessor.java
* (edit) 
helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstanceAccessor.java
* (edit) helix-core/src/main/java/org/apache/helix/ConfigAccessor.java


> REST2.0: Add delete command to updateInstanceConfig
> ---
>
> Key: HELIX-776
> URL: https://issues.apache.org/jira/browse/HELIX-776
> Project: Apache Helix
>  Issue Type: Improvement
>Reporter: Hunter L
>Assignee: Hunter L
>Priority: Major
>
> For instance configs, REST2.0 did not expose the REST API for deletion of 
> fields. This RB adds update and delete commands to updateInstanceConfig and 
> an integration test thereof. Changelist: 1. Add delete command to 
> updateInstanceConfig in InstanceAccessor 2. Add integration tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-778) TASK: Fix a race condition in updatePreviousAssignedTasksStatus

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670955#comment-16670955
 ] 

Hudson commented on HELIX-778:
--

FAILURE: Integrated in Jenkins build helix #1561 (See 
[https://builds.apache.org/job/helix/1561/])
[HELIX-778] TASK: Fix a race condition in (hulee: rev 
ceba1a55ae351090144c001324f908f2364212a4)
* (edit) 
helix-core/src/test/java/org/apache/helix/integration/task/TestUnregisteredCommand.java
* (edit) 
helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java


> TASK: Fix a race condition in updatePreviousAssignedTasksStatus
> ---
>
> Key: HELIX-778
> URL: https://issues.apache.org/jira/browse/HELIX-778
> Project: Apache Helix
>  Issue Type: Improvement
>Reporter: Hunter L
>Assignee: Hunter L
>Priority: Major
>
> It was observed that TestUnregisteredCommand is very unstable. The reason was 
> identified to be a race condition where when a task fails, sometimes a 
> pending message for that task (from INIT to RUNNING) wasn't being cleaned up 
> on time, so AbstractTaskDispatcher's updatePreviousAssignedTasksStatus would 
> try to process that message and skip the status update of that task (like 
> updating its status and NUM_ATTEMPTS field in JobContext).
> A short, temporary fix is to call markPartitionError() prior to checking the 
> pending message, but over the long haul, we would need to revisit the task 
> status update's design here to avoid this type of race conditions.
> Changelist:
> 1. Move markPartitionError() up before checking for a pending message on the 
> task
> 2. Fix TestUnregisteredCommand's instability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-777) TASK: Handle null currentState for unscheduled tasks

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670954#comment-16670954
 ] 

Hudson commented on HELIX-777:
--

FAILURE: Integrated in Jenkins build helix #1561 (See 
[https://builds.apache.org/job/helix/1561/])
[HELIX-777] TASK: Handle null currentState for unscheduled tasks (hulee: rev 
5d24ed544898ff69f289f54be71a04413735d118)
* (edit) 
helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java


> TASK: Handle null currentState for unscheduled tasks
> 
>
> Key: HELIX-777
> URL: https://issues.apache.org/jira/browse/HELIX-777
> Project: Apache Helix
>  Issue Type: Improvement
>Reporter: Hunter L
>Assignee: Hunter L
>Priority: Major
>
> It was observed that when a workflow is submitted and the Controller attempts 
> to schedule its tasks, ZK read fails to read the appropriate job's context, 
> causing the job to be stuck in an unscheduled state. The job remained 
> unscheduled because it had no currentStates, and its job context did not 
> contain any assignment/state information. This RB fixes such stuck states by 
> detecting null currentStates.
> Changelist:
> 1. Check if currentState is null and if it is, manually assign an INIT state



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)