[GitHub] helix pull request #295: PR

2018-11-13 Thread narendly
GitHub user narendly opened a pull request:

https://github.com/apache/helix/pull/295

PR



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/narendly/helix master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/295.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #295


commit 22fa03f3d8bb0863913f2a6614574443130e500a
Author: narendly 
Date:   2018-11-14T02:20:38Z

[HELIX-789] REST2.0: Add support for update and delete for ResourceConfig

Previous implementation of updateResourceConfig did not allow deletion of 
fields in ResourceConfig in ZK. This RB refactors the REST endpoint.
Changelist:
1. Add command support for updateResourceConfig
2. Add integration tests

commit abc6969d754e01c76278c266d08cc4e9fb80e910
Author: narendly 
Date:   2018-11-14T02:22:55Z

[HELIX-790] REST2.0: Add support for updating IdealState

There was a user request for a REST endpoint that allows users to 
add/delete/modify fields in IdealState ZNodes.
Changelist:
1. Add updateResourceIdealState in ResourceAcessor
2. Add update APIs in HelixAdmin
3. Add an integration test




---


[jira] [Created] (HELIX-790) REST2.0: Add support for updating IdealState

2018-11-13 Thread Hunter L (JIRA)
Hunter L created HELIX-790:
--

 Summary: REST2.0: Add support for updating IdealState
 Key: HELIX-790
 URL: https://issues.apache.org/jira/browse/HELIX-790
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


There was a user request for a REST endpoint that allows users to 
add/delete/modify fields in IdealState ZNodes.

Changelist: 1. Add updateResourceIdealState in ResourceAcessor 2. Add update 
APIs in HelixAdmin 3. Add an integration test



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-789) REST2.0: Add support for update and delete for ResourceConfig

2018-11-13 Thread Hunter L (JIRA)
Hunter L created HELIX-789:
--

 Summary: REST2.0: Add support for update and delete for 
ResourceConfig
 Key: HELIX-789
 URL: https://issues.apache.org/jira/browse/HELIX-789
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


Previous implementation of updateResourceConfig did not allow deletion of 
fields in ResourceConfig in ZK. This RB refactors the REST endpoint.
Changelist:
1. Add command support for updateResourceConfig
2. Add integration tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Sporadic issue in putting a variable in workflow scope

2018-11-13 Thread Xue Junkai
Yes. You are right.

On Tue, Nov 13, 2018 at 7:32 AM DImuthu Upeksha 
wrote:

> Hi Junkai,
>
> Thanks a lot. I'll try with expiry time then. Is this[1] the place where
> Helix has implemented this logic? If that so, default expiry time should be
> 24 hours. Am I right?
>
> [1]
>
> https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java#L711
>
> Thanks
> Dimuthu
>
> On Mon, Nov 12, 2018 at 10:17 PM Xue Junkai  wrote:
>
> > 1 and 2 are correct. 3 is wrong. The expiry time start counting only when
> > the workflow is completed. If it is not scheduled ( dont have enought
> > resource) or still running, Helix never deletes it.
> >
> >
> >
> > On Sun, Nov 11, 2018 at 8:01 PM DImuthu Upeksha <
> > dimuthu.upeks...@gmail.com> wrote:
> >
> >> Hi Junkai,
> >>
> >> Thanks for the clarification. That helped a lot.
> >>
> >> I our case, each of the task of the workflow are depending on the
> previous
> >> task. So there is no parallel execution. And we are not using Job
> Queues.
> >>
> >> Regarding the expiry time, what are the rules that you are imposing on
> >> that? For example let's say I setup an expiry time to 2 hours, I assume
> >> following situations are covered in Helix,
> >>
> >> 1. Even though the workflow is completed before 2 hours, resources
> related
> >> to that workflow will not be cleared until 2 hours are elapsed and
> exactly
> >> after 2 hours, all the resources will be cleared by the framework.
> >> 2. If the workflow failed, resources will not be cleared even after 2
> >> hours
> >> 3. If the workflow wasn't scheduled within 2 hours in a participant, it
> >> will be deleted
> >>
> >> Is my understanding correct?
> >>
> >> Thanks
> >> Dimuthu
> >>
> >>
> >> On Sat, Nov 10, 2018 at 4:26 PM Xue Junkai 
> wrote:
> >>
> >> > Hi Dimuthu,
> >> >
> >> > Couple things here:
> >> > 1. Only JobQueue in Helix is single branch DAG and 1 job running at a
> >> time
> >> > with defining parallel job number to be 1. Otherwise, you may see many
> >> jobs
> >> > running at same time as you set parallel job number to be a different
> >> > number. For generic workflow, all jobs without dependencies could be
> >> > dispatched together.
> >> > 2. Helix only cleans up the completed generic workflows by deleting
> all
> >> > the related znode, not for JobQueue. For JobQueue you have to set up
> >> > periodical purge time. As Helix defined, JobQueue never finishes and
> >> only
> >> > can be terminated by manual kill and it can keep accepting dynamic
> jobs.
> >> > Thus you have to understand your workflow is generic workflow or
> >> JobQueue.
> >> > For failed generic workflow, even if you setup the expiry time, Helix
> >> will
> >> > not clean it up as Helix would like to keep it for user further
> >> > investigation.
> >> > 3. For Helix controller, if Helix failed to clean up workflows, the
> only
> >> > thing you can see is the having workflows with context but no resource
> >> > config and idealstate there. This is because of ZK write fail to clean
> >> last
> >> > piece, context node. And there is no ideal state can trigger clean up
> >> again
> >> > for this workflow.
> >> >
> >> > Please take a look for this task framework tutorial for detailed
> >> > configurations:
> >> > https://helix.apache.org/0.8.2-docs/tutorial_task_framework.html
> >> >
> >> > Best,
> >> >
> >> > Junkai
> >> >
> >> > On Sat, Nov 10, 2018 at 8:29 AM DImuthu Upeksha <
> >> > dimuthu.upeks...@gmail.com> wrote:
> >> >
> >> >> Hi Junkai,
> >> >>
> >> >> Thanks for the clarification. There are few special properties in our
> >> >> workflows. All the workflows are single branch DAGs so there will be
> >> only
> >> >> one job running at a time. By looking at the log, I could see that
> only
> >> >> the
> >> >> task with this error has been failed. Cleanup agent deleted this
> >> workflow
> >> >> after this task is failed so it is clear that no other task is
> >> triggering
> >> >> this issue (I checked the timestamp).
> >> >>
> >> >> However for the instance, I disabled the cleanup agent for a while.
> >> Reason
> >> >> for adding this agent is because Helix became slow to schedule
> pending
> >> >> jobs
> >> >> when the load is high and participant was waiting without running
> >> anything
> >> >> for few minutes. We discussed this on thread "Sporadic delays in task
> >> >> execution". Before implementing this agent, I noticed that, there
> were
> >> >> lots
> >> >> of uncleared znodes related to Completed and Failed workflows and I
> >> though
> >> >> that was the reason to slow down controller / participant. After
> >> >> implementing this agent, things went smoothly until this point.
> >> >>
> >> >> Now I understand that you have your own workflow cleanup logic
> >> implemented
> >> >> in Helix but we might need to tune it to our case. Can you point me
> >> into
> >> >> code / documentation where I can have an idea about that?
> >> >>
> >> >> And this for my understanding, let's say that for 

helix - Build # 1571 - Still Failing

2018-11-13 Thread Apache Jenkins Server
The Apache Jenkins build system has built helix (build #1571)

Status: Still Failing

Check console output at https://builds.apache.org/job/helix/1571/ to view the 
results.