Re: Sporadic issue in putting a variable in workflow scope

2018-11-13 Thread Xue Junkai
Yes. You are right. On Tue, Nov 13, 2018 at 7:32 AM DImuthu Upeksha wrote: > Hi Junkai, > > Thanks a lot. I'll try with expiry time then. Is this[1] the place where > Helix has implemented this logic? If that so, default expiry time should be > 24 hours. Am I right? > > [1] > > https://github.co

Re: Sporadic issue in putting a variable in workflow scope

2018-11-13 Thread DImuthu Upeksha
Hi Junkai, Thanks a lot. I'll try with expiry time then. Is this[1] the place where Helix has implemented this logic? If that so, default expiry time should be 24 hours. Am I right? [1] https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java#L711

Re: Sporadic issue in putting a variable in workflow scope

2018-11-12 Thread Xue Junkai
1 and 2 are correct. 3 is wrong. The expiry time start counting only when the workflow is completed. If it is not scheduled ( dont have enought resource) or still running, Helix never deletes it. On Sun, Nov 11, 2018 at 8:01 PM DImuthu Upeksha wrote: > Hi Junkai, > > Thanks for the clarificati

Re: Sporadic issue in putting a variable in workflow scope

2018-11-11 Thread DImuthu Upeksha
Hi Junkai, Thanks for the clarification. That helped a lot. I our case, each of the task of the workflow are depending on the previous task. So there is no parallel execution. And we are not using Job Queues. Regarding the expiry time, what are the rules that you are imposing on that? For exampl

Re: Sporadic issue in putting a variable in workflow scope

2018-11-10 Thread Xue Junkai
Hi Dimuthu, Couple things here: 1. Only JobQueue in Helix is single branch DAG and 1 job running at a time with defining parallel job number to be 1. Otherwise, you may see many jobs running at same time as you set parallel job number to be a different number. For generic workflow, all jobs withou

Re: Sporadic issue in putting a variable in workflow scope

2018-11-10 Thread DImuthu Upeksha
Hi Junkai, Thanks for the clarification. There are few special properties in our workflows. All the workflows are single branch DAGs so there will be only one job running at a time. By looking at the log, I could see that only the task with this error has been failed. Cleanup agent deleted this wo

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread Xue Junkai
It is possible. For example, if other jobs caused the workflow failed, it will trigger the monitoring to clean up the workflow. Then if this job is still running, you may see the problem. That's what I am trying to ask for, extra thread deleting/cleaning workflows. I can understand it clean up the

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread DImuthu Upeksha
Hi Junkai, There is a cleanup agent [1] who is monitoring currently available workflows and deleting completed and failed workflows to clear up zookeeper storage. Do you think that this will be causing this issue? [1] https://github.com/apache/airavata/blob/staging/modules/airavata-helix/helix-sp

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread DImuthu Upeksha
Hi Junkai, There is no manual workflow killing logic implemented but as you have suggested, I need to verify that. Unfortunately all the helix log levels in our servers were set to WARN as helix is printing a whole lot of logs in INFO level so there is no much valuable information in logs. Can you

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread Xue Junkai
Hmm, that's very strange. The user content store znode only has been deleted when the workflow is gone. From the log, it shows the znode is gone. Could you please try to dig the log to find whether the workflow has been manually killed? If that's the case, then it is possible you have the problem.

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread DImuthu Upeksha
Hi Junkai, Thanks for your suggestion. You have captured most of the parts correctly. There are two jobs as job1 and job2. And there is a dependency that job2 depends on job1. Until job1 is completed job2 should not be scheduled. And task 1 in job 1 is calling that method and it is not updating an

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread Xue Junkai
In my understanding, it could be you have job1 and job2. The task running in job1 tries to update content for job2. Then, there could be a race condition happening here that job2 is not scheduled. If that's the case, I suggest you can put key-value store at workflow level since this is cross-job o

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread DImuthu Upeksha
Hi Junkai, This method is being called inside a running task. And it is working for most of the time. I only saw this in 2 occasions for last few months and both of them happened today and yesterday. Thanks Dimuthu On Fri, Nov 9, 2018 at 2:40 PM Xue Junkai wrote: > User content store node will

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread Xue Junkai
User content store node will be created one the job has been scheduled. In your case, I think the job is not scheduled. This method usually has been utilized in running task. Best, Junkai On Fri, Nov 9, 2018 at 8:19 AM DImuthu Upeksha wrote: > Hi Helix Folks, > > I'm having this sporadic issue