NealSun96 opened a new pull request #1508:
URL: https://github.com/apache/helix/pull/1508


   ### Issues
   
   - [x] My PR addresses the following Helix issues and references them in the 
PR description:
   
   Fixes #1506, #1507
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   This PR fixes 2 issues:
   1. Ondemand rebalance flooding after TF IS removal. This is caused by an old 
problem. A jobConfig can exist for the previous iterations of a workflow, but 
the job doesn't exist in the job DAG. Because of runtime DAG refresh logic, 
such a jobConfig will cause the runtime DAG to be refreshed every time the 
pipeline runs. A new runtime DAG causes all the jobs to be reprocessed, and 
during processing, the cleanup logic is run for every job that has already 
completed. Before IS removal, the cleanup logic attempts to delete the IS 
again, which has no effect; in the new code, an onDemand rebalance is triggered 
instead. 
   This "dangling job" will not be removed, so the job dag refresh keeps on 
happening, which causes the ondemand rebalances to keep on firing. 
   2. Log flooding for jobs that miss target resources after TF IS removal. 
Similarly, "dangling jobs" can have missing target resources once their target 
resources are deleted. Before IS removal, this is not a problem because this 
log is only triggered when the job is first processed; now, the processing 
logic happens every pipeline (since it's config based, instead of IS based), so 
the log could keep on firing. 
   
   ### Tests
   
   - [x] The following is the result of the "mvn test" command on the 
appropriate module:
   
   ```
   [ERROR] Tests run: 1236, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 
4,922.413 s <<< FAILURE! - in TestSuite
   [ERROR] testDeactivateCluster(org.apache.helix.tools.TestHelixAdminCli)  
Time elapsed: 2.383 s  <<< FAILURE!
   org.apache.helix.HelixException: There are still LEADER in the cluster, shut 
them down first.
           at 
org.apache.helix.tools.TestHelixAdminCli.testDeactivateCluster(TestHelixAdminCli.java:604)
   
   [INFO]
   [INFO] Results:
   [INFO]
   [ERROR] Failures:
   [ERROR]   TestHelixAdminCli.testDeactivateCluster:604 ยป Helix There are 
still LEADER in ...
   [INFO]
   [ERROR] Tests run: 1236, Failures: 1, Errors: 0, Skipped: 1
   [INFO]
   [INFO] 
------------------------------------------------------------------------
   [INFO] BUILD FAILURE
   [INFO] 
------------------------------------------------------------------------
   [INFO] Total time:  01:22 h
   [INFO] Finished at: 2020-11-02T18:52:09-08:00
   [INFO] 
------------------------------------------------------------------------
   ```
   
   Rerun
   
   ```
   [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
23.672 s - in org.apache.helix.tools.TestHelixAdminCli
   [INFO] 
   [INFO] Results:
   [INFO] 
   [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0
   [INFO] 
   [INFO] 
------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] 
------------------------------------------------------------------------
   [INFO] Total time:  28.917 s
   [INFO] Finished at: 2020-11-02T19:44:51-08:00
   [INFO] 
------------------------------------------------------------------------
   ```
   
   ### Documentation (Optional)
   
   - In case of new functionality, my PR adds documentation in the following 
wiki page:
   
   (Link the GitHub wiki you added)
   
   ### Commits
   
   - My commits all reference appropriate Apache Helix GitHub issues in their 
subject lines. In addition, my commits follow the guidelines from "[How to 
write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to