[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r383097458 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,40 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +Set existingContexts; +/* + * Here try-catch is used to avoid concurrent modification exception while doing deep copy. + * Map.keySet() can produce concurrent modification exception. + * Reason: If the map is modified while an iteration over the set is in progress, concurrent + * modification exception will be thrown. + */ +try { + existingContexts = new HashSet<>(dataProvider.getContexts().keySet()); Review comment: dataProvider.getContexts() is outside the scope of this class and perhaps there are performance concerns regarding this. Maybe we can consider this in next design of TF. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377986318 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1040,26 +1040,39 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf /** * The function that loops through the all existing workflow contexts and removes IdealState and * workflow context of the workflow whose workflow config does not exist. + * Try-catch has been used to avoid concurrent modification exception while doing deep copy. Since + * Map.keySet() can produce concurrent modification exception. Review comment: Done. Please let me know if it is still not clear. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377986361 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1040,26 +1040,39 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf /** * The function that loops through the all existing workflow contexts and removes IdealState and * workflow context of the workflow whose workflow config does not exist. + * Try-catch has been used to avoid concurrent modification exception while doing deep copy. Since + * Map.keySet() can produce concurrent modification exception. * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +Set existingWorkflowContexts; +try { + existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377880822 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,33 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW) && cfg == null) { +toBeDeletedWorkflows.add(entry); + } } } +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); } Review comment: Done. Thanks @narendly for the comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377436725 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,33 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW) && cfg == null) { +toBeDeletedWorkflows.add(entry); + } } } +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); } Review comment: I don't believe deep copy by itself helps. The code generate concurrent modification in this function: **map.keySet().** We cannot avoid this. How you are proposing to get the keys in the map in this scenario? We need to get the keys no matter what (even for doing deep copy). Even the implementation of deep copying involves to loop over the elements and copy them one by one to the new map. A deep copy is merely done by iterating through the elements (keys and values) and cloning those too. Right? In this case if original map has been changed while copy operation is happening, we might still get concurrent modification exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377436725 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,33 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW) && cfg == null) { +toBeDeletedWorkflows.add(entry); + } } } +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); } Review comment: I don't believe deep copy will help in this case (it would help in the single threaded case where you want to remove elements from the map or list). The code generate concurrent modification in this function: **map.keySet().** We cannot avoid this. How you are proposing to get the keys in the map in this scenario? We need to get the keys no matter what (even for doing deep copy). Even the implementation of deep copying involves to loop over the elements and copy them one by one to the new map. A deep copy is merely done by iterating through the elements (keys and values) and cloning those too. Right? In this case if original map has been changed while copy operation is happening, we might still get concurrent modification exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377436725 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,33 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW) && cfg == null) { +toBeDeletedWorkflows.add(entry); + } } } +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); } Review comment: I don't believe deep copy will help in this case (it would help in the single threaded case where you want to remove elements from the map or list). The code generate concurrent modification in this function: map.keySet(). We cannot avoid this. How you are proposing to get the keys in the map in this scenario? We need to get the keys no matter what (even for doing deep copy). Even the implementation of deep copying involves to loop over the elements and copy them one by one to the new map. A deep copy is merely done by iterating through the elements (keys and values) and cloning those too. Right? In this case if original map has been changed while copy operation is happening, we might still get concurrent modification exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377436725 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,33 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW) && cfg == null) { +toBeDeletedWorkflows.add(entry); + } } } +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); } Review comment: I don't believe deep copy will help in this case (it would help in the single threaded case where you want to remove elements from the map or list). The code generate concurrent modification in this function: map.keySet(). We cannot avoid this. How you are proposing to get the keys in the map in this scenario? We need to get the keys no matter what. Even the implementation of deep copying involves to loop over the elements and copy them one by one to the new map. A deep copy is merely done by iterating through the elements (keys and values) and cloning those too. Right? In this case if original map has been changed while copy operation is happening, we might still get concurrent modification exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377436725 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,33 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW) && cfg == null) { +toBeDeletedWorkflows.add(entry); + } } } +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); } Review comment: I don't believe deep copy will help in this case (it would help in the single threaded case where you want to remove elements from the map or list). The code generate concurrent modification in this function: map.keySet(). We cannot avoid this. How you are proposing to get the keys in the map in this scenario? We need to get the keys no matter what. Even the implementation of deep copying involves to loop over the elements and copy them one by one to the new map. Right? In this case if original map has been changed while copy operation is happening, we might still get concurrent modification exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377436725 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,33 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW) && cfg == null) { +toBeDeletedWorkflows.add(entry); + } } } +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); } Review comment: I don't believe deep copy will help in this case (it would help in the single threaded case where you want to remove elements from the map or list). The code generate concurrent modification in this function: map.keySet(). We cannot avoid this. How you are proposing to get the keys in the map in this scenario? We need to get the keys no matter what. Even the implementation of deep copying involves to loop over the elements and copy them one by one to the new map. Right? In this case if original map has been changes while copy operation is happening, we might still get concurrent modification exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377433026 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,33 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW) && cfg == null) { +toBeDeletedWorkflows.add(entry); + } } } +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); } Review comment: Yes. I tried several scenarios and for each scenario I used Jiajun's scripts which runs the test for 50 time. The most effective solution is the one that I proposed in this PR. Please have a look at this line: - Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); In some minor cases (about 2 out of 50 runs) above line is the only line that can generate concurrent modification exception which I eliminated it with try-catch. The reason behind this is because while we want to get all of the existing contexts, the contextMap can be modified in the cache by other threads. As a result we will get concurrentMod exception. Please note that this part of the code runs asynchronously. @narendly I don't have strong preference about this method and using try-catch and I would be happy if you can propose new way to get all of the context without hitting concurrent modification exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377433026 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,33 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW) && cfg == null) { +toBeDeletedWorkflows.add(entry); + } } } +} catch (Exception e) { + LOG.warn( + "Exception occurred while creating a list of all existing contexts with missing config!", + e); } Review comment: Yes. I tried several scenarios and for each scenario I used Jiajun's scripts which runs the test for 50 time. The most effective solution is the one that I proposed in this PR. Please have a look at this line: - Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); In some minor cases (about 2 out of 50 runs) above line is the only line that can generate concurrent modification exception which I eliminated it with try-catch. The reason behind this is because while we want to get all of the existing contexts, the contextMap can be modified in the cache by other threads. As a result we will get concurrentMod exception. Please note that this part of the code runs asynchronously. @narendly I don't have strong preference about this method and I would be happy if you can propose new way to get all of the context without hitting concurrent modification exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377418237 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,35 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { +if (cfg == null) { + toBeDeletedWorkflows.add(entry); +} + } } } +} catch (Exception e) { + LOG.warn(String.format( + "Exception occurred while creating a list of all existing contexts with missing config!! Reason: %s"), Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377418263 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,35 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { +if (cfg == null) { + toBeDeletedWorkflows.add(entry); +} + } } } +} catch (Exception e) { + LOG.warn(String.format( + "Exception occurred while creating a list of all existing contexts with missing config!! Reason: %s"), Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection
alirezazamani commented on a change in pull request #741: Fix ConcurrentModification exception in Workflow Garbage Collection URL: https://github.com/apache/helix/pull/741#discussion_r377418194 ## File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java ## @@ -1043,23 +1043,35 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf * @param dataProvider * @param manager */ - public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider, + public static void workflowGarbageCollection(final WorkflowControllerDataProvider dataProvider, final HelixManager manager) { // Garbage collections for conditions where workflow context exists but config is missing. -Map contexts = dataProvider.getContexts(); -HelixDataAccessor accessor = manager.getHelixDataAccessor(); -HelixPropertyStore propertyStore = manager.getHelixPropertyStore(); +// toBeDeletedWorkflows is a set that contains the name of the workflows that their contexts +// should be deleted. Set toBeDeletedWorkflows = new HashSet<>(); -for (Map.Entry entry : contexts.entrySet()) { - if (entry.getValue() != null - && entry.getValue().getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { -if (dataProvider.getWorkflowConfig(entry.getKey()) == null) { - toBeDeletedWorkflows.add(entry.getKey()); +try { + Set existingWorkflowContexts = new HashSet<>(dataProvider.getContexts().keySet()); + for (String entry : existingWorkflowContexts) { +if (entry != null) { + WorkflowConfig cfg = dataProvider.getWorkflowConfig(entry); + WorkflowContext ctx = dataProvider.getWorkflowContext(entry); + if (ctx != null && ctx.getId().equals(TaskUtil.WORKFLOW_CONTEXT_KW)) { +if (cfg == null) { Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org