[ https://issues.apache.org/jira/browse/TRAFODION-3318?focusedWorklogId=296787&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296787 ]
ASF GitHub Bot logged work on TRAFODION-3318: --------------------------------------------- Author: ASF GitHub Bot Created on: 17/Aug/19 14:36 Start Date: 17/Aug/19 14:36 Worklog Time Spent: 10m Work Description: Traf-Jenkins commented on issue #1854: [TRAFODION-3318] Changed process management rules for DTM process: URL: https://github.com/apache/trafodion/pull/1854#issuecomment-522242934 Test Failed. https://jenkins.esgyn.com/job/Check-PR-master/3258/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 296787) Time Spent: 2h (was: 1h 50m) Remaining Estimate: 118h (was: 118h 10m) > Change process management of DTM to improve HA behavior > ------------------------------------------------------- > > Key: TRAFODION-3318 > URL: https://issues.apache.org/jira/browse/TRAFODION-3318 > Project: Apache Trafodion > Issue Type: Improvement > Components: dtm, foundation > Affects Versions: 2.4 > Reporter: Gonzalo E Correa > Priority: Major > Fix For: 2.4 > > Original Estimate: 120h > Time Spent: 2h > Remaining Estimate: 118h > > Current process management model for process type DTM enforces and soft node > down behavior which kills all processes in a node where a DTM process > terminates abnormally. The DTM process is recreated by the monitor along with > all persistent processes hosted in that node. > To reduce the fault zone impact, this change removes the soft node down/up > functionality so that the DTM process is recreated without killing all other > processes in the node. The rule where the persistent DTM process cannot be > restarted within the configured retries in the specified time window will > cause a node down will still be enforced. -- This message was sent by Atlassian JIRA (v7.6.14#76016)