Re: [PR] Improve ResourceCountCheckTask and recalculation by not locking DB [cloudstack]

via GitHub Mon, 09 Oct 2023 02:10:12 -0700


DaanHoogland commented on code in PR #7977:
URL: https://github.com/apache/cloudstack/pull/7977#discussion_r1350032294



##########
server/src/main/java/com/cloud/resourcelimit/ResourceLimitManagerImpl.java:
##########
@@ -1171,6 +1178,27 @@ public ResourceCountCheckTask() {
 
         @Override
         protected void runInContext() {
+            GlobalLock lock = GlobalLock.getInternLock("ResourceCheckTask");

Review Comment:
   :D I think this warrants renaming the PR :p



##########
server/src/main/java/com/cloud/resourcelimit/ResourceLimitManagerImpl.java:
##########
@@ -1171,6 +1178,27 @@ public ResourceCountCheckTask() {
 
         @Override
         protected void runInContext() {
+            GlobalLock lock = GlobalLock.getInternLock("ResourceCheckTask");
+            try {
+                if (lock.lock(30)) {
+                    ManagementServerHostVO msHost = 
managementServerHostDao.findOneByLongestRuntime();
+                    if (msHost == null || (msHost.getMsid() != 
ManagementServerNode.getManagementServerId())) {
+                        s_logger.debug("Skipping the resource counters 
recalculation task on this management server");
+                        lock.unlock();
+                        return;

Review Comment:
   does this mean the `lock.releaseRef()` is not needed?



##########
framework/cluster/src/main/java/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java:
##########
@@ -272,4 +274,14 @@ public ManagementServerHostVO findOneInUpState(Filter 
filter) {
         return null;
     }
 
+    @Override
+    public ManagementServerHostVO findOneByLongestRuntime() {
+        SearchCriteria<ManagementServerHostVO> sc = StateSearch.create();
+        sc.setParameters("state", ManagementServerHost.State.Up);
+        sc.setParameters("runid", 0);
+        Filter filter = new Filter(ManagementServerHostVO.class, "runid", 
true, 0L, 1L);

Review Comment:
   so we order by runid and assume that means the first one is the longest 
running.
   I think this will work, but there is the issue of re-joining the cluster and 
time skews (this is about milliseconds. I don´t think these will be issues for 
this functionality but just giving a heads-up. Maybe someone else knows??



##########
server/src/main/java/com/cloud/resourcelimit/ResourceLimitManagerImpl.java:
##########
@@ -1171,6 +1178,27 @@ public ResourceCountCheckTask() {
 
         @Override
         protected void runInContext() {
+            GlobalLock lock = GlobalLock.getInternLock("ResourceCheckTask");
+            try {
+                if (lock.lock(30)) {
+                    ManagementServerHostVO msHost = 
managementServerHostDao.findOneByLongestRuntime();
+                    if (msHost == null || (msHost.getMsid() != 
ManagementServerNode.getManagementServerId())) {
+                        s_logger.debug("Skipping the resource counters 
recalculation task on this management server");
+                        lock.unlock();

Review Comment:
   both branches of the if have the unlock() call, maybe bring it outside (as 
it does not seem to depend on the if.



##########
server/src/main/java/com/cloud/resourcelimit/ResourceLimitManagerImpl.java:
##########
@@ -1171,6 +1178,27 @@ public ResourceCountCheckTask() {
 
         @Override
         protected void runInContext() {
+            GlobalLock lock = GlobalLock.getInternLock("ResourceCheckTask");
+            try {
+                if (lock.lock(30)) {
+                    ManagementServerHostVO msHost = 
managementServerHostDao.findOneByLongestRuntime();
+                    if (msHost == null || (msHost.getMsid() != 
ManagementServerNode.getManagementServerId())) {
+                        s_logger.debug("Skipping the resource counters 
recalculation task on this management server");

Review Comment:
   maybe make this a trace message?
   ```suggestion
                           s_logger.trace("Skipping the resource counters 
recalculation task on this management server");
   ```



##########
framework/cluster/src/main/java/com/cloud/cluster/dao/ManagementServerHostDaoImpl.java:
##########
@@ -272,4 +274,14 @@ public ManagementServerHostVO findOneInUpState(Filter 
filter) {
         return null;
     }
 
+    @Override
+    public ManagementServerHostVO findOneByLongestRuntime() {
+        SearchCriteria<ManagementServerHostVO> sc = StateSearch.create();
+        sc.setParameters("state", ManagementServerHost.State.Up);
+        sc.setParameters("runid", 0);
+        Filter filter = new Filter(ManagementServerHostVO.class, "runid", 
true, 0L, 1L);

Review Comment:
   so we order by runid and assume that means the first one is the longest 
running.
   I think this will work, but there is the issue of re-joining the cluster and 
time skews (this is about milliseconds. I don´t think these will be issues for 
this functionality but just giving a heads-up. Maybe someone else knows??



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Improve ResourceCountCheckTask and recalculation by not locking DB [cloudstack]

Reply via email to