ruanwenjun commented on a change in pull request #5572:
URL: https://github.com/apache/dolphinscheduler/pull/5572#discussion_r644648690



##########
File path: 
dolphinscheduler-server/src/main/java/org/apache/dolphinscheduler/server/master/cache/impl/TaskInstanceCacheManagerImpl.java
##########
@@ -57,7 +86,7 @@
     @Override
     public TaskInstance getByTaskInstanceId(Integer taskInstanceId) {
         TaskInstance taskInstance = taskInstanceCache.get(taskInstanceId);

Review comment:
       @blackberrier I think you can change the entire method to :), 
   ```
   return taskInstanceCache.computeIfAbsent(taskInstanceId, k 
->processService.findTaskInstanceById(k));
   ```

##########
File path: 
dolphinscheduler-server/src/main/java/org/apache/dolphinscheduler/server/master/cache/impl/TaskInstanceCacheManagerImpl.java
##########
@@ -57,7 +86,7 @@
     @Override
     public TaskInstance getByTaskInstanceId(Integer taskInstanceId) {
         TaskInstance taskInstance = taskInstanceCache.get(taskInstanceId);

Review comment:
       @blackberrier You can change the entire method to 
   ```
   return taskInstanceCache.computeIfAbsent(taskInstanceId, k -> 
processService.findTaskInstanceById(k));
   ```

##########
File path: 
dolphinscheduler-server/src/main/java/org/apache/dolphinscheduler/server/master/cache/impl/TaskInstanceCacheManagerImpl.java
##########
@@ -57,7 +86,7 @@
     @Override
     public TaskInstance getByTaskInstanceId(Integer taskInstanceId) {
         TaskInstance taskInstance = taskInstanceCache.get(taskInstanceId);

Review comment:
       @blackberrier You can change the entire method to 
   ```
   return taskInstanceCache.computeIfAbsent(taskInstanceId, k -> 
processService.findTaskInstanceById(k));
   ```
   And you can add some ut.

##########
File path: 
dolphinscheduler-server/src/main/java/org/apache/dolphinscheduler/server/master/cache/impl/TaskInstanceCacheManagerImpl.java
##########
@@ -57,7 +86,7 @@
     @Override
     public TaskInstance getByTaskInstanceId(Integer taskInstanceId) {
         TaskInstance taskInstance = taskInstanceCache.get(taskInstanceId);

Review comment:
       @blackberrier I mean you can change this method to 
   ```
   public TaskInstance getByTaskInstanceId(Integer taskInstanceId) {
   return taskInstanceCache.computeIfAbsent(taskInstanceId, k -> 
processService.findTaskInstanceById(k));
   }
   ```
   This is an atomic operation.

##########
File path: 
dolphinscheduler-server/src/main/java/org/apache/dolphinscheduler/server/master/cache/impl/TaskInstanceCacheManagerImpl.java
##########
@@ -57,7 +86,7 @@
     @Override
     public TaskInstance getByTaskInstanceId(Integer taskInstanceId) {
         TaskInstance taskInstance = taskInstanceCache.get(taskInstanceId);

Review comment:
       @blackberrier 
   If there are two threads execute `getByTaskInstanceId` at the same time, we 
just want to query from database once, right? The block will only happen on the 
two thread update on the same key, and the key is not exist, this is expected. 
   
   And you should realize that the below way is not an atomic operation, in 
some situation, it might cause problem.
   ```java
   TaskInstance taskInstance = taskInstanceCache.get(taskInstanceId);
   if (taskInstance == null){
       taskInstance = processService.findTaskInstanceById(taskInstanceId);
       taskInstanceCache.put(taskInstanceId,taskInstance);
   }
   return taskInstance;
   ```

##########
File path: 
dolphinscheduler-server/src/main/java/org/apache/dolphinscheduler/server/master/cache/impl/TaskInstanceCacheManagerImpl.java
##########
@@ -57,7 +86,7 @@
     @Override
     public TaskInstance getByTaskInstanceId(Integer taskInstanceId) {
         TaskInstance taskInstance = taskInstanceCache.get(taskInstanceId);

Review comment:
       @blackberrier 
   
   > the block may happen on two thread on two key, and the two key share the 
same hashcode
   
   Yes, but we can avoid the 1 step.
   ```
   1. taskInstance = processService.findTaskInstanceById(taskInstanceId);
   2. TaskInstance finalTaskInstance = taskInstance;
   3. taskInstanceCache.computeIfAbsent(taskInstanceId, k -> finalTaskInstance);
   ```

##########
File path: 
dolphinscheduler-server/src/main/java/org/apache/dolphinscheduler/server/master/cache/impl/TaskInstanceCacheManagerImpl.java
##########
@@ -57,7 +86,7 @@
     @Override
     public TaskInstance getByTaskInstanceId(Integer taskInstanceId) {
         TaskInstance taskInstance = taskInstanceCache.get(taskInstanceId);

Review comment:
       @blackberrier I mean you can use below code
   ```
   public TaskInstance getByTaskInstanceId(Integer taskInstanceId) {
   return taskInstanceCache.computeIfAbsent(taskInstanceId, k -> 
processService.findTaskInstanceById(k));
   }
   ```
   The cache is used to reduce database pressure, in your plan, if there are 
multiple threads query the same taskInstance at the same time, they will all 
query from database.
   Your consideration is that when conflicts occur, multiple thread query will 
block. I think if conflicts occur, if we don't want to be blocked, we should 
find a better hash method to reduce conflicts. 
   
   In additional, we can use one line of code to achieve, why use more complex 
code?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to