pnowojski commented on pull request #18173:
URL: https://github.com/apache/flink/pull/18173#issuecomment-1003042631


   There are 3 locks being acquired in the context of this ticket:
   1. `LocalBufferPool#availableMemorySegments`
   2. `NetworkBufferPool#factoryLock`
   3. `NetworkBufferPool#availableMemorySegments`
   
   I completely forgot about the 3rd one, but it looks like it has 
short/shallow scope and it's not causing any problems. In my previous message 
by "network buffer pool lock" I meant 2. and by "local buffer pool lock" I 
meant 1. Having said that...
   
   > I find that in local buffer pool there is no method which really needs to 
acquire the network buffer pool lock.
   
   both 2. and 3. are being acquired from the `LocalBufferPool`, albeit 
indirectly. So yes, I had in mind some construct like you mentioned:
   ```
   synchronized (factoryLock) {
   synchronized (availableMemorySegments) {
   // do something
   }
   }
   ```
   or maybe:
   ```
   val flag
   synchronized (availableMemorySegments) {
     flag = networkBufferPool.doSomething();
   }
   if (flag) {
     synchronized (factoryLock) {
       doSomethingElse();
     }
   }
   ```
   In other words, in one way or another, avoid situations where we first 
acquire 1. and later 2. in the same stack trace.
   
   > So always locking the network buffer pool lock when call the network 
buffer pool method may incurs extra overhead. 
   
   Yes, I agree that we should avoid it. We shouldn't be acquiring the 
`factoryLock` if not needed. Definitely we shouldn't be doing it per every 
written/recycled buffer. However once per input gate/input channel setup should 
be fine.
   
   I guess in `LocalBufferPool` we can have four options:
   I. Preemptively acquire `factoryLock` (2.) in methods that are used rarely
   II. Decouple logic in two steps. First do something under the lock 1., 
release lock 1., as second do something that requires lock 2.
   III. First acquire lock 1., but if it turns out that we will need to acquire 
lock 2., return from the `NetworkBufferPool` back to the `LocalBufferPool`, 
release lock 1., acquire lock 2. and then acquire lock 1. again.
   IV. Never acquire lock 2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to