CalvinKirs opened a new pull request, #39099:
URL: https://github.com/apache/doris/pull/39099
## Proposed changes
#39015
### Description:
This issue proposes the addition of new features to the project, including a
deadlock detection tool and monitored lock implementations. These features will
help in identifying and debugging potential deadlocks and monitoring lock
usage. Features:
#### AbstractMonitoredLock:
A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout, TimeUnit
unit) methods. Logs information about lock acquisition time, release time, and
any failure to acquire the lock within the specified timeout. ##### eg
```log
2024-08-07 12:02:59 [ Thread-2:2006 ] - [ WARN ] Thread ID: 12, Thread
Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
at java.lang.Thread.getStackTrace(Thread.java:1564)
at
org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
at
org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
at org.example.ExampleService.timeout(ExampleService.java:17)
at org.example.Main.lambda$test2$1(Main.java:39)
at java.lang.Thread.run(Thread.java:750)
```
#### DeadlockCheckerTool:
Uses ScheduledExecutorService for periodic deadlock checks. Logs deadlock
information including thread names, states, lock info, and stack traces.
**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data from
external resources such as disk or network. Thread state cache: The JVM
typically maintains a cache of thread states, reducing the need for real-time
calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at
org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
at org.example.Main.lambda$testDeadLock$3(Main.java:79)
at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)
2024-08-07 14:11:28 [ pool-1-thread-1:2001 ] - [ WARN ] Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at
org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
at org.example.Main.lambda$testDeadLock$4(Main.java:93)
at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)
```
##### benchmark
```
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@Threads(1)
Benchmark Mode Cnt
Score Error Units
LockBenchmark.testMonitoredLock thrpt 2
15889.407 ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate thrpt 2
678.061 MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm thrpt 2
56.000 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space thrpt 2
668.249 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm thrpt 2
55.080 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space thrpt 2
0.075 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm thrpt 2
0.006 B/op
LockBenchmark.testMonitoredLock:·gc.count thrpt 2
20.000 counts
LockBenchmark.testMonitoredLock:·gc.time thrpt 2
6.000 ms
LockBenchmark.testNativeLock thrpt 2
103130.635 ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate thrpt 2
≈ 10⁻⁴ MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm thrpt 2
≈ 10⁻⁶ B/op
LockBenchmark.testNativeLock:·gc.count thrpt 2
≈ 0 counts
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@Threads(100)
Benchmark Mode Cnt
Score Error Units
LockBenchmark.testMonitoredLock thrpt 2
10994.606 ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate thrpt 2
488.508 MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm thrpt 2
56.002 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space thrpt 2
481.390 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm thrpt 2
55.163 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space thrpt 2
0.020 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm thrpt 2
0.002 B/op
LockBenchmark.testMonitoredLock:·gc.count thrpt 2
18.000 counts
LockBenchmark.testMonitoredLock:·gc.time thrpt 2
9.000 ms
LockBenchmark.testNativeLock thrpt 2
558652.036 ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate thrpt 2
0.016 MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm thrpt 2
≈ 10⁻⁴ B/op
LockBenchmark.testNativeLock:·gc.count thrpt 2
≈ 0 counts
```
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]