rpuch commented on code in PR #7250:
URL: https://github.com/apache/ignite-3/pull/7250#discussion_r2652421434
##########
modules/page-memory/src/main/java/org/apache/ignite/internal/pagememory/persistence/checkpoint/CheckpointReadWriteLock.java:
##########
@@ -35,7 +35,7 @@ public class CheckpointReadWriteLock {
*/
static final String CHECKPOINT_RUNNER_THREAD_PREFIX = "checkpoint-runner";
- private static final long LONG_LOCK_THRESHOLD_MILLIS = 1000;
+ private static final long LONG_LOCK_THRESHOLD_NANOS = 1_000_000;
Review Comment:
It was 1000 millis before, but now it's 1 millisecond. Is this change
intentional?
##########
modules/page-memory/src/test/java/org/apache/ignite/internal/pagememory/persistence/checkpoint/CheckpointTimeoutLockTest.java:
##########
@@ -455,8 +456,27 @@ void testCheckpointReadLockMetrics() throws Exception {
});
await().untilAsserted(() ->
assertThat(metrics.readLockWaitingThreads().value(), is(1L)));
readWriteLock.writeUnlock();
+ assertThat(metrics.readLockWaitingThreads().value(), is(0L));
} finally {
timeoutLock.stop();
}
}
+
+ /**
+ * Verifies that the specified distribution metric has recorded the
expected total number of measurements.
+ *
+ * <p>
+ * Rather than checking individual histogram buckets, this method
aggregates all recorded measurements across every bucket
+ * and confirms that the expected interaction was captured in at least one
of them.
+ * This approach mitigates test flakiness that can arise from timing
variations — such as those caused by slower test agents —
+ * by focusing on the overall count rather than precise bucket placement.
Review Comment:
I would remove this. It's a part of a discussion that seems to be external
to the resulting code
##########
modules/page-memory/src/main/java/org/apache/ignite/internal/pagememory/persistence/checkpoint/CheckpointReadWriteLock.java:
##########
@@ -196,20 +196,20 @@ private void onReadLock(long startNanos, boolean taken) {
metrics.decrementReadLockWaitingThreads();
long currentNanos = System.nanoTime();
- long elapsedNanos = TimeUnit.NANOSECONDS.toMillis(currentNanos -
startNanos);
+ long elapsedNanos = currentNanos - startNanos;
if (taken) {
int newLockCount = checkpointReadLockHoldCount.get() + 1;
checkpointReadLockHoldCount.set(newLockCount);
- // We only record acquisition time on first lock acquisition (not
on reentrant locks).
+ // We only record acquisition time on first lock acquisition (not
on reentry).
if (newLockCount == 1) {
checkpointReadLockAcquiredTime.set(currentNanos);
}
metrics.recordReadLockAcquisitionTime(elapsedNanos);
}
- if (elapsedNanos > LONG_LOCK_THRESHOLD_MILLIS) {
+ if (elapsedNanos > LONG_LOCK_THRESHOLD_NANOS) {
log.warn(LONG_LOCK_THROTTLE_KEY, "Checkpoint read lock took {} ms
to acquire.", elapsedNanos);
Review Comment:
Again, millis vs nanos. Let's audit all occurrences of the affected
variables and constants
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]