lhotari commented on code in PR #23634:
URL: https://github.com/apache/pulsar/pull/23634#discussion_r1860204410
##########
pulsar-broker-common/src/main/java/org/apache/pulsar/common/configuration/VipStatus.java:
##########
@@ -38,25 +44,71 @@ public class VipStatus {
public static final String ATTRIBUTE_STATUS_FILE_PATH = "statusFilePath";
public static final String ATTRIBUTE_IS_READY_PROBE = "isReadyProbe";
+ // log a full thread dump when a deadlock is detected in status check once
every 10 minutes
+ // to prevent excessive logging
+ private static final long LOG_THREADDUMP_INTERVAL_WHEN_DEADLOCK_DETECTED =
600000L;
+ private static volatile long lastCheckStatusTimestamp;
+
+ // Rate limit status checks to every 500ms to prevent DoS
+ private static final long CHECK_STATUS_INTERVAL = 500L;
+ private static volatile boolean lastCheckStatusResult;
+
@Context
protected ServletContext servletContext;
@GET
public String checkStatus() {
- String statusFilePath = (String)
servletContext.getAttribute(ATTRIBUTE_STATUS_FILE_PATH);
- @SuppressWarnings("unchecked")
- Supplier<Boolean> isReadyProbe = (Supplier<Boolean>)
servletContext.getAttribute(ATTRIBUTE_IS_READY_PROBE);
+ // Locking classes to avoid deadlock detection in multi-thread
concurrent requests.
+ synchronized (VipStatus.class) {
+ if (System.currentTimeMillis() - lastCheckStatusTimestamp <
CHECK_STATUS_INTERVAL) {
+ lastCheckStatusTimestamp = System.currentTimeMillis();
Review Comment:
> Your suggestion is not to need synchronization, right? If there is no code
synchronization, will there be a massive number of DoS attacks that execute
deadlock detection logic simultaneously, causing a surge in node load.
I'm not talking about synchronization. If `lastCheckStatusTimestamp` gets
updated here, it breaks the logic.
The previous DoS concern was about calling the deadlock detection since that
has a higher cost than before. Jetty has separate
[DoSFilter](https://javadoc.jetty.org/jetty-9/org/eclipse/jetty/servlets/DoSFilter.html)
for adding basic DoS protection. In general Pulsar isn't protected against
malicious DoS attacks and it's not meant to be exposed on the public internet.
In this case, we just need to avoid adding a lot of extra cost compared to the
previous version.
The original suggestion was to rate limit the deadlock check, however it's
fine to also rate limit the file existence check. By removing this line, I
believe that the logic would be fine. However, adding tests would be helpful to
ensure this. There could be unit tests with mocking to ensure it. That might
require further refactoring so that mocks could be properly injected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]