codeant-ai-for-open-source[bot] commented on code in PR #40856:
URL: https://github.com/apache/superset/pull/40856#discussion_r3374983337


##########
superset-websocket/src/index.ts:
##########
@@ -159,6 +159,37 @@ export const setLastFirehoseId = (id: string): void => {
   lastFirehoseId = id;
 };
 
+// WebSocket close code used when a connection is refused because a configured
+// connection limit has been reached (1013 = "Try Again Later").
+const CONNECTION_LIMIT_CLOSE_CODE = 1013;
+
+/**
+ * Determines whether accepting a new connection on the given channel would
+ * exceed a configured connection limit. Returns a human-readable reason when a
+ * limit is reached, or `null` when the connection is within limits.
+ *
+ * Both limits are opt-in: a value of `0` (the default) disables the check.
+ */
+export const connectionLimitReason = (channel: string): string | null => {
+  const { maxTotalConnections, maxConnectionsPerChannel } = opts;
+
+  if (
+    maxTotalConnections > 0 &&
+    Object.keys(sockets).length >= maxTotalConnections
+  ) {
+    return `total connection limit (${maxTotalConnections}) reached`;
+  }

Review Comment:
   **Suggestion:** The total-limit check counts every entry in the global 
socket registry, including sockets that are already closed but not yet 
garbage-collected. Because cleanup is periodic, new connections can be 
incorrectly rejected for a while after disconnects. Count only active sockets 
(or remove sockets immediately on close) before enforcing the total cap. [stale 
reference]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ WebSocket connections rejected despite available capacity.
   - ⚠️ Users may miss async event stream updates.
   - ⚠️ Capacity planning via MAX_TOTAL_CONNECTIONS becomes unreliable.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure a positive total connection cap by setting 
`MAX_TOTAL_CONNECTIONS=1` so
   `opts.maxTotalConnections` is > 0 in `buildConfig()`
   (`superset-websocket/src/config.ts:52-55,128-130`), then start the websocket 
sidecar so
   `startServer` is true and server startup code runs
   (`superset-websocket/src/index.ts:141-147,507-524`).
   
   2. Establish a first WebSocket connection through the normal HTTP upgrade 
path:
   `httpUpgrade()` validates the JWT (`readChannelId` at `index.ts:318-331`), 
calls
   `wss.handleUpgrade`, which emits the `connection` event handled by 
`wsConnection`
   (`index.ts:356-365,515-516`). In `wsConnection`, 
`connectionLimitReason(channel)` is
   called (`index.ts:359-362`) and, because `sockets` is initially empty, 
returns `null`;
   `trackClient` then adds the socket to the global `sockets` registry 
(`index.ts:196-204`).
   
   3. Close the client's WebSocket (normal browser tab close or deliberate 
disconnect). There
   is no `ws.on('close', ...)` handler anywhere in the codebase 
(`superset-websocket/src`
   searched via Grep), so closed sockets are not removed immediately from 
`sockets`. Instead,
   they are only removed when `checkSockets()` runs periodically 
(`index.ts:458-483`),
   invoked via `setInterval(checkSockets, opts.pingSocketsIntervalMs)` 
(`index.ts:524`).
   Until the next `checkSockets` tick, the closed connection's entry remains in 
`sockets`.
   
   4. Before `checkSockets()` has run and deleted the stale entry, initiate a 
second
   WebSocket connection. `wsConnection` runs again (`index.ts:356-368`) and 
calls
   `connectionLimitReason(channel)` (`index.ts:359-362`), whose 
total-connection branch
   checks `Object.keys(sockets).length >= maxTotalConnections` 
(`index.ts:176-180`). Because
   the stale closed socket is still present in `sockets`, 
`Object.keys(sockets).length` is 1
   and `maxTotalConnections` is 1, so the condition is true. The new connection 
is
   incorrectly rejected: `statsd.increment('ws_connection_rejected')`, a 
warning is logged,
   and `ws.close(CONNECTION_LIMIT_CLOSE_CODE, limitReason)` is called with code 
1013
   (`index.ts:359-366`), even though there are zero active WebSocket 
connections.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=866fd67df7484f4ba2f4e702f62112aa&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=866fd67df7484f4ba2f4e702f62112aa&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset-websocket/src/index.ts
   **Line:** 176:181
   **Comment:**
        *Stale Reference: The total-limit check counts every entry in the 
global socket registry, including sockets that are already closed but not yet 
garbage-collected. Because cleanup is periodic, new connections can be 
incorrectly rejected for a while after disconnects. Count only active sockets 
(or remove sockets immediately on close) before enforcing the total cap.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40856&comment_hash=28a8e61e0c9666cf6a0632fcc36a94ac0ee4fc9957f73ce39c6e53607109d68a&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40856&comment_hash=28a8e61e0c9666cf6a0632fcc36a94ac0ee4fc9957f73ce39c6e53607109d68a&reaction=dislike'>👎</a>



##########
superset-websocket/src/index.ts:
##########
@@ -159,6 +159,37 @@ export const setLastFirehoseId = (id: string): void => {
   lastFirehoseId = id;
 };
 
+// WebSocket close code used when a connection is refused because a configured
+// connection limit has been reached (1013 = "Try Again Later").
+const CONNECTION_LIMIT_CLOSE_CODE = 1013;
+
+/**
+ * Determines whether accepting a new connection on the given channel would
+ * exceed a configured connection limit. Returns a human-readable reason when a
+ * limit is reached, or `null` when the connection is within limits.
+ *
+ * Both limits are opt-in: a value of `0` (the default) disables the check.
+ */
+export const connectionLimitReason = (channel: string): string | null => {
+  const { maxTotalConnections, maxConnectionsPerChannel } = opts;
+
+  if (
+    maxTotalConnections > 0 &&
+    Object.keys(sockets).length >= maxTotalConnections
+  ) {
+    return `total connection limit (${maxTotalConnections}) reached`;
+  }
+
+  if (maxConnectionsPerChannel > 0) {
+    const channelSize = channels[channel]?.sockets.length ?? 0;
+    if (channelSize >= maxConnectionsPerChannel) {
+      return `per-channel connection limit (${maxConnectionsPerChannel}) 
reached`;
+    }

Review Comment:
   **Suggestion:** The per-channel limit uses the raw channel socket-id array 
length, but that array can contain stale IDs after sockets have been removed 
from the global registry and before `cleanChannel` runs. This can reject valid 
new connections on a channel that has no active sockets. Derive the count from 
active socket instances (or clean the channel first) when enforcing this limit. 
[stale reference]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Single channel can be locked out after disconnect.
   - ⚠️ Clients may be unable to reconnect to channels.
   - ⚠️ Per-channel capacity limits misrepresent actual active sockets.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure a positive per-channel cap by setting 
`MAX_CONNECTIONS_PER_CHANNEL=1` so
   `opts.maxConnectionsPerChannel` is > 0 in `buildConfig()`
   (`superset-websocket/src/config.ts:53-55,128-129`), then start the websocket 
sidecar so
   the server startup block runs (`index.ts:507-524`), wiring 
`wss.on('connection',
   wsConnection)` (`index.ts:112-116`).
   
   2. Open a first WebSocket connection on some channel (identified via JWT, 
parsed by
   `readChannelId` at `index.ts:318-331`). During `wsConnection` 
(`index.ts:356-373`),
   `connectionLimitReason(channel)` initially returns `null` because 
`channels[channel]` is
   empty, then `trackClient(channel, socketInstance)` registers the socket in 
`sockets` and
   pushes its socket ID into `channels[channel].sockets` (`index.ts:196-209`), 
making
   `channels[channel].sockets.length === 1`.
   
   3. Close that WebSocket connection. The global `sockets` registry is cleaned 
by
   `checkSockets()` (`index.ts:458-483`), which deletes entries for inactive 
sockets and is
   invoked periodically via `setInterval(checkSockets, 
opts.pingSocketsIntervalMs)`
   (`index.ts:524`). However, the per-channel `channels[channel].sockets` array 
is not
   updated here; it is only compacted in `cleanChannel(channel)` 
(`index.ts:490-505`), which
   is run periodically via a separate GC interval (`setInterval(function gc() { 
for (const
   channel in channels) cleanChannel(channel); }, opts.gcChannelsIntervalMs)` at
   `index.ts:525-531`) or when `sendToChannel` encounters missing sockets
   (`index.ts:221-227`). In the window after `checkSockets` has deleted the 
socket from
   `sockets` but before `cleanChannel` has run, `channels[channel].sockets` 
still contains
   the stale socket ID even though there are zero active sockets for that 
channel.
   
   4. Within that window, attempt to open a new WebSocket connection to the 
same channel.
   `wsConnection` executes `connectionLimitReason(channel)` 
(`index.ts:359-362`), which
   computes `channelSize = channels[channel]?.sockets.length ?? 0` and then 
checks `if
   (channelSize >= maxConnectionsPerChannel)` (`index.ts:183-187`). Because 
`channelSize` is
   still 1 from the stale ID and `maxConnectionsPerChannel` is 1, this 
condition is true, so
   `connectionLimitReason` returns a non-null reason. `wsConnection` treats 
this as over the
   per-channel limit, increments `ws_connection_rejected`, logs a warning, and 
closes the new
   WebSocket with code 1013 (`index.ts:359-366`), even though there are no 
active connections
   on that channel.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=0e93de13d4e9439ca5b97c15cccedbdc&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=0e93de13d4e9439ca5b97c15cccedbdc&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset-websocket/src/index.ts
   **Line:** 184:187
   **Comment:**
        *Stale Reference: The per-channel limit uses the raw channel socket-id 
array length, but that array can contain stale IDs after sockets have been 
removed from the global registry and before `cleanChannel` runs. This can 
reject valid new connections on a channel that has no active sockets. Derive 
the count from active socket instances (or clean the channel first) when 
enforcing this limit.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40856&comment_hash=2a047a74995b461fac27f87c04f729836a35725a0f8f8d92f28a789deff2d248&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40856&comment_hash=2a047a74995b461fac27f87c04f729836a35725a0f8f8d92f28a789deff2d248&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to