Nagy Attila Bálint created FLINK-39761:
------------------------------------------

             Summary: Missing 'Connection: close' header on '304 Not Modified' 
responses causes proxy connection pool poisoning
                 Key: FLINK-39761
                 URL: https://issues.apache.org/jira/browse/FLINK-39761
             Project: Flink
          Issue Type: Bug
          Components: Runtime / REST, Runtime / Web Frontend
    Affects Versions: 2.2.1, 1.20.4
            Reporter: Nagy Attila Bálint


*Overview:*
When the Flink Web UI / History Server serves static files (e.g., .css, .js) 
and receives an If-Modified-Since request matching the file's modification 
time, it correctly generates a {{304 Not Modified}} response.
However, the server immediately drops the TCP connection without including a 
{{Connection: close}} HTTP header in the response.
This violates HTTP/1.1 keep-alive 
[expectations|https://www.rfc-editor.org/info/rfc2068/#section-19.7.1] and 
causes *connection pool poisoning* in downstream reverse proxies (such as 
Apache Knox or Nginx).

*Impact:*
Because HTTP/1.1 assumes persistent connections by default, reverse proxies 
receive the 304 response and place the connection back into their reusable 
connection pool.
When the proxy attempts to reuse this connection for the very next request, it 
hits an unexpected end of stream because Flink has already severed the TCP 
connection.

In Apache Knox, this manifests as a NoHttpResponseException and results in 
intermittent HTTP 500 Server Errors being served to the end user.

*Root Cause:*
The issue originates in 
{{org.apache.flink.runtime.rest.handler.legacy.files.StaticFileServerHandler}}.
current master branch (2.2.1+) 
[link|https://github.com/apache/flink/blob/45295cf62608ca172b83ac42d9128d027a91d06a/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/legacy/files/StaticFileServerHandler.java#L312]
1.20.4 branch 
[link|https://github.com/apache/flink/blob/release-1.20.4/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/legacy/files/StaticFileServerHandler.java#L312]

In the {{sendNotModified}} method, the code creates a 
{{DefaultFullHttpResponse}} and immediately attaches a 
{{ChannelFutureListener.CLOSE}} listener to the write operation.
However, it fails to set the {{Connection: close}} header on the response 
object before flushing it to the client.


{code:java}
public static void sendNotModified(ChannelHandlerContext ctx) {
    FullHttpResponse response = new DefaultFullHttpResponse(HTTP_1_1, 
NOT_MODIFIED);
    setDateHeader(response);

    // BUG: Missing explicit Connection: close header here before closing the 
channel.
    // Proxies assume the connection is kept alive.

    // close the connection as soon as the error message is sent.
    ctx.writeAndFlush(response).addListener(ChannelFutureListener.CLOSE);
}
{code}

*Proposed Solution:*
To comply with HTTP/1.1 specifications and prevent proxy connection pool 
poisoning, the Flink server must explicitly communicate that the connection is 
being closed.

The fix is simply to add the Connection: close header before flushing the 
response:

{code:java}
public static void sendNotModified(ChannelHandlerContext ctx) {
    FullHttpResponse response = new DefaultFullHttpResponse(HTTP_1_1, 
NOT_MODIFIED);
    setDateHeader(response);
    
    // Explicitly notify the client that the connection will be dropped
    response.headers().set(HttpHeaderNames.CONNECTION, HttpHeaderValues.CLOSE);

    // close the connection as soon as the error message is sent.
    ctx.writeAndFlush(response).addListener(ChannelFutureListener.CLOSE);
}
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to