[
https://issues.apache.org/jira/browse/FLINK-18663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164229#comment-17164229
]
Till Rohrmann commented on FLINK-18663:
---------------------------------------
One suspicion I have is the following: When calling
{{AbstractHandler.respondAsLeader}} we don't check whether the handler has been
terminated or not. Hence the following might happen:
1. We receive a REST request and we call into {{AbstractHandler.channelRead0}}
(inherited from {{LeaderRetrievalHandler}})
2. The {{RestServerEndpoint}} is being shut down which closes all handlers
3. Since no requests are registered in
{{AbstractHandler.inFlightRequestTracker}}, we immediately close the handlers
4. After having obtained the leader gateway, we call into
{{AbstractHandler.respondAsLeader}} which registers the request in the
{{inFlightRequestTracker}} but does not check whether the handler has been shut
down.
> Fix Flink On YARN AM not exit
> -----------------------------
>
> Key: FLINK-18663
> URL: https://issues.apache.org/jira/browse/FLINK-18663
> Project: Flink
> Issue Type: Bug
> Components: Runtime / REST
> Affects Versions: 1.10.0, 1.10.1, 1.11.0
> Reporter: tartarus
> Priority: Major
> Labels: pull-request-available
> Attachments: 110.png, 111.png,
> C49A7310-F932-451B-A203-6D17F3140C0D.png, e18e00dd6664485c2ff55284fe969474.png
>
>
> AbstractHandler throw NPE cause by FlinkHttpObjectAggregator is null
> when rest throw exception, it will do this code
> {code:java}
> private CompletableFuture<Void> handleException(Throwable throwable,
> ChannelHandlerContext ctx, HttpRequest httpRequest) {
> FlinkHttpObjectAggregator flinkHttpObjectAggregator =
> ctx.pipeline().get(FlinkHttpObjectAggregator.class);
> int maxLength = flinkHttpObjectAggregator.maxContentLength() -
> OTHER_RESP_PAYLOAD_OVERHEAD;
> if (throwable instanceof RestHandlerException) {
> RestHandlerException rhe = (RestHandlerException) throwable;
> String stackTrace = ExceptionUtils.stringifyException(rhe);
> String truncatedStackTrace = Ascii.truncate(stackTrace,
> maxLength, "...");
> if (log.isDebugEnabled()) {
> log.error("Exception occurred in REST handler.", rhe);
> } else {
> log.error("Exception occurred in REST handler: {}",
> rhe.getMessage());
> }
> return HandlerUtils.sendErrorResponse(
> ctx,
> httpRequest,
> new ErrorResponseBody(truncatedStackTrace),
> rhe.getHttpResponseStatus(),
> responseHeaders);
> } else {
> log.error("Unhandled exception.", throwable);
> String stackTrace = String.format("<Exception on server
> side:%n%s%nEnd of exception on server side>",
> ExceptionUtils.stringifyException(throwable));
> String truncatedStackTrace = Ascii.truncate(stackTrace,
> maxLength, "...");
> return HandlerUtils.sendErrorResponse(
> ctx,
> httpRequest,
> new ErrorResponseBody(Arrays.asList("Internal server
> error.", truncatedStackTrace)),
> HttpResponseStatus.INTERNAL_SERVER_ERROR,
> responseHeaders);
> }
> }
> {code}
> but flinkHttpObjectAggregator some case is null,so this will throw NPE,but
> this method called by AbstractHandler#respondAsLeader
> {code:java}
> requestProcessingFuture
> .whenComplete((Void ignored, Throwable throwable) -> {
> if (throwable != null) {
>
> handleException(ExceptionUtils.stripCompletionException(throwable), ctx,
> httpRequest)
> .whenComplete((Void ignored2, Throwable
> throwable2) -> finalizeRequestProcessing(finalUploadedFiles));
> } else {
> finalizeRequestProcessing(finalUploadedFiles);
> }
> });
> {code}
> the result is InFlightRequestTracker Cannot be cleared.
> so the CompletableFuture does‘t complete that handler's closeAsync returned
> !C49A7310-F932-451B-A203-6D17F3140C0D.png!
> !e18e00dd6664485c2ff55284fe969474.png!
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)