[ 
https://issues.apache.org/jira/browse/NIFI-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590176#comment-16590176
 ] 

Otto Fowler commented on NIFI-5522:
-----------------------------------

OK,

WRT that exception ( I am not saying this is your problem overall ) here is 
what I think is happening.  If you look at the jetty code, this exception will 
be thrown if you are resetting an already closed connection. Like would happen 
when we call flush.

In trigger, we have gotten the response from the context, but it is still in 
the context service.  We don't remove it from the service until we are done 
with it later.

That means that we can get the Response, and be working on it, and have the 
service time it out in the background, such that it is closed when we go to 
flush later, thus this exception.

 
{code:java}
// WE HAVE THE RESPONSE, but so does the contextMap, which has a thread trying 
to time them out
final HttpServletResponse response = contextMap.getResponse(contextIdentifier);
if (response == null) {
    session.transfer(flowFile, REL_FAILURE);
    getLogger().error("Failed to respond to HTTP request for {} because 
FlowFile had an '{}' attribute of {} but could not find an HTTP Response Object 
for this identifier",
            new Object[]{flowFile, HTTPUtils.HTTP_CONTEXT_ID, 
contextIdentifier});
    return;
}

final int statusCode = Integer.parseInt(statusCodeValue);
response.setStatus(statusCode);

for (final Map.Entry<PropertyDescriptor, String> entry : 
context.getProperties().entrySet()) {
    final PropertyDescriptor descriptor = entry.getKey();
    if (descriptor.isDynamic()) {
        final String headerName = descriptor.getName();
        final String headerValue = 
context.getProperty(descriptor).evaluateAttributeExpressions(flowFile).getValue();

        if (!headerValue.trim().isEmpty()) {
            response.setHeader(headerName, headerValue);
        }
    }
}

try {
    session.exportTo(flowFile, response.getOutputStream());
    // THE response's connection is closed at this point.  I am not sure if 
that means the stream is bad
    // and the export to has not worked.....
    // Also, it is unclear _when_ it was closed, but the context may have
    response.flushBuffer();
} catch (final ProcessException e) {
    session.transfer(flowFile, REL_FAILURE);
    getLogger().error("Failed to respond to HTTP request for {} due to {}", new 
Object[]{flowFile, e});
    // THIS REMOVES FROM THE MAP, and from the timeout thread
    contextMap.complete(contextIdentifier);
    return;
} catch (final Exception e) {
    session.transfer(flowFile, REL_FAILURE);
    // HERE WE DO NOT REMOVE FROM MAP, why?
    getLogger().error("Failed to respond to HTTP request for {} due to {}", new 
Object[]{flowFile, e});
    return;
}

// we only remove the response from the context map below here.....{code}
 

Given the other threads involved, when working with the Response object, it 
must be removed from the map.

 

[~joewitt] may want to double check my logic here.

 

 

> HandleHttpRequest enters in fault state and does not recover
> ------------------------------------------------------------
>
>                 Key: NIFI-5522
>                 URL: https://issues.apache.org/jira/browse/NIFI-5522
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.7.0, 1.7.1
>            Reporter: Diego Queiroz
>            Priority: Critical
>              Labels: security
>         Attachments: HandleHttpRequest_Error_Template.xml, 
> image-2018-08-15-21-10-27-926.png, image-2018-08-15-21-10-33-515.png, 
> image-2018-08-15-21-11-57-818.png, image-2018-08-15-21-15-35-364.png, 
> image-2018-08-15-21-19-34-431.png, image-2018-08-15-21-20-31-819.png, 
> test_http_req_resp.xml
>
>
> HandleHttpRequest randomly enters in a fault state and does not recover until 
> I restart the node. I feel the problem is triggered when some exception 
> occurs (ex.: broken request, connection issues, etc), but I am usually able 
> to reproduce this behavior stressing the node with tons of simultaneous 
> requests:
> {{# example script to stress server}}
>  {{for i in `seq 1 10000`; do}}
>  {{   wget ‐T10 ‐t10 ‐qO‐ 'http://127.0.0.1:64080/'>/dev/null &}}
>  {{done}}
> When this happens, HandleHttpRequest start to return "HTTP ERROR 503 - 
> Service Unavailable" and does not recover from this state:
> !image-2018-08-15-21-10-33-515.png!
> If I try to stop the HandleHttpRequest processor, the running threads does 
> not terminate:
> !image-2018-08-15-21-11-57-818.png!
> If I force them to terminate, the listen port continue being bound by NiFi:
> !image-2018-08-15-21-15-35-364.png!
> If I try to connect again, I got a HTTP ERROR 500:
> !image-2018-08-15-21-19-34-431.png!
>  
> If I try to start the HandleHttpRequest processor again, it doesn't start 
> with the message:
>  * {{ERROR [Timer-Driven Process Thread-11] 
> o.a.n.p.standard.HandleHttpRequest 
> HandleHttpRequest[id=9bae326b-5ac3-3e9f-2dac-c0399d8f2ddb] 
> {color:#FF0000}*Failed to process session due to 
> org.apache.nifi.processor.exception.ProcessException: Failed to initialize 
> the server: org.apache.nifi.processor.exception.ProcessException: Failed to 
> initialize the server*{color}}}{\{ 
> org.apache.nifi.processor.exception.ProcessException: Failed to initialize 
> the server}}\{{ {{ at 
> org.apache.nifi.processors.standard.HandleHttpRequest.onTrigger(HandleHttpRequest.java:501)}}}}\{{
>  {{ at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)}}}}\{{
>  {{ at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)}}}}\{{
>  {{ at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)}}}}\{{
>  {{ at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)}}}}\{{
>  {{ at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}}}\{{
>  {{ at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)}}}}\{{ {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)}}}}\{{
>  {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)}}}}\{{
>  {{ at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}}}\{{
>  {{ at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}}}\{{
>  {{ at java.lang.Thread.run(Thread.java:748)}}}}{{ {color:#FF0000}*Caused by: 
> java.net.BindException: Address already in use*{color}}}\{{ {{ at 
> sun.nio.ch.Net.bind0(Native Method)}}}}\{{ {{ at 
> sun.nio.ch.Net.bind(Net.java:433)}}}}\{{ {{ at 
> sun.nio.ch.Net.bind(Net.java:425)}}}}\{{ {{ at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)}}}}\{{
>  {{ at 
> sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)}}}}\{{ {{ at 
> org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:298)}}}}\{{
>  {{ at 
> org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)}}}}\{{
>  {{ at 
> org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)}}}}\{{
>  {{ at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)}}}}\{{
>  {{ at org.eclipse.jetty.server.Server.doStart(Server.java:431)}}}}\{{ {{ at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)}}}}\{{
>  {{ at 
> org.apache.nifi.processors.standard.HandleHttpRequest.initializeServer(HandleHttpRequest.java:430)}}}}\{{
>  {{ at 
> org.apache.nifi.processors.standard.HandleHttpRequest.onTrigger(HandleHttpRequest.java:489)}}
>  \{ Unknown macro: { ... 11 common frames omitted}}}{{}}}
> !image-2018-08-15-21-20-31-819.png!
>  
> The only way to workaround this when it happens is chaging the port it 
> listens to or restarting NiFi service. I flagged this as a security issue 
> because it allows someone to cause a DoS to the service.
> I found several similar issues, but most of them are related with old 
> versions, I am can confirm this affects versions 1.7.0 and 1.7.1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to