[ 
https://issues.apache.org/jira/browse/NIFI-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953067#comment-15953067
 ] 

ASF GitHub Bot commented on NIFI-3668:
--------------------------------------

Github user ijokarumawak commented on the issue:

    https://github.com/apache/nifi/pull/1646
  
    @markap14 Would you review this fix since you originally worked on 
NIFI-3636.
    I've reproduced the issue with following test method which sends lots of 
S2S client requests concurrently against a NiFi cluster. The issue was 
reproducible before this patch and confirmed it's addressed by this patch:
    
    ```java
        @Test
        public void test() throws Exception {
            final ExecutorService executors = Executors.newFixedThreadPool(100);
            final AtomicInteger processed = new AtomicInteger();
            final AtomicInteger error = new AtomicInteger();
            for (int i = 0; i < 1000; i++) {
                executors.submit(() -> {
                    try (final SiteToSiteClient client = new 
SiteToSiteClient.Builder()
                            .transportProtocol(SiteToSiteTransportProtocol.HTTP)
                            .url("http://localhost:9011/nifi/";)
                            .portName("input")
                            .build()) {
    
                        final Transaction transaction = 
client.createTransaction(TransferDirection.SEND);
                        transaction.send("test".getBytes(), new HashMap<>());
                        transaction.confirm();
                        transaction.complete();
                        processed.incrementAndGet();
    
                    } catch (Exception e) {
                        logger.error("ERR!", e);
                        error.incrementAndGet();
                    }
                });
            }
            executors.shutdown();
            executors.awaitTermination(1, TimeUnit.DAYS);
            logger.warn("processed={}, err={}", new Object[]{processed.get(), 
error.get()});
        }
    ```
    
    On Mac OS X Sierra, I had to increase file descriptor limit to make above 
test works by creating limit.maxfiles.plist and limit.maxproc.plist as 
described in these web pages:
    - https://www.chrissearle.org/2016/10/01/too-many-open-files-on-osx-macos/
    - https://superuser.com/questions/830149/os-x-yosemite-too-many-files-open
    
    I hope the fix still preserve the performance gain in clustered environment 
that have been added by NIFI-3636. Thanks!


> ThreadPoolRequestReplicator doesn't purge expired requests
> ----------------------------------------------------------
>
>                 Key: NIFI-3668
>                 URL: https://issues.apache.org/jira/browse/NIFI-3668
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.2.0
>            Reporter: Koji Kawamura
>            Assignee: Mark Payne
>
> NIFI-3636 has changes the execution order of putting new entry to the 
> responseMap and purging expired ones from the same map.
> After NIFI-3636, ThreadPoolRequestReplicator adds new 
> StandardAsyncClusterResponse entry into responseMap before checking or 
> purging the map. The newly created entry matches with its isComplete() 
> method, so it won't be purged even if its expired.
> Then if there's already more than 100 remaining request, the newly created 
> request is rejected with "Cannot replicate request {} {} because there are {} 
> outstanding HTTP Requests already. Request Counts Per URI = {}" message. 
> And the request will not be performed even though its already put into the 
> map. As mentioned earlier, these async response initial state is 'completed' 
> so it won't be purged and remains in the map forever.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to