[ https://issues.apache.org/jira/browse/KNOX-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364350#comment-16364350 ]
Kevin Risden commented on KNOX-1091: ------------------------------------ I set *gateway.threadpool.max to 6* (min required based on the following message from gateway.log) {noformat} Failed to start gateway: java.lang.IllegalStateException: Insufficient threads: max=5 < needed(acceptors=1 + selectors=4 + request=1) {noformat} I was then able to run 100 concurrent with no issues. *Out from testing 100 concurrent with gateway.threadpool.max=6* {noformat} [knox-1.0.0]$ >logs/gateway-audit.log [knox-1.0.0]$ ab -n 1000 -c 100 http://localhost:8443/gateway/health/api/v1/version This is ApacheBench, Version 2.3 <$Revision: 1430300 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking localhost (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Completed 600 requests Completed 700 requests Completed 800 requests Completed 900 requests Completed 1000 requests Finished 1000 requests Server Software: Jetty(9.2.15.v20160210) Server Hostname: localhost Server Port: 8443 Document Path: /gateway/health/api/v1/version Document Length: 157 bytes Concurrency Level: 100 Time taken for tests: 0.685 seconds Complete requests: 1000 Failed requests: 0 Write errors: 0 Total transferred: 298000 bytes HTML transferred: 157000 bytes Requests per second: 1459.26 [#/sec] (mean) Time per request: 68.528 [ms] (mean) Time per request: 0.685 [ms] (mean, across all concurrent requests) Transfer rate: 424.67 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.3 0 2 Processing: 1 65 20.5 60 115 Waiting: 1 65 20.5 60 114 Total: 2 65 20.3 60 115 Percentage of the requests served within a certain time (ms) 50% 60 66% 65 75% 68 80% 69 90% 113 95% 114 98% 114 99% 114 100% 115 (longest request) [knox-1.0.0]$ grep -F access logs/gateway-audit.log | cut -d'|' -f3 | sort | uniq -c | sort -n | wc -l 1000 [knox-1.0.0]$ grep -F access logs/gateway-audit.log | cut -d'|' -f3 | sort | uniq -c | sort -n | tail -n5 2 fe948731-bd26-4767-947d-116719e0b03e 2 feb7243d-7958-4e17-89da-aba9a7d65d14 2 fee954db-0784-44db-8988-d050951cd1b4 2 ffc5f977-cdad-4417-9601-c73756a07f3f 2 ffe19e1f-8968-4268-98db-7da2ffe4e977{noformat} > Knox Audit Logging - duplicate correlation ids > ---------------------------------------------- > > Key: KNOX-1091 > URL: https://issues.apache.org/jira/browse/KNOX-1091 > Project: Apache Knox > Issue Type: Bug > Components: Server > Reporter: Kevin Risden > Priority: Major > > From the Knox User list thread: "Multiple topology audit logging", it came to > my attention that Knox seems to be logging duplicate correlation ids. > Separating out the topic specifically here to dig a bit deeper. > While looking at our Knox audit logs (Knox 0.9 on HDP 2.5) the "correlation > id" doesn't seem to be unique across requests. Is this to be expected? Here > is a snippet (anonymized): > grep 7557c91b-2a48-4e09-aefc-44e9892372da /var/knox/gateway-audit.log > {code} > 17/10/10 12:50:09 > ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE||||access|uri|/gateway/HADOOPTEST/hbase/hbase/NAMESPACE1:TABLE1/ID1//|unavailable|Request > method: GET > 17/10/10 12:50:09 > ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||authentication|uri|/gateway/HADOOPPROD/hbase/NAMESPACE2:TABLE2/multiget?row=ID2%2fd%3araw&|success| > 17/10/10 12:50:09 > ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||authentication|uri|/gateway/HADOOPPROD/hbase/NAMESPACE2:TABLE2/multiget?row=ID2%2fd%3araw&|success|Groups: > [] > 17/10/10 12:50:09 > ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||dispatch|uri|http://WEBHBASE.example.com:8084/NAMESPACE2:TABLE2/multiget?doAs=USER1&row=ID2%2Fd%3Araw|unavailable|Request > method: GET > 17/10/10 12:50:09 > ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||dispatch|uri|http://WEBHBASE.example.com:8084/NAMESPACE2:TABLE2/multiget?doAs=USER1&row=ID2%2Fd%3Araw|success|Response > status: 200 > 17/10/10 12:50:09 > ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||access|uri|/gateway/HADOOPPROD/hbase/NAMESPACE2:TABLE2/multiget?row=ID2%2fd%3araw&|success|Response > status: 200 > 17/10/10 12:50:09 > ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE||||authentication|principal|USER2|failure|LDAP > authentication failed. > 17/10/10 12:50:09 > ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE||||access|uri|/gateway/HADOOPTEST/hbase/hbase/NAMESPACE1:TABLE2/ID1//|success|Response > status: 401 > {code} > The things to highlight here for the same correlation id: > * different topologies are being used > * different uris are being used > * different users are being used > Some of the things that we have configured that could impact results: > * authentication caching > * multiple Knox servers > * load balancer in front of Knox -- This message was sent by Atlassian JIRA (v7.6.3#76005)