[
https://issues.apache.org/jira/browse/KNOX-3058?focusedWorklogId=931031&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-931031
]
ASF GitHub Bot logged work on KNOX-3058:
----------------------------------------
Author: ASF GitHub Bot
Created on: 20/Aug/24 21:45
Start Date: 20/Aug/24 21:45
Worklog Time Spent: 10m
Work Description: pzampino opened a new pull request, #929:
URL: https://github.com/apache/knox/pull/929
## What changes were proposed in this pull request?
Modified error handling when a topology is being redeployed, such that the
response is not HTTP 404 'Not Found', but rather HTTP 503 'Service
Unavailable'. The 503 response is much more likely to be retried by clients
than is a 404 response.
Added a Jetty ErrorHandler that checks whether or not the topology being
requested is in a Set marked as inactive. Topology names are add to this
inactive set when the associated topology is deactivated, and removed from this
set when the topology is reactivated. In the case of topology deletion, the
topology is marked as inactive, but then removed from the inactive set because
we know it's being deleted.
## How was this patch tested?
I deployed a test topology with a demo LDAP provider and the Knox Token
service.
I then ran the following script with
'https://localhost:8443/gateway/demo/knoxtoken/api/v2/token' and one of the
demo LDAP username/pwd combinations, and piped the output to a file. This
script outputs only the HTTP response status code for each invocation.
```
#!/bin/sh
#
#
ENDPOINT=$1
echo "Endpoint: $ENDPOINT"
if [ ! -z "$2" ] ; then
USER=$2
fi
if [ ! -z "$3" ] ; then
PWD=$3
fi
for i in {1..100000}
do
curl -o /dev/null -s -w "%{http_code}\n" -ku ${USER}:${PWD} ${ENDPOINT}
done
```
Example:
`~/bin/resp-test.sh
'https://localhost:8443/gateway/demo/knoxtoken/api/v2/token' sam sam-password >
~/response-code-test.txt &
`
While this script is running, I "touched" the test topology to trigger
redeployment many times over several minutes. Finally, I deleted the test
topology.
Following this, I reviewed the output to verify that there were no 404
responses until that time at which I deleted the topology. I also verified the
periodic 503 responses which are expected, and the normal 200 responses in
between.
Issue Time Tracking
-------------------
Worklog Id: (was: 931031)
Remaining Estimate: 0h
Time Spent: 10m
> Avoid 404 When Topology Is Being Redeployed
> -------------------------------------------
>
> Key: KNOX-3058
> URL: https://issues.apache.org/jira/browse/KNOX-3058
> Project: Apache Knox
> Issue Type: Improvement
> Components: Server
> Reporter: Philip Zampino
> Assignee: Philip Zampino
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> While a topology is being redeployed, if it is requested, the client receives
> an HTTP 404 response. Most clients will not retry when receiving a 404, so
> the interaction will fail.
> If Knox were to respond with a more retry-friendly response (e.g., HTTP 503),
> then clients could overcome these small windows of unavailability with
> retries.
> The difficult part may be distinguishing topology removal from topology
> inactivity. I think a deleted topology should still result in a 404.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)