OK, so the `BaseJettyTest` is written to redirect POST every 20 requests, and
fail with 500 error every 10 requests that aren't redirected.
```java
} else if (request.getMethod().equals("POST")) {
// don't redirect large objects
if (request.getContentLength() < 10240 &&
redirectEveryTwentyRequests(request, response))
return;
if (failEveryTenRequests(request, response))
return;
```
Below, I'm saying I did stuff not as I hunted the code down, just lazy and
figured it was probably me.
I wrote this to ensure we could follow redirects on POST as this could happen
in amazon with virtual-hosted buckets. I also wanted to ensure users could
silently recover from 500s on POST as AWS query apis had many safe commands
that were tunneled over POST. Note this behavior has since screwed with
rackspace in particular who have occasionally returned 500 *after* processing
the POST!
It is probably more safe to retry a POST before it was fully sent (IOException)
vs retrying based a server response code.
For the record, the only safe write is PUT (maybe PATCH in some cases): it may
be entirely better to remove this contract or ensure it does *not* retry POST!
This would imply adding logic at each api level to decide which commands are
really safe even though they are presented as POST requests. Note that this
issue isn't black/white, so you can also see the [issue on okhttp wrt silent
retries](https://github.com/square/okhttp/issues/258) (and disabling them).
Now back to these tests. 3 tests failed, right? Notice all of them had
`testPost` prefix. The tests are written to ensure that input streams are not
retried (as this ensures input stream is not rebuffered), while buffered
content is retried.
* `testPostAsInputStream` is written to fail on Exception vs retrying. This
test checked to see if it failed at least one of out 5 requests and it didn't.
Now, BaseJettyTest is written to fail one out of every 10 POSTs, so this could
be more just bad luck. At any rate, this test could be rewritten to send a
flag on POST which jetty will always fail on vs relying on random failures or
redirect roulette.
* `testPostBinder` and `testPostContentLanguage` failed with an EOF error on
`POST http://localhost:8124/`. The port 8124 hints that these failures were
after a redirect. I'd focus on redirect.
If I had to rewrite these tests, which were crafted in 2009, today. I'd use
MockWebServer and isolate only the behaviors we want, not relying on state like
how many other tests may have POST'ed something, etc. At the cost of firing up
one (or 2 in the case of redirects) MWS per test, flakiness like this could be
immediately identifiable regardless of how they are run. Well worth a refactor
imho. I'd probably also remove the logic to retry unsafe http methods on 500.
You can look at
[feign](https://github.com/Netflix/feign/blob/master/core/src/test/java/feign/FeignTest.java#L414)
for some opinions of retries I've had recently, ex. retrying on failed ssl
handshake.
Suspicions about these failures: It is possible that okhttp is retrying the
POST that is expected to fail. It is possible that the mechanics of a redirect
are misaligned. See if you can narrow down with MWS tests. Here's an [example
of a 2 server
test](https://github.com/Netflix/feign/blob/master/ribbon/src/test/java/feign/ribbon/LoadBalancingTargetTest.java#L40),
if you want to isolate redirects. Caution.. these are guesses! The real
problems could be elsewhere.
Anyway, I hope these hints help!
---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/232#issuecomment-30581270