On Feb 14, 2011, at 10:21 AM, Alvaro Lopez Ortega wrote: > Hello Robert, > > On 14/02/2011, at 16:36, Robert Olson wrote: > >> The problem we're seeing is that for a particular test script on the client >> side, one of the exchanges is failing. >> >> Looking at packet traces, I see the client sending a complete request to >> cherokee. Cherokee sends the request to the compute server, but it appears >> to be truncating the request one packet shy of finishing it. the compute >> server then reports a bad parse in response. >> >> The problem initially showed up fairly reliably only when both front ends >> were running. If I killed wackamole on one of them (pushing both IPs over to >> a single server) the problem vanished. > > Did it perform a clean 'three way' close sequence (FIN, FIN+ACK, ACK)? The > last package might be lost if a RST were sent while the connection is being > closed.
Aha, I got one. Wordy trace files attached.
trace.1 was taken from the front-end machine, trace.2 is the same transaction
taken from the compute machine.
Here is the syscall trace from cherokee for that transaction (I've clipped out
the long packet content). Note the EAGAIN.
24358 13:21:08.623134 recvfrom(17,
"source=ClientThing&function=fid_locations&args=---%0A-bou"..., 15237, 0, NULL,
NULL) = 15237
24358 13:21:08.623548 sendto(20,
"\1\5\0\1;\205\0\0source=ClientThing&fun162%0A++"..., 15253, 0, NULL, 0) = 14480
24358 13:21:08.623990 epoll_wait(16, {{EPOLLOUT, {u32=17, u64=17}}}, 407, 1000)
= 1
24358 13:21:08.624057 sendto(20, "72%0A++-+fig%7C2098.5.peg.73", 773, 0, NULL,
0) = -1 EAGAIN (Resource temporarily unavailable)
24358 13:21:08.624252 epoll_ctl(16, EPOLL_CTL_MOD, 17, {EPOLLIN, {u32=17,
u64=17}}) = 0
24358 13:21:08.624316 epoll_wait(16, <unfinished ...>
No mention of file descriptors 17 or 20 until the timeout hits:
24348 13:23:09.051488 epoll_wait(7, <unfinished ...>
24358 13:23:09.616377 <... epoll_wait resumed> {}, 407, 1000) = 0
24358 13:23:09.616419 shutdown(17, 1 /* send */) = 0
24358 13:23:09.616529 epoll_wait(16, {{EPOLLIN|EPOLLHUP, {u32=17, u64=17}}},
407, 1000) = 1
24358 13:23:09.617106 recvfrom(17, "", 4096, 0, NULL, NULL) = 0
24358 13:23:09.617173 epoll_ctl(16, EPOLL_CTL_DEL, 17, {0, {u32=17, u64=17}}) = 0
24358 13:23:09.617253 close(17) = 0
24358 13:23:09.617331 close(20) = 0
Is cherokee not picking up on the nonblocking socket properly? that system is
running 1.0.20.
--bob
trace.1
Description: Binary data
trace.2
Description: Binary data
_______________________________________________ Cherokee mailing list [email protected] http://lists.octality.com/listinfo/cherokee
