As a detailed symptom of the problem, there
seems consistent behaviour with the following Transcript logging
Now
adding some Transcript logging into:
*
ZnResponse>>writeOn:
Transcript crShow: ' ' , statusLine printString.
*
ZnMessage>>writeOn:
Transcript crShow: ' ' , self headers printString.
*
ZnMultiThreadedServer>>writeResponseBad:on:
Transcript crShow: ' ' , response printString.
...that
SUCCESSFUL requests log CLIENT headers before the SERVER headers.
(CLIENT)
ZnHeaders('Accept'->'*/*' 'Connection'->'close'
'Host'->'localhost:1700'
'User-Agent'->'Zinc HTTP Components 1.0' )
013-ZnServer>Connected
(SERVER)
ZnHeaders('Connection'->'close' 'Content-Length'->'27'
'Content-Type'->'text/plain;charset=utf-8'
'Date'->'Sun, 20 Oct 2013 15:42:53 GMT' 'Server'->'Zinc HTTP
Components 1.0' )
FAILED
requests log the SERVER
headers
before the CLIENT
headers...
(SERVER)
ZnHeaders('Connection'->'close' 'Content-Length'->'27'
'Content-Type'->'text/plain;charset=utf-8'
'Date'->'Sun, 20 Oct 2013 15:42:53 GMT' 'Server'->'Zinc HTTP
Components 1.0' )
(CLIENT)
ZnHeaders('Accept'->'*/*' 'Connection'->'close'
'Host'->'localhost:1700'
'User-Agent'->'Zinc HTTP Components 1.0' )
013-ZnServer>Connected
On my system, the problem can be fixed by adding a delay at point "E"
Socket>>waitForAcceptFor:
timeout
self waitForConnectionFor: timeout ifTimedOut: [^ nil].
"(Delay
forMilliseconds: 100) wait. " "<---E"
^
self isConnected
ifTrue:[self
accept]
and made worse by adding a delay at point "F"
waitForConnectionFor:
timeout ifTimedOut: timeoutBlock
| startTime msecsDelta msecsEllapsed status |
"(Delay forMilliseconds: 200) wait." "<---F"
startTime := Time millisecondClockValue.
msecsDelta := (timeout * 1000) truncated.
status := self primSocketConnectionStatus: socketHandle.
[(status = WaitingForConnection) and: [(msecsEllapsed := Time
millisecondsSince: startTime) < msecsDelta]]
whileTrue: [
semaphore waitTimeoutMSecs: msecsDelta - msecsEllapsed.
status := self primSocketConnectionStatus: socketHandle].
status = Connected ifFalse: [^timeoutBlock value].
^ true
cheers, Ben
Sven Van Caekenberghe wrote:
Ben,
Thanks for spending time looking into this.
Now, you produced an enormous amount of information, but I am still confused.
I have to come back to you when I have more time to rerun my older unfinished experiments using Windows 8.1 in VirtualBox, so that I can reformulate how I see things.
Hopefully I will be able to understand what you were up to ;-)
Sven
On 20 Oct 2013, at 18:38, [email protected] wrote:
Sven,
I've spent the day poking around in different parts of ZnServers and Sockets and got about as far as I can. I've reported a lot of stuff on the Case [1], and some expert review would now be useful - but you might want to skip to the end where I suggest a way that the problem might be invoked/simulated at your end on an otherwise working Max or Linux system.
cheers -ben
[email protected] wrote:
While trying to debug this I found the error went away when I put a breakpoint to step through ZnMultiThreadedServer>>executeOneRequestResponseOn:
Presuming a race condition I played with adding delays until I managed to isolate the following case...
ZnMultiThreadedServer>>executeOneRequestResponseOn: stream
| request response |
(Delay forMilliseconds: 30) wait. "<----- <20 fails. >30 succeeds."
(request := self readRequestSafely: stream)
ifNil: [ ^ true ]
ifNotNil: [
response := self handleRequest: request.
self augmentResponse: response forRequest: request.
self writeResponseSafely: response on: stream.
response useConnection: stream.
].
^ request wantsConnectionClose or: [ response wantsConnectionClose ]
I've got more info coming, so I've raised ticket [1] to continue discussion.
cheers -ben
[1] https://pharo.fogbugz.com/default.asp?11960
Sven Van Caekenberghe wrote:
Ben,
The goal of the test is to confirm that the server returns an error (refuses a request) when any line, in this case the first so called request line, it too long. Hence the artificially constructed URL. Of course, if you make it shorter, at one point it will be short enough so that it no longer fails - but that defeats the purpose of the test ;-)
Now, the problem with this test and 3 other ones that are related, checking the reaction to other limits, is that they pass on Mac and Linux, but fail (randomly) on Windows, when ZdcSocketStream is used instead of SocketStream, currently the default. I have already looked into this, but up to now I failed to find the problem. The Socket class uses a number of primitives which have totally different implementations on the 3 platforms.
Yes, I could use some help, but this is not easy.
I also think there already is an issue for this, but I can't find it.
Sven
On 19 Oct 2013, at 18:54, [email protected] wrote:
>> [email protected] wrote:
>>> In the process of tying to review the slices for a few issues, I downloaded build #30500 Pharo3.0-win.zip from [1]. (Normally I use files.pharo.org but I wanted to ensure it was a "GREEN" build.) However before loading the slice to review I double-checked the baseline test results by running all tests from TestRunner. I had presumed all tests would be green but was surprised by some failures in the result. Well it maybe it has something to do with environment, but it would be nice for it to be clean to help isoalte any errors introduced by the slice under review. Before I log anything on Fogbugz in case someone can advise anything obvious, the Test Runner result is shown below [2].
My platform is 64-bit Windows 7 Home Premium Service Pack 1.
I'll try to follow up with more specific info on each line.
cheers -ben
[1] https://ci.inria.fr/pharo/job/Pharo-3.0-Update-Step-4-Publish/566/
[2] FAILURE ZnServerTests >> #testRequestLineTooLong
>> I extracted #testRequestLineTooLong to something I can evaluate from Workspace...
| numX url fudge |
fudge := 0.
numX := ZnConstants maximumLineLength - fudge . (url := ZnUrl fromString: 'http://localhost:1704/')
addPathSegment: #echo;
addPathSegment: (String new: numX withAll: $X).
ZnClient new
beOneShot;
logToTranscript;
url: url;
get;
response
That fails with initial fudge factor of 0 - with 'ZnUnknownHttpVerion.' Here is the Transcript (including logging from ZnServer)...
2013-10-20 00:45:40 489094 D Executing request/response loop
2013-10-20 00:45:40 505527 I Wrote a ZnRequest(GET /echo/XXXXXXXX...XXXXX)
2013-10-20 00:45:40 505527 D Sent headers
Accept: */*
User-Agent: Zinc HTTP Components 1.0
Connection: close
Host: localhost:1704
2013-10-20 00:45:40 489094 D ZnLineTooLong bad request while parsing
2013-10-20 00:45:40 489094 I Wrote a ZnResponse(400 Bad Request text/plain;charset=utf-8 27B)
2013-10-20 00:45:40 489094 D Closing streamZnStatusLine
With a fudge factor of 9 it still fails, but with 10 it succeeds. Interestingly, inspect 'url' shows...
http://localhost:1704/echo/XXXXXXXXXXXXXXXXXXXXXXXXXX...etc
...and 10 characters = 1704/echo/
but then correlation does not equal cause.
How should we proceed?
cheers -ben
>
|