Hi Gert,

Thanks for your reply.

I checked and the response did not have a content-length so I checked the HTTP 
spec to see what the content-length should be set to for a 304 response.
"The 304 response MUST NOT contain a message-body":
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5
"The presence of a message-body in a request is signaled by the inclusion of a 
Content-Length or Transfer-Encoding header":
http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.3

which makes me think that a 304 response should not have a Content-Length.

Even so I tried forcing my remote repository to have a Content-Length of 0 but 
the header was not getting back to MarkLogic. I thought this might be because 
it was detecting the lack of message body and therefore not setting the header 
so I decided to break the HTTP spec and include a body but again the 
Content-Length and body were missing in my response. Looking into the 
Application Server itself (tomcat) it looks as if a response with a 304 
response will always filter out the content body and Content-Length header 
(probably to force it to comply with the HTTP spec):
https://github.com/apache/tomcat60/blob/94b4cf497377e48b116c0bed7ccada0f47cd9c10/java/org/apache/coyote/http11/Http11NioProcessor.java#L1649
indicates that the following is used to remove the content body
https://github.com/apache/tomcat60/blob/94b4cf497377e48b116c0bed7ccada0f47cd9c10/java/org/apache/coyote/http11/filters/VoidOutputFilter.java
and the following will cause the content-length to be removed
https://github.com/apache/tomcat60/blob/94b4cf497377e48b116c0bed7ccada0f47cd9c10/java/org/apache/coyote/http11/Http11NioProcessor.java#L1442

So looks like I can't force the remote server to set a zero content-length if 
that would cause the MarkLogic call not to wait for the full timeout.
What do you think bug in Tomcat or MarkLogic?

Thanks again,
Chris

From: Geert Josten [mailto:[email protected]]
Sent: 08 January 2015 10:44
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] XDMP:http-get and 304 responses

Hi Chris,

Does the response contain a Content-Length? If not, maybe MarkLogic waits the 
full timeout before it decides there is none. If it has one (with a value of 
zero), that might be a bug..

Kind regards,
Geert

From: Chris Hudson-Silver 
<[email protected]<mailto:[email protected]>>
Reply-To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Date: Thursday, January 8, 2015 at 11:12 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [MarkLogic Dev General] XDMP:http-get and 304 responses

Hi All,

Recently I was working on a project that tracks a repository by calling a REST 
webservice that returns back metadata and download URLS for items that have 
changed in the remote repository since the last call. It then checks to see if 
the item has already been downloaded and if so will call the download URL with 
the HTTP cache headers set as the modification could have been just metadata 
not content. E.g:

let $options := <options xmlns="xdmp:http"><headers><If-Modified-Since>Fri, 21 
Nov 2014 16:53:12 
GMT</If-Modified-Since><If-None-Match>"1416588792000"</If-None-Match></headers><repair
 xmlns="xdmp:document-get">full</repair></options>
let $response := xdmp:http-get($url, $options)

I noticed that the run time for this was considerably longer if some of the 
items would return back a Not Modified 304 response so decided to test if it 
was the remote repository or MarkLogic adding the overhead. I did this by 
creating a script that generated CURL commands so I could do the exact same 
requests from the command line and MarkLogic.
The calls back to the command line and Marklogic were returning the exact same 
response including the correct 304 code and an empty response body.
The calls from the command line were taking about 20 seconds less time than the 
calls from MarkLogic and seeing how the global timeout was set to 20 seconds I 
decided to try the MarkLogic calls but with a 1 second time out e.g:


let $options := <options 
xmlns="xdmp:http"><timeout>1</timeout><headers><If-Modified-Since>Fri, 21 Nov 
2014 16:53:12 
GMT</If-Modified-Since><If-None-Match>"1416588792000"</If-None-Match></headers><repair
 xmlns="xdmp:document-get">full</repair></options>

and now they are taking approximately 1 second longer than the calls from the 
command line.

Has anyone else encountered this?
Could it be that MarkLogic is waiting for the response body even though it has 
received a valid response header? The HTTP 1.1 standard states that a response 
does not necessarily need a response body so if this is the case it maybe a bug 
in MarkLogics HTTP module.
Or am I missing something vital in my request options?

Thanks in advance,

Chris

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to