Although not an ideal solution, but if it does end up being a bug that you
have to live with for now,  then perhaps one way that will help with the
delay for the time being at the expense of extra round-trips:

- first do a head request using hdmp:http-head()
- locally decide on the need to download the doc (assuming the
Last-Modified header is returned)
- then do a regular fetch without the If-Modified-Since header (since you
figured that out from the head request)

Again, not ideal, but if xdmp:http-get truly waits for the timeout, then
perhaps this is of value in the meantime.





Kind Regards,
David Ennis


David Ennis
*Content Engineer*

[image: HintTech]  <http://www.hinttech.com/>
Mastering the value of content
creative | technology | content

Delftechpark 37i
2628 XJ Delft
The Netherlands
T: +31 88 268 25 00
M: +31 63 091 72 80

[image: http://www.hinttech.com] <http://www.hinttech.com>
<https://twitter.com/HintTech>  <http://www.facebook.com/HintTech>
<http://www.linkedin.com/company/HintTech>

On 8 January 2015 at 20:13, Geert Josten <[email protected]> wrote:

>  Hi Chris,
>
>  It is not uncommon to be strict in sending, tolerant in receiving with
> such things. I would recommend sending your case to Support. The delay
> sounds unnecessary, and inconvenient..
>
>  Kind regards,
> Geert
>
>   From: Chris Hudson-Silver <[email protected]>
> Reply-To: MarkLogic Developer Discussion <[email protected]>
> Date: Thursday, January 8, 2015 at 6:22 PM
> To: MarkLogic Developer Discussion <[email protected]>
> Subject: Re: [MarkLogic Dev General] XDMP:http-get and 304 responses
>
>   Hi Gert,
>
>
>
> Thanks for your reply.
>
>
>
> I checked and the response did not have a content-length so I checked the
> HTTP spec to see what the content-length should be set to for a 304
> response.
>
> “The 304 response MUST NOT contain a message-body”:
>
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5
>
> “The presence of a message-body in a request is signaled by the inclusion
> of a Content-Length or Transfer-Encoding header”:
>
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.3
>
>
>
> which makes me think that a 304 response should not have a Content-Length.
>
>
>
> Even so I tried forcing my remote repository to have a Content-Length of 0
> but the header was not getting back to MarkLogic. I thought this might be
> because it was detecting the lack of message body and therefore not setting
> the header so I decided to break the HTTP spec and include a body but again
> the Content-Length and body were missing in my response. Looking into the
> Application Server itself (tomcat) it looks as if a response with a 304
> response will always filter out the content body and Content-Length header
> (probably to force it to comply with the HTTP spec):
>
>
> https://github.com/apache/tomcat60/blob/94b4cf497377e48b116c0bed7ccada0f47cd9c10/java/org/apache/coyote/http11/Http11NioProcessor.java#L1649
>
> indicates that the following is used to remove the content body
>
> https://github.com/apache/tomcat60/blob/94b4cf497377e48b116c0bed7ccada0f47cd9c10/java/org/apache/coyote/http11/filters/VoidOutputFilter.java
>
> and the following will cause the content-length to be removed
>
>
> https://github.com/apache/tomcat60/blob/94b4cf497377e48b116c0bed7ccada0f47cd9c10/java/org/apache/coyote/http11/Http11NioProcessor.java#L1442
>
>
>
> So looks like I can’t force the remote server to set a zero content-length
> if that would cause the MarkLogic call not to wait for the full timeout.
>
> What do you think bug in Tomcat or MarkLogic?
>
>
>
> Thanks again,
>
> Chris
>
>
>
> *From:* Geert Josten [mailto:[email protected]
> <[email protected]>]
> *Sent:* 08 January 2015 10:44
> *To:* MarkLogic Developer Discussion
> *Subject:* Re: [MarkLogic Dev General] XDMP:http-get and 304 responses
>
>
>
> Hi Chris,
>
>
>
> Does the response contain a Content-Length? If not, maybe MarkLogic waits
> the full timeout before it decides there is none. If it has one (with a
> value of zero), that might be a bug..
>
>
>
> Kind regards,
>
> Geert
>
>
>
> *From: *Chris Hudson-Silver <[email protected]>
> *Reply-To: *MarkLogic Developer Discussion <
> [email protected]>
> *Date: *Thursday, January 8, 2015 at 11:12 AM
> *To: *"[email protected]" <[email protected]>
> *Subject: *[MarkLogic Dev General] XDMP:http-get and 304 responses
>
>
>
> Hi All,
>
>
>
> Recently I was working on a project that tracks a repository by calling a
> REST webservice that returns back metadata and download URLS for items that
> have changed in the remote repository since the last call. It then checks
> to see if the item has already been downloaded and if so will call the
> download URL with the HTTP cache headers set as the modification could have
> been just metadata not content. E.g:
>
>
>
> let $options := <options
> xmlns="xdmp:http"><headers><If-Modified-Since>Fri, 21 Nov 2014 16:53:12
> GMT</If-Modified-Since><If-None-Match>"1416588792000"</If-None-Match></headers><repair
> xmlns="xdmp:document-get">full</repair></options>
> let $response := xdmp:http-get($url, $options)
>
>
>
> I noticed that the run time for this was considerably longer if some of
> the items would return back a Not Modified 304 response so decided to test
> if it was the remote repository or MarkLogic adding the overhead. I did
> this by creating a script that generated CURL commands so I could do the
> exact same requests from the command line and MarkLogic.
>
> The calls back to the command line and Marklogic were returning the exact
> same response including the correct 304 code and an empty response body.
>
> The calls from the command line were taking about 20 seconds less time
> than the calls from MarkLogic and seeing how the global timeout was set to
> 20 seconds I decided to try the MarkLogic calls but with a 1 second time
> out e.g:
>
>
>
> let $options := <options 
> xmlns="xdmp:http">*<timeout>1</timeout><*headers><If-Modified-Since>Fri, 21 
> Nov 2014 16:53:12 
> GMT</If-Modified-Since><If-None-Match>"1416588792000"</If-None-Match></headers><repair
>  xmlns="xdmp:document-get">full</repair></options>
>
>
>
> and now they are taking approximately 1 second longer than the calls from
> the command line.
>
>
>
> Has anyone else encountered this?
>
> Could it be that MarkLogic is waiting for the response body even though it
> has received a valid response header? The HTTP 1.1 standard states that a
> response does not necessarily need a response body so if this is the case
> it maybe a bug in MarkLogics HTTP module.
>
> Or am I missing something vital in my request options?
>
>
>
> Thanks in advance,
>
>
>
> Chris
>
>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
>
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to