Although not an ideal solution, but if it does end up being a bug that you have to live with for now, then perhaps one way that will help with the delay for the time being at the expense of extra round-trips:
- first do a head request using hdmp:http-head() - locally decide on the need to download the doc (assuming the Last-Modified header is returned) - then do a regular fetch without the If-Modified-Since header (since you figured that out from the head request) Again, not ideal, but if xdmp:http-get truly waits for the timeout, then perhaps this is of value in the meantime. Kind Regards, David Ennis David Ennis *Content Engineer* [image: HintTech] <http://www.hinttech.com/> Mastering the value of content creative | technology | content Delftechpark 37i 2628 XJ Delft The Netherlands T: +31 88 268 25 00 M: +31 63 091 72 80 [image: http://www.hinttech.com] <http://www.hinttech.com> <https://twitter.com/HintTech> <http://www.facebook.com/HintTech> <http://www.linkedin.com/company/HintTech> On 8 January 2015 at 20:13, Geert Josten <[email protected]> wrote: > Hi Chris, > > It is not uncommon to be strict in sending, tolerant in receiving with > such things. I would recommend sending your case to Support. The delay > sounds unnecessary, and inconvenient.. > > Kind regards, > Geert > > From: Chris Hudson-Silver <[email protected]> > Reply-To: MarkLogic Developer Discussion <[email protected]> > Date: Thursday, January 8, 2015 at 6:22 PM > To: MarkLogic Developer Discussion <[email protected]> > Subject: Re: [MarkLogic Dev General] XDMP:http-get and 304 responses > > Hi Gert, > > > > Thanks for your reply. > > > > I checked and the response did not have a content-length so I checked the > HTTP spec to see what the content-length should be set to for a 304 > response. > > “The 304 response MUST NOT contain a message-body”: > > http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5 > > “The presence of a message-body in a request is signaled by the inclusion > of a Content-Length or Transfer-Encoding header”: > > http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.3 > > > > which makes me think that a 304 response should not have a Content-Length. > > > > Even so I tried forcing my remote repository to have a Content-Length of 0 > but the header was not getting back to MarkLogic. I thought this might be > because it was detecting the lack of message body and therefore not setting > the header so I decided to break the HTTP spec and include a body but again > the Content-Length and body were missing in my response. Looking into the > Application Server itself (tomcat) it looks as if a response with a 304 > response will always filter out the content body and Content-Length header > (probably to force it to comply with the HTTP spec): > > > https://github.com/apache/tomcat60/blob/94b4cf497377e48b116c0bed7ccada0f47cd9c10/java/org/apache/coyote/http11/Http11NioProcessor.java#L1649 > > indicates that the following is used to remove the content body > > https://github.com/apache/tomcat60/blob/94b4cf497377e48b116c0bed7ccada0f47cd9c10/java/org/apache/coyote/http11/filters/VoidOutputFilter.java > > and the following will cause the content-length to be removed > > > https://github.com/apache/tomcat60/blob/94b4cf497377e48b116c0bed7ccada0f47cd9c10/java/org/apache/coyote/http11/Http11NioProcessor.java#L1442 > > > > So looks like I can’t force the remote server to set a zero content-length > if that would cause the MarkLogic call not to wait for the full timeout. > > What do you think bug in Tomcat or MarkLogic? > > > > Thanks again, > > Chris > > > > *From:* Geert Josten [mailto:[email protected] > <[email protected]>] > *Sent:* 08 January 2015 10:44 > *To:* MarkLogic Developer Discussion > *Subject:* Re: [MarkLogic Dev General] XDMP:http-get and 304 responses > > > > Hi Chris, > > > > Does the response contain a Content-Length? If not, maybe MarkLogic waits > the full timeout before it decides there is none. If it has one (with a > value of zero), that might be a bug.. > > > > Kind regards, > > Geert > > > > *From: *Chris Hudson-Silver <[email protected]> > *Reply-To: *MarkLogic Developer Discussion < > [email protected]> > *Date: *Thursday, January 8, 2015 at 11:12 AM > *To: *"[email protected]" <[email protected]> > *Subject: *[MarkLogic Dev General] XDMP:http-get and 304 responses > > > > Hi All, > > > > Recently I was working on a project that tracks a repository by calling a > REST webservice that returns back metadata and download URLS for items that > have changed in the remote repository since the last call. It then checks > to see if the item has already been downloaded and if so will call the > download URL with the HTTP cache headers set as the modification could have > been just metadata not content. E.g: > > > > let $options := <options > xmlns="xdmp:http"><headers><If-Modified-Since>Fri, 21 Nov 2014 16:53:12 > GMT</If-Modified-Since><If-None-Match>"1416588792000"</If-None-Match></headers><repair > xmlns="xdmp:document-get">full</repair></options> > let $response := xdmp:http-get($url, $options) > > > > I noticed that the run time for this was considerably longer if some of > the items would return back a Not Modified 304 response so decided to test > if it was the remote repository or MarkLogic adding the overhead. I did > this by creating a script that generated CURL commands so I could do the > exact same requests from the command line and MarkLogic. > > The calls back to the command line and Marklogic were returning the exact > same response including the correct 304 code and an empty response body. > > The calls from the command line were taking about 20 seconds less time > than the calls from MarkLogic and seeing how the global timeout was set to > 20 seconds I decided to try the MarkLogic calls but with a 1 second time > out e.g: > > > > let $options := <options > xmlns="xdmp:http">*<timeout>1</timeout><*headers><If-Modified-Since>Fri, 21 > Nov 2014 16:53:12 > GMT</If-Modified-Since><If-None-Match>"1416588792000"</If-None-Match></headers><repair > xmlns="xdmp:document-get">full</repair></options> > > > > and now they are taking approximately 1 second longer than the calls from > the command line. > > > > Has anyone else encountered this? > > Could it be that MarkLogic is waiting for the response body even though it > has received a valid response header? The HTTP 1.1 standard states that a > response does not necessarily need a response body so if this is the case > it maybe a bug in MarkLogics HTTP module. > > Or am I missing something vital in my request options? > > > > Thanks in advance, > > > > Chris > > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
