Re: [CODE4LIB] internet explorer and pdf files
Eric wrote: Unfortunately IE's behavior is weird. The first time someone tries to load one of these URL nothing happens. When someone tries to load another one, it loads just fine. When they re-try the first one, it loads. We are banging our heads against the wall here at Catholic Pamphlet Central. Networking issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus issue? Thank you for all the replies. We'er not one hundred percent positive, but we think the issue with IE has something to do with headers. As alluded to previously, IE needs/desires file name extensions in order to know what to do with incoming files. We are serving these PDF documents from Fedora which is sending out a stream, not necessarily a file. Apparently this confuses IE. Since Fedora is not really designed to be a file server, we will write a piece of intermediary software to act as a go between. This isn't really a big deal since all of our other implementations of Fedora are expected to work in the same way. Wish us luck. -- Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] internet explorer and pdf files
On Wed, Aug 31, 2011 at 8:42 AM, Eric Lease Morgan emor...@nd.edu wrote: Eric wrote: Unfortunately IE's behavior is weird. The first time someone tries to load one of these URL nothing happens. When someone tries to load another one, it loads just fine. When they re-try the first one, it loads. We are banging our heads against the wall here at Catholic Pamphlet Central. Networking issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus issue? Thank you for all the replies. We'er not one hundred percent positive, but we think the issue with IE has something to do with headers. As alluded to previously, IE needs/desires file name extensions in order to know what to do with incoming files. We are serving these PDF documents from Fedora which is sending out a stream, not necessarily a file. Apparently this confuses IE. Since Fedora is not really designed to be a file server, we will write a piece of intermediary software to act as a go between. This isn't really a big deal since all of our other implementations of Fedora are expected to work in the same way. Wish us luck. FWIW, this is true for any and all HTTP servers. Only the client's request specifies a name (as the path component of the request, e.g., /fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1 The server's reply does not contain a name at all. It simply specifies what type and, typically, the length of the returned content is. The returned content itself is just a blob of bytes. Your server says this blob of bytes is a PDF object (application/pdf), but it doesn't specify the length. Not specifying the length makes the job of the client slightly more difficult, which is why the HTTP/1.1 specification discourages it; it now has to read the stream until the server closes the connection. It is certainly possible that IE's PDF plug-in is not prepared to deal with this situation; and I would certainly fix this first. - Godmar
Re: [CODE4LIB] internet explorer and pdf files
Works fine on my computer... Are both Adobe Reader and Windows Updates current? I had this issue on a book-keeper's computer... installed more recent version of Adobe Reader, and seemed to fix it. James Gilbert, BS, MLIS Systems Librarian Whitehall Township Public Library 3700 Mechanicsville Road Whitehall, PA 18052 610-432-4330 ext: 203 -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: Monday, August 29, 2011 3:31 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] internet explorer and pdf files I need some technical support when it comes to Internet Explorer (IE) and PDF files. Here at Notre Dame we have deposited a number of PDF files in a Fedora repository. Some of these PDF files are available at the following URLs: * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/ PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/P DF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/P DF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/P DF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/ PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/P DF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/P DF1 Retrieving the URLs with any browser other than IE works just fine. Unfortunately IE's behavior is weird. The first time someone tries to load one of these URL nothing happens. When someone tries to load another one, it loads just fine. When they re-try the first one, it loads. We are banging our heads against the wall here at Catholic Pamphlet Central. Networking issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus issue? Could some of y'all try to load some of the URLs with IE and tell me your experience? Other suggestions would be greatly appreciated as well. -- Eric Lease Morgan University of Notre Dame (574) 631-8604
Re: [CODE4LIB] internet explorer and pdf files
On Aug 29, 2011, at 3:38 PM, James Gilbert wrote: Works fine on my computer... Are both Adobe Reader and Windows Updates current? I had this issue on a book-keeper's computer... installed more recent version of Adobe Reader, and seemed to fix it. I don't know if things are up-to-date on the local hardware, but if they weren't that would be a bummer. We can't expect people to update their operating system just to load our PDF files. I suppose I ought to put some of our PDF files on a different host and see whether or not IE exhibits the same behavior. Hmmm… -- Eric Morgan
Re: [CODE4LIB] internet explorer and pdf files
Eric, Works fine in IE9 on my machine. -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: Monday, August 29, 2011 1:31 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] internet explorer and pdf files I need some technical support when it comes to Internet Explorer (IE) and PDF files. Here at Notre Dame we have deposited a number of PDF files in a Fedora repository. Some of these PDF files are available at the following URLs: * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/PDF1 Retrieving the URLs with any browser other than IE works just fine. Unfortunately IE's behavior is weird. The first time someone tries to load one of these URL nothing happens. When someone tries to load another one, it loads just fine. When they re-try the first one, it loads. We are banging our heads against the wall here at Catholic Pamphlet Central. Networking issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus issue? Could some of y'all try to load some of the URLs with IE and tell me your experience? Other suggestions would be greatly appreciated as well. -- Eric Lease Morgan University of Notre Dame (574) 631-8604
Re: [CODE4LIB] internet explorer and pdf files
Earlier versions of IE were known to sometimes disregard the Content-Type (which you set correctly to application/pdf) and look at the suffix of the URL instead. For instance, they would render HTML if you served a .html as text/plain, etc. You may try creating URLs that end with .pdf Separately, you're not sending a Content-Length header: HTTP request sent, awaiting response... HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Pragma: No-cache Cache-Control: no-cache Expires: Wed, 31 Dec 1969 19:00:00 EST Content-Type: application/pdf Date: Mon, 29 Aug 2011 19:47:27 GMT Connection: close Length: unspecified [application/pdf] which disregards RFC 2616, http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.13 On Mon, Aug 29, 2011 at 3:30 PM, Eric Lease Morgan emor...@nd.edu wrote: I need some technical support when it comes to Internet Explorer (IE) and PDF files. Here at Notre Dame we have deposited a number of PDF files in a Fedora repository. Some of these PDF files are available at the following URLs: * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/PDF1 Retrieving the URLs with any browser other than IE works just fine. Unfortunately IE's behavior is weird. The first time someone tries to load one of these URL nothing happens. When someone tries to load another one, it loads just fine. When they re-try the first one, it loads. We are banging our heads against the wall here at Catholic Pamphlet Central. Networking issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus issue? Could some of y'all try to load some of the URLs with IE and tell me your experience? Other suggestions would be greatly appreciated as well. -- Eric Lease Morgan University of Notre Dame (574) 631-8604
Re: [CODE4LIB] internet explorer and pdf files
On Aug 29, 2011, at 3:30 PM, Eric Lease Morgan wrote: I need some technical support when it comes to Internet Explorer (IE) and PDF files. Here at Notre Dame we have deposited a number of PDF files in a Fedora repository. Some of these PDF files are available at the following URLs: * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/PDF1 Retrieving the URLs with any browser other than IE works just fine. Unfortunately IE's behavior is weird. The first time someone tries to load one of these URL nothing happens. When someone tries to load another one, it loads just fine. When they re-try the first one, it loads. We are banging our heads against the wall here at Catholic Pamphlet Central. Networking issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus issue? Could some of y'all try to load some of the URLs with IE and tell me your experience? Other suggestions would be greatly appreciated as well. I don't have IE to test from, but it's been my experience that in past versions of IE, it would use the file's extension no matter what the mime-type sent was. I'd first see if you can trick IE ... it looks like Fedora doesn't like you sending extra stuff in PATH_INFO, so you might have to abuse QUERY_STRING for this: http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1/?filename.pdf http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1/?file=filename.pdf If either of those work fine in IE, but the first one doesn't, that's the problem. I don't know what's possible in Fedora, so I don't know if it's possible to do some URL re-writing so it'd always serve something that IE accepts as a PDF. If you could insert an extra HTTP header, you might be able to trick it with Content-Disposition, but that'll also tell some browsers to download the file rather than display it themselves: http://www.ietf.org/rfc/rfc2183.txt -Joe
Re: [CODE4LIB] internet explorer and pdf files
At Mon, 29 Aug 2011 15:30:56 -0400, Eric Lease Morgan wrote: I need some technical support when it comes to Internet Explorer (IE) and PDF files. Here at Notre Dame we have deposited a number of PDF files in a Fedora repository. Some of these PDF files are available at the following URLs: * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/PDF1 Retrieving the URLs with any browser other than IE works just fine. Unfortunately IE's behavior is weird. The first time someone tries to load one of these URL nothing happens. When someone tries to load another one, it loads just fine. When they re-try the first one, it loads. We are banging our heads against the wall here at Catholic Pamphlet Central. Networking issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus issue? Could some of y'all try to load some of the URLs with IE and tell me your experience? Other suggestions would be greatly appreciated as well. Hi Eric, As I recall IE fetches PDFs oddly sometimes. It will do a GET, then interrupt it, the GET the favicon.ico, then resume the original GET using a Range header to request the rest of the PDF. (This info is copied from an email from May 2008, so it may be out of date). Your server might not like that kind of abuse. Wireshark can be your friend here. Hope that helps! best, Erik Sent from my free software system http://fsf.org/. pgpXz2Ut56zS1.pgp Description: PGP signature
Re: [CODE4LIB] internet explorer and pdf files
On Aug 29, 2011, at 3:52 PM, Godmar Back wrote: Earlier versions of IE were known to sometimes disregard the Content-Type (which you set correctly to application/pdf) and look at the suffix of the URL instead. For instance, they would render HTML if you served a .html as text/plain, etc. You may try creating URLs that end with .pdf Separately, you're not sending a Content-Length header: HTTP request sent, awaiting response... HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Pragma: No-cache Cache-Control: no-cache Expires: Wed, 31 Dec 1969 19:00:00 EST Content-Type: application/pdf Date: Mon, 29 Aug 2011 19:47:27 GMT Connection: close Length: unspecified [application/pdf] which disregards RFC 2616, http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.13 RFC2616 says 'SHOULD' for that section. HTTP/1.1 clients *must* support chunked encoding: http://en.wikipedia.org/wiki/Chunked_transfer_encoding (which is why any time I write an HTTP client, I always claim to be HTTP/1.0, so I don't have to support it) If the data's stored on disk compressed, and being decompressed on the fly, it's pretty typical to not send Content-Length. (although, you could argue that they should save it when storing the value, so it's available when serving without needing to decompress first). -Joe
Re: [CODE4LIB] internet explorer and pdf files
I've discovered that the latest version of Adobe Reader doesn't always work correctly. After upgrading to the latest version, neither Firefox or IE would properly display pdfs. I finally wound up uninstalling the new version and then re-installing one version back. You might try this on one of your problem machines and see if it makes a difference. Regards, Doris Doris Munson Systems/Reference Librarian Eastern Washington University dmun...@ewu.edu 509-359-6395 -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Erik Hetzner Sent: Monday, August 29, 2011 1:02 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] internet explorer and pdf files At Mon, 29 Aug 2011 15:30:56 -0400, Eric Lease Morgan wrote: I need some technical support when it comes to Internet Explorer (IE) and PDF files. Here at Notre Dame we have deposited a number of PDF files in a Fedora repository. Some of these PDF files are available at the following URLs: * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1 * http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:83 4207/PDF1 Retrieving the URLs with any browser other than IE works just fine. Unfortunately IE's behavior is weird. The first time someone tries to load one of these URL nothing happens. When someone tries to load another one, it loads just fine. When they re-try the first one, it loads. We are banging our heads against the wall here at Catholic Pamphlet Central. Networking issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus issue? Could some of y'all try to load some of the URLs with IE and tell me your experience? Other suggestions would be greatly appreciated as well. Hi Eric, As I recall IE fetches PDFs oddly sometimes. It will do a GET, then interrupt it, the GET the favicon.ico, then resume the original GET using a Range header to request the rest of the PDF. (This info is copied from an email from May 2008, so it may be out of date). Your server might not like that kind of abuse. Wireshark can be your friend here. Hope that helps! best, Erik