Re: [CODE4LIB] internet explorer and pdf files

2011-08-31 Thread Eric Lease Morgan
Eric wrote:

 Unfortunately IE's behavior is weird. The first time someone tries to load
 one of these URL nothing happens. When someone tries to load another one, it
 loads just fine. When they re-try the first one, it loads. We are banging
 our heads against the wall here at Catholic Pamphlet Central. Networking
 issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus
 off-campus issue?

Thank you for all the replies.

We'er not one hundred percent positive, but we think the issue with IE has 
something to do with headers. As alluded to previously, IE needs/desires file 
name extensions in order to know what to do with incoming files. We are serving 
these PDF documents from Fedora which is sending out a stream, not necessarily 
a file. Apparently this confuses IE. Since Fedora is not really designed to be 
a file server, we will write a piece of intermediary software to act as a go 
between. This isn't really a big deal since all of our other implementations of 
Fedora are expected to work in the same way. Wish us luck.

-- 
Eric Lease Morgan
University of Notre Dame


Re: [CODE4LIB] internet explorer and pdf files

2011-08-31 Thread Godmar Back
On Wed, Aug 31, 2011 at 8:42 AM, Eric Lease Morgan emor...@nd.edu wrote:

 Eric wrote:

  Unfortunately IE's behavior is weird. The first time someone tries to
 load
  one of these URL nothing happens. When someone tries to load another one,
 it
  loads just fine. When they re-try the first one, it loads. We are banging
  our heads against the wall here at Catholic Pamphlet Central. Networking
  issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus
  off-campus issue?

 Thank you for all the replies.

 We'er not one hundred percent positive, but we think the issue with IE has
 something to do with headers. As alluded to previously, IE needs/desires
 file name extensions in order to know what to do with incoming files. We are
 serving these PDF documents from Fedora which is sending out a stream, not
 necessarily a file. Apparently this confuses IE. Since Fedora is not really
 designed to be a file server, we will write a piece of intermediary
 software to act as a go between. This isn't really a big deal since all of
 our other implementations of Fedora are expected to work in the same way.
 Wish us luck.


FWIW, this is true for any and all HTTP servers.  Only the client's request
specifies a name (as the path component of the request, e.g.,
/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1

The server's reply does not contain a name at all. It simply specifies what
type and, typically, the length of the returned content is. The returned
content itself is just a blob of bytes. Your server says this blob  of
bytes is a PDF object (application/pdf), but it doesn't specify the length.
 Not specifying the length makes the job of the client slightly more
difficult, which is why the HTTP/1.1 specification discourages it; it now
has to read the stream until the server closes the connection. It is
certainly possible that IE's PDF plug-in is not prepared to deal with this
situation; and I would certainly fix this first.

 - Godmar


Re: [CODE4LIB] internet explorer and pdf files

2011-08-29 Thread James Gilbert
Works fine on my computer... Are both Adobe Reader and Windows Updates
current?

I had this issue on a book-keeper's computer... installed more recent
version of Adobe Reader, and seemed to fix it.

James Gilbert, BS, MLIS
Systems Librarian
Whitehall Township Public Library
3700 Mechanicsville Road
Whitehall, PA 18052
 
610-432-4330 ext: 203


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric
Lease Morgan
Sent: Monday, August 29, 2011 3:31 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] internet explorer and pdf files

I need some technical support when it comes to Internet Explorer (IE) and
PDF files.

Here at Notre Dame we have deposited a number of PDF files in a Fedora
repository. Some of these PDF files are available at the following URLs:

  *
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/
PDF1
  *
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/P
DF1
  *
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/P
DF1
  *
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/P
DF1
  *
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/
PDF1
  *
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/P
DF1
  *
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/P
DF1

Retrieving the URLs with any browser other than IE works just fine.

Unfortunately IE's behavior is weird. The first time someone tries to load
one of these URL nothing happens. When someone tries to load another one, it
loads just fine. When they re-try the first one, it loads. We are banging
our heads against the wall here at Catholic Pamphlet Central. Networking
issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus
off-campus issue?

Could some of y'all try to load some of the URLs with IE and tell me your
experience? Other suggestions would be greatly appreciated as well.

-- 
Eric Lease Morgan
University of Notre Dame

(574) 631-8604


Re: [CODE4LIB] internet explorer and pdf files

2011-08-29 Thread Eric Lease Morgan
On Aug 29, 2011, at 3:38 PM, James Gilbert wrote:

 Works fine on my computer... Are both Adobe Reader and Windows Updates
 current?
 
 I had this issue on a book-keeper's computer... installed more recent
 version of Adobe Reader, and seemed to fix it.

I don't know if things are up-to-date on the local hardware, but if they 
weren't that would be a bummer. We can't expect people to update their 
operating system just to load our PDF files.

I suppose I ought to put some of our PDF files on a different host and see 
whether or not IE exhibits the same behavior. Hmmm…
 
-- 
Eric Morgan


Re: [CODE4LIB] internet explorer and pdf files

2011-08-29 Thread Stockwell, Chris
Eric,

Works fine in IE9 on my machine.

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric 
Lease Morgan
Sent: Monday, August 29, 2011 1:31 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] internet explorer and pdf files

I need some technical support when it comes to Internet Explorer (IE) and PDF 
files.

Here at Notre Dame we have deposited a number of PDF files in a Fedora 
repository. Some of these PDF files are available at the following URLs:

  * 
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1
  * 
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1
  * 
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1
  * 
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1
  * 
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1
  * 
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1
  * 
http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/PDF1

Retrieving the URLs with any browser other than IE works just fine.

Unfortunately IE's behavior is weird. The first time someone tries to load one 
of these URL nothing happens. When someone tries to load another one, it loads 
just fine. When they re-try the first one, it loads. We are banging our heads 
against the wall here at Catholic Pamphlet Central. Networking issue? Port 
issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus issue?

Could some of y'all try to load some of the URLs with IE and tell me your 
experience? Other suggestions would be greatly appreciated as well.

-- 
Eric Lease Morgan
University of Notre Dame

(574) 631-8604


Re: [CODE4LIB] internet explorer and pdf files

2011-08-29 Thread Godmar Back
Earlier versions of IE were known to sometimes disregard the Content-Type
(which you set correctly to application/pdf) and look at the suffix of the
URL instead. For instance, they would render HTML if you served a .html as
text/plain, etc.

You may try creating URLs that end with .pdf

Separately, you're not sending a Content-Length header:

HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Server: Apache-Coyote/1.1
  Pragma: No-cache
  Cache-Control: no-cache
  Expires: Wed, 31 Dec 1969 19:00:00 EST
  Content-Type: application/pdf
  Date: Mon, 29 Aug 2011 19:47:27 GMT
  Connection: close
Length: unspecified [application/pdf]

which disregards RFC 2616,
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.13

On Mon, Aug 29, 2011 at 3:30 PM, Eric Lease Morgan emor...@nd.edu wrote:

 I need some technical support when it comes to Internet Explorer (IE) and
 PDF files.

 Here at Notre Dame we have deposited a number of PDF files in a Fedora
 repository. Some of these PDF files are available at the following URLs:

  *
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1
  *
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1
  *
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1
  *
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1
  *
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1
  *
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1
  *
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/PDF1

 Retrieving the URLs with any browser other than IE works just fine.

 Unfortunately IE's behavior is weird. The first time someone tries to load
 one of these URL nothing happens. When someone tries to load another one, it
 loads just fine. When they re-try the first one, it loads. We are banging
 our heads against the wall here at Catholic Pamphlet Central. Networking
 issue? Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus
 off-campus issue?

 Could some of y'all try to load some of the URLs with IE and tell me your
 experience? Other suggestions would be greatly appreciated as well.

 --
 Eric Lease Morgan
 University of Notre Dame

 (574) 631-8604



Re: [CODE4LIB] internet explorer and pdf files

2011-08-29 Thread Joe Hourcle
On Aug 29, 2011, at 3:30 PM, Eric Lease Morgan wrote:

 I need some technical support when it comes to Internet Explorer (IE) and PDF 
 files.
 
 Here at Notre Dame we have deposited a number of PDF files in a Fedora 
 repository. Some of these PDF files are available at the following URLs:
 
  * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1
  * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1
  * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1
  * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1
  * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1
  * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1
  * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/PDF1
 
 Retrieving the URLs with any browser other than IE works just fine.
 
 Unfortunately IE's behavior is weird. The first time someone tries to load 
 one of these URL nothing happens. When someone tries to load another one, it 
 loads just fine. When they re-try the first one, it loads. We are banging our 
 heads against the wall here at Catholic Pamphlet Central. Networking issue? 
 Port issue? IE PDF plug-in? Invalid HTTP headers? On-campus versus off-campus 
 issue?
 
 Could some of y'all try to load some of the URLs with IE and tell me your 
 experience? Other suggestions would be greatly appreciated as well.


I don't have IE to test from, but it's been my experience that in past versions 
of IE, it would use the file's extension no matter what the mime-type sent was.

I'd first see if you can trick IE ... it looks like Fedora doesn't like you 
sending extra stuff in PATH_INFO, so you might have to abuse QUERY_STRING for 
this:


http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1/?filename.pdf


http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1/?file=filename.pdf

If either of those work fine in IE, but the first one doesn't, that's the 
problem.

I don't know what's possible in Fedora, so I don't know if it's possible to do 
some URL re-writing so it'd always serve something that IE accepts as a PDF.  
If you could insert an extra HTTP header, you might be able to trick it with 
Content-Disposition, but that'll also tell some browsers to download the file 
rather than display it themselves:

http://www.ietf.org/rfc/rfc2183.txt

-Joe




Re: [CODE4LIB] internet explorer and pdf files

2011-08-29 Thread Erik Hetzner
At Mon, 29 Aug 2011 15:30:56 -0400,
Eric Lease Morgan wrote:

 I need some technical support when it comes to Internet Explorer (IE) and PDF 
 files.

 Here at Notre Dame we have deposited a number of PDF files in a Fedora 
 repository. Some of these PDF files are available at the following URLs:

   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:834207/PDF1

 Retrieving the URLs with any browser other than IE works just fine.

 Unfortunately IE's behavior is weird. The first time someone tries
 to load one of these URL nothing happens. When someone tries to load
 another one, it loads just fine. When they re-try the first one, it
 loads. We are banging our heads against the wall here at Catholic
 Pamphlet Central. Networking issue? Port issue? IE PDF plug-in?
 Invalid HTTP headers? On-campus versus off-campus issue?

 Could some of y'all try to load some of the URLs with IE and tell me
 your experience? Other suggestions would be greatly appreciated as
 well.

Hi Eric,

As I recall IE fetches PDFs oddly sometimes. It will do a GET, then
interrupt it, the GET the favicon.ico, then resume the original GET
using a Range header to request the rest of the PDF. (This info is
copied from an email from May 2008, so it may be out of date). Your
server might not like that kind of abuse.

Wireshark can be your friend here.

Hope that helps!

best, Erik
Sent from my free software system http://fsf.org/.


pgpXz2Ut56zS1.pgp
Description: PGP signature


Re: [CODE4LIB] internet explorer and pdf files

2011-08-29 Thread Joe Hourcle
On Aug 29, 2011, at 3:52 PM, Godmar Back wrote:

 Earlier versions of IE were known to sometimes disregard the Content-Type
 (which you set correctly to application/pdf) and look at the suffix of the
 URL instead. For instance, they would render HTML if you served a .html as
 text/plain, etc.
 
 You may try creating URLs that end with .pdf
 
 Separately, you're not sending a Content-Length header:
 
 HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Server: Apache-Coyote/1.1
  Pragma: No-cache
  Cache-Control: no-cache
  Expires: Wed, 31 Dec 1969 19:00:00 EST
  Content-Type: application/pdf
  Date: Mon, 29 Aug 2011 19:47:27 GMT
  Connection: close
 Length: unspecified [application/pdf]
 
 which disregards RFC 2616,
 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.13


RFC2616 says 'SHOULD' for that section.

HTTP/1.1 clients *must* support chunked encoding:

http://en.wikipedia.org/wiki/Chunked_transfer_encoding

(which is why any time I write an HTTP client, I always claim to be
HTTP/1.0, so I don't have to support it)

If the data's stored on disk compressed, and being decompressed
on the fly, it's pretty typical to not send Content-Length.  (although,
you could argue that they should save it when storing the value,
so it's available when serving without needing to decompress
first).

-Joe


Re: [CODE4LIB] internet explorer and pdf files

2011-08-29 Thread Munson, Doris
I've discovered that the latest version of Adobe Reader doesn't always work 
correctly. After upgrading to the latest version, neither Firefox or IE would 
properly display pdfs.  I finally wound up uninstalling the new version and 
then re-installing one version back.  You might try this on one of your problem 
machines and see if it makes a difference.

Regards,
Doris

Doris Munson
Systems/Reference Librarian
Eastern Washington University
dmun...@ewu.edu
509-359-6395



-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Erik 
Hetzner
Sent: Monday, August 29, 2011 1:02 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] internet explorer and pdf files

At Mon, 29 Aug 2011 15:30:56 -0400,
Eric Lease Morgan wrote:

 I need some technical support when it comes to Internet Explorer (IE) and PDF 
 files.

 Here at Notre Dame we have deposited a number of PDF files in a Fedora 
 repository. Some of these PDF files are available at the following URLs:

   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1000793/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832898/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:999332/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832657/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:1001919/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:832818/PDF1
   * 
 http://fedoraprod.library.nd.edu:8080/fedora/get/CATHOLLIC-PAMPHLET:83
 4207/PDF1

 Retrieving the URLs with any browser other than IE works just fine.

 Unfortunately IE's behavior is weird. The first time someone tries to 
 load one of these URL nothing happens. When someone tries to load 
 another one, it loads just fine. When they re-try the first one, it 
 loads. We are banging our heads against the wall here at Catholic 
 Pamphlet Central. Networking issue? Port issue? IE PDF plug-in?
 Invalid HTTP headers? On-campus versus off-campus issue?

 Could some of y'all try to load some of the URLs with IE and tell me 
 your experience? Other suggestions would be greatly appreciated as 
 well.

Hi Eric,

As I recall IE fetches PDFs oddly sometimes. It will do a GET, then interrupt 
it, the GET the favicon.ico, then resume the original GET using a Range header 
to request the rest of the PDF. (This info is copied from an email from May 
2008, so it may be out of date). Your server might not like that kind of abuse.

Wireshark can be your friend here.

Hope that helps!

best, Erik