Todd,

You may find Test::CGI::Multipart useful in testing code for this situation. I wrote it because I found testing file upload so impossibly difficult. However I have to admit that it has not seen the sort of situation you describe, and that is just the sort of situation that will break it. But assuming we can iron out any bugs of that sort you should be able to replicate all sorts of situations.

Nicholas


[email protected] wrote:
Send cgiapp mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.erlbaum.net/mailman/listinfo/cgiapp
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of cgiapp digest..."


Today's Topics:

   1. file uploads and encodings (Todd Ross)
   2. Re: file uploads and encodings (?ohn say??r)
   3. Re: file uploads and encodings (Michael Peters)
   4. Re: file uploads and encodings (Joshua Miller)


----------------------------------------------------------------------

Message: 1
Date: Mon, 4 Oct 2010 13:34:25 -0700 (PDT)
From: Todd Ross <[email protected]>
Subject: [cgiapp] file uploads and encodings
To: CGI Application Listserv <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii

Hello,

I think I have an impossible problem. Or at least, it looks dire from where I'm sitting.

I support a website that accepts file uploads. I accept uploads of all types from text/plain (csv) to image/jpeg to application/pdf; it's currently unconstrained. The file upload happens over a very typical setup of:

<form enctype="multipart/form-data" method="post">
    <input type="file" name="my_file">
</form>

using CGI.pm for the form processing on the server.

Most file uploads are routed elsewhere for processing. One of our targets is a COBOL application on z/OS and we need to perform some platform conversion. Namely, we need to convert text/plain files to EBCDIC.

In order to convert _to_ EBCDIC, I need to know what I'm converting _from_. And therein lies my impossible problem; how does one determine the encoding of a file upload? The browser does provide some information in the form of the file name and the mime type but neither would indicate whether the (text/plain) file was encoded with ISO-8859-1 or UTF-8 or something else entirely.

These are uploads from a variety of clients running on a variety of platforms, the details of which are largely unknown to me. Consequently, I'm reluctant to assume any particular character encoding.

I can't imagine a character encoding field (or prompt) as being effective. My users are business users not computer specialists. They might be responsible for uploading the file, but they probably aren't responsible for creating it in the first place.

Thoughts?

Thanks,

Todd



------------------------------

Message: 2
Date: Mon, 04 Oct 2010 17:13:05 -0400
From: ?ohn say??r <[email protected]>
Subject: Re: [cgiapp] file uploads and encodings
To: CGI Application <[email protected]>
Message-ID: <1286226785.6953.12.ca...@saylor-linux>
Content-Type: text/plain; charset="UTF-8"

hola

On Mon, 2010-10-04 at 13:34 -0700, Todd Ross wrote:
In order to convert _to_ EBCDIC, I need to know what I'm converting _from_. And therein lies my impossible problem; how does one determine the encoding of a file upload? The browser does provide some information in the form of the file name and the mime type but neither would indicate whether the (text/plain) file was encoded with ISO-8859-1 or UTF-8 or something else entirely.

i think you have to do this programmatically by examining the characters
in the file. there may be libraries to do this already somewhere, but i
have exerted no effort to find them.

as you mention, you can't count on users, and they can [and will] upload
just about anything.

good luck!


#####  CGI::Application community mailing list  ################
##                                                            ##
##  To unsubscribe, or change your message delivery options,  ##
##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp    ##
##                                                            ##
##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
##  Wiki:          http://cgiapp.erlbaum.net/                 ##
##                                                            ##
################################################################

Reply via email to