Hi folks,

after posting to Bugzilla I have a few more thoughts about this issue.

We are looking here at the darker parts of the HTTP protocol
specification.  File uploading was introduced experimentally
in 1995 (RFC1867) and formalized with HTML 4.0 in 1998 (RFC2388).

As the feature was urgently needed at the time when the web has
already taken off, the RFC was apparently prepared in such a rush,
that it is not really well thought.

In order to use <input type="file"> the HTML form must use
<form method="POST" enctype="multipart/form-data">.  The browser
then encodes each <input> value as a part of the multipart request
body.

The problem now is that the values of <input type="text"> and
other form input elements are interspersed with the possibly 
every large file content.  

For example, if you have a form

<form>
  <input type="file">
  <input type="radio">
  <input type="submit">
</form>

and you want to know the state of the radio button, you first
need to process the file data.

In a framework which wants to provide random and idem-potent
parameter access as in request.getParameter("foo") one needs to
read and buffer the complete request body.

Providing buffer space for megabytes of upload data is problematic.
The Java Servlet Specifications allows the container (Tomcat) to 
duck the problem and leave it to the servlet (Cocoon) to decode 
the request body.

That is exactly what is done in CocoonServlet.service:

        HttpServletRequest request =
RequestFactory.getRequestFactory(requestFactoryClass).getServletRequest(
req,
 
CocoonServlet.SAVE_UPLOADED_FILES_TO_DISK,
                                         this.uploadDir,
                                         CocoonServlet.ALLOW_OVERWRITE,
                                         CocoonServlet.SILENTLY_RENAME,
                                         this.maxUploadSize);

        this.cocoon = getCocoon(request.getPathInfo(),
request.getParameter(Constants.RELOAD_PARAM));

The RequestFactory is called upon to wrap the HttpServletRequest from
the container into a new HttpServletRequest after decoding the request
content by reading from req.getInputStream().

Currently there are three implementations of RequestFactory to choose 
from in org/apache/cocoon/components/request:
1.) SimpleRequestFactoryImpl.java
2.) MaybeUploadRequestFactoryImpl.java
3.) MultipartRequestFactoryImpl.java

SimpleRequestFactoryImpl is the dummy implementation which just returns
the container request.

MaybeUploadRequestFactoryImpl is legacy from which apparently stems the
non-Javaesque argument list of getServletRequest().  However, it is the
fallback used if web.xml does not contain a request-factory
configuration.

MultipartRequestFactoryImpl is the preferred implementation configured
in the example web.xml file.  It provides the choice between storing the
upload file content in memory (FilePartArray) or in a *permanent* file
(FilePartFile).   The upload directory is a global configuration
parameter
defaulting to WEB-INF/work/upload-dir.

However, CocoonServlet hardwires

    private static final boolean SAVE_UPLOADED_FILES_TO_DISK = true;

to always store the request in a file which will remain after the
end of the request.  This covers the use case in the Cocoon samples 
of providing a public inbox where one can drop files which shall be
handled lateron by another application.

However, for the other use case of processing the uploaded data directly
in the request, the permanent file approach causes serious trouble.
First there is race condition, if by chance or design two request use
the same filename at the same time.

<SECURITY-ALERT>
Second, there is as far as I can see a *BIG* security hole here.  
The filename supplied in the request data is used verbatim in
constructing
the filepath on the server.  By crafting a request with enough ../ in 
the filename an attacker can overwrite any file writable by the
container 
process!!

At the very least anybody can fill up my disk by sending fake file
upload
request.  Note that it not necessary to have a file upload page.  All
that
happens at the very beginning of request handling before any Cocoon
based
access control mechanisms could stop it!!!
</SECURITY-ALERT>

What needs to be done?

1.) Fix the security hole *IMMEDIATELY* and advise the user community.  
The last thing Cocoon needs is bad press about such things.

The quickest fix is (what I hacked into my CocoonServlet.java)

    private static final boolean SAVE_UPLOADED_FILES_TO_DISK = false;

if you want to handle multipart/form-data (but it breaks the upload
sample).

If you only need default form encoding, you can also disable it in
web.xml: 

    <init-param>
      <param-name>request-factory</param-name>
 
<param-value>org.apache.cocoon.components.request.SimpleRequestFactoryIm
pl</param-value>
    </init-param>

2.) Clean the filename in
component/request/multipart/MultipartParser.java
from ".." and other dangerous stuff.  Then make
SAVE_UPLOADED_FILES_TO_DISK
configurable.  (I think MaybeUploadRequestFactoryImpl can be removed
completely.)

3.) Refactor MultipartRequestFactoryImpl and friends.

How should MultipartRequestFactoryImpl be refactored?

The FilePartArray stuff works fine.  The only problem is that for
duration 
of a request it has provide buffer space for the whole request body.
With
a few large uploads running in parallel it can easily exhaust all JVM
memory.

Rather than choosing in the configuration either/or array/file a hybrid
approach should be used.  Small upload request (configurable limit
~100k)
are kept in memory.

Larger uploads are written to a *temporary* file created with a random
name
in tempdir and automatically removed by CocoonServlet at the end of the 
request.

The samples/xsp/upload.xsp is then made working again by reading out the
upload content from the request and storing it as permanent file.

<xsp:logic>
FilePart filePart = (FilePart)request.get("uploaded_file");
InputStream contentStream = filePart.getInputStream();
File uploadFile = new File(filePart.getSafeFilePath(uploadDir));
// copy contentStream to uploadFile
...
</xsp:logic

This could actually be turned into an FileUploadAction with
configuration 
parameters:

<map:act type="upload" src="path/to/inbox">
  <map:parameter name="overwrite" value="deny|allow|rename"/>
  <map:parameter name="maxsize" value="10000000"/>
  ... Thank you for your file
</map:act>
... Sorry, your file could not be stored

SimpleRequestFactoryImpl should become the fallback request factory in
CocoonServlet, unless web.xml (or cocoon.xconf) says otherwise.
The request factory in the example configuration should be
MultipartRequestFactoryImpl with a reasonable limits for array/file
sizes.

The RequestFactory should be extended to allow registering different
request wrappers depending on the method ("POST") and content type
("multipart/form-data").  That can come handy if more special request
decoding is required (WEBDAV?, SOAP?).

CocoonServlet.init():

    MultipartRequestFactoryImpl.setMaxBufferSize(maxBufferSize);
    MultipartRequestFactoryImpl.setMaxUploadSize(maxUploadSize);

    RequestFactory.register("POST", "application/x-www-form-urlencoded",
 
"org.apache.cocoon.component.request.SimpleRequestFactoryImpl");

    RequestFactory.register("POST", "multipart/form-data",
 
"org.apache.cocoon.component.request.MultipartRequestFactoryImpl");

    RequestFactory.register("...", "...",
 
"org.apache.cocoon.component.request.MySpecialRequestFactoryImpl");


CocoonServlet.service():

    RequestFactory factory =
RequestFactory.getRequestFactory(req.getMethod(),
 
req.getContentType());
    HttpServletRequest request = factory.getServletRequest(req);


Further room for enhancements are:

1.) <input type=file> allows also for multiple file selection.  That is
not yet handled at all.

2.) Even if file uploading is disabled, read out the request body to
advance to the
next HTTP/1.1 request in the socket.  (When using an upstream load
balancer, the
following request may even be from a different user.)

3.) Sort out the character encoding between the file content sent by the
browser
and what the servlet expects.

So there is plenty of volunteer work left to be done...

Cheers, Alfred.


>-----Original Message-----
>From: Geoff Howard [mailto:[EMAIL PROTECTED]]
>Sent: Freitag, 11. Oktober 2002 23:36
>To: [EMAIL PROTECTED]
>Subject: [VOLUNTEER] Re: DO NOT REPLY [Bug 13541] New: -
>SAVE_UPLOAD_FILES_TO_DISK should be configurable
>
>
>I'm willing to volunteer to work on issues surrounding
>this "bug" if someone can get me up to speed on where
>things stand in the code.  Specifically, what would
>help for starters is: 
>1) Why is SAVE_UPLOADED_FILES_TO_DISK hardcoded now? 
>I assume that its because of some unresolved problem
>with the condition in MultipartParser.java which
>handles the false condition?  Is something wrong with
>FilePartArray as it stands now?  Is this not the
>correct solution?
>2) What is the story with
>MaybeUploadRequestFactoryImpl?  From the comments in
>the code it seems unfinished.
>3) Same with SimpleRequestFactoryImpl?  It seems like
>specifying one of these in web.xml is overlapped in
>functionality with SAVE_UPLOADED_FILES_TO_DISK.  If
>so, which direction should it go?  To them or away
>from them?
>4) There is a message in the archives demonstrating
>the use of an action to deal with uploads (though I
>think it depends on the default functionality).  Is
>this the right direction to encourage use?  I assume
>that InputModules could handle things the same way if
>they replace Actions in the future.
>
>It seems to me that this is what is needed for
>starters:
>- files should not be uploaded and saved by default
>from any page.  that has security hole written all
>over it.
>- when upload of a file is desired, there should be a
>configurable default directory (as there is now) _and_
>the ability to designate alternative locations either
>in the sitemap, and _maybe_ via runtime/request
>parameters.
>- ideally, some pipelines should be able to allow them
>and others not - this way you can require
>authentication easily.  any other suggestions on how
>to allow uploads only to select users?
>
>The first could be handled simply by changing
>CocoonServlet to use a parameter from web.xml instead
>of it's final boolean.  (I'd also argue that if the
>other changes don't happen, this parameter should be
>set to false in the default config for security
>reasons).
>
>The second would involve modifying the behavior of
>MultipartParser and friends in
>org.apache.cocoon.matching.* 
>
>Am I wrong that the default configuration means that I
>can upload files all day long to live cocoon instances
>everywhere right now (unless they have specified
>SimpleRequestFactoryImpl in web.xml)?!?
>
>Geoff Howard
>
>--- [EMAIL PROTECTED] wrote:
>> DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
>> 
>> RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE
>> AT
>>
><http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13541>.
>> ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED
>> AND 
>> INSERTED IN THE BUG DATABASE.
>> 
>>
>http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13541
>> 
>> SAVE_UPLOAD_FILES_TO_DISK should be configurable
>> 
>>            Summary: SAVE_UPLOAD_FILES_TO_DISK should
>> be configurable
>>            Product: Cocoon 2
>>            Version: Current CVS
>>           Platform: All
>>         OS/Version: All
>>             Status: NEW
>>           Severity: Enhancement
>>           Priority: Other
>>          Component: core
>>         AssignedTo: [EMAIL PROTECTED]
>>         ReportedBy: [EMAIL PROTECTED]
>> 
>> 
>> CocoonServlet passes a final boolean
>> SAVE_UPLOADED_FILES_TO_DISK = true to the 
>> RequestFactory.
>> 
>> This should become a ServetConfig init parameter to
>> allow using the 
>> MultipartRequestFactoryImpl producing FilePartArray
>> objects for upload requests.
>> 
>> Provided one is not afraid of memory exhaustion,
>> FilePartArray avoids the 
>> problems of FilePartFile (race condition, privacy
>> issue, housekeeping).
>> 
>>
>---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> [EMAIL PROTECTED]
>> For additional commands, email:
>> [EMAIL PROTECTED]
>> 
>
>
>__________________________________________________
>Do you Yahoo!?
>Faith Hill - Exclusive Performances, Videos & More
>http://faith.yahoo.com
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, email: [EMAIL PROTECTED]
>
>

This message is for the named person's use only. It may contain
confidential, proprietary or legally privileged information. No
confidentiality or privilege is waived or lost by any mistransmission.
If you receive this message in error, please notify the sender urgently
and then immediately delete the message and any copies of it from your
system. Please also immediately destroy any hardcopies of the message.
You must not, directly or indirectly, use, disclose, distribute, print,
or copy any part of this message if you are not the intended recipient.
The sender's company reserves the right to monitor all e-mail
communications through their networks. Any views expressed in this
message are those of the individual sender, except where the message
states otherwise and the sender is authorised to state them to be the
views of the sender's company. 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to