Hi,
I am the current maintainer of Commons FileUpload and would like to
reuse Mime4J as the multipart parser. Thanks to the acceptance of my
pull parser patch (MIME4J-19), this is now possible. In an ongoing
thread, others have expressed interes in following this step. See
http://www.nabble.com/RfC%3A-commons-fileupload-2%2C-based-on-mime4j-tf4220932.html
In the last days, I have developed a first implementation of what I'd
like to see as Commons FileUpload 2.0, which you can find at
http://people.apache.org/~jochen/commons-fileupload
It is based on a patched version of Mime4J 0.4-SNAPSHOT, which you
find at the same location.
All in all, I found a few minor flaws in Mime4J, which I'd like to see
fixed. I'd like to post them here for general discussion. If I
hopefully find general agreement, then I would split them into patches
and submit them to Jira.
One reason for this procedure is to beg for a kind of "fast track": If
I submit several patches and wait for a long time, before they are
accepted, then this may take too much time. It would help, if some
developer could agree to move this forward together with me, or if I
might be able to get committer privileges.
Here are my points, ordered by priority:
Required
1.) Parsing a message without headers
Currently, Mime4j can only parse messages with headers. That's
not suitable
for parsing an HTTP message, because the typical situation is
in a servlet,
that doesn't see the headers.
I'd propose a new method
public parse(InputStream, BodyDescriptor)
This method would be specified as emitting a sequence
T_START_MULTIPART ... T_END_MULTIPART
as opposed to
T_START_MESSAGE T_START_HEADER ... T_END_HEADER
T_START_MULTIPART ... T_END_MULTIPART T_END_MESSAGE
The required patch is relatively minor and should not complicate
the parser.
Recommended:
2.) Let BodyDescriptor provide full blown access to the headers
Currently, BodyDescriptor offers access to the content-type and
content-transfer-encoding headers only.
As a consequence, the mime4j user is forced to listen for T_FIELD events
and build its own header map. This is duplicated work, in
particular, because
all mime4j users will likely do the same.
I'd propose to:
- Replace BodyDescriptor with an interface. (Assumes that this
is possible,
I am guessing by the version number 0.4, but I maybe wrong.)
- Make the BodyDescriptor implementation pluggable by adding a method
protected void newBodyDescriptor()
to the Mime4JTokenStream.
- Provide a default implementation that maintains a map of headers and
values.
- Open up the method
private void getHeaderParams(String)
by making it static and moving it to a utility class or by
providing an accessor
that takes a header name as an argument and invokes the
method by providing
the value as input.
- Rename getParameters() to getContentTypeParameters(),
because the method
name is definitely confusing. I clearly had the impression
that this method would
provide the header values.
3.) Drop lazy syntax checking or make it optional
Mime4j has a lot of places where it detects syntax errors of the
multipart stream. Currently, these are reported by a warning message,
which is being logged.
This behaviour is improper. Such situations should cause an
exception or at
least the Mime4j user should be able to request that they do.
4.) Provide utility classes for the Mime4j user
I have implemented code in Commons Fileupload, which would better
sit in Mime4j, because it is likely to be shared by Mime4j
users. In particular,
a utility class for implementation of header maps, could be pushed down.
5.) Provide methods
public String getFieldName()
public String getFieldValue()
The user of
public String getField()
is forced to parse the returned value in order to obtain the
field name and the
value, although the Mime4jTokenStream has already done exactly that.
6.) Drop
public int read(byte[] b)
from PartialInputStream and PositionInputStream. These methods are by
default delegating to public int read(byte[], int, int) and
the default implementation
works fine. Overwriting these method only enforces, that subclasses must
implement them too.
7.) Add support for limitations.
In Commons Fileupload, it is possible to limit the overall
request size and/or the
size of a an atomic entity. This is highly recommended for web
applications,
as a security measure against DOS attacks.
This can be implemented by the Mime4j user. However, it is also
likely to be
reused, so it might better be pushed down.
Thanks for reading so far. :-)
Jochen
--
"Besides, manipulating elections is under penalty of law, resulting in
a preventative effect against manipulating elections.
The german government justifying the use of electronic voting machines
and obviously believing that we don't need a police, because all
illegal actions are forbidden.
http://dip.bundestag.de/btd/16/051/1605194.pdf
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]