Hi, I am the current maintainer of Commons FileUpload and would like to reuse Mime4J as the multipart parser. Thanks to the acceptance of my pull parser patch (MIME4J-19), this is now possible. In an ongoing thread, others have expressed interes in following this step. See
http://www.nabble.com/RfC%3A-commons-fileupload-2%2C-based-on-mime4j-tf4220932.html In the last days, I have developed a first implementation of what I'd like to see as Commons FileUpload 2.0, which you can find at http://people.apache.org/~jochen/commons-fileupload It is based on a patched version of Mime4J 0.4-SNAPSHOT, which you find at the same location. All in all, I found a few minor flaws in Mime4J, which I'd like to see fixed. I'd like to post them here for general discussion. If I hopefully find general agreement, then I would split them into patches and submit them to Jira. One reason for this procedure is to beg for a kind of "fast track": If I submit several patches and wait for a long time, before they are accepted, then this may take too much time. It would help, if some developer could agree to move this forward together with me, or if I might be able to get committer privileges. Here are my points, ordered by priority: Required 1.) Parsing a message without headers Currently, Mime4j can only parse messages with headers. That's not suitable for parsing an HTTP message, because the typical situation is in a servlet, that doesn't see the headers. I'd propose a new method public parse(InputStream, BodyDescriptor) This method would be specified as emitting a sequence T_START_MULTIPART ... T_END_MULTIPART as opposed to T_START_MESSAGE T_START_HEADER ... T_END_HEADER T_START_MULTIPART ... T_END_MULTIPART T_END_MESSAGE The required patch is relatively minor and should not complicate the parser. Recommended: 2.) Let BodyDescriptor provide full blown access to the headers Currently, BodyDescriptor offers access to the content-type and content-transfer-encoding headers only. As a consequence, the mime4j user is forced to listen for T_FIELD events and build its own header map. This is duplicated work, in particular, because all mime4j users will likely do the same. I'd propose to: - Replace BodyDescriptor with an interface. (Assumes that this is possible, I am guessing by the version number 0.4, but I maybe wrong.) - Make the BodyDescriptor implementation pluggable by adding a method protected void newBodyDescriptor() to the Mime4JTokenStream. - Provide a default implementation that maintains a map of headers and values. - Open up the method private void getHeaderParams(String) by making it static and moving it to a utility class or by providing an accessor that takes a header name as an argument and invokes the method by providing the value as input. - Rename getParameters() to getContentTypeParameters(), because the method name is definitely confusing. I clearly had the impression that this method would provide the header values. 3.) Drop lazy syntax checking or make it optional Mime4j has a lot of places where it detects syntax errors of the multipart stream. Currently, these are reported by a warning message, which is being logged. This behaviour is improper. Such situations should cause an exception or at least the Mime4j user should be able to request that they do. 4.) Provide utility classes for the Mime4j user I have implemented code in Commons Fileupload, which would better sit in Mime4j, because it is likely to be shared by Mime4j users. In particular, a utility class for implementation of header maps, could be pushed down. 5.) Provide methods public String getFieldName() public String getFieldValue() The user of public String getField() is forced to parse the returned value in order to obtain the field name and the value, although the Mime4jTokenStream has already done exactly that. 6.) Drop public int read(byte[] b) from PartialInputStream and PositionInputStream. These methods are by default delegating to public int read(byte[], int, int) and the default implementation works fine. Overwriting these method only enforces, that subclasses must implement them too. 7.) Add support for limitations. In Commons Fileupload, it is possible to limit the overall request size and/or the size of a an atomic entity. This is highly recommended for web applications, as a security measure against DOS attacks. This can be implemented by the Mime4j user. However, it is also likely to be reused, so it might better be pushed down. Thanks for reading so far. :-) Jochen -- "Besides, manipulating elections is under penalty of law, resulting in a preventative effect against manipulating elections. The german government justifying the use of electronic voting machines and obviously believing that we don't need a police, because all illegal actions are forbidden. http://dip.bundestag.de/btd/16/051/1605194.pdf --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]