Hi Robert,

the main pull parser parses all digit boundaries fine.

Ah, now I see what you mean. I debugged again and there's two lots of parsing going on,

MimeStreamParser.parseHeader()
  ContentHandler.field(fieldData)
  BodyDescriptor.addField()
    getHeaderParams()

The Message implementation of ContentHandler.field() is parsing the message using the JavaCC parser, the same as the SimpleContentHandler implementation.

BodyDescriptor.addField() only parses the Content-Transfer-Encoding/Content-Type headers. NB: It does not check 'Content-*' headers as the Javadocs state, e.g. Content-Disposition.

So, it seems like BodyDescriptor.addField() is the odd-man-out. It's parsing the Content-Type header, but using a different parsing mechanism to the standard ContentTypeParser, and as there's no way in the ContentHandler implementation to get this BodyDescriptor parsed information to the handler, the handler must also parse it.

If BodyDescriptor is managing Content-* headers, it should it save them all, e.g. in

private params Map<String, Map<String, String>>;

where Strings are headerName/paramName/paramValue.

For example, the Content-Disposition parameters should be available. Currently only Content-Type parameters are.

The it could offer methods such as

// Returns the parameters for the given header, e.g. C-T or C-D
String getContentParam(String headerName, String paramName);
Map getContentParams(String headerName);

and helper methods:
boolean isInline();
boolean isAttachment();

Anyway, I've just dived into the Mime4J as I want to use it for a project of mine. If I make any changes, I'll submit them back as patches at a later date.

> having taken a look,
Message seems to me to do far too much of it's own parsing. this parsing
seems pretty buggy and also unnecessary: BodyDescriptor contains parsed
meta-data.

IMHO the right way to fix this problem is to rewrite Message so that the
information is obtained from the meta-data provided by the main parser. but
Message isn't really something i've played around with before and i'm out of
time this evening. if you'd like to had a go at revising the implementation,
it should be reasonable easy. otherwise, unless anyone jumps in with a
better implementations strategy, i should be able to take a look at this
tomorrow.

Antony



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to