On 17.08.2009, at 16:29, Markus Wiederkehr wrote:

If you are interested I could write a regex based version which will not
reintroduce the double space bug.
I'ld use the regex to extract charset, encoding and encoded string in one
go. I think it will be at least as fast as the current method.
However, java.util.regex requires Java 1.4, if that's a no-go I won't
bother.

Regex wouldn't be a problem since Mime4j already depends on Java 5.

I'm not sure how a regex solution could compete with a few indexOf and
substring calls in terms of speed though. I mean Pattern.compile()
alone has to build a DFA from the input string.


That's why the Pattern.compile() call is only executed once when the class is loaded:

final static Pattern regex = Pattern.compile("...");

From what I can see, the indexOf calls seem to be quite optimized, so I do not expect a noticable speed improvement by switching to regular expressions.


I'd like to give it a try by refactoring and fixing the existing code.


Fine with me! 

Reply via email to