Never mind, I solved it "by hand"
I wrote a Python script that takes a list of HTML entities and generates
a huge tree of switch() { case: switch () { case: switch () { case: ...
The generated Java code goes through a char[] in a single pass and when
it recognizes an entity it pushes the associated Unicode char into the
SAX stream, instead of the chars composing the entity.
It's pretty brutal, it produces a 36k class file, but it's the fastest
thing that could possibly solve the job, short of writing a C extension!
The pattern transformer took 800ms on some data, where mine takes 2ms!
If anybody is interested, I can post or email the code.
Joerg Heinicke wrote:
> That's one of the rare cases where I consider
> <xsl:text disable-output-escaping="yes"> a valid approach
Yes, that was the first thing I tried, but I discarded it as it was
causing more problems than it solved.
Tobia
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]