Hi,

I'm building a new ContentHandler that needs to do some work on script elements 
as well. But they are not reported in my startElement method. The context has 
the IdentityHtmlMapper set and script does not get discarded in Tika's own 
HtmlHandler. Instead, the script element is reported in HtmlHandler but not in 
my custom handler.

The confusing thing is that i am able to get it in my handler when adding the 
script element to TagSoup inside HtmlParser's constructor:
        HTML_SCHEMA.elementType("script", HTMLSchema.M_EMPTY, 65535, 0);

Without this, script and it's characters are only reported inside HtmlHandler, 
never in custom handlers.

Am must be doing something wrong here, any hints?

Thanks,
Markus

Reply via email to