[ 
https://issues.apache.org/jira/browse/TIKA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benson Margulies updated TIKA-304:
----------------------------------

    Description: 
It would be nice if one could subclass HtmlParser to change what it passes 
along, instead of having to copy it. I'll attach a first effort.

It would also be good if attributes could be preserved (particularly id 
attributes) but let's see how you like my first patch.



  was:
It would be nice if one could subclass HtmlParser to change what it passes 
along, instead of having to copy it. I'll attach a first effort.



> HtmlParser could be easier to subclass
> --------------------------------------
>
>                 Key: TIKA-304
>                 URL: https://issues.apache.org/jira/browse/TIKA-304
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.4, 0.5
>            Reporter: Benson Margulies
>         Attachments: html-parser-subclass.diff
>
>
> It would be nice if one could subclass HtmlParser to change what it passes 
> along, instead of having to copy it. I'll attach a first effort.
> It would also be good if attributes could be preserved (particularly id 
> attributes) but let's see how you like my first patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to