[ 
https://issues.apache.org/jira/browse/TIKA-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336616#comment-15336616
 ] 

Tim Allison edited comment on TIKA-995 at 6/17/16 6:22 PM:
-----------------------------------------------------------

This leads to doubling of the body tag in the output of HTMLParser if users 
don't suppress "body" in their HtmlMapper.  Let's see if we can accomodate this 
without doubling the body tag...

See SOLR-8981


was (Author: [email protected]):
This leads to doubling of the body tag in the output of HTMLParser if users 
don't suppress "body" in their HtmlMapper.  Let's see if we can accomodate this 
without doubling the body tag...

> XHTMLContentHandler doesn't pass attributes of body element
> -----------------------------------------------------------
>
>                 Key: TIKA-995
>                 URL: https://issues.apache.org/jira/browse/TIKA-995
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.2
>            Reporter: Markus Jelsma
>             Fix For: 1.8
>
>         Attachments: TIKA-995-1.3-1.patch, TIKA-995-unit.patch
>
>
> XHTMLContentHandler.startElement() uses lazyHead() for the body element 
> because it's defined in the AUTO Set. As a consequence, attributes of the 
> body element are not passed to downstream content handlers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to