[
https://issues.apache.org/jira/browse/TIKA-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ray Gauss II resolved TIKA-895.
-------------------------------
Resolution: Duplicate
Assignee: Ray Gauss II
> Empty title element makes Tika-generated HTML documents not open
> ----------------------------------------------------------------
>
> Key: TIKA-895
> URL: https://issues.apache.org/jira/browse/TIKA-895
> Project: Tika
> Issue Type: Bug
> Components: metadata
> Affects Versions: 1.1
> Environment: Windows 7
> Reporter: Benoit MAGGI
> Assignee: Ray Gauss II
> Priority: Trivial
> Labels: newbie
>
> I try to transform an empty docx to an html file.
> Ex : java -jar tika-app-1.1.jar -x example.docx > t.html
> The html file can't be open with Firefox,Internet Explorer and Chrome.
> The main point is that <title/> seems to be forbiden by html specification
> (can't get the point on html5)
> bq. http://www.w3.org/TR/html401/struct/global.html#h-7.4.2
> bq. 7.4.2 The TITLE element
> bq. <!-- The TITLE element is not considered part of the flow of text.
> bq. It should be displayed, for example as the page header or
> bq. window title. Exactly one title is required per document.
> bq. -->
> bq. <!ELEMENT TITLE
> <http://www.w3.org/TR/html401/struct/global.html#edef-TITLE> - - (#PCDATA)
> -(%head.misc;
> bq. <http://www.w3.org/TR/html401/sgml/dtd.html#head.misc> ) -- document
> title -->
> bq. <!ATTLIST TITLE %i18n <http://www.w3.org/TR/html401/sgml/dtd.html#i18n> >
> bq. *Start tag: required, End tag: required*
> For information there was the same bug with xls
> https://issues.apache.org/jira/browse/TIKA-725
> The simple solution should be to provide an empty title by default
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira