Luis Filipe Nassif created TIKA-2033:
----------------------------------------

             Summary: Value attributes of input elements not extracted from 
HTML 
                 Key: TIKA-2033
                 URL: https://issues.apache.org/jira/browse/TIKA-2033
             Project: Tika
          Issue Type: Improvement
          Components: parser
    Affects Versions: 1.10
         Environment: Windows 7, java8 x64
            Reporter: Luis Filipe Nassif
            Priority: Minor


The text of value attributes of input elements currently is not extracted from 
HTML files. Note it is rendered by browsers. I tried using IdentityHtmlMapper 
and played with HtmlSchema with no luck. Simple test HTML below:

<HTML><body><input value='text'></input></body></HTML>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to