David Pilato created TIKA-1165:
----------------------------------
Summary: Autodetect and parse Asciidoc
Key: TIKA-1165
URL: https://issues.apache.org/jira/browse/TIKA-1165
Project: Tika
Issue Type: Wish
Components: languageidentifier, parser
Affects Versions: 1.4
Reporter: David Pilato
Priority: Trivial
When parsing asciidoc metadata, we currently get the following:
{noformat}
Content-Encoding: ISO-8859-1
Content-Length: 66363
Content-Type: text/plain; charset=ISO-8859-1
resourceName: asciidoc.adoc
{noformat}
Steps to reproduce:
{code:title=asciidoc.sh|borderStyle=solid}
curl
https://raw.github.com/asciidoctor/asciidoctor.org/master/docs/asciidoc-syntax-quick-reference.adoc
-O -s
java -jar tika-app-1.4.jar -m asciidoc-syntax-quick-reference.adoc
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira