[
https://issues.apache.org/jira/browse/TIKA-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030410#comment-14030410
]
Dan Allen commented on TIKA-1165:
---------------------------------
I strongly recommend using AsciidoctorJ, as you have suggested. AsciidoctorJ is
the official and most comprehensive way of processing AsciiDoc on the JVM.
http://asciidoctor.org/docs/install-and-use-asciidoctor-java-integration/
https://github.com/asciidoctor/asciidoctorj
AsciidoctorJ uses JRuby to invoke Asciidoctor. You can either use the bundled
JRuby runtime or you can configure it to use an alternate JRuby runtime.
AsciidoctorJ is used by both GitBlit and Bintray to render AsciiDoc documents.
Feel free to reach out to those teams if you want feedback about how it
performs.
> Autodetect and parse Asciidoc
> -----------------------------
>
> Key: TIKA-1165
> URL: https://issues.apache.org/jira/browse/TIKA-1165
> Project: Tika
> Issue Type: Wish
> Components: languageidentifier, parser
> Affects Versions: 1.4
> Reporter: David Pilato
> Priority: Trivial
>
> When parsing asciidoc metadata, we currently get the following:
> {noformat}
> Content-Encoding: ISO-8859-1
> Content-Length: 66363
> Content-Type: text/plain; charset=ISO-8859-1
> resourceName: asciidoc.adoc
> {noformat}
> Steps to reproduce:
> {code:title=asciidoc.sh|borderStyle=solid}
> curl
> https://raw.github.com/asciidoctor/asciidoctor.org/master/docs/asciidoc-syntax-quick-reference.adoc
> -O -s
> java -jar tika-app-1.4.jar -m asciidoc-syntax-quick-reference.adoc
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)