Hi,

On Mon, Oct 25, 2010 at 10:27 AM, Ista Pouss <[email protected]> wrote:
> There is no official spec of the markup langage.  There are some
> parsers... I find "Wiki2HtmlJavaProgram"
> (http://community.jboss.org/wiki/Wiki2HtmlJavaProgram) and "jwpl"
> (http://code.google.com/p/jwpl/). Perhaps it's best to start from
> scratch with antlr ?

Note that since the MediaWiki markup is practically plain text with
some structural formatting rules, you can get pretty far with Tika's
normal plain text parser unless you really need the structural
information.

Or if you already have the markup of a wiki page available as a string
or a character stream (for example if you're accessing the underlying
database or JSON exports directly), then there may be no need to
involve Tika in the process.

BR,

Jukka Zitting

Reply via email to