My problem is there are enormous amounts of errors generated by the 
checks where wiki mark-up is met. Especially name= etc.

It is not for me, but for any wikipedia user checking pages ..

Maybe a built-in parsoid-like routine?

What is it we do check? Is it enough when all wiki mark-up is hidden 
inside a kind of tag?

Ruud

Op 15-09-14 om 13:16 schreef Daniel Naber:
> On 2014-09-15 10:50, R.J. Baars wrote:
>
>> How can I improve LT specifically for Wikipedia?
>> I would like to remove all false positives, caused by the Wiki markup.
> Here's what I think is the proper solution (for the use case of checking
> the recent changes feed):
>
> -send the old version of the page to Parsoid (Parsoid is a system that
> would do the parsing for us, turning Wikipedia markup into something we
> can actually parse - see https://www.mediawiki.org/wiki/Parsoid)
> -send the new version of the page to Parsoid
> -make an XML diff to see where the changes are
> -run the LT on the text of the paragraphs that have been changed
>
> The problem with this is that a) it needs considerable development
> effort and b) it makes the checking slower and less robust due to the
> additional http requests that it requires.
>
> Working on the rules (grammar.xml/disambiguation.xml) to prevent false
> alarms caused by Wikipedia markup is a hack, I wouldn't recommend that.
>
>> I can adjust rules, but how do I get them there to see the results?
> You could use this (available from
> https://languagetool.org/download/snapshots/):
> java -jar languagetool-wikipedia.jar wiki-check
> http://de.wikipedia.org/wiki/Bielefeld
>
> Regards
>    Daniel
>
>
> ------------------------------------------------------------------------------
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce
> Perforce version control. Predictably reliable.
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel


------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to