My problem is there are enormous amounts of errors generated by the checks where wiki mark-up is met. Especially name= etc.
It is not for me, but for any wikipedia user checking pages .. Maybe a built-in parsoid-like routine? What is it we do check? Is it enough when all wiki mark-up is hidden inside a kind of tag? Ruud Op 15-09-14 om 13:16 schreef Daniel Naber: > On 2014-09-15 10:50, R.J. Baars wrote: > >> How can I improve LT specifically for Wikipedia? >> I would like to remove all false positives, caused by the Wiki markup. > Here's what I think is the proper solution (for the use case of checking > the recent changes feed): > > -send the old version of the page to Parsoid (Parsoid is a system that > would do the parsing for us, turning Wikipedia markup into something we > can actually parse - see https://www.mediawiki.org/wiki/Parsoid) > -send the new version of the page to Parsoid > -make an XML diff to see where the changes are > -run the LT on the text of the paragraphs that have been changed > > The problem with this is that a) it needs considerable development > effort and b) it makes the checking slower and less robust due to the > additional http requests that it requires. > > Working on the rules (grammar.xml/disambiguation.xml) to prevent false > alarms caused by Wikipedia markup is a hack, I wouldn't recommend that. > >> I can adjust rules, but how do I get them there to see the results? > You could use this (available from > https://languagetool.org/download/snapshots/): > java -jar languagetool-wikipedia.jar wiki-check > http://de.wikipedia.org/wiki/Bielefeld > > Regards > Daniel > > > ------------------------------------------------------------------------------ > Want excitement? > Manually upgrade your production database. > When you want reliability, choose Perforce > Perforce version control. Predictably reliable. > http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk > _______________________________________________ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel ------------------------------------------------------------------------------ Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel