Re: [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-29 Thread C. Scott Ananian
I think we're mostly agreed now. And I agree that rule-based systems can provide valuable bootstrapping, if the requisite language experts can be found. I suspect we will find that different language pairs will favor different techniques. --scott -- (http://cscott.net) _

Re: [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-29 Thread Lars Aronsson
On 07/26/2013 09:25 PM, David Cuenca wrote: This is the preliminary draft: https://meta.wikimedia.org/wiki/Collaborative_Machine_Translation_for_Wikipedia Apertium, the GNU GPL software project for rule-based translation that you mention, seems quite promising. In the near term, it would make s

Re: [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-28 Thread John Erling Blad
In my opinion the only thing that is going to work on short term is a guided rule based system. We need that to be able to reuse values from Wikidata in running text. That is a template text must be transformed according to gender, plurality, etc, but also that the values must be adjust to genitive

Re: [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-27 Thread David Cuenca
On Sat, Jul 27, 2013 at 10:39 AM, C. Scott Ananian wrote: > My main point was just that there is a chicken-and-egg problem here. You > assume that machine translation can't work because we don't have enough > parallel texts. But, to the extent that machine-aided translation of WP is > successful

Re: [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-27 Thread C. Scott Ananian
On Sat, Jul 27, 2013 at 10:18 AM, David Cuenca wrote: > Scott, "edit and maintain" parallelism sounds wonderful on paper, until you > want to implement it and then you realize that you have to freeze changes > both in the source text and in the target language for it to happen, which > is, IMHO a

Re: [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-27 Thread David Cuenca
On Fri, Jul 26, 2013 at 11:30 PM, C. Scott Ananian wrote: > This statement seems rather defeatist to me. Step one of a machine > translation effort should be to provide tools to annotate parallel texts in > the various wikis, and to edit and maintain their parallelism. Scott, "edit and maintain

Re: [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-26 Thread C. Scott Ananian
On Fri, Jul 26, 2013 at 3:25 PM, David Cuenca wrote: > This is the preliminary draft: > > https://meta.wikimedia.org/wiki/Collaborative_Machine_Translation_for_Wikipedia The linked page says: > For this kind of project it is prefered to use a rule-based machine > translation

[Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-26 Thread David Cuenca
After Erik's email about supporting open source machine translation [1], I've been researching options and having talks with several machine translation researchers about what would be the best way to integrate MT into Wikipedia. Unfortunately I couldn't find a single solution that, on its own, wou