> 2) Implement spelling and punctuation check automatically within GTTK before > posting of the articles. > > There is spell check in Translator Toolkit, although it's not available for > all languages. We don't have any punctuation checks today and I doubt that > we can release this anytime soon. (If it's not available in Google Docs or > Gmail, then it's unlikely that we'll have it for Translator Toolkit, as > well, since we use the same infrastructure.) > > What's the proposal, though - would you like for us to prevent publishing of > articles if they have too many spelling errors, or simply warn the user that > there are X spelling errors? Any input you can provide on preferred > behavior would be great.
I would say to force spellcheck before publication, which does not seem to be the case currently. I think this would be enough - perhaps a warning as well. I don't know about preventing publication, although that might work too. > 3) Have GTTK automatically remove broken templates and images, or require > users to translate any templates before a page may be posted. > > Templates are a bit tricky. Sometimes, a template in one Wikipedia does not > exist in another Wikipedia. Other times, a template in one langauge maps to > a template in another language but the parameters are different. > > Removing broken templates automatically may not work because some templates > come between words. If we remove them, the sentences or paragraph may > become invalid. We've also considered creating a custom interface for > localizing templates, but this requires a lot of work. > > In the interim, the approach we've taken is to have translators fix the > templates in Wikipedia when they post the article from Translator Toolkit. > When a user clicks on Share > Publish to source page in Translator Toolkit, > the Wikipedia article is in preview mode --- it's not live. The idea is > that if there are any errors, the translator can fix them before saving the > article. Well, many translators do fix such problems, but I was just thinking of some of the problems that I've heard so far with people who do "drive-by" translations, dropping it on a project and then disappearing. If translators are careful and do all the work themselves, templates are an annoyance rather than a real problem. > 4) Include a list of most needed articles for people to create, rather than > random articles that will be of little use to local readers. Some articles, > such as those on local topics, have the added benefit of encouraging more > edits and community participation since they tend to generate more interest > from speakers of a language in my experience. > > The articles we selected actually weren't really random. Here's how we > selected them: > > 1. we looked at the top Google searches in the region (e.g., for Tamil, we > looked at searches in India and I believe Sri Lanka, as well) > 2. from the top Google searches in the region, we looked at the top, clicked > Wikipedia articles --- regardless of the language (so we wound up with > Wikipedia source articles in English, Hindi, and other languages) > 3. from the top, clicked Wikipedia articles, we looked for articles that > were either stubs or unavailable in the local language - these are the > articles that we sent for translation > > This selection isn't perfect. For example, it assumes that the top, clicked > Wikipedia articles by all users in India/Sri Lanka --- who may be searching > in English, Hindi, Tamil, or some other language --- are relevant to the > Tamil community. To improve this, last month, we met with members of the > Tamil and Telugu Wikipedias to improve this article selection. The main > changes that we agreed on were: I'm not sure if this project was separate from the Swahili Wikipedia Challenge, but I'm assuming it was after seeing articles such as http://sw.wikipedia.org/wiki/Maduka_ya_United_Cigar_Stores (about a defunct chain of cigar stores in the US) which I doubt were popular searches in East Africa. One more idea: Automatically add existing Interwikis links to the new article. Also, as far as Indic languages go, I would ask if there's any chance you have any Oriya speakers - with 637 articles, the Oriya Wikipedia is by far the most anemic of Indic-language Wikipedias, in spite of a speaker population of 31 million. -m. _______________________________________________ foundation-l mailing list [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
