Re: [Wikimedia-l] Priority languages
On Mon, Jun 1, 2015 at 11:53 AM, Milos Rancic mill...@gmail.com wrote: * MediaWiki is developing and messages are changing. While it doesn't matter a lot for the main language to have 99% and not 100% of translated most used messages, the new one won't get a project if it's not 100%. (The situation as it is; I don't like it, but I can't change it.) Won't get a project? Are you saying that new project language editions are only approved if the MediaWiki messages for that language are all translated already? (Maybe I'm misunderstanding.) ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe
Re: [Wikimedia-l] Priority languages
On Jun 2, 2015 00:39, Benjamin Lees emufarm...@gmail.com wrote: On Mon, Jun 1, 2015 at 11:53 AM, Milos Rancic mill...@gmail.com wrote: * MediaWiki is developing and messages are changing. While it doesn't matter a lot for the main language to have 99% and not 100% of translated most used messages, the new one won't get a project if it's not 100%. (The situation as it is; I don't like it, but I can't change it.) Won't get a project? Are you saying that new project language editions are only approved if the MediaWiki messages for that language are all translated already? (Maybe I'm misunderstanding.) Writing from the phone, so can't give links... Search for Language proposal policy on Meta. That's theory. I described above how it translates into the practice. It makes sense up to certain point. MediaWiki interface should be in the native language. The condition for the first project isn't hard. It's about 500 messages. In relation to the second and third project I think there are much more sensible work to be done than translating various obscure places of MW interface. (Few years ago few of us, after a lot of arguing removed translation of CheckUser interface as a requirement for the third project, likely Wikibooks or Wikisource.) It would be useful for the sake of future arguments to have data how often people access to particular messages. ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe
Re: [Wikimedia-l] Priority languages
Milos Rancic wrote: On Jun 2, 2015 00:39, Benjamin Lees emufarm...@gmail.com wrote: Won't get a project? Are you saying that new project language editions are only approved if the MediaWiki messages for that language are all translated already? (Maybe I'm misunderstanding.) [...] It would be useful for the sake of future arguments to have data how often people access to particular messages. Directly related: https://phabricator.wikimedia.org/T65416#1042471. Though upon re-reading it just now, the specific wording used at https://meta.wikimedia.org/wiki/Language_proposal_policy is actually softer than I thought (it is recommended instead of a hard requirement). MZMcBride ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe
Re: [Wikimedia-l] Priority languages
On Jun 2, 2015 02:08, MZMcBride z...@mzmcbride.com wrote: Milos Rancic wrote: On Jun 2, 2015 00:39, Benjamin Lees emufarm...@gmail.com wrote: Won't get a project? Are you saying that new project language editions are only approved if the MediaWiki messages for that language are all translated already? (Maybe I'm misunderstanding.) [...] It would be useful for the sake of future arguments to have data how often people access to particular messages. Directly related: https://phabricator.wikimedia.org/T65416#1042471. Though upon re-reading it just now, the specific wording used at https://meta.wikimedia.org/wiki/Language_proposal_policy is actually softer than I thought (it is recommended instead of a hard requirement). Will read Phabricator discussion in the morning... Regarding LPP wording, as I mentioned above, it's theory. Practice is pretty hard and was even harder in the past. I remember Robin and I were waging hard battles for every set we wanted to remove from requirements. I am sure that's documented somewhere, but I forgot where. It should be somewhere on Meta. ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe
Re: [Wikimedia-l] Priority languages
Actually, currently the requirements are - most-used messages for the first project in a language - continuing translation activity on TWN for all others Am 02.06.2015 02:18 schrieb Milos Rancic mill...@gmail.com: On Jun 2, 2015 02:08, MZMcBride z...@mzmcbride.com wrote: Milos Rancic wrote: On Jun 2, 2015 00:39, Benjamin Lees emufarm...@gmail.com wrote: Won't get a project? Are you saying that new project language editions are only approved if the MediaWiki messages for that language are all translated already? (Maybe I'm misunderstanding.) [...] It would be useful for the sake of future arguments to have data how often people access to particular messages. Directly related: https://phabricator.wikimedia.org/T65416#1042471. Though upon re-reading it just now, the specific wording used at https://meta.wikimedia.org/wiki/Language_proposal_policy is actually softer than I thought (it is recommended instead of a hard requirement). Will read Phabricator discussion in the morning... Regarding LPP wording, as I mentioned above, it's theory. Practice is pretty hard and was even harder in the past. I remember Robin and I were waging hard battles for every set we wanted to remove from requirements. I am sure that's documented somewhere, but I forgot where. It should be somewhere on Meta. ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe
[Wikimedia-l] Priority languages
Below is the list of the languages sorted by the number of L2 speakers (more than one million of them). L2 speakers appear in two occasions: * First and important to us is about languages used for wider communication. For example, French is L2 among educated people of West Africa. * The second type is related to the native languages in not so good position (either dying or reviving). For example, English is L1 language of the most of Native Americans, as well as Russian is L1 language of the most of ethnicities of former Soviet Union, while their own languages are L2 ones. (They are important in other cases, but not for this purpose.) I omitted English (there is no sense, as we are communicating in English and English is default for all the localization) and few spoken languages (our content is [mostly] written). I also removed some languages which belong to the second category (Irish Gaelic and Scots, for example), but it could be the case that some of the languages from the list belong to that category, as well (though I am pretty sure they don't). There are languages inside of this list with well developed Wikimedia projects and without particular need to promote work on Wikimedia projects among them: French, Spanish and German are the examples. There is no Russian inside of the list, as it's usually L1 language, as mentioned above, but it belongs to the category of the languages with well developed Wikimedia projects. There are also languages spoken in countries with low level of internet access and issues much more important than writing an encyclopedia, like Congo Swahili is. Those are the areas not yet ready even for the projects like OLPC is and we don't have a lot to do there. But there are a number of languages in between with active chapter(s) or user group(s) inside of relevant countries. Those languages should be the priority in promotion collaboration. They are: Arabic (Arabic user groups), Indonesian (WM ID), Hindi (WM IN), Urdu (Pakistani user group), Thai (Thailand UG), Bengali (WM BD), Zulu (WM ZA), Hausa (West African user groups), Xhosa (WM ZA), Afrikaans (WM ZA), Kannada (WM IN), Telugu (WM IN), Tsonga (WM ZA), Malay (WM ID and Malaysian Wikimedians), Marathi (WM IN). The priorities for those languages should include (but likely not limited to): * Translation of MediaWiki messages should be 100%. * Those languages should be priorities for every document which should be translated. For example, ongoing Board elections; but also various Meta pages. * We should have the pool of literate people in those languages for various purposes, not just for translation. For example, if we want to create projects in languages of Pakistan, we should have a number of literate Urdu speakers, willing to help newcomers speaking Urdu as their L2 language. Will be back with other languages-related data :) LanguageCodeL1 speakersL2 speakersStandard Arabicarb206,000,000246,000,000Mandarin Chinesecmn847,808,270178,000,000Indonesianind23,200,480140,000,000Hindihin 260,333,620120,000,000Spanishspa398,931,84096,990,000Urduurd64,035,800 94,000,000Frenchfra75,916,15087,000,000Thaitha20,396,93040,000,000Bengaliben 189,261,20019,200,000Zuluzul11,969,10015,700,000Hausahau25,109,00015,000,000 Xhosaxho8,177,30011,000,000Afrikaansafr7,096,81010,300,000Bamanankanbam 4,072,04010,000,000Burmesemya32,035,30010,000,000Congo Swahiliswc1,000 9,100,000Northern Sothonso4,631,0009,100,000Kannadakan37,739,0409,000,000 Germandeu78,093,9808,000,000Tamiltam68,776,4608,000,000Juladyu2,550,000 7,000,000Lingalalin2,141,3007,000,000Koongokng5,016,5005,000,000Telugutel 74,049,0005,000,000Ibibioibb1,500,0004,500,000Tok Pisintpi122,0004,000,000 Kriokri495,6004,000,000Amharicamh21,811,6004,000,000Bangalabxg~0 3,500,000Tsongatso4,009,0003,400,000Malayzlm15,848,5003,000,000Marathimar 71,780,6603,000,000Sinhalasin15,613,9802,000,000Efikefi405,2602,000,000Duala dua87,7002,000,000Yorubayor19,380,8002,000,000Shonasna10,741,7001,800,000 Vendaven1,294,0001,700,000Sangosag404,0001,600,000Manado Malayxmm850,000 1,500,000Sylhetisyl10,300,0001,500,000Ambonese Malayabs245,0201,400,000 Ndebelenbl1,090,0001,400,000Rakhinerki1,000,0001,020,000Gandalug4,130,000 1,000,000Akanaka8,314,6001,000,000Khmerkhm14,224,5001,000,000 ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe
Re: [Wikimedia-l] Priority languages
On Wed, May 27, 2015 at 2:04 PM, Milos Rancic mill...@gmail.com wrote: [...] But there are a number of languages in between with active chapter(s) or user group(s) inside of relevant countries. Those languages should be the priority in promotion collaboration. They are: Arabic (Arabic user groups), Indonesian (WM ID), Hindi (WM IN), Urdu (Pakistani user group), Thai (Thailand UG), Bengali (WM BD), Zulu (WM ZA), Hausa (West African user groups), Xhosa (WM ZA), Afrikaans (WM ZA), Kannada (WM IN), Telugu (WM IN), Tsonga (WM ZA), Malay (WM ID and Malaysian Wikimedians), Marathi (WM IN). The priorities for those languages should include (but likely not limited to): * Translation of MediaWiki messages should be 100%. * Those languages should be priorities for every document which should be translated. For example, ongoing Board elections; but also various Meta pages. For some of these languages, I don't see that this makes sense, in terms of investment versus impact, or in terms of putting the cart before the horse. Tsonga, for example, seems to have precisely one active editor[1] -- no doubt, our colleague Dumisani Ndubane (CCed as a courtesy). While it is indeed his native language, it does not seem like a good investment of effort to translate hundreds of pages and thousands of strings into Tsonga when he (and, with overwhelming likelihood, any other literate speaker of Tsonga) is also fluent (and educated in) English. Xhosa, Hausa and Zulu are in the same class[2][3][4]. The other languages you mention, on the other hand, already have established, active communities. We can indeed make more of an effort to encourage greater participation by those communities (and thereby by speakers of those languages) in international goings-on via increased participation in volunteer translation, regular or semi-regular reporting (so that it's not all one-way). This is in fact generally done by some of us whenever we are in contact with folks from those communities. Crucially, I don't see that the problem lends itself to outside engineering. If we want more interchange with, say, the Telugu community, we should talk to it. I know I do. * We should have the pool of literate people in those languages for various purposes, not just for translation. For example, if we want to create projects in languages of Pakistan, we should have a number of literate Urdu speakers, willing to help newcomers speaking Urdu as their L2 language. Again, there already exists a community of dedicated contributors to the Urdu Wikipedia[5] (apparently more from India than from Pakistan, no doubt partially due to script issues[6]). Some of you, particularly in the last year, have been energetically mentoring newcomers and doing outreach activities. Our colleagues Nisar Ahmad Syed and Muzammiluddin Syed (CCed) are two such volunteers. Now, what, precisely, are you suggesting? Cheers, Asaf [1] http://stats.wikimedia.org/EN/SummaryTS.htm [2] http://stats.wikimedia.org/EN/SummaryXH.htm [3] http://stats.wikimedia.org/EN/SummaryHA.htm [4] http://stats.wikimedia.org/EN/SummaryZU.htm [5] http://stats.wikimedia.org/EN/SummaryUR.htm [6] Urdu Wikipedia is configured to use the Naskh script (used in India) rather than the Nasta'liq script (favored in Pakistan) -- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe
Re: [Wikimedia-l] Priority languages
For reference, these are the languages which were prioritized for the elections. Additional translation support was utilized to help with the 48-72 hour deadlines: 1. العربية (ar) 2. বাংলা (bn) 3. Deutsch (de) 4. español (es) 5. فارسی (fa) 6. français (fr) 7. हिन्दी (hi) 8. Bahasa Indonesia (id) 9. italiano (it) 10. 日本語 (ja) 11. Nederlands (nl) 12. português do Brasil (pt-br) 13. русский (ru) 14. Kiswahili (sw) 15. Türkçe (tr) 16. Tiếng Việt (vi) 17. 中文 (zh) -greg (User:Varnent) On Wed, May 27, 2015 at 6:44 PM, Richard Symonds richard.symo...@wikimedia.org.uk wrote: Milos, The formatting has been stripped from the email because the list software doesn't like HTML, so the table at the bottom is illegible. Is it available elsewhere? Richard Symonds Wikimedia UK 0207 065 0992 Wikimedia UK is a Company Limited by Guarantee registered in England and Wales, Registered No. 6741827. Registered Charity No.1144513. Registered Office 4th Floor, Development House, 56-64 Leonard Street, London EC2A 4LT. United Kingdom. Wikimedia UK is the UK chapter of a global Wikimedia movement. The Wikimedia projects are run by the Wikimedia Foundation (who operate Wikipedia, amongst other projects). *Wikimedia UK is an independent non-profit charity with no legal control over Wikipedia nor responsibility for its contents.* On 27 May 2015 at 22:04, Milos Rancic mill...@gmail.com wrote: Below is the list of the languages sorted by the number of L2 speakers (more than one million of them). L2 speakers appear in two occasions: * First and important to us is about languages used for wider communication. For example, French is L2 among educated people of West Africa. * The second type is related to the native languages in not so good position (either dying or reviving). For example, English is L1 language of the most of Native Americans, as well as Russian is L1 language of the most of ethnicities of former Soviet Union, while their own languages are L2 ones. (They are important in other cases, but not for this purpose.) I omitted English (there is no sense, as we are communicating in English and English is default for all the localization) and few spoken languages (our content is [mostly] written). I also removed some languages which belong to the second category (Irish Gaelic and Scots, for example), but it could be the case that some of the languages from the list belong to that category, as well (though I am pretty sure they don't). There are languages inside of this list with well developed Wikimedia projects and without particular need to promote work on Wikimedia projects among them: French, Spanish and German are the examples. There is no Russian inside of the list, as it's usually L1 language, as mentioned above, but it belongs to the category of the languages with well developed Wikimedia projects. There are also languages spoken in countries with low level of internet access and issues much more important than writing an encyclopedia, like Congo Swahili is. Those are the areas not yet ready even for the projects like OLPC is and we don't have a lot to do there. But there are a number of languages in between with active chapter(s) or user group(s) inside of relevant countries. Those languages should be the priority in promotion collaboration. They are: Arabic (Arabic user groups), Indonesian (WM ID), Hindi (WM IN), Urdu (Pakistani user group), Thai (Thailand UG), Bengali (WM BD), Zulu (WM ZA), Hausa (West African user groups), Xhosa (WM ZA), Afrikaans (WM ZA), Kannada (WM IN), Telugu (WM IN), Tsonga (WM ZA), Malay (WM ID and Malaysian Wikimedians), Marathi (WM IN). The priorities for those languages should include (but likely not limited to): * Translation of MediaWiki messages should be 100%. * Those languages should be priorities for every document which should be translated. For example, ongoing Board elections; but also various Meta pages. * We should have the pool of literate people in those languages for various purposes, not just for translation. For example, if we want to create projects in languages of Pakistan, we should have a number of literate Urdu speakers, willing to help newcomers speaking Urdu as their L2 language. Will be back with other languages-related data :) LanguageCodeL1 speakersL2 speakersStandard Arabicarb206,000,000246,000,000Mandarin Chinesecmn847,808,270178,000,000Indonesianind23,200,480140,000,000Hindihin 260,333,620120,000,000Spanishspa398,931,84096,990,000Urduurd64,035,800 94,000,000Frenchfra75,916,15087,000,000Thaitha20,396,93040,000,000Bengaliben 189,261,20019,200,000Zuluzul11,969,10015,700,000Hausahau25,109,00015,000,000 Xhosaxho8,177,30011,000,000Afrikaansafr7,096,81010,300,000Bamanankanbam 4,072,04010,000,000Burmesemya32,035,30010,000,000Congo Swahiliswc1,000
Re: [Wikimedia-l] Priority languages
Milos, The formatting has been stripped from the email because the list software doesn't like HTML, so the table at the bottom is illegible. Is it available elsewhere? Richard Symonds Wikimedia UK 0207 065 0992 Wikimedia UK is a Company Limited by Guarantee registered in England and Wales, Registered No. 6741827. Registered Charity No.1144513. Registered Office 4th Floor, Development House, 56-64 Leonard Street, London EC2A 4LT. United Kingdom. Wikimedia UK is the UK chapter of a global Wikimedia movement. The Wikimedia projects are run by the Wikimedia Foundation (who operate Wikipedia, amongst other projects). *Wikimedia UK is an independent non-profit charity with no legal control over Wikipedia nor responsibility for its contents.* On 27 May 2015 at 22:04, Milos Rancic mill...@gmail.com wrote: Below is the list of the languages sorted by the number of L2 speakers (more than one million of them). L2 speakers appear in two occasions: * First and important to us is about languages used for wider communication. For example, French is L2 among educated people of West Africa. * The second type is related to the native languages in not so good position (either dying or reviving). For example, English is L1 language of the most of Native Americans, as well as Russian is L1 language of the most of ethnicities of former Soviet Union, while their own languages are L2 ones. (They are important in other cases, but not for this purpose.) I omitted English (there is no sense, as we are communicating in English and English is default for all the localization) and few spoken languages (our content is [mostly] written). I also removed some languages which belong to the second category (Irish Gaelic and Scots, for example), but it could be the case that some of the languages from the list belong to that category, as well (though I am pretty sure they don't). There are languages inside of this list with well developed Wikimedia projects and without particular need to promote work on Wikimedia projects among them: French, Spanish and German are the examples. There is no Russian inside of the list, as it's usually L1 language, as mentioned above, but it belongs to the category of the languages with well developed Wikimedia projects. There are also languages spoken in countries with low level of internet access and issues much more important than writing an encyclopedia, like Congo Swahili is. Those are the areas not yet ready even for the projects like OLPC is and we don't have a lot to do there. But there are a number of languages in between with active chapter(s) or user group(s) inside of relevant countries. Those languages should be the priority in promotion collaboration. They are: Arabic (Arabic user groups), Indonesian (WM ID), Hindi (WM IN), Urdu (Pakistani user group), Thai (Thailand UG), Bengali (WM BD), Zulu (WM ZA), Hausa (West African user groups), Xhosa (WM ZA), Afrikaans (WM ZA), Kannada (WM IN), Telugu (WM IN), Tsonga (WM ZA), Malay (WM ID and Malaysian Wikimedians), Marathi (WM IN). The priorities for those languages should include (but likely not limited to): * Translation of MediaWiki messages should be 100%. * Those languages should be priorities for every document which should be translated. For example, ongoing Board elections; but also various Meta pages. * We should have the pool of literate people in those languages for various purposes, not just for translation. For example, if we want to create projects in languages of Pakistan, we should have a number of literate Urdu speakers, willing to help newcomers speaking Urdu as their L2 language. Will be back with other languages-related data :) LanguageCodeL1 speakersL2 speakersStandard Arabicarb206,000,000246,000,000Mandarin Chinesecmn847,808,270178,000,000Indonesianind23,200,480140,000,000Hindihin 260,333,620120,000,000Spanishspa398,931,84096,990,000Urduurd64,035,800 94,000,000Frenchfra75,916,15087,000,000Thaitha20,396,93040,000,000Bengaliben 189,261,20019,200,000Zuluzul11,969,10015,700,000Hausahau25,109,00015,000,000 Xhosaxho8,177,30011,000,000Afrikaansafr7,096,81010,300,000Bamanankanbam 4,072,04010,000,000Burmesemya32,035,30010,000,000Congo Swahiliswc1,000 9,100,000Northern Sothonso4,631,0009,100,000Kannadakan37,739,0409,000,000 Germandeu78,093,9808,000,000Tamiltam68,776,4608,000,000Juladyu2,550,000 7,000,000Lingalalin2,141,3007,000,000Koongokng5,016,5005,000,000Telugutel 74,049,0005,000,000Ibibioibb1,500,0004,500,000Tok Pisintpi122,0004,000,000 Kriokri495,6004,000,000Amharicamh21,811,6004,000,000Bangalabxg~0 3,500,000Tsongatso4,009,0003,400,000Malayzlm15,848,5003,000,000Marathimar 71,780,6603,000,000Sinhalasin15,613,9802,000,000Efikefi405,2602,000,000Duala dua87,7002,000,000Yorubayor19,380,8002,000,000Shonasna10,741,7001,800,000 Vendaven1,294,0001,700,000Sangosag404,0001,600,000Manado Malayxmm850,000 1,500,000Sylhetisyl10,300,0001,500,000Ambonese