Re: [Foundation-l] Push translation

2010-08-17 Thread David Gerard
On 17 August 2010 03:22, Samuel Klein meta...@gmail.com wrote: On Fri, Aug 13, 2010 at 4:28 PM, Michael Galvez michae...@gmail.com wrote: On Fri, Aug 6, 2010 at 5:56 PM, David Gerard dger...@gmail.com wrote: And the data that GTTK gathers from its use in Wikipedia translations? What would

Re: [Foundation-l] Push translation

2010-08-17 Thread stevertigo
Michael Galvez michae...@gmail.com wrote: Translator Toolkit processes generic Media Wiki text, although this is not an officially supported feature and is largely untested.  If you upload a UTF-8 file with extension .mediawiki, Translator Toolkit will try to render the file in the same way

Re: [Foundation-l] Push translation

2010-08-16 Thread Samuel Klein
On Fri, Aug 13, 2010 at 4:28 PM, Michael Galvez michae...@gmail.com wrote: On Fri, Aug 6, 2010 at 5:56 PM, David Gerard dger...@gmail.com wrote: And the data that GTTK gathers from its use in Wikipedia translations? What would need to happen for that to start coming back, in a usable form?

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Fri, Aug 6, 2010 at 2:13 PM, Michael Snow wikipe...@verizon.net wrote: Michael Galvez wrote: 2. Once the articles exist in multiple languages, the articles take on a life of their own and become out of sync. If Wikipedians want to keep those articles in sync, we would like to help

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
Hi Amir, Apologies for the late reply. Replies inline below. Mike On Fri, Aug 6, 2010 at 3:14 PM, Amir E. Aharoni amir.ahar...@mail.huji.ac.il wrote: Dear Michael, I also thank you for joining the discussion. See my question below. 2010/8/6 Michael Galvez michae...@gmail.com: Also, as

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Fri, Aug 6, 2010 at 5:56 PM, David Gerard dger...@gmail.com wrote: On 6 August 2010 18:47, Michael Galvez michae...@gmail.com wrote: 3. We acquire dictionaries on limited licenses from other parties. In general, while we can surface this content on our own sites (e.g., Google

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Sat, Aug 7, 2010 at 1:38 AM, Mark Williamson node...@gmail.com wrote: On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson node...@gmail.com wrote: 2) Implement spelling and punctuation check automatically within GTTK before posting of the articles. There is spell check in

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Sat, Aug 7, 2010 at 4:52 AM, Federico Leva (Nemo) nemow...@gmail.comwrote: Michael Galvez, 05/08/2010 15:12: Sorry for coming into this discussion a bit late. I'm one of the members of Google's translation team, and I wanted to make myself available for feedback/questions. Thank

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Thu, Aug 5, 2010 at 9:45 PM, stevertigo stv...@gmail.com wrote: Michael Galvez michae...@gmail.com wrote: Sorry for coming into this discussion a bit late. I'm one of the members of Google's translation team, and I wanted to make myself available for feedback/questions. Thanks for

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Sat, Aug 7, 2010 at 11:30 PM, Lars Aronsson l...@aronsson.se wrote: On 08/06/2010 07:47 PM, Michael Galvez wrote: 3. We acquire dictionaries on limited licenses from other parties. In general, while we can surface this content on our own sites (e.g., Google Translate, Google

Re: [Foundation-l] Push translation

2010-08-07 Thread Federico Leva (Nemo)
Andreas Kolbe, 07/08/2010 02:23: If Google want to build up their translation memory, I suggest they pay publishers for permission to analyse existing, published translations, and read those into their memory. This will give them a database of translations that the market judged good enough

Re: [Foundation-l] Push translation

2010-08-07 Thread Federico Leva (Nemo)
Michael Galvez, 05/08/2010 15:12: Sorry for coming into this discussion a bit late. I'm one of the members of Google's translation team, and I wanted to make myself available for feedback/questions. Thank you, you've explained some important things. There is spell check in Translator

Re: [Foundation-l] Push translation

2010-08-07 Thread stevertigo
Michael Galvez michae...@gmail.com wrote: Sorry for coming into this discussion a bit late.  I'm one of the members of Google's translation team, and I wanted to make myself available for feedback/questions. Thanks for stopping by. A few questions: 1) Does GTTK have a specific API for

Re: [Foundation-l] Push translation

2010-08-07 Thread Lars Aronsson
On 08/06/2010 07:47 PM, Michael Galvez wrote: 3. We acquire dictionaries on limited licenses from other parties. In general, while we can surface this content on our own sites (e.g., Google Translate, Google Dictionary, Google Translator Toolkit), we don't have permission to donate that data

Re: [Foundation-l] Push translation

2010-08-06 Thread Michael Galvez
Hi Lars, Thanks for the detailed feedback. Some comments inline. Mike On Thu, Aug 5, 2010 at 1:39 PM, Lars Aronsson l...@aronsson.se wrote: On 08/05/2010 03:12 PM, Michael Galvez wrote: Sorry for coming into this discussion a bit late. I'm one of the members of Google's translation

Re: [Foundation-l] Push translation

2010-08-06 Thread Michael Galvez
Hi Mark, Responses inline. Mike On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson node...@gmail.com wrote: 2) Implement spelling and punctuation check automatically within GTTK before posting of the articles. There is spell check in Translator Toolkit, although it's not available for

Re: [Foundation-l] Push translation

2010-08-06 Thread Michael Snow
Michael Galvez wrote: 2. Once the articles exist in multiple languages, the articles take on a life of their own and become out of sync. If Wikipedians want to keep those articles in sync, we would like to help them by enabling section-level translation. I'm guessing that few communities

Re: [Foundation-l] Push translation

2010-08-06 Thread Amir E. Aharoni
Dear Michael, I also thank you for joining the discussion. See my question below. 2010/8/6 Michael Galvez michae...@gmail.com: Also, as far as Indic languages go, I would ask if there's any chance you have any Oriya speakers - with 637 articles, the Oriya Wikipedia is by far the most anemic of

Re: [Foundation-l] Push translation

2010-08-06 Thread David Gerard
On 6 August 2010 18:47, Michael Galvez michae...@gmail.com wrote: 3. We acquire dictionaries on limited licenses from other parties.  In general, while we can surface this content on our own sites (e.g., Google Translate, Google Dictionary, Google Translator Toolkit), we don't have permission

Re: [Foundation-l] Push translation

2010-08-06 Thread Andreas Kolbe
--- On Sat, 31/7/10, Nikola Smolenski smole...@eunet.rs wrote: Interestingly, I have had a completely opposite experiences. When reading a Google translation, it is easy for me to decipher what does it mean even if it is not gramatically correct. When translating, I often hang on deciding

Re: [Foundation-l] Push translation

2010-08-06 Thread Mark Williamson
On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson node...@gmail.com wrote: 2) Implement spelling and punctuation check automatically within GTTK before posting of the articles. There is spell check in Translator Toolkit, although it's not available for all languages.  We don't have any

Re: [Foundation-l] Push translation

2010-08-05 Thread Michael Galvez
Sorry for coming into this discussion a bit late. I'm one of the members of Google's translation team, and I wanted to make myself available for feedback/questions. Quoting some suggestions from Mark earlier in the thread: 1) Fix some of the formatting errors with GTTK. Would this really be so

Re: [Foundation-l] Push translation

2010-08-05 Thread Lars Aronsson
On 08/05/2010 03:12 PM, Michael Galvez wrote: Sorry for coming into this discussion a bit late. I'm one of the members of Google's translation team, and I wanted to make myself available for feedback/questions. This is an unusual and most welcome step for Google. When I first learned about

Re: [Foundation-l] Push translation

2010-08-05 Thread Mark Williamson
2) Implement spelling and punctuation check automatically within GTTK before posting of the articles. There is spell check in Translator Toolkit, although it's not available for all languages.  We don't have any punctuation checks today and I doubt that we can release this anytime soon.  (If

Re: [Foundation-l] Push translation

2010-08-04 Thread Federico Leva (Nemo)
Aphaia, 27/07/2010 21:33: I've noticed many of English Wikipedia articles cite only English written articles even if the topics are of non-English world. And normally, specially in the developing world, the most comprehend sources are found in their own languages - how can those articles be

Re: [Foundation-l] Push translation

2010-08-02 Thread stevertigo
Nikola Smolenski smole...@eunet.rs wrote: Interestingly, I have had a completely opposite experiences. When reading a Google translation, it is easy for me to decipher what does it mean even if it is not gramatically correct. When translating, I often hang on deciding what sentence structure

Re: [Foundation-l] Push translation

2010-07-31 Thread Nikola Smolenski
Дана Friday 30 July 2010 02:31:44 Andreas Kolbe написа: Having tried it tonight, I don't find the Google translator toolkit all that useful, at least not at this present level of development. To sum up: First you read their translation. Then you scratch your head: What the deuce is that

Re: [Foundation-l] Push translation

2010-07-30 Thread stevertigo
Muhammad Yahia shipmas...@gmail.com wrote:   Where is the community? where is the involvement and exchange of ideas and   continuous evolvement of articles? where's the wiki in wikipedia?   - I see it as POV to assume that wiki x has the 'perfect' article on a   certain subject such that

Re: [Foundation-l] Push translation

2010-07-29 Thread Amir E. Aharoni
2010/7/29 Mark Williamson node...@gmail.com I don't think that's completely unwise, though. I'm sure they get tons of crackpot e-mails all the time. I was reading an official blog about Google Translate, and in the post about their Wikipedia contests, someone wrote an angry comment that

Re: [Foundation-l] Push translation

2010-07-29 Thread Muhammad Yahia
My 2c : - I dont know where everyone came up with the notion that the tool produces good results. Most of the articles on both Google's projects on the Arabic wikipedia are barely intelligible, with broken sentences, weird terminology and generally can be spotted right away (see my

Re: [Foundation-l] Push translation

2010-07-29 Thread Andreas Kolbe
: Cool Hand Luke user.coolhandl...@gmail.com Subject: Re: [Foundation-l] Push translation To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Date: Wednesday, 28 July, 2010, 0:27 Mass machine translations (pushing them onto other projects that may or may not want them

Re: [Foundation-l] Push translation

2010-07-28 Thread Mark Williamson
Yes, of course if it's not actually reviewed and corrected by a human it's going to be bad. What I said was that if it's used as it was meant to be used, the results should be indistinguishable from a normal human translation, regardless of the language involved because all mistakes would be fixed

Re: [Foundation-l] Push translation

2010-07-28 Thread Amir E. Aharoni
Is anyone from Google reading this thread? Because of this thread i tried to play with the Google Translator Toolkit a little and found some technical problems. When i tried to send bug reports about them through the Contact us form, i received after a few minutes a bounce message from the

Re: [Foundation-l] Push translation

2010-07-28 Thread Mark Williamson
Google is, in my experience, very difficult for regular people to get in touch with. Sometimes, when a product is in beta, they give you a way to contact them. They used to have an e-mail to contact them at if you had information about bilingual corpora (I found one online from the Nunavut

Re: [Foundation-l] Push translation

2010-07-27 Thread Mark Williamson
Shiju Alex, Stevertigo is just one en.wikipedian. As far as using exact copies goes, I don't know about the policy at your home wiki, but in many Wikipedias this sort of back-and-forth translation and trading and sharing of articles has been going on since day one, not just with English but with

Re: [Foundation-l] Push translation

2010-07-27 Thread Ray Saintonge
Mark Williamson wrote: Google Translator Toolkit is particularly problematic because it messes up the existing article formatting (one example, it messes up internal links by putting punctuation marks before double brackets when they should be after) and it includes incompatible formatting

Re: [Foundation-l] Push translation

2010-07-27 Thread Ray Saintonge
stevertigo wrote: Mark Williamson node...@gmail.com wrote: I would like to add to this that I think the worst part of this idea is the assumption that other languages should take articles from en.wp. The idea is that most of en.wp's articles are well-enough written, and written in

Re: [Foundation-l] Push translation

2010-07-27 Thread Shiju Alex
Google Translator Toolkit is particularly problematic because it messes up the existing article formatting (one example, it messes up internal links by putting punctuation marks before double brackets when they should be after) and it includes incompatible formatting such as redlinked

Re: [Foundation-l] Push translation

2010-07-27 Thread Mark Williamson
On Tue, Jul 27, 2010 at 1:36 AM, Shiju Alex shijualexonl...@gmail.com wrote:   1. Ban the project of Google as done by the Bengali wiki community (Bad   solution, and I am personally against this solution)   2. Ask Google to engage wiki community (As happened in the case of Tamil)   to find

Re: [Foundation-l] Push translation

2010-07-27 Thread Shiju Alex
On Tue, Jul 27, 2010 at 11:42 AM, Mark Williamson node...@gmail.com wrote: 4) Include a list of most needed articles for people to create, rather than random articles that will be of little use to local readers. Some articles, such as those on local topics, have the added benefit of

Re: [Foundation-l] Push translation

2010-07-27 Thread Aphaia
I've noticed many of English Wikipedia articles cite only English written articles even if the topics are of non-English world. And normally, specially in the developing world, the most comprehend sources are found in their own languages - how can those articles be assured in NPOV when they ignore

Re: [Foundation-l] Push translation

2010-07-27 Thread Mark Williamson
Aphaia, Shiju Alex and I are referring to Google Translator Toolkit, not Google Translate. If the person using the Toolkit uses it as it was _meant_ to be used, the results should be as good as a human translation because they've been reviewed and corrected by a human. -m. On Tue, Jul 27, 2010

Re: [Foundation-l] Push translation

2010-07-27 Thread Casey Brown
On Tue, Jul 27, 2010 at 3:44 PM, Mark Williamson node...@gmail.com wrote: Aphaia, Shiju Alex and I are referring to Google Translator Toolkit, not Google Translate. If the person using the Toolkit uses it as it was _meant_ to be used, the results should be as good as a human translation

Re: [Foundation-l] Push translation

2010-07-27 Thread Michael Snow
Aphaia wrote: Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be called toolkit, but it doesn't change it is useless; it cannot deal with syntax properly, i.e. conjugation etc. at this moment. Intended to be reviewed and corrected by a human doesn't assure it was really

Re: [Foundation-l] Push translation

2010-07-27 Thread Cool Hand Luke
Mass machine translations (pushing them onto other projects that may or may not want them) is a very bad idea. Beginning in 2004-05, a non-native speaker on en.wp decided that he should import slightly-cleaned babelfish translations of foreign language articles that did not have articles on the

Re: [Foundation-l] Push translation

2010-07-27 Thread Aphaia
On Wed, Jul 28, 2010 at 7:26 AM, Michael Snow wikipe...@verizon.net wrote: Aphaia wrote: Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be called toolkit, but it doesn't change it is useless; it cannot deal with syntax properly, i.e. conjugation etc. at this moment.  

Re: [Foundation-l] Push translation

2010-07-27 Thread Ting Chen
Hello all, I am a heavy translator on WikiMedia projects. I would say more than 95% of my contributions on content is translation. But I am against a blind translation. For example mostly I would translate british or north american related content from en-wp to zh-wp, and not from other

Re: [Foundation-l] Push translation

2010-07-26 Thread Pavlo Shevelo
I don't know whether other wikipedias have similar policies, but on the Italian Wikipedia an article which is just a machine translation can be speedy deleted according to our policies. The reason is that machine translations are not good enough and the autotranslated text is too difficult to

Re: [Foundation-l] Push translation

2010-07-26 Thread stevertigo
Mark Williamson node...@gmail.com wrote: I would like to add to this that I think the worst part of this idea is the assumption that other languages should take articles from en.wp. The idea is that most of en.wp's articles are well-enough written, and written in accord with NPOV to a

Re: [Foundation-l] Push translation

2010-07-26 Thread Oliver Keyes
The idea is that most of en.wp's articles are well-enough written, and written in accord with NPOV to a sufficient degree to overcome any such criticism of 'imperial encyclopedism.' - really? It's a) not particularly well-written, mostly and b) referenced overwhelmingly to English-language

Re: [Foundation-l] Push translation

2010-07-26 Thread Shiju Alex
really? It's a) not particularly well-written, mostly and b) referenced overwhelmingly to English-language sources, most of which are, you guessed it.. Western in nature. Very much true. Now English Wikipedians want some one to translate and use the exact copy of en:wp in all other

Re: [Foundation-l] Push translation

2010-07-25 Thread Casey Brown
On Sun, Jul 25, 2010 at 1:39 AM, Mark Williamson node...@gmail.com wrote: Wikipedias are not for _cultures_, they are for languages. If I and I'm surprised to hear that coming from someone who I thought to be a student of languages. I think you might want to read an article from today's Wall

Re: [Foundation-l] Push translation

2010-07-25 Thread Ray Saintonge
stevertigo wrote: Translation between wikis currently exists as a largely pulling paradigm: Someone on the target wiki finds an article in another language (English for example) and then pulls it to their language wiki. These days Google and other translate tools are good enough to use as

Re: [Foundation-l] Push translation

2010-07-25 Thread Mark Williamson
On Sat, Jul 24, 2010 at 11:03 PM, Casey Brown li...@caseybrown.org wrote: On Sun, Jul 25, 2010 at 1:39 AM, Mark Williamson node...@gmail.com wrote: Wikipedias are not for _cultures_, they are for languages. If I and I'm surprised to hear that coming from someone who I thought to be a student

Re: [Foundation-l] Push translation

2010-07-25 Thread Mark Williamson
I would like to add to this that I think the worst part of this idea is the assumption that other languages should take articles from en.wp. I would be in favor of an international, language-free Wikipedia if/when perfect (or 99.99% accurate) MT software exists, but that is not currently the

[Foundation-l] Push translation

2010-07-24 Thread stevertigo
Translation between wikis currently exists as a largely pulling paradigm: Someone on the target wiki finds an article in another language (English for example) and then pulls it to their language wiki. These days Google and other translate tools are good enough to use as the starting basis for an

Re: [Foundation-l] Push translation

2010-07-24 Thread Pavlo Shevelo
These days Google and other translate tools are good enough to use as the starting basis for an translated article No, it's far not true - at least for such target language as Ukrainian etc. So any attempt of push translation will be almost the disaster... On Sat, Jul 24, 2010 at 3:57 AM,

Re: [Foundation-l] Push translation

2010-07-24 Thread Bence Damokos
As far as push translation goes, there are languages where it could almost work and where it couldn't. (Consider the experience of the Google team with the Bengali Wikipedia - http://googletranslate.blogspot.com/2010/07/translating-wikipedia.html ) Bence

Re: [Foundation-l] Push translation

2010-07-24 Thread Oliver Keyes
If there are issues, they can be overcome. The fact of the matter is that the vast majority of articles in English can be pushed over to other languages, and fill a need for those topics in those languages. - if there are vast swathes in other languages that aren't filled, it's normally

Re: [Foundation-l] Push translation

2010-07-24 Thread Casey Brown
On Sat, Jul 24, 2010 at 4:11 PM, Pavlo Shevelo pavlo.shev...@gmail.com wrote: These days Google and other translate tools are good enough to use as the starting basis for an translated article No, it's far not true - at least for such target language as Ukrainian etc. So any attempt of push

Re: [Foundation-l] Push translation

2010-07-24 Thread Cristian Consonni
2010/7/24 Casey Brown li...@caseybrown.org: On Sat, Jul 24, 2010 at 4:11 PM, Pavlo Shevelo pavlo.shev...@gmail.com wrote: These days Google and other translate tools are good enough to use as the starting basis for an translated article No, it's far not true - at least for such target

Re: [Foundation-l] Push translation

2010-07-24 Thread Oliver Keyes
Agreed. There's one wiki which artificially inflated the number of articles it had via a bot (I forget the specific language). That's not a way to increase the wiki's strength. There's an old phrase used on en-wiki; africa is not a redlink. It means that because we have articles on a lot of common

Re: [Foundation-l] Push translation

2010-07-24 Thread Mark Williamson
Wikipedias are not for _cultures_, they are for languages. If I and 1,000 other Americans suddenly learnt French (to the point of native-level fluency) and decided to read and edit the French Wikipedia, it would belong to us just as much as to anybody else. This came up recently in the debate

Re: [Foundation-l] Push translation

2010-07-24 Thread Mark Williamson
Bence, that's a different topic - MAT (Machine Aided Translation), and in the case of Bengali, I believe simply the use of a translation memory system. Some of the comments on that page seem to be quite misinformed, ranging from people who thought Google was inserting unrevised machine