On 17 August 2010 03:22, Samuel Klein meta...@gmail.com wrote:
On Fri, Aug 13, 2010 at 4:28 PM, Michael Galvez michae...@gmail.com wrote:
On Fri, Aug 6, 2010 at 5:56 PM, David Gerard dger...@gmail.com wrote:
And the data that GTTK gathers from its use in Wikipedia translations?
What would
Michael Galvez michae...@gmail.com wrote:
Translator Toolkit processes generic Media Wiki text, although this is not
an officially supported feature and is largely untested. If you upload a
UTF-8 file with extension .mediawiki, Translator Toolkit will try to
render the file in the same way
On Fri, Aug 13, 2010 at 4:28 PM, Michael Galvez michae...@gmail.com wrote:
On Fri, Aug 6, 2010 at 5:56 PM, David Gerard dger...@gmail.com wrote:
And the data that GTTK gathers from its use in Wikipedia translations?
What would need to happen for that to start coming back, in a usable
form?
On Fri, Aug 6, 2010 at 2:13 PM, Michael Snow wikipe...@verizon.net wrote:
Michael Galvez wrote:
2. Once the articles exist in multiple languages, the articles take on a
life of their own and become out of sync. If Wikipedians want to keep
those
articles in sync, we would like to help
Hi Amir,
Apologies for the late reply. Replies inline below.
Mike
On Fri, Aug 6, 2010 at 3:14 PM, Amir E. Aharoni
amir.ahar...@mail.huji.ac.il wrote:
Dear Michael, I also thank you for joining the discussion. See my
question below.
2010/8/6 Michael Galvez michae...@gmail.com:
Also, as
On Fri, Aug 6, 2010 at 5:56 PM, David Gerard dger...@gmail.com wrote:
On 6 August 2010 18:47, Michael Galvez michae...@gmail.com wrote:
3. We acquire dictionaries on limited licenses from other parties. In
general, while we can surface this content on our own sites (e.g., Google
On Sat, Aug 7, 2010 at 1:38 AM, Mark Williamson node...@gmail.com wrote:
On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson node...@gmail.com
wrote:
2) Implement spelling and punctuation check automatically within GTTK
before
posting of the articles.
There is spell check in
On Sat, Aug 7, 2010 at 4:52 AM, Federico Leva (Nemo) nemow...@gmail.comwrote:
Michael Galvez, 05/08/2010 15:12:
Sorry for coming into this discussion a bit late. I'm one of the members
of
Google's translation team, and I wanted to make myself available for
feedback/questions.
Thank
On Thu, Aug 5, 2010 at 9:45 PM, stevertigo stv...@gmail.com wrote:
Michael Galvez michae...@gmail.com wrote:
Sorry for coming into this discussion a bit late. I'm one of the members
of
Google's translation team, and I wanted to make myself available for
feedback/questions.
Thanks for
On Sat, Aug 7, 2010 at 11:30 PM, Lars Aronsson l...@aronsson.se wrote:
On 08/06/2010 07:47 PM, Michael Galvez wrote:
3. We acquire dictionaries on limited licenses from other parties. In
general, while we can surface this content on our own sites (e.g., Google
Translate, Google
Andreas Kolbe, 07/08/2010 02:23:
If Google want to build up their translation memory, I suggest they pay
publishers for permission to analyse existing, published translations, and
read those into their memory. This will give them a database of translations
that the market judged good enough
Michael Galvez, 05/08/2010 15:12:
Sorry for coming into this discussion a bit late. I'm one of the members of
Google's translation team, and I wanted to make myself available for
feedback/questions.
Thank you, you've explained some important things.
There is spell check in Translator
Michael Galvez michae...@gmail.com wrote:
Sorry for coming into this discussion a bit late. I'm one of the members of
Google's translation team, and I wanted to make myself available for
feedback/questions.
Thanks for stopping by. A few questions: 1) Does GTTK have a specific
API for
On 08/06/2010 07:47 PM, Michael Galvez wrote:
3. We acquire dictionaries on limited licenses from other parties. In
general, while we can surface this content on our own sites (e.g., Google
Translate, Google Dictionary, Google Translator Toolkit), we don't have
permission to donate that data
Hi Lars,
Thanks for the detailed feedback. Some comments inline.
Mike
On Thu, Aug 5, 2010 at 1:39 PM, Lars Aronsson l...@aronsson.se wrote:
On 08/05/2010 03:12 PM, Michael Galvez wrote:
Sorry for coming into this discussion a bit late. I'm one of the members
of
Google's translation
Hi Mark,
Responses inline.
Mike
On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson node...@gmail.com wrote:
2) Implement spelling and punctuation check automatically within GTTK
before
posting of the articles.
There is spell check in Translator Toolkit, although it's not available
for
Michael Galvez wrote:
2. Once the articles exist in multiple languages, the articles take on a
life of their own and become out of sync. If Wikipedians want to keep those
articles in sync, we would like to help them by enabling section-level
translation.
I'm guessing that few communities
Dear Michael, I also thank you for joining the discussion. See my
question below.
2010/8/6 Michael Galvez michae...@gmail.com:
Also, as far as Indic languages go, I would ask if there's any chance
you have any Oriya speakers - with 637 articles, the Oriya Wikipedia
is by far the most anemic of
On 6 August 2010 18:47, Michael Galvez michae...@gmail.com wrote:
3. We acquire dictionaries on limited licenses from other parties. In
general, while we can surface this content on our own sites (e.g., Google
Translate, Google Dictionary, Google Translator Toolkit), we don't have
permission
--- On Sat, 31/7/10, Nikola Smolenski smole...@eunet.rs wrote:
Interestingly, I have had a completely opposite
experiences. When reading a
Google translation, it is easy for me to decipher what does
it mean even if
it is not gramatically correct. When translating, I often
hang on deciding
On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson node...@gmail.com wrote:
2) Implement spelling and punctuation check automatically within GTTK
before
posting of the articles.
There is spell check in Translator Toolkit, although it's not available
for
all languages. We don't have any
Sorry for coming into this discussion a bit late. I'm one of the members of
Google's translation team, and I wanted to make myself available for
feedback/questions.
Quoting some suggestions from Mark earlier in the thread:
1) Fix some of the formatting errors with GTTK. Would this really be so
On 08/05/2010 03:12 PM, Michael Galvez wrote:
Sorry for coming into this discussion a bit late. I'm one of the members of
Google's translation team, and I wanted to make myself available for
feedback/questions.
This is an unusual and most welcome step for Google. When I first
learned about
2) Implement spelling and punctuation check automatically within GTTK before
posting of the articles.
There is spell check in Translator Toolkit, although it's not available for
all languages. We don't have any punctuation checks today and I doubt that
we can release this anytime soon. (If
Aphaia, 27/07/2010 21:33:
I've noticed many of English Wikipedia articles cite only English
written articles even if the topics are of non-English world. And
normally, specially in the developing world, the most comprehend
sources are found in their own languages - how can those articles be
Nikola Smolenski smole...@eunet.rs wrote:
Interestingly, I have had a completely opposite experiences. When reading a
Google translation, it is easy for me to decipher what does it mean even if
it is not gramatically correct. When translating, I often hang on deciding
what sentence structure
Дана Friday 30 July 2010 02:31:44 Andreas Kolbe написа:
Having tried it tonight, I don't find the Google translator toolkit all
that useful, at least not at this present level of development. To sum up:
First you read their translation.
Then you scratch your head: What the deuce is that
Muhammad Yahia shipmas...@gmail.com wrote:
Where is the community? where is the involvement and exchange of ideas and
continuous evolvement of articles? where's the wiki in wikipedia?
- I see it as POV to assume that wiki x has the 'perfect' article on a
certain subject such that
2010/7/29 Mark Williamson node...@gmail.com
I don't think that's completely unwise, though. I'm sure they get tons
of crackpot e-mails all the time. I was reading an official blog about
Google Translate, and in the post about their Wikipedia contests,
someone wrote an angry comment that
My 2c :
- I dont know where everyone came up with the notion that the tool
produces good results. Most of the articles on both Google's projects on the
Arabic wikipedia are barely intelligible, with broken sentences, weird
terminology and generally can be spotted right away (see my
: Cool Hand Luke user.coolhandl...@gmail.com
Subject: Re: [Foundation-l] Push translation
To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org
Date: Wednesday, 28 July, 2010, 0:27
Mass machine translations (pushing
them onto other projects that may or
may not want them
Yes, of course if it's not actually reviewed and corrected by a human
it's going to be bad. What I said was that if it's used as it was
meant to be used, the results should be indistinguishable from a
normal human translation, regardless of the language involved because
all mistakes would be fixed
Is anyone from Google reading this thread?
Because of this thread i tried to play with the Google Translator Toolkit a
little and found some technical problems. When i tried to send bug reports
about them through the Contact us form, i received after a few minutes a
bounce message from the
Google is, in my experience, very difficult for regular people to
get in touch with. Sometimes, when a product is in beta, they give you
a way to contact them. They used to have an e-mail to contact them at
if you had information about bilingual corpora (I found one online
from the Nunavut
Shiju Alex,
Stevertigo is just one en.wikipedian.
As far as using exact copies goes, I don't know about the policy at
your home wiki, but in many Wikipedias this sort of back-and-forth
translation and trading and sharing of articles has been going on
since day one, not just with English but with
Mark Williamson wrote:
Google Translator Toolkit is particularly problematic because it
messes up the existing article formatting (one example, it messes up
internal links by putting punctuation marks before double brackets
when they should be after) and it includes incompatible formatting
stevertigo wrote:
Mark Williamson node...@gmail.com wrote:
I would like to add to this that I think the worst part of this idea
is the assumption that other languages should take articles from
en.wp.
The idea is that most of en.wp's articles are well-enough written, and
written in
Google Translator Toolkit is particularly problematic because it
messes up the existing article formatting (one example, it messes up
internal links by putting punctuation marks before double brackets
when they should be after) and it includes incompatible formatting
such as redlinked
On Tue, Jul 27, 2010 at 1:36 AM, Shiju Alex shijualexonl...@gmail.com wrote:
1. Ban the project of Google as done by the Bengali wiki community (Bad
solution, and I am personally against this solution)
2. Ask Google to engage wiki community (As happened in the case of Tamil)
to find
On Tue, Jul 27, 2010 at 11:42 AM, Mark Williamson node...@gmail.com wrote:
4) Include a list of most needed articles for people to create, rather
than random articles that will be of little use to local readers. Some
articles, such as those on local topics, have the added benefit of
I've noticed many of English Wikipedia articles cite only English
written articles even if the topics are of non-English world. And
normally, specially in the developing world, the most comprehend
sources are found in their own languages - how can those articles be
assured in NPOV when they ignore
Aphaia, Shiju Alex and I are referring to Google Translator Toolkit,
not Google Translate. If the person using the Toolkit uses it as it
was _meant_ to be used, the results should be as good as a human
translation because they've been reviewed and corrected by a human.
-m.
On Tue, Jul 27, 2010
On Tue, Jul 27, 2010 at 3:44 PM, Mark Williamson node...@gmail.com wrote:
Aphaia, Shiju Alex and I are referring to Google Translator Toolkit,
not Google Translate. If the person using the Toolkit uses it as it
was _meant_ to be used, the results should be as good as a human
translation
Aphaia wrote:
Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be
called toolkit, but it doesn't change it is useless; it cannot deal
with syntax properly, i.e. conjugation etc. at this moment. Intended
to be reviewed and corrected by a human doesn't assure it was really
Mass machine translations (pushing them onto other projects that may or
may not want them) is a very bad idea.
Beginning in 2004-05, a non-native speaker on en.wp decided that he should
import slightly-cleaned babelfish translations of foreign language articles
that did not have articles on the
On Wed, Jul 28, 2010 at 7:26 AM, Michael Snow wikipe...@verizon.net wrote:
Aphaia wrote:
Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be
called toolkit, but it doesn't change it is useless; it cannot deal
with syntax properly, i.e. conjugation etc. at this moment.
Hello all,
I am a heavy translator on WikiMedia projects. I would say more than 95%
of my contributions on content is translation. But I am against a blind
translation. For example mostly I would translate british or north
american related content from en-wp to zh-wp, and not from other
I don't know whether other wikipedias have similar policies, but on
the Italian Wikipedia an article which is just a machine translation
can be speedy deleted according to our policies. The reason is that
machine translations are not good enough and the autotranslated text
is too difficult to
Mark Williamson node...@gmail.com wrote:
I would like to add to this that I think the worst part of this idea
is the assumption that other languages should take articles from
en.wp.
The idea is that most of en.wp's articles are well-enough written, and
written in accord with NPOV to a
The idea is that most of en.wp's articles are well-enough written, and
written in accord with NPOV to a sufficient degree to overcome any
such criticism of 'imperial encyclopedism.' - really? It's a) not
particularly well-written, mostly and b) referenced overwhelmingly to
English-language
really? It's a) not
particularly well-written, mostly and b) referenced overwhelmingly to
English-language sources, most of which are, you guessed it.. Western in
nature.
Very much true. Now English Wikipedians want some one to translate and use
the exact copy of en:wp in all other
On Sun, Jul 25, 2010 at 1:39 AM, Mark Williamson node...@gmail.com wrote:
Wikipedias are not for _cultures_, they are for languages. If I and
I'm surprised to hear that coming from someone who I thought to be a
student of languages. I think you might want to read an
article from today's Wall
stevertigo wrote:
Translation between wikis currently exists as a largely pulling
paradigm: Someone on the target wiki finds an article in another
language (English for example) and then pulls it to their language
wiki.
These days Google and other translate tools are good enough to use as
On Sat, Jul 24, 2010 at 11:03 PM, Casey Brown li...@caseybrown.org wrote:
On Sun, Jul 25, 2010 at 1:39 AM, Mark Williamson node...@gmail.com wrote:
Wikipedias are not for _cultures_, they are for languages. If I and
I'm surprised to hear that coming from someone who I thought to be a
student
I would like to add to this that I think the worst part of this idea
is the assumption that other languages should take articles from
en.wp.
I would be in favor of an international, language-free Wikipedia
if/when perfect (or 99.99% accurate) MT software exists, but that is
not currently the
Translation between wikis currently exists as a largely pulling
paradigm: Someone on the target wiki finds an article in another
language (English for example) and then pulls it to their language
wiki.
These days Google and other translate tools are good enough to use as
the starting basis for an
These days Google and other translate tools are good enough to use as
the starting basis for an translated article
No, it's far not true - at least for such target language as Ukrainian etc.
So any attempt of push translation will be almost the disaster...
On Sat, Jul 24, 2010 at 3:57 AM,
As far as push translation goes, there are languages where it could almost
work and where it couldn't. (Consider the experience of the Google team with
the Bengali Wikipedia -
http://googletranslate.blogspot.com/2010/07/translating-wikipedia.html )
Bence
If there are issues, they can be overcome. The fact of the matter is
that the vast majority of articles in English can be pushed over to
other languages, and fill a need for those topics in those languages. - if
there are vast swathes in other languages that aren't filled, it's normally
On Sat, Jul 24, 2010 at 4:11 PM, Pavlo Shevelo pavlo.shev...@gmail.com wrote:
These days Google and other translate tools are good enough to use as
the starting basis for an translated article
No, it's far not true - at least for such target language as Ukrainian etc.
So any attempt of push
2010/7/24 Casey Brown li...@caseybrown.org:
On Sat, Jul 24, 2010 at 4:11 PM, Pavlo Shevelo pavlo.shev...@gmail.com
wrote:
These days Google and other translate tools are good enough to use as
the starting basis for an translated article
No, it's far not true - at least for such target
Agreed. There's one wiki which artificially inflated the number of articles
it had via a bot (I forget the specific language). That's not a way to
increase the wiki's strength. There's an old phrase used on en-wiki; africa
is not a redlink. It means that because we have articles on a lot of common
Wikipedias are not for _cultures_, they are for languages. If I and
1,000 other Americans suddenly learnt French (to the point of
native-level fluency) and decided to read and edit the French
Wikipedia, it would belong to us just as much as to anybody else.
This came up recently in the debate
Bence, that's a different topic - MAT (Machine Aided Translation), and
in the case of Bengali, I believe simply the use of a translation
memory system. Some of the comments on that page seem to be quite
misinformed, ranging from people who thought Google was inserting
unrevised machine
64 matches
Mail list logo