Brian wrote:
> In the absence of a sentence aligned corpus one must be created.
It would be nice if such a corpus (or rather, the resulting
dictionary of translated words, phrases and sentences) could also
be "open content". Are you in talks with Google about this,
Brian? Would they be inter
In talks with Google? Oh I wish ;)
There are lots of algorithms that do sentence alignment automatically. The
different language articles don't have to be identical for Google to align
them. So we've basically already got what they've got in terms of Wikipedia
data.
On Wed, Jun 10, 2009 at 1:05 A
2009/6/10 Brian :
> Not only did you not provide a critique of my more general claim (that the
> user does not enter into a contract with Google regarding Wikipedia's data)
> but you have no provided any sort of well founded critique of this one.
> You've basically said, in both cases, "I don't bel
On Tue, Jun 9, 2009 at 23:42, Brian wrote:
> Google has built in support for using its machine translation technology to
> help bootstrap human translations of Wikipedia articles.
>
> http://translate.google.com/toolkit/docupload
>
> The benefit to Google is clear - they need sentence-aligned text
On Wed, Jun 10, 2009 at 00:54, masti wrote:
> current level of sophistication of translation tools, especialy of
> languages that do not belog to the same group as english, german,
> french, etc. is completely useless.
Let me disagree. Hungarian is not in the same group by far, and the
results mak
Amir E. Aharoni wrote:
> On Tue, Jun 9, 2009 at 23:42, Brian wrote:
>> Google has built in support for using its machine translation technology to
>> help bootstrap human translations of Wikipedia articles.
>>
>> http://translate.google.com/toolkit/docupload
>>
>> The benefit to Google is clear - t
tion contest, so I'll go back to the shadow.
:-)
grin
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
______ ESET Smart Security - Vmrusdefinmciss adatbazis: 4143 (20090610)
What I see as a great feature in the toolkit is the translation memory: in
practice (after you switch of the machine translation), common phrases in
Wikipedia articles - like "external links", "notes", "history", "early life"
etc. - are pretranslated once a human has already translated them; if mor
On Wed, Jun 10, 2009 at 14:46, Bence Damokos wrote:
> What I see as a great feature in the toolkit is the translation memory: in
> practice (after you switch of the machine translation), common phrases in
> Wikipedia articles - like "external links", "notes", "history", "early life"
> etc. - are pr
On Wed, Jun 10, 2009 at 1:56 PM, Amir E. Aharoni wrote:
> On Wed, Jun 10, 2009 at 14:46, Bence Damokos wrote:
> > What I see as a great feature in the toolkit is the translation memory:
> in
> > practice (after you switch of the machine translation), common phrases in
> > Wikipedia articles - like
> current level of sophistication of translation tools, especialy of
> languages that do not belog to the same group as english, german,
> french, etc. is completely useless.
>
> Machine translations into slavic languages are to be deleted from wiki
> immediatealy.
>
> masti
>
Just to confirm, yest
On Wed, Jun 10, 2009 at 06:22, David Goodman wrote:
> On Tue, Jun 9, 2009 at 6:01 PM, Amir E. Aharoni wrote:
>
>> An unedited machine-translated text is likely to be speedily deleted
>> as patent nonsense, before copyvio is even considered.
>
> If it is deleted as nonsense, that will be a gross er
Such an approach has an critical flaw. I don’t know whether this
applies to, say, English—French translations, but it is known to be
present for cyrillic languages. Statistical approach sometimes
discovers false connections that result in factual errors. Examples of
“translating”, say, “50 USD” as
Kalan wrote:
> present for cyrillic languages. Statistical approach sometimes
> discovers false connections that result in factual errors. Examples of
> “translating”, say, “50 USD” as “50 000 UAH” within a particular
> context are known; more of such things can arise unexpectedly. So, at
The funn
I can not help share this with you.
I was looking for the name "devouard" in a little tool I just discovered
today (TouchGraph).
And I was surprised to discover that the word "devouard" was highly
linked to the Hoggar plateau (Ahaggar) in Algeria. I consequently
clicked on the central point ap
Дана Wednesday 10 June 2009 16:36:38 Florence Devouard написа:
> But frankly, I am super pleased to find out that one of the pict I
> uploaded 4 years ago are now featured in Britannica :-)
And they made a honest effort to be GFDL-compliant. I wonder how many more
such images are there.
On Wed, Jun 10, 2009 at 11:10 AM, Nikola Smolenski wrote:
> Дана Wednesday 10 June 2009 16:36:38 Florence Devouard написа:
>> But frankly, I am super pleased to find out that one of the pict I
>> uploaded 4 years ago are now featured in Britannica :-)
>
> And they made a honest effort to be GFDL-co
Sometimes cities are "translated" - "Koper" was translated to English
from Slovene as "Chicago" and "Kranj" as "Miami"... of course Kranj is
100km inland and Miami is largely beachfront and the opposite with
Chicago and Koper.
"Ljubljana" was translated to English in earlier phases of the
software
2009/6/10 Florence Devouard :
> I can not help share this with you.
>
> I was looking for the name "devouard" in a little tool I just discovered
> today (TouchGraph).
>
> And I was surprised to discover that the word "devouard" was highly
> linked to the Hoggar plateau (Ahaggar) in Algeria. I conse
Of course these are now things that you are able to fix and which can be
shared with everyone.
On Wed, Jun 10, 2009 at 9:32 AM, Mark Williamson wrote:
> Sometimes cities are "translated" - "Koper" was translated to English
> from Slovene as "Chicago" and "Kranj" as "Miami"... of course Kranj is
On Wed, Jun 10, 2009 at 19:29, Brian wrote:
> Of course these are now things that you are able to fix and which can be
> shared with everyone.
Unfortunately it's Google, not Wikipedia. There's mysterious Google
code behind it all; not MediaWiki, whose code everyone is free to
study and fix.
Not e
Дана Wednesday 10 June 2009 17:32:00 Mark Williamson написа:
> "Ljubljana" was translated to English in earlier phases of the
> software as "rape"... In Italian to English, "L'Italia" became
Well that is a correct translation :)
___
foundation-l mailing
Bennó wrote:
> Let me agree with it completely (out of the shadow ;). This feature's aim is
> obviously to help understand totally "alien" texts to a certain [at least
> minimal?] extent. This whole thing has absolutely nothing to do with
> 'translation/interpretation' in it's proper sense. It's a
I would just like to point out that every single critic has ignored the
premise that I started this thread with:
"This is a great example of machines helping people help machines help
people."
On Wed, Jun 10, 2009 at 10:53 AM, Ray Saintonge wrote:
> Bennó wrote:
> > Let me agree with it complet
Thanks Nikola, I just laughed enough to last me for the rest of the week.
Mark
On Wed, Jun 10, 2009 at 9:49 AM, Nikola Smolenski wrote:
> Дана Wednesday 10 June 2009 17:32:00 Mark Williamson написа:
>> "Ljubljana" was translated to English in earlier phases of the
>> software as "rape"... In It
Brian wrote:
> Of course these are now things that you are able to fix and which can be
> shared with everyone.
>
Sure, the funny errors are the most obvious and most easily fixed. The
problematic ones are more subtle, remain unnoticed, and more readily
spread misunderstanding.
Ec
> On Wed,
Brian wrote:
> I would just like to point out that every single critic has ignored the
> premise that I started this thread with:
>
> "This is a great example of machines helping people help machines help
> people."
>
>
I don't disagree with that point, but I often note in real life that
many
> Sometimes cities are "translated" - "Koper" was translated to English
> from Slovene as "Chicago" and "Kranj" as "Miami"... of course Kranj is
> 100km inland and Miami is largely beachfront and the opposite with
> Chicago and Koper.
>
> "Ljubljana" was translated to English in earlier phases of t
On Wed, Jun 10, 2009 at 20:01, Brian wrote:
> I would just like to point out that every single critic has ignored the
> premise that I started this thread with:
>
> "This is a great example of machines helping people help machines help
> people."
That, again, would be Wikipedia, not Google. No-one
2009/6/9 Erik Moeller :
> All,
>
> after some internal discussion with the licensing update committee,
> I'm proposing the following final site terms to be implemented on all
> Wikimedia projects that currently use GFDL as their primary content
> license, as well as the relevant multimedia template
Ladies and Gentlemen, Boys and Girls, (wikimedia-au,
chapters-cultural-partnerships, foundation-l)
The event that you have (hopefully) heard about, "Galleries, Libraries,
Archives, Museums and Wikimedia: finding the common ground" is coming along
apace!
This is a Wikimedia Australia event, a worl
Machine translations are not new work, neither derivatives, as it is
done by machines and not by humans.
Also Google will have a hard time claiming that because some
unidentified person added text or an url to a open service they now has
the right to do whatever they want with the text.
I guess w
Compare such text to a photo of a painting changed by some automatic
algorithm. The copyright of the painting is unchanged and the algorithm
gets no part of any new copyright, yet the person applying the tool
_can_ have a part in the copyright for the new derived work.
If you translate a work thro
On Wed, Jun 10, 2009 at 11:57 PM, John at Darkstar wrote:
> Machine translations are not new work, neither derivatives, as it is
> done by machines and not by humans.
This is probably the correct argument to make.
___
foundation-l mailing list
foundat
There are two trends in machine translations; rule based translations
and statistical translations. Both have pros and cons. Rule based
translations seems to be possible to integrate with Wiktionary in such a
way that it can support Wikipedia. Statistical translations seems to be
possible to integr
35 matches
Mail list logo