Re: [Wikimedia-l] New Affiliations Committee leadership

2014-04-10 Thread RYU Cheol
Thank you, Bence! And I am also glad you will remain in the AffCom. 

Congratulations, Carlos and Cythia. I really need to keep in touch with you for 
Korean Wikimedia user group.

Kind regards,
Cheol, Chair of Korean Wikimedia Chapter Preparation Committee

2014. 4. 11., 오전 6:20, Bence Damokos  작성:

> Hi all,
> 
> It is with great pleasure that I can announce today that the Affiliations
> Committee has appointed Carlos Colina as the chair (who previously served
> as the vice-chair), and Cynthia Ashley-Nelson as the vice-chair of the
> committee for a one year term ending in April 2015.
> 
> I am happy to leave the committee in the very capable hands of Carlos and
> Cynthia, and will remain an ordinary member of AffCom going forward.
> 
> Best regards,
> Bence Damokos
> Member, Affiliations Committee
> ___
> Wikimedia-l mailing list
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
> 


___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] National Museum of Korea releases images of artifacts and old books.

2013-12-18 Thread RYU Cheol
It means that even you can get the metadata from the museum, we could not
import it with a general tool?

The museum is going to release the content by exposing it on their website.
Currently I advised them to publish with DC(Dublin Core) attributes.

Then I need to write some codes which can handle DC attributes or extended
attributes for museums?

Cheol


2013/12/18 rupert THURNER 

> just out of interest, is there any template on wikimedia commons which
> follows dublin core? the only reference i could find was:
> https://strategy.wikimedia.org/wiki/Proposal:Dublin_Core
>
> rupert.
>
> On Tue, Dec 17, 2013 at 9:42 AM, RYU Cheol  wrote:
> > Thank you for your attention, Asaf.
> >
> > I am contacting them for better release but they seem very busy for the
> > release. From the conversation with them, I think they need some help
> from
> > some experts from GLAM-WIKI fellows of Wikimedia movement for continuing
> > the opening and long term success. I found the release lacks some
> important
> > meta data in my thought, for example the location of the heritage, and
> they
> > do not understand Dublin Core and its extension for the museums. Korean
> > Wikimedians will start to draft our opinion for better sharing. I hope we
> > could borrow some wisdom who have the experience to lead a successful
> > museum information releasing.
> >
> > Cheol
> >
> >
> > 2013/12/17 Asaf Bartov 
> >
> >> These are wonderful news, Cheol!  Thanks for sharing them.
> >>
> >> Are you or any other Wikipedians in touch with them at all?  If not, it
> >> might be a good time to get in touch, congratulate them on this
> decision,
> >> and describe the ways the Wikimedia community (not just in Korea!) can
> help
> >> get more exposure for Korean heritage and art via articles and
> >> translations, and also (perhaps) to contribute corrections to metadata,
> >> photo captions, etc.
> >>
> >> Cheers,
> >>
> >> Asaf
> >>
> >>
> >> On Sun, Dec 15, 2013 at 5:23 PM, RYU Cheol  wrote:
> >>
> >> > Hello, folks.
> >> >
> >> > The National Museum of Korea announce high quality images of 7,300
> >> > artifacts would be released. And they will release the 100 thousands
> >> pages
> >> > of old books. They said the material will be available for commercial
> >> uses.
> >> > But the exact license term is not known.
> >> >
> >> >
> >> >
> >> >
> >>
> http://www.museum.go.kr/program/board/detail.jsp?menuID=001009001&boardTypeID=32&originalBoardTypeID=28&boardID=19154
> >> >
> >> > I hope I could find the images on Commons.
> >> >
> >> > Cheol
> >> > ___
> >> > Wikimedia-l mailing list
> >> > Wikimedia-l@lists.wikimedia.org
> >> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
> ,
> >> > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
> >>
> >>
> >>
> >>
> >> --
> >> Asaf Bartov
> >> Wikimedia Foundation <http://www.wikimediafoundation.org>
> >>
> >> Imagine a world in which every single human being can freely share in
> the
> >> sum of all knowledge. Help us make it a reality!
> >> https://donate.wikimedia.org
> >> ___
> >> Wikimedia-l mailing list
> >> Wikimedia-l@lists.wikimedia.org
> >> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> >> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
> > ___
> > Wikimedia-l mailing list
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
>
> ___
> Wikimedia-l mailing list
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
>
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

Re: [Wikimedia-l] National Museum of Korea releases images of artifacts and old books.

2013-12-17 Thread RYU Cheol
Thank you for your attention, Asaf.

I am contacting them for better release but they seem very busy for the
release. From the conversation with them, I think they need some help from
some experts from GLAM-WIKI fellows of Wikimedia movement for continuing
the opening and long term success. I found the release lacks some important
meta data in my thought, for example the location of the heritage, and they
do not understand Dublin Core and its extension for the museums. Korean
Wikimedians will start to draft our opinion for better sharing. I hope we
could borrow some wisdom who have the experience to lead a successful
museum information releasing.

Cheol


2013/12/17 Asaf Bartov 

> These are wonderful news, Cheol!  Thanks for sharing them.
>
> Are you or any other Wikipedians in touch with them at all?  If not, it
> might be a good time to get in touch, congratulate them on this decision,
> and describe the ways the Wikimedia community (not just in Korea!) can help
> get more exposure for Korean heritage and art via articles and
> translations, and also (perhaps) to contribute corrections to metadata,
> photo captions, etc.
>
> Cheers,
>
> Asaf
>
>
> On Sun, Dec 15, 2013 at 5:23 PM, RYU Cheol  wrote:
>
> > Hello, folks.
> >
> > The National Museum of Korea announce high quality images of 7,300
> > artifacts would be released. And they will release the 100 thousands
> pages
> > of old books. They said the material will be available for commercial
> uses.
> > But the exact license term is not known.
> >
> >
> >
> >
> http://www.museum.go.kr/program/board/detail.jsp?menuID=001009001&boardTypeID=32&originalBoardTypeID=28&boardID=19154
> >
> > I hope I could find the images on Commons.
> >
> > Cheol
> > ___
> > Wikimedia-l mailing list
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
>
>
>
>
> --
> Asaf Bartov
> Wikimedia Foundation <http://www.wikimediafoundation.org>
>
> Imagine a world in which every single human being can freely share in the
> sum of all knowledge. Help us make it a reality!
> https://donate.wikimedia.org
> ___
> Wikimedia-l mailing list
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

[Wikimedia-l] National Museum of Korea releases images of artifacts and old books.

2013-12-15 Thread RYU Cheol
Hello, folks.

The National Museum of Korea announce high quality images of 7,300
artifacts would be released. And they will release the 100 thousands pages
of old books. They said the material will be available for commercial uses.
But the exact license term is not known.


http://www.museum.go.kr/program/board/detail.jsp?menuID=001009001&boardTypeID=32&originalBoardTypeID=28&boardID=19154

I hope I could find the images on Commons.

Cheol
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] The case for supporting open source machine translation

2013-04-26 Thread Ryu Cheol
Thanks to Jane for introducing CoSyne. But I feel all the wikis do not want to 
be synchronized to certain wikis. Rather than having identical articles, I hope 
they would have their own articles. I hope I could have two more tabs at right 
of the 'Article' and 'Talk' on English Wikipedia for Korean language. The two 
tabs are 'Article in Korean' and 'Talk in Korean'. The translations would have 
same information in originals and any editing on an article or a talk in 
translation pages would go back to the originals. In this case they need to be 
synchronized precisely.

I mean these are done in the scope of English Wikipedia, not related to Korean 
Wikipedia. But the Korean Wikipedia linked to the left side of a page would be 
benefited from the translations in English Wikipedia eventually when an Korean 
Wikipedia editor find a good part of English Wikipedia article could be 
inserted to Korean Wikipedia.

You can find the merits of the exact Korean translation of English Wikipedia or 
the scheme of the exact translation of big Wikipedias. It will help you reach 
to more potential contributors. It will make the language barrier lower for 
those who want to contribute to a Wikipedia they do not speak very well. Also, 
It could provide the better aligned corpora and it could could track how human 
translators or reviewers improve the translations. 

Cheol

On 2013. 4. 26., at 오후 9:04, Jane Darnell  wrote:

> We already have the translation options on the left side of the screen
> in any Wikipedia article.
> This choice is generally a smattering of languages, and a long term
> goal for many small-language Wikipedias is to be able to translate an
> article from related languages (say from Dutch into Frisian, where the
> Frisian Wikipedia has no article at all on the title subject) and the
> even longer-term goal is to translate into some other
> really-really-really foreign language.
> 
> Wouldn't it be easier however, to start with a project that uses
> translatewiki and the related-language pairs? Usually there is a big
> difference in numbers of articles (like between the Dutch Wikipedia
> and the Frisian Wikipedia). Presumably the demand is larger on the
> destination wikipedia (because there are fewer articles in those
> languages), and the potential number of human translators is larger
> (because most editors active in the smaller Wikipedia are versed in
> both langages).
> 
> The Dutch Wikimedia chapter took part in a European multilingual
> synchronization tool project called CoSyne:
> http://cosyne.eu/index.php/Main_Page
> 
> It was not a success, because it was hard to figure out how this would
> be beneficial to Wikipedians actually joining the project. Some
> funding that was granted to the chapter to work on the project will be
> returned, because it was never spent.
> 
> In order to tackle this problem on a large scale, it needs to be
> broken down into words, sentences, paragraphs and perhaps other
> structures (category trees?). I think CoSyne was trying to do this. I
> think it would be easier to keep the effort in one-way-traffic, so try
> to offer machine translation from Dutch to Frisian and not the other
> way around, and then as you go, define concepts that work both ways,
> so that eventually it would be possible to translated from Frisian
> into Dutch.
> 
> 2013/4/26, Mathieu Stumpf :
>> Le 2013-04-25 20:56, Theo10011 a écrit :
>>> As far as Linguistic typology goes, it's far too unique and too
>>> varied to
>>> have a language independent form develop as easily. Perhaps it also
>>> depends
>>> on the perspective. For example, the majority of people commenting
>>> here
>>> (Americans, Europeans) might have exposure to a limited set of a
>>> linguistic
>>> branch. Machine-translations as someone pointed out, are still not
>>> preferred in some languages, even with years of research and
>>> potentially
>>> unlimited resources at Google's disposal, they still come out
>>> sounding
>>> clunky in some ways. And perhaps they will never get to the level of
>>> absolute, where they are truly language independent.
>> 
>> To my mind, there's no such thing as "absolute" meaning. It's all about
>> intrepretation in a given a context by a given interpreter. I mean, I do
>> think that MT could probably be as good as a profesional translators.
>> But even profesional translators can't make "perfect translations". I
>> already gave the example of poetry, but you may also take example of
>> humour, which ask for some cultural background, otherwise you have to
>> explain why it's funny and you know that you have to explain a joke,
>> it's not a joke.
>> 
>>> If you read some of
>>> the discussions in linguistic relativity (Sapir-Whorf hypothesis),
>>> there is
>>> research to suggest that a language a person is born with dictates
>>> their
>>> thought processes and their view of the world - there might not be
>>> absolutes when it comes to linguistic cognition. There is something
>>> inherently uniqu

Re: [Wikimedia-l] The case for supporting open source machine translation

2013-04-24 Thread Ryu Cheol
Thank you for my learning on what you are going forward with Wikidata, Denny.

I am a Korean Wikipedia contributor. I definitely agree with Erik that we have 
to tackle the problem of information disparity between languages. But I feel we 
can take better choices than investing to open source machine translation 
itself. Wikipedia content could be reused for commercial purposes. We know it 
will help the spreading of the Wikipedia. I think it is all the same. If 
proprietary machine translations could help the getting rid of the barrier of 
the language, it would great also. I hope we could support any machine 
translation developing team as well as open source machine  translation team. 
But I believe finally open source machine translation will prevail.

Wikidata-based approaches are great! But I hope Wikipedia could do more 
including providing well aligned parallel corpora. I had looked into Google's 
translation workbench which tried to provide a customized translation tool for 
Wikipedia. I tried to translate a few English articles into Korean myself. The 
tool has a translation memory and a customizable dictionary. It lacked lots of 
features for practical translation and the interface was clumsy. 

I believe translatewiki.net could do better than Google. I hope the 
translatewiki could provide a translation workbench not just for messages in 
softwares but Wikipedia articles. Through the workbenk, we could get out more 
great data in addition to parallel corpus. We can track how a human translator 
works. If we have more data on the editing activity, we can improve the 
translation job and get new clues for automatic translation. 
The translator will start from a stub and he will improve the draft. Peer 
reviewers will give eyes on the draft and will make it better. I mean logs for 
collaborated translation on a parallel corpora could provide more things to 
learn. 

I think Wikipedia community could start an initiative for supporting raw 
materials for machine learning to translate. Those would be common asset for 
machine translation systems. 

Best regards

RYU Cheol
Chair of Wikimedia Korea Preparation Committee


2013. 4. 24., 오후 7:35, Denny Vrandečić  작성:

> Erik, all,
> 
> sorry for the long mail.
> 
> Incidentally, I have been thinking in this direction myself for a while,
> and I have come to a number of conclusions:
> 1) the Wikimedia movement can not, in its current state, tackle the problem
> of machine translation of arbitrary text from and to all of our supported
> languages
> 2) the Wikimedia movement is probably the single most important source of
> training data already. Research that I have done with colleagues based on
> Wikimedia corpora as training data easily beat other corpora, and others
> are using Wikimedia corpora routinely already. There is not much we can
> improve here, actually
> 3) Wiktionary could be an even more amazing resource if we would finally
> tackle the issue of structuring its content more appropriately. I think
> Wikidata opened a few venues to structure planning in this direction and
> provide some software, but this would have the potential to provide more
> support for any external project than many other things we could tackle
> 
> Looking at the first statement, there are two ways we could constrain it to
> make it possibly feasible:
> a) constrain the number of supported languages. Whereas this would be
> technically the simpler solution, I think there is agreement that this is
> not in our interest at all
> b) constrain the kind of input text we want to support
> 
> If we constrain b) a lot, we could just go and develop "pages to display
> for pages that do not exist yet based on Wikidata" in the smaller
> languages. That's a far cry from machine translating the articles, but it
> would be a low hanging fruit. And it might help with a desire which is
> evidently strongly expressed by the mass creation of articles through bots
> in a growing number of languages. Even more constraints would still allow
> us to use Wikidata items for tagging and structuring Commons in a
> language-independent way (this was suggested by Erik earlier).
> 
> Current machine translation research aims at using massive machine learning
> supported systems. They usually require big parallel corpora. We do not
> have big parallel corpora (Wikipedia articles are not translations of each
> other, in general), especially not for many languages, and there is no
> reason to believe this is going to change. I would question if we want to
> build an infrastructure for gathering those corpora from the Web
> continuously. I do not think we can compete in this arena, or that is the
> best use of our resources to support projects in this area. We should use
> our unique features to our advantage.
> 
> How can we use the unique