Re: [Wikimediaindia-l] Tamil Wiktionary now Top 10 globally

2010-11-14 Thread Sunil
Great news. Congrats to all

On Mon, Nov 15, 2010 at 12:21 AM, theo10011 de10...@gmail.com wrote:

 Congratulations, Ravi and all other Tamil Wiktionarians, that is quite an
 achievement.


 Looking forward to seeing Tamil Wiktionary grow larger.


 Regards


 Salmaan Haroon


 On Mon, Nov 15, 2010 at 12:17 AM, Ravishankar ravidre...@gmail.comwrote:

 Hi,

 Tamil Wiktionarians added 70,000+ English words recently to make it the
 9th biggest Wiktionary globally. Now, we have 1,90,000+ words.

 http://www.wiktionary.org/

 This addition was done by a community run auto Wiki Browser bot account
 after a 3 month long careful planning and redesign of page format for Tamil
 Wiktionary.

 The words are mainly technical in nature and donated in an excel sheet
 format by the State Government of Tamilnadu during Tamil Internet
 Conference, 2010.

 We hope to bring more such publicly available sources to Tamil Wiki
 projects in future.

 Regards,

 Ravi

 (On behalf of Tamil Wiktionarians)

 ___
 Wikimediaindia-l mailing list
 Wikimediaindia-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l



 ___
 Wikimediaindia-l mailing list
 Wikimediaindia-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l


___
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l


Re: [Wikimediaindia-l] Tamil Wiktionary now Top 10 globally

2010-11-14 Thread Gautam John
On 15 November 2010 00:17, Ravishankar ravidre...@gmail.com wrote:

 Tamil Wiktionarians added 70,000+ English words recently to make it the 9th
 biggest Wiktionary globally. Now, we have 1,90,000+ words.

Wow! Thanks fantastic, Ravi! Congratulations to everyone who helped
make this happen!

Thank you.

Best,

Gautam

http://social.prathambooks.org/

___
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l


[Wikimediaindia-l] Tamil Encyclopedia merge into Wikipedia.

2010-11-14 Thread Murali Kumar

Dear Wikimedia India,
As you probably aware the Govt. of India, immediately post Independence started 
multiple Indian language encyclopedia projects to stream in Science and 
Technology. The Tamil language encyclopedia was completed 
[http://en.wikipedia.org/wiki/Tamil_Encyclopedia]
I'm pleased to report Tamil Virtual University has scanned in the Tamil 
Kalaikalanjiam / Tamil Encyclopedia [Please see Reference 1 below].
I was able to download the material via the wonderful wget command and the 
'convert' (from imagemagick lib)  in GNU/Linux. However each of the 10 volumes 
is close to 700 MB without compression.
I would imagine, the people behind this mammoth task (pre-internet era) would 
have liked it to be merged into a Wiki type format, which would make it a truly 
living document in-sync with the times.
I do not have any experience with 1) Tamil OCR software and 2) Automated 
updates to Wikipedia.   Can anyone take the lead on this project ? It will help 
boost the number of quality, articles in Indian languages. The Children's 
encyclopedia is being scanned and has a lot of great visual content.
I have uploaded a sample (10 MB) PDF file at 
https://sites.google.com/site/periasamythooran/kalaikalanjiam/kalaikalanjiamWikiMergeAttempt.pdf
 if you are interested to give it a spin.
Thanks,
Murali.
1. http://www.tamilvu.org/library/libindex.htm and click on Kalaikalanjiam / 
Tamil Encyclopedia.  ___
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l


Re: [Wikimediaindia-l] Tamil Encyclopedia merge into Wikipedia.

2010-11-14 Thread John Vandenberg
On Mon, Nov 15, 2010 at 5:32 PM, Shiju Alex shijualexonl...@gmail.com wrote:
 I have a query.

 What is the license of Tamil Kalaikalanjiam? Did Tamil Nadu government or
 Tamil Virtual University had officially announced that this Encyclopedia is
 released in Public Domain or in some creative commons license so that we can
 reuse the content. If yes, we can very well reuse the content. Otherwise it
 will be copyright violation. So kindly verify this.

 Let us not assume that since it is published by Government it will be in
 pubic domain. In India that is not the case.

If it is released under a free license, it can be transcribed on Tamil
Wikisource.

http://ta.wikisource.org/wiki/

The software to do transcriptions on Wikisource is not enabled on
Tamil Wikisource, but it can be enabled once these messages have been
translated.

http://translatewiki.net/w/i.php?title=Special%3ATranslatetask=untranslatedgroup=ext-proofreadpagelanguage=talimit=100

Here is an example of the Wikisource transcription software with a
completed work:

http://en.wikisource.org/wiki/Index:Sanskrit_Grammar_by_Whitney_p1.djvu

and an incomplete project:

http://en.wikisource.org/wiki/Index:Dictionary_of_National_Biography_volume_03.djvu

 In 2008 December, Kerala Government has officially announced that it is
 changing  the license of similar encyclopedic project in Malayalam
 (sarvavijanakosam) to Free documentation license so that Malayalam wiki
 community can reuse its content to develop Malayalam wikipedia. Governmant
 has officially announced it. Kerala Government has also set up its own wiki
 (to help us) for Sarvavijanakosam and they are slowly digitizing the content
 and posting in its own wiki (http://mal.sarva.gov.in). They have completed
 some 2,900 articles now. We are reusing this content to enhance many of the
 existing articles. But we are not copy pasting  the entire content due to
 various reasons. The main reason is, the content need to rewritten as per
 the style of wikipedia.

The wikisource transcription software is a mediawiki extension, so it
could be added to mal.sarva.gov.in.

--
John Vandenberg

___
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l


Re: [Wikimediaindia-l] Tamil Encyclopedia merge into Wikipedia.

2010-11-14 Thread BalaSundaraRaman
Yes, we need to get it under a suitable license. If the technical issue related 
to OCR is resolved, we can talk to them about releasing the content into public 
domain.

- Sundar

 That language is an instrument of human reason, and not merely a medium for 
the expression of thought, is a truth generally admitted.
- George Boole, quoted in Iverson's Turing Award Lecture



From: Shiju Alex shijualexonl...@gmail.com
To: wikimediaindia-l@lists.wikimedia.org
Sent: Mon, November 15, 2010 12:02:42 PM
Subject: Re: [Wikimediaindia-l] Tamil Encyclopedia merge into Wikipedia.

I have a query. 

What is the license of Tamil Kalaikalanjiam? Did Tamil Nadu government or 
Tamil 
Virtual University had officially announced that this Encyclopedia is released 
 
in Public Domain or in some creative commons license so that we can  reuse the 
content. If yes, we can very well reuse the content. Otherwise  it will be 
copyright violation. So kindly verify this. 


Let us not  assume that since it is published by Government it will be in 
pubic  
domain. In India that is not the case.

In 2008 December, Kerala  Government has officially announced that it is 
changing  the license of  similar encyclopedic project in Malayalam  
(sarvavijanakosam) to Free documentation license so that Malayalam wiki 
community can reuse its content to develop  Malayalam wikipedia. Governmant 
has 
officially announced it. Kerala Government has also set up its own wiki (to 
help  
us) for Sarvavijanakosamand they are slowly digitizing the content and posting 
in its own  wiki (http://mal.sarva.gov.in). They have completed some 2,900 
articles now. We are  reusing this content to enhance many of the existing 
articles. But we  are not copy pasting  the entire content due to various 
reasons. The main  reason is, the content need to rewritten as per the style 
of 
wikipedia.

I really have doubt about the  efficiency of  current OCR softwares for Indian 
languages. It is still  under development. The existing solutions are not 
good. 
I am not sure about Tamil OCR softwares.

Shiju Alex


On Mon, Nov 15, 2010 at 11:33 AM, Murali Kumar pthoo...@hotmail.com wrote:

Dear Wikimedia India,


As you probably aware the Govt. of India, immediately post Independence 
started 
multiple Indian language encyclopedia projects to stream in Science and 
Technology. The Tamil language encyclopedia was completed 
[http://en.wikipedia.org/wiki/Tamil_Encyclopedia]


I'm pleased to report Tamil Virtual University has scanned in the Tamil 
Kalaikalanjiam / Tamil Encyclopedia [Please see Reference 1 below].


I was able to download the material via the wonderful wget command and the 
'convert' (from imagemagick lib)  in GNU/Linux. However each of the 10 
volumes 
is close to 700 MB without compression.


I would imagine, the people behind this mammoth task (pre-internet era) would 
have liked it to be merged into a Wiki type format, which would make it a 
truly 
living document in-sync with the times.


I do not have any experience with 1) Tamil OCR software and 2) Automated 
updates 
to Wikipedia. 
  
Can anyone take the lead on this project ? It will help boost the number of 
quality, articles in Indian languages. The Children's encyclopedia is being 
scanned and has a lot of great visual content.


I have uploaded a sample (10 MB) PDF file at 
https://sites.google.com/site/periasamythooran/kalaikalanjiam/kalaikalanjiamWikiMergeAttempt.pdf
 if you are interested to give it a spin.


Thanks,


Murali.


1. http://www.tamilvu.org/library/libindex.htm and click on Kalaikalanjiam / 
Tamil Encyclopedia.
___
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l


___
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l


Re: [Wikimediaindia-l] Tamil Wiktionary now Top 10 globally

2010-11-14 Thread CherianTinu Abraham
Also glad to see an indic langugae on the main page of a Wikimedia project
http://www.wiktionary.org/
Congratulations to all who made this happen.

Would love to see atleast one Indic language on top 10 of every Wikimedia
project in next 5 years.

Regards
Tinu Cherian


On Mon, Nov 15, 2010 at 9:31 AM, Gautam John gau...@prathambooks.orgwrote:

 On 15 November 2010 00:17, Ravishankar ravidre...@gmail.com wrote:

  Tamil Wiktionarians added 70,000+ English words recently to make it the
 9th
  biggest Wiktionary globally. Now, we have 1,90,000+ words.

 Wow! Thanks fantastic, Ravi! Congratulations to everyone who helped
 make this happen!

 Thank you.

 Best,

 Gautam
 
 http://social.prathambooks.org/

 ___
 Wikimediaindia-l mailing list
 Wikimediaindia-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l

___
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l