Re: [Wikimediaindia-l] Releasing Wikipedia CD – "few concerns"

2011-03-01 Thread Shiju Alex
Even though Srikanth had clearly explained most of the things, let me add
few things.

I am very much aware of Wikipedia CD/DVD project (and other offline
releases) across the world and about the different initiatives surrounding
it. In fact I was the project coordinator for the Malayalam wikipedia CD.
Malayalam wikipedians had done a thorough study about all the Wikipedia CD
releases and the Wikipedia for Schools project (and their release of special
edition of English wikipedia CD) before they started working on the
Wikipedia CD project.

Rest of the comments in your blog post are unwarranted for the topic of the
blog post (comments like IndiaTV channel story and all). It is really
difficult to understand things from Indian perspective.

My blog post clearly says:

   -  Indian language wikipedias also require wikipedia for schools or
   English wikipedia official CD release like efforts. But for that, much
   effort and cooperation with respective Indic language wikipedians are
   required. It is not as simple as taking the dump, convert it into ZIM, and
   create Wikipedia CD/DVD.
   - The content that we give to school children need to go through thorough
   review process.
   - Even the official general release of wikipedia CD/DVD, (that is, the
   CD/DVD that is not directed towards children) need to have some process. For
   the best example, look at the wikipedia CD effort by Martin walker and team
   for English wikipedia CD/DVD (I must say I am forced to write DVD along with
   CD everywhere [image: :)] )
   - The copyright of content and images are most important. We are very bad
   in respecting the copyrights.
   - The inclusion of explicit images and content in offline releases (let
   it be CD/DVD/Book/USB/Blue-ray disk, or any other offline material) will
   have serious after effects in India.

Since you can find it yourself, I am not providing you the links to some of
the explicit and copyrighted images (and content also) that contained in
Hindi/Marathi/Gujarati wikipedia CD/DVD that you already released.

Understanding the things from Indian context is a big thing. I think *Ashwin
Baindur* had tried to explain this through 2 emails.
request all of you who are involved in offline efforts to take those
feedback very seriously.

Before I close this reply, let me tell one thing. Malayalam wikipedia CD was
NOT a Wikipedia CD released for school children. It was a general release.
We made all efforts to make it error free and free from copyright issues.
Still the CD release ran into controversies (for the inclusion/non-inclusion
of certain articles, images, and so on). So we do not want other Indian wiki
communities ran into similar issues especially because many of the so called
big languages in India (like Hindi, Marathi, Gujarati, and all) doesn’t have
a big community to take up all these. See the statistics report to
understand the current state of different Indic language wikipedias.

Tamil community understood the importance of all these. That is why they
have adopted certain process to release Tamil wikipedia CD/DVD.

Shiju Alex

On Wed, Mar 2, 2011 at 8:40 AM, CherianTinu Abraham

> Dear Nikhil (& et all),
> Request to NOT take it personally. Whatever you are
> doing is a great initiative and really appreciable. But these comments are
> not to undermine your efforts.
> Srikanth's reply should completely explain why
> Shiju was trying to make a point and IMHO, they are valid ones. These best
> practices are based on previous experiences and not any knee jerk comments (
> Sorry, No pun intended).
> Please note that even this is the practice for
> project for English wiki , supplied to
> schools of "more liberalized" countries.
> The question is neither about the format of this
> information provided nor about the number of articles. 500 ? 2500? 25000?
> Ship it on a CD, thumb drive, DVD or even Blueray discs or any other medium
> ..Any number of articles is accepted, provided we are able to provide them
> quality content. I can personally vouch that more than half the content on
> our Wikipedia is crap. Yes I mean it. And but the reason we are all here is
> to convert them this crap to more useful stuff.
> Regards
> Tinu Cherian
> On Wed, Mar 2, 2011 at 7:18 AM, Srikanth Lakshmanan wrote:
>> On Wed, Mar 2, 2011 at 00:50, Nikhil Sheth  wrote:
>>> Hi All,
>>> I had a great time reading the blogpost. Very amusing. But had to clarify
>>> some things before we all pick up our pitch-forks and torches, so..

Re: [Wikimediaindia-l] Releasing Wikipedia CD – "few concerns"

2011-03-01 Thread CherianTinu Abraham
Dear Nikhil (& et all),
Request to NOT take it personally. Whatever you are
doing is a great initiative and really appreciable. But these comments are
not to undermine your efforts.
Srikanth's reply should completely explain why Shiju
was trying to make a point and IMHO, they are valid ones. These best
practices are based on previous experiences and not any knee jerk comments (
Sorry, No pun intended).
Please note that even this is the practice for project for English wiki , supplied to schools
of "more liberalized" countries.
The question is neither about the format of this
information provided nor about the number of articles. 500 ? 2500? 25000?
Ship it on a CD, thumb drive, DVD or even Blueray discs or any other medium
..Any number of articles is accepted, provided we are able to provide them
quality content. I can personally vouch that more than half the content on
our Wikipedia is crap. Yes I mean it. And but the reason we are all here is
to convert them this crap to more useful stuff.

Tinu Cherian

On Wed, Mar 2, 2011 at 7:18 AM, Srikanth Lakshmanan wrote:

> On Wed, Mar 2, 2011 at 00:50, Nikhil Sheth  wrote:
>> Hi All,
>> I had a great time reading the blogpost. Very amusing. But had to clarify
>> some things before we all pick up our pitch-forks and torches, so..
> I think you took it a bit personal. Shiju's blogpost IMHO is a good guide
> especially for Indian languages CD creation project. Giving ZIM format dumps
> has some adverse effects which was what Shiju tried to share as a best
> practice. I think they are valid concerns. Let me explain.
> I think he clearly mentioned about the project
> for English wiki and there is no problem in such approaches. I think the
> blog post was mainly addressing the impact of giving whole dumps of Indian
> language wikipedia's, all [except English(Yes, English is also an Indian
> langugage)] of which are in their early stages.
> Sometime back i proposed a similar CD creation project on tamil wikipedia
> and thought given that Santhosh's software is available, we could get a CD
> in couple of month's time. My timeline was rightfully mocked at by the
> community, with valid concerns on quality and we have just started
> collecting articles, which will then be peer reviewed, copy edited etc
> before we could use the software and create a CD. Along with Shiju's points,
> I am sharing some important viewpoints(which i learnt from tamil community),
> why i think the whole exercise of fact checking and releasing is very
> important instead of just dumping them in the case of Indian language
> wikipedia's.
> ** Who had checked the validity of the content provided in the CD? Once the
> CD is created, the content is frozen for ever. We never know in what ways
> the content in CD is going to get duplicated.
> >>This is very important. Imagine students using it to write answers for
> exams using this and if there is a factual error and teacher fails to give
> marks, the students will never ever turn back to Wikipedia again.We are
> loosing young buds here. While you might say Wikipedia is not reliable for
> information source, but you can use it, for a school kid, its always
> binary.Indian language wikipedia's are bound to have more spelling
> mistakes(since most of us use a non native keyboard / translation and we are
> human), the error rate will be even higher. Its not just about schools,
> anyone who is find such errors in a published form(CD is a published form
> you cant change) will find it annoying and may arrive at a prejudice which
> we wouldnt want.
> **How we are going to handle the copyright violation of text and images
> included in the CD? Can we always point our fingers to WMF?
> >> This is very serious issue, its a different thing that we as individuals
> might not respect copyright because of the climate in India, but as a
> project Wikipedia holds copyright as one of the 5 pillars(Do I need to tell
> this list about this, sorry *runs away* :D).
> **Who will answer the queries related to the  explicit images contained in
> many articles. The inclusion of explicit images, controversial articles, and
> factual errors in the CD supplied to the school children are not small
> issues.
> >> Not many know, Shiju may be sharing this because malayalam community
> actually burnt their fingers here. There was a (politically motivated?)
>  article in leading malayalam daily/magazine criticizing the effort
> regarding inclusion criteria / some factual erroneous information.This may
> not have been shared widely to avoid multiplication of negative publicity.
> While at personal level it hurts the people who worked hard for this after
> spending weeks for the project, at project level it

Re: [Wikimediaindia-l] Releasing Wikipedia CD – "few concerns"

2011-03-01 Thread Srikanth Lakshmanan
On Wed, Mar 2, 2011 at 00:50, Nikhil Sheth  wrote:

> Hi All,
> I had a great time reading the blogpost. Very amusing. But had to clarify
> some things before we all pick up our pitch-forks and torches, so..

I think you took it a bit personal. Shiju's blogpost IMHO is a good guide
especially for Indian languages CD creation project. Giving ZIM format dumps
has some adverse effects which was what Shiju tried to share as a best
practice. I think they are valid concerns. Let me explain.

I think he clearly mentioned about the project
for English wiki and there is no problem in such approaches. I think the
blog post was mainly addressing the impact of giving whole dumps of Indian
language wikipedia's, all [except English(Yes, English is also an Indian
langugage)] of which are in their early stages.

Sometime back i proposed a similar CD creation project on tamil wikipedia
and thought given that Santhosh's software is available, we could get a CD
in couple of month's time. My timeline was rightfully mocked at by the
community, with valid concerns on quality and we have just started
collecting articles, which will then be peer reviewed, copy edited etc
before we could use the software and create a CD. Along with Shiju's points,
I am sharing some important viewpoints(which i learnt from tamil community),
why i think the whole exercise of fact checking and releasing is very
important instead of just dumping them in the case of Indian language

** Who had checked the validity of the content provided in the CD? Once the
CD is created, the content is frozen for ever. We never know in what ways
the content in CD is going to get duplicated.

>>This is very important. Imagine students using it to write answers for
exams using this and if there is a factual error and teacher fails to give
marks, the students will never ever turn back to Wikipedia again.We are
loosing young buds here. While you might say Wikipedia is not reliable for
information source, but you can use it, for a school kid, its always
binary.Indian language wikipedia's are bound to have more spelling
mistakes(since most of us use a non native keyboard / translation and we are
human), the error rate will be even higher. Its not just about schools,
anyone who is find such errors in a published form(CD is a published form
you cant change) will find it annoying and may arrive at a prejudice which
we wouldnt want.

**How we are going to handle the copyright violation of text and images
included in the CD? Can we always point our fingers to WMF?

>> This is very serious issue, its a different thing that we as individuals
might not respect copyright because of the climate in India, but as a
project Wikipedia holds copyright as one of the 5 pillars(Do I need to tell
this list about this, sorry *runs away* :D).

**Who will answer the queries related to the  explicit images contained in
many articles. The inclusion of explicit images, controversial articles, and
factual errors in the CD supplied to the school children are not small

>> Not many know, Shiju may be sharing this because malayalam community
actually burnt their fingers here. There was a (politically motivated?)
 article in leading malayalam daily/magazine criticizing the effort
regarding inclusion criteria / some factual erroneous information.This may
not have been shared widely to avoid multiplication of negative publicity.
While at personal level it hurts the people who worked hard for this after
spending weeks for the project, at project level it creates a negative
publicity / impact for language wikipedia which none of us would want.

>> On the content side as well, Indian wikipedia's arent that great(barring
few). The Stub ratio may be high which will give a wrong impression of
language wikipedia's to people using the offline ZIM dumps that language
wikipedia's will only have templated content stubs. While its good to have
stubs online, since they help in improvement of articles, i dont fancy stubs
on an offline wikipedia.

>> While giving ZIM format dumps might be useful for people without
connectivity, IMHO its definitely not suitable for mass consumption without
a bunch of disclaimers (which might not even be understood by well educated)

>> While the intentions may be good, that shouldnt cause a negative impact
to the language wikipedia is the bottom line.I hope you understand.(This was
probably missing in Shiju's blog post which made you think it was a pointing
fingers, but I am sure Shiju's intention was to share his "bottom line")

Wikimediaindia-l mailing list

Re: [Wikimediaindia-l] Releasing Wikipedia CD – "few concerns"

2011-03-01 Thread Nikhil Sheth
Hi All,

I had a great time reading the blogpost. Very amusing. But had to clarify
some things before we all pick up our pitch-forks and torches, so..

I can fully understand the knee-jerk mechanism of reading one email in haste
and putting two unrelated things together to turn a totally open initiative
(the main website is a wiki-page where anyone can give feedback if anything
goes wrong or can jump in, give alternatives and change the course of the
project for the better without any controlling powers blocking anybody) into
a thrilling soap opera filled with villains preying on innocent children and
destroying our sacred culture by showing them things we desperately do not
want them to see like:

safe sex, condoms, facts about womens' bodies that would end their
objectification and bring respect between the sexes, human rights abuses
practiced by our government under the garb of AFSPA, truths about Naxalism,
SEZ's and raping of the nation's resources and forests by industrialists
that metro-dwellers don't want to admit, mass displacements caused by
development projects that don't really tip the cost-benefit scales in the
long run, ties of all major mainstream media with establishment and
corporates that prevents real news from being printed or broadcast, that
Santa Claus doesn't really exist, that our and other governments are doing
lip service about mass-killer threats of climatic disruption while blowing
vast amounts on things like CWG, scandals and scams, ugly track records of
most politicians in our country, etc etc.

I fully understand that we should be really really scared of any of these
uncomfortable truths that are anyways present in wikipedia open for all to
see (and not any porn or racy material as we try to blanket them under when
using the term "objectionable content") from ever reaching the millions of
children on the other side of the digital and economic divide whom we don't
really give a damn about anyways.
And I completely agree we should always guard against such threats to our
society rather than actually risking empowering the people with knowledge of
the world they are living in.

btw, personal opinions do not influence practicality. As I've explained in
the reply-blog-post, Ain't no way the unfiltered versions are getting
anywhere near any students as long as the initiative remains grass-roots and
no one is ordering any school from top. Before any of you start to even
worry about content, the teachers and staff themselves audit it fully before
allowing it, being directly responsible. So I'd advise the community to stop
worrying so much and instead get to work on actually creating
student-friendly indian language versions that have a little more than 500

What I cannot understand is how in the world can anybody expect a
made-for-education collection of wikipedia articles for school children from
age 8 to 17 that is supposed to cover everything they'll learn through
school to actually fit on just one little CD!! :P

And then after that fact how is one supposed to magically squeeze in porno
and what not into the same CD to make it unfit for school children, even if
we forget that the ZIM format isn't half as easily alterable as the other
more "preferred" formats that our own community has developed !


On Tue, Mar 1, 2011 at 7:54 PM, sankarshan wrote:

> On Tue, Mar 1, 2011 at 5:31 PM, CherianTinu Abraham
>  wrote:
> > 1) Releasing Wikipedia CD for School and school children needs larger
> review
> > of content. Not all content on Wikipedia may be suitable for children. If
> > any such controversy occurs on content on Wikipedia to children, we may
> risk
> > a very bad reputation and possibly permanent ban of them from schools.
> Somewhat tangential, the OLPC folks had also worked on providing a
> dump of Wikipedia. Is there a way to access the criteria used to
> curate the content suitable for children ?
> --
> sankarshan mukhopadhyay
> ___
> Wikimediaindia-l mailing list
Wikimediaindia-l mailing list

Re: [Wikimediaindia-l] Releasing Wikipedia CD – "few concerns"

2011-03-01 Thread sankarshan
On Tue, Mar 1, 2011 at 5:31 PM, CherianTinu Abraham

> 1) Releasing Wikipedia CD for School and school children needs larger review
> of content. Not all content on Wikipedia may be suitable for children. If
> any such controversy occurs on content on Wikipedia to children, we may risk
> a very bad reputation and possibly permanent ban of them from schools.

Somewhat tangential, the OLPC folks had also worked on providing a
dump of Wikipedia. Is there a way to access the criteria used to
curate the content suitable for children ?

sankarshan mukhopadhyay

Wikimediaindia-l mailing list

Re: [Wikimediaindia-l] Releasing Wikipedia CD – "few concerns"

2011-03-01 Thread Anirudh Bhati
Hi Tinu,

Thank you for bringing this post to notice.  Shiju makes a few very
important points.

Since the dumps are publicly available and the content is licensed
under CC BY-SA (except some infringing media), users are free to
create ZIM files out of them.  But some of these efforts may not be
associated with the specific language community, there may not be any
oversight on the content.  Therefore, I suggest that we make an effort
to connect with individuals who are taking up these interests because
we need their support and enthusiasm make such projects successful.
For instance, Nikhil Sheth is a Pune-based volunteer who has taken the
SOSCV approved edition of Wikipedia for Schools in English to two
schools in his city.  While at the GNUnify conference, he shared the
Marathi, Hindi and Gujarati ZIM files with me for review but these are
not for use in schools as far as I'm aware.

To ensure that we do not receive criticism based on the efforts of
individuals who are not directly associated with the Wikimedia
community (even when well-intentioned), we may:

(i) publicize the availability of works created by the local language
(ii) use Wikimedia trademarks with approval from the Foundation or the
Chapter (like the Malayalam CD team);
(iii) actively attempt to engage volunteers to share ideas and best practices.

Yours sincerely,

Anirudh Bhati

00 91 9328712208
Skype: anirudhsbh

On Tue, Mar 1, 2011 at 5:31 PM, CherianTinu Abraham
> Happened to jump into this blogpost by Shiju Alex "Releasing Wikipedia CD –
> few concerns"
> Thought it might need a larger discussion in the Wikicommunity.
> Here are my take on same :
> 1) Releasing Wikipedia CD for School and school children needs larger review
> of content. Not all content on Wikipedia may be suitable for children. If
> any such controversy occurs on content on Wikipedia to children, we may risk
> a very bad reputation and possibly permanent ban of them from schools.
> 2) If we have releasing a Wikipedia on a particular language, it is always
> good to take the Wiki-community along. They will be more than happy to help.
> Please go through the blogpost and share your thoughts
> Regards
> Tinu Cherian
> ___
> Wikimediaindia-l mailing list

Wikimediaindia-l mailing list