Raphael, To clarify, i am not trying to make a case against availability of
fine grained data, far from it i'm with you on this argument among others
that are made spuriously to restrict access. I might have stretched the
point but then again - killing is just one extreme form of discrimination -
there are others that are less visible

you summed it up very well, its good to have a healthy caution and unease
when dealing with some of this data,there are probably no simple answers
here.

will read the paper at leisure.

cs.


On Fri, Apr 11, 2014 at 12:37 PM, Raphael Susewind <
li...@raphael-susewind.de> wrote:

> Chandrashekhar,
>
> just on the specific issues of targeting communities, which I have
> thought about a great deal (my first book was on post-2002 Gujarat), my
> tentative conclusion is this:
>
> The fact that electoral rolls had been used in the past in riots before
> they were available online shows that rioters, if they want to, can
> access this data already. As Gautam pointed out, it IS public by law.
> What changes is merely the scale of data availability. Large-scale data
> would only be 'more useful' for large-scale targeting, however
> (small-scale targeting is possible already), which I don't see happening
> at this time (with the troublesome exception of Gujarat, particularly
> troublesome now that Mr Modi runs for PM - but here, too, the targeting
> happened in small units on the ground, even though coordination took
> place higher up). On the other hand, fine-grained large-scale data is
> absolutely necessary to understand a range of issues about (religious,
> caste) economic position. So that in this specific case, we have
> additional benefits but no additional risk (beyond the worrisome risk
> already out there)...
>
> More detailed arguments about this in a forthcoming paper of mine at
> http://pub.uni-bielefeld.de/publication/2631138
>
> Best,
> Raphael
>
> On 11.04.2014 08:49, Chandrashekhar Raman wrote:
> > Raphael, you raise very pertinent issues.
> >
> > We as a community love open data and in this country there is a lot that
> > can be done to free all kinds of data so that it can be made use of in a
> > good way (election data in an aggregated form is one example). But at
> > the same time there are certain kinds of data which are not open ( i
> > mean not open in a machine readable format) for a good reason. I believe
> > voter rolls data is one such type. In the past voter lists have been
> > used to pinpoint members of specific communities which were then
> > targeted with gruesome effect. Shudder to think what happens if it is
> > automated, a 'riot app'?
> >
> > As Raphael points out this is not just about privacy, but could be much
> > worse.
> >
> > This group is a fantastic initiative and as it evolves, it would be
> > great for us to involve more social scientists and policy experts - so
> > as we advocate vociferously to free more data and make it open - we can
> > also bring in the technical expertise here to recommend where data needs
> > to be better protected and how.
> >
> > cs
> >
> >
> > On Fri, Apr 11, 2014 at 11:44 AM, Raphael Susewind
> > <li...@raphael-susewind.de <mailto:li...@raphael-susewind.de>> wrote:
> >
> >     Hi Devdatta and Avinash,
> >
> >     yes, I, too, am frankly surprised at the ease with which one can
> access
> >     sensitive data in bulk. Not only PDF rolls and voter details, but
> also
> >     things such as land records, BPL lists, and much more - I think we
> are
> >     in an exciting as well as dangerous phase of fairly uncontrolled,
> >     nascent e-Governance practices. But I think the ethical issues here
> are
> >     a little more complex than mere privacy concern.
> >
> >     Upfront, I must admit that I use all the above sources for academic
> >     research (in UP and across India). What Avinash described in
> principle
> >     and at the example of Delhi can indeed be done on an all-India scale,
> >     and I am sure there are more people than just me who do it.
> >
> >     But then the social sciences have long dealt with sensitive data and
> >     developed protocols to protect it. Even though the data is publicly
> >     available, I for instance have my own copy on a secure workstation
> with
> >     full disk encryption and two factor authentication. Whenever
> possible, I
> >     also work on anonymized subsets of data. Yet there are other
> potential
> >     uses - some of the more worrisome you pointed out - which are not
> bound
> >     by such data protection standards.
> >
> >     To me, this once more highlights the nascent stage of ethical
> standards
> >     around Big Data and eGovernance. On the plus side, I am happy to have
> >     that kind of access to conduct research which will ultimately be
> >     ethically beneficial, leading to better understanding of social
> issues
> >     and potentially to better policy advice. Also, there is a point to be
> >     made that transparency is an important asset in elections in
> particular,
> >     not only in terms of individual electoral search functions, but also
> in
> >     terms of publicly accessible (and cross-checkable, publicly
> verifiable)
> >     PDF rolls. Finally, a lot of this data had been available in the
> past as
> >     well, only in distributed and/or commercial form, which means there
> had
> >     been a hierarchy of access: small-time crooks could not use it, but
> >     large-time crooks were always able to use it. Likewise, scholars at
> >     large (often foreign) universities were able to use it, but not
> smaller
> >     ones (this is still true for some data, geodata in particular, which
> I
> >     can only access because of Ivy-League contacts and only process
> because
> >     of an association with Oxford University).
> >
> >     The ethical challenge as I see it thus comes not from data
> availability
> >     per se, but from the bulk accessibility and processability of data,
> as
> >     well as the potential to link otherwise disconnected datasets with
> each
> >     other (for instance a voter ID from the rolls to the online electoral
> >     search mechanism to that voter's polling booth locality to the ration
> >     card of a person with the same name registered at a ration shop in
> close
> >     spatial proximity to the amount of rice that person obtained last
> week,
> >     all coupled - in case of my own research - to that person's religious
> >     identity through a namematching algorithm). And this IS an ethical
> >     challenge indeed, particularly if one leaves the ivory tower of
> >     academia, where ethical standards for such data are more ingrained,
> and
> >     more adhered to. One need not go all the way to the various criminal
> >     uses of such data - are we all happy with commercial use, to start
> with?
> >
> >     I have no easy answers here, because I think the ethical issue is
> fairly
> >     complex, balancing privacy and personal security against
> transparency in
> >     the political process and legitimate academic use of data (also
> because
> >     I think the answer must be found in India through political
> >     deliberation, and not in German academia). Still, in the end, I have
> to
> >     admit that I often leave my desk in the evening with quite some
> unease
> >     over the sheer wealth of private data that I work with...
> >
> >     What do others think?
> >     Raphael
> >
> >     On 11.04.2014 06:57, Avinash Celestine wrote:
> >     > Hi Devdatta
> >     >
> >     > Yes, though (and in the current context, i suppose thats a good
> >     thing),
> >     > its not so easy for some other states such as UP, due to certain
> >     > problems with the way the pdfs are encoded. Raphael, who is on this
> >     > group, will testify to that...
> >     >
> >     > I had alluded to this sometime back...
> >     >
> >     > https://storify.com/ac_soc/voter-profiling
> >     >
> >     > Avinash
> >     >
> >     >
> >     >
> >     >
> >     > On Fri, Apr 11, 2014 at 9:55 AM, Devdatta Tengshe
> >     <devda...@tengshe.in <mailto:devda...@tengshe.in>
> >     > <mailto:devda...@tengshe.in <mailto:devda...@tengshe.in>>> wrote:
> >     >
> >     >     Hi,
> >     >     I found this interesting article by a guy who downloaded and
> >     >     processed the Voter list of
> >     Delhi:https://medium.com/p/1aff55526881
> >     >     <https://medium.com/p/1aff55526881>
> >     >
> >     >     I found this via a discussion on Reddit:
> >     >
> >
> http://www.reddit.com/r/programming/comments/22pn8u/i_wrote_a_few_simple_python_scripts_to_retrieve/
> >     >
> >     >     I'll like to quote his findings here:
> >     >
> >     >      1. It is possible to automate the retrieval of every single
> PDF
> >     >         roll all across India
> >     >      2. These PDFs can then be processed in a matter of minutes to
> >     >         produce details like Addresses, names, father's name,
> gender,
> >     >         age and voters ID number for every single registered voter
> >     of India
> >     >      3. Nearly 25% of the Voter IDs assigned within only Delhi
> fail to
> >     >         conform to the government format, and fail the Luhn
> Checksum
> >     >         test used to validate them. It is likely that other states
> are
> >     >         in a similar, if not worse condition
> >     >
> >     >
> >     >     Regards,
> >     >
> >     >     Devdatta Tengshe
> >     >
> >     >
> >     >     --
> >     >     For more details about this list
> >     >     http://datameet.org/discussions/
> >     >     ---
> >     >     You received this message because you are subscribed to the
> Google
> >     >     Groups "datameet" group.
> >     >     To unsubscribe from this group and stop receiving emails from
> it,
> >     >     send an email to datameet+unsubscr...@googlegroups.com
> >     <mailto:datameet%2bunsubscr...@googlegroups.com>
> >     >     <mailto:datameet+unsubscr...@googlegroups.com
> >     <mailto:datameet%2bunsubscr...@googlegroups.com>>.
> >     >     For more options, visit https://groups.google.com/d/optout.
> >     >
> >     >
> >     > --
> >     > For more details about this list
> >     > http://datameet.org/discussions/
> >     > ---
> >     > You received this message because you are subscribed to the Google
> >     > Groups "datameet" group.
> >     > To unsubscribe from this group and stop receiving emails from it,
> send
> >     > an email to datameet+unsubscr...@googlegroups.com
> >     <mailto:datameet%2bunsubscr...@googlegroups.com>
> >     > <mailto:datameet+unsubscr...@googlegroups.com
> >     <mailto:datameet%2bunsubscr...@googlegroups.com>>.
> >     > For more options, visit https://groups.google.com/d/optout.
> >
> >     --
> >     Raphael Susewind | BGHS Bielefeld University, CSASP University of
> Oxford
> >           Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
> >        Papers & Blog | http://www.raphael-susewind.de
> >
> >     Please do consider http://www.gnupg.org for encryption (key id
> A5ED49AE)
> >
> >     --
> >     For more details about this list
> >     http://datameet.org/discussions/
> >     ---
> >     You received this message because you are subscribed to the Google
> >     Groups "datameet" group.
> >     To unsubscribe from this group and stop receiving emails from it,
> >     send an email to datameet+unsubscr...@googlegroups.com
> >     <mailto:datameet%2bunsubscr...@googlegroups.com>.
> >     For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> > For more details about this list
> > http://datameet.org/discussions/
> > ---
> > You received this message because you are subscribed to the Google
> > Groups "datameet" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to datameet+unsubscr...@googlegroups.com
> > <mailto:datameet+unsubscr...@googlegroups.com>.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
>       Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
>    Papers & Blog | http://www.raphael-susewind.de
>
> Please do consider http://www.gnupg.org for encryption (key id A5ED49AE)
>
> --
> For more details about this list
> http://datameet.org/discussions/
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datameet+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to