Re: [datameet] Security Issues with the Voter List

2014-05-18 Thread Gautam John
Something I read today:

http://www.medianama.com/2014/05/223-modak-marketing-election-voter-india/

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Security Issues with the Voter List

2014-05-18 Thread Snehashish Ghosh
Dear Gautam,

Thank you. This is very interesting. I wrote a piece on this issue right
after the failed Google-ECI deal in February http://goo.gl/e9Xea0
The UK approach seems to be a good one. In UK there are two voter lists -
full list and edited list. You can choose to be removed from the edited
list during the time of registration or at anytime thereafter. The edited
list is available in the public domain and the full list is safeguarded by
purpose limitation and UK Data Protection Law.

~Snehashish


On Mon, May 19, 2014 at 10:36 AM, Gautam John gkj...@gmail.com wrote:

 Something I read today:

 http://www.medianama.com/2014/05/223-modak-marketing-election-voter-india/

 --
 Datameet is a community of Data Science enthusiasts in India. Know more
 about us by visiting http://datameet.org
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Security Issues with the Voter List

2014-04-14 Thread Raphael Susewind
As a follow-up to this discussion:

electoralsearch.in began to implement rate limiting and selective IP
blocking yesterday. Sad as this is for my own research purposes, I
welcome the step from a privacy point of view...

Raphael

On 11.04.2014 10:56, Chandrashekhar Raman wrote:
 Raphael, To clarify, i am not trying to make a case against availability
 of fine grained data, far from it i'm with you on this argument among
 others that are made spuriously to restrict access. I might have
 stretched the point but then again - killing is just one extreme form of
 discrimination - there are others that are less visible
 
 you summed it up very well, its good to have a healthy caution and
 unease when dealing with some of this data,there are probably no simple
 answers here. 
 
 will read the paper at leisure.
 
 cs.
 
 
 On Fri, Apr 11, 2014 at 12:37 PM, Raphael Susewind
 li...@raphael-susewind.de mailto:li...@raphael-susewind.de wrote:
 
 Chandrashekhar,
 
 just on the specific issues of targeting communities, which I have
 thought about a great deal (my first book was on post-2002 Gujarat), my
 tentative conclusion is this:
 
 The fact that electoral rolls had been used in the past in riots before
 they were available online shows that rioters, if they want to, can
 access this data already. As Gautam pointed out, it IS public by law.
 What changes is merely the scale of data availability. Large-scale data
 would only be 'more useful' for large-scale targeting, however
 (small-scale targeting is possible already), which I don't see happening
 at this time (with the troublesome exception of Gujarat, particularly
 troublesome now that Mr Modi runs for PM - but here, too, the targeting
 happened in small units on the ground, even though coordination took
 place higher up). On the other hand, fine-grained large-scale data is
 absolutely necessary to understand a range of issues about (religious,
 caste) economic position. So that in this specific case, we have
 additional benefits but no additional risk (beyond the worrisome risk
 already out there)...
 
 More detailed arguments about this in a forthcoming paper of mine at
 http://pub.uni-bielefeld.de/publication/2631138
 
 Best,
 Raphael
 
 On 11.04.2014 08:49, Chandrashekhar Raman wrote:
  Raphael, you raise very pertinent issues.
 
  We as a community love open data and in this country there is a
 lot that
  can be done to free all kinds of data so that it can be made use
 of in a
  good way (election data in an aggregated form is one example). But at
  the same time there are certain kinds of data which are not open ( i
  mean not open in a machine readable format) for a good reason. I
 believe
  voter rolls data is one such type. In the past voter lists have been
  used to pinpoint members of specific communities which were then
  targeted with gruesome effect. Shudder to think what happens if it is
  automated, a 'riot app'?
 
  As Raphael points out this is not just about privacy, but could be
 much
  worse.
 
  This group is a fantastic initiative and as it evolves, it would be
  great for us to involve more social scientists and policy experts - so
  as we advocate vociferously to free more data and make it open -
 we can
  also bring in the technical expertise here to recommend where data
 needs
  to be better protected and how.
 
  cs
 
 
  On Fri, Apr 11, 2014 at 11:44 AM, Raphael Susewind
  li...@raphael-susewind.de mailto:li...@raphael-susewind.de
 mailto:li...@raphael-susewind.de
 mailto:li...@raphael-susewind.de wrote:
 
  Hi Devdatta and Avinash,
 
  yes, I, too, am frankly surprised at the ease with which one
 can access
  sensitive data in bulk. Not only PDF rolls and voter details,
 but also
  things such as land records, BPL lists, and much more - I
 think we are
  in an exciting as well as dangerous phase of fairly uncontrolled,
  nascent e-Governance practices. But I think the ethical issues
 here are
  a little more complex than mere privacy concern.
 
  Upfront, I must admit that I use all the above sources for
 academic
  research (in UP and across India). What Avinash described in
 principle
  and at the example of Delhi can indeed be done on an all-India
 scale,
  and I am sure there are more people than just me who do it.
 
  But then the social sciences have long dealt with sensitive
 data and
  developed protocols to protect it. Even though the data is
 publicly
  available, I for instance have my own copy on a secure
 workstation with
  full disk encryption and two factor authentication. Whenever
 possible, I
  also work 

Re: [datameet] Security Issues with the Voter List

2014-04-11 Thread Raphael Susewind
Hi Devdatta and Avinash,

yes, I, too, am frankly surprised at the ease with which one can access
sensitive data in bulk. Not only PDF rolls and voter details, but also
things such as land records, BPL lists, and much more - I think we are
in an exciting as well as dangerous phase of fairly uncontrolled,
nascent e-Governance practices. But I think the ethical issues here are
a little more complex than mere privacy concern.

Upfront, I must admit that I use all the above sources for academic
research (in UP and across India). What Avinash described in principle
and at the example of Delhi can indeed be done on an all-India scale,
and I am sure there are more people than just me who do it.

But then the social sciences have long dealt with sensitive data and
developed protocols to protect it. Even though the data is publicly
available, I for instance have my own copy on a secure workstation with
full disk encryption and two factor authentication. Whenever possible, I
also work on anonymized subsets of data. Yet there are other potential
uses - some of the more worrisome you pointed out - which are not bound
by such data protection standards.

To me, this once more highlights the nascent stage of ethical standards
around Big Data and eGovernance. On the plus side, I am happy to have
that kind of access to conduct research which will ultimately be
ethically beneficial, leading to better understanding of social issues
and potentially to better policy advice. Also, there is a point to be
made that transparency is an important asset in elections in particular,
not only in terms of individual electoral search functions, but also in
terms of publicly accessible (and cross-checkable, publicly verifiable)
PDF rolls. Finally, a lot of this data had been available in the past as
well, only in distributed and/or commercial form, which means there had
been a hierarchy of access: small-time crooks could not use it, but
large-time crooks were always able to use it. Likewise, scholars at
large (often foreign) universities were able to use it, but not smaller
ones (this is still true for some data, geodata in particular, which I
can only access because of Ivy-League contacts and only process because
of an association with Oxford University).

The ethical challenge as I see it thus comes not from data availability
per se, but from the bulk accessibility and processability of data, as
well as the potential to link otherwise disconnected datasets with each
other (for instance a voter ID from the rolls to the online electoral
search mechanism to that voter's polling booth locality to the ration
card of a person with the same name registered at a ration shop in close
spatial proximity to the amount of rice that person obtained last week,
all coupled - in case of my own research - to that person's religious
identity through a namematching algorithm). And this IS an ethical
challenge indeed, particularly if one leaves the ivory tower of
academia, where ethical standards for such data are more ingrained, and
more adhered to. One need not go all the way to the various criminal
uses of such data - are we all happy with commercial use, to start with?

I have no easy answers here, because I think the ethical issue is fairly
complex, balancing privacy and personal security against transparency in
the political process and legitimate academic use of data (also because
I think the answer must be found in India through political
deliberation, and not in German academia). Still, in the end, I have to
admit that I often leave my desk in the evening with quite some unease
over the sheer wealth of private data that I work with...

What do others think?
Raphael

On 11.04.2014 06:57, Avinash Celestine wrote:
 Hi Devdatta
 
 Yes, though (and in the current context, i suppose thats a good thing),
 its not so easy for some other states such as UP, due to certain
 problems with the way the pdfs are encoded. Raphael, who is on this
 group, will testify to that...
 
 I had alluded to this sometime back...
 
 https://storify.com/ac_soc/voter-profiling
 
 Avinash
 
 
 
 
 On Fri, Apr 11, 2014 at 9:55 AM, Devdatta Tengshe devda...@tengshe.in
 mailto:devda...@tengshe.in wrote:
 
 Hi,
 I found this interesting article by a guy who downloaded and
 processed the Voter list of Delhi:https://medium.com/p/1aff55526881
 https://medium.com/p/1aff55526881
 
 I found this via a discussion on Reddit:
 
 http://www.reddit.com/r/programming/comments/22pn8u/i_wrote_a_few_simple_python_scripts_to_retrieve/
 
 I'll like to quote his findings here:
 
  1. It is possible to automate the retrieval of every single PDF
 roll all across India
  2. These PDFs can then be processed in a matter of minutes to
 produce details like Addresses, names, father's name, gender,
 age and voters ID number for every single registered voter of India
  3. Nearly 25% of the Voter IDs assigned within only Delhi fail to
 

Re: [datameet] Security Issues with the Voter List

2014-04-11 Thread Gautam John
Leaving aside my earlier comment as perhaps tongue in cheek, the
electoral rolls are *meant* to be public. The Registration of Electors
Rules, 1960 makes that clear. However, your larger point is well made.
Maybe what needs to be done is to *de-centralise* the storage? That
fulfils the requirements of the Registration of Electors Rules, 1960
and making it harder to something like this.

It says: As soon as the roll for a constituency is ready, the
registration officer  shall publish it in draft by making a copy
thereof available for inspection and displaying a notice in Form 5--
(a) at his office, if it is within the constituency, and  (b) at such
place in the constituency as may be specified by him for the purpose,
if his office is outside the constituency ; [or in the official
website of the Chief Electoral Officer of the concerned State:]
[Provided that where such draft contains names of overseas electors,
the copies of such rolls shall also be published in the Electronic
Gazette 6 [or in the official website of the Chief Electoral Officer
of the concerned State].]

The Representation of the People Act, 1951 contains this: The
Government shall, at any election to be held for the purposes of
constituting the House of the People or the Legislative Assembly of a
State, supply, free of cost, to the candidates of recognised political
parties such number of copies of the electoral roll, as finally
published ...

Worth asking if we want political parties to have free access to it
but not citizens.
People Act, 1950 (43 of 1950)

-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Security Issues with the Voter List

2014-04-11 Thread Chandrashekhar Raman
Raphael, you raise very pertinent issues.

We as a community love open data and in this country there is a lot that
can be done to free all kinds of data so that it can be made use of in a
good way (election data in an aggregated form is one example). But at the
same time there are certain kinds of data which are not open ( i mean not
open in a machine readable format) for a good reason. I believe voter rolls
data is one such type. In the past voter lists have been used to pinpoint
members of specific communities which were then targeted with gruesome
effect. Shudder to think what happens if it is automated, a 'riot app'?

As Raphael points out this is not just about privacy, but could be much
worse.

This group is a fantastic initiative and as it evolves, it would be great
for us to involve more social scientists and policy experts - so as we
advocate vociferously to free more data and make it open - we can also
bring in the technical expertise here to recommend where data needs to be
better protected and how.

cs


On Fri, Apr 11, 2014 at 11:44 AM, Raphael Susewind 
li...@raphael-susewind.de wrote:

 Hi Devdatta and Avinash,

 yes, I, too, am frankly surprised at the ease with which one can access
 sensitive data in bulk. Not only PDF rolls and voter details, but also
 things such as land records, BPL lists, and much more - I think we are
 in an exciting as well as dangerous phase of fairly uncontrolled,
 nascent e-Governance practices. But I think the ethical issues here are
 a little more complex than mere privacy concern.

 Upfront, I must admit that I use all the above sources for academic
 research (in UP and across India). What Avinash described in principle
 and at the example of Delhi can indeed be done on an all-India scale,
 and I am sure there are more people than just me who do it.

 But then the social sciences have long dealt with sensitive data and
 developed protocols to protect it. Even though the data is publicly
 available, I for instance have my own copy on a secure workstation with
 full disk encryption and two factor authentication. Whenever possible, I
 also work on anonymized subsets of data. Yet there are other potential
 uses - some of the more worrisome you pointed out - which are not bound
 by such data protection standards.

 To me, this once more highlights the nascent stage of ethical standards
 around Big Data and eGovernance. On the plus side, I am happy to have
 that kind of access to conduct research which will ultimately be
 ethically beneficial, leading to better understanding of social issues
 and potentially to better policy advice. Also, there is a point to be
 made that transparency is an important asset in elections in particular,
 not only in terms of individual electoral search functions, but also in
 terms of publicly accessible (and cross-checkable, publicly verifiable)
 PDF rolls. Finally, a lot of this data had been available in the past as
 well, only in distributed and/or commercial form, which means there had
 been a hierarchy of access: small-time crooks could not use it, but
 large-time crooks were always able to use it. Likewise, scholars at
 large (often foreign) universities were able to use it, but not smaller
 ones (this is still true for some data, geodata in particular, which I
 can only access because of Ivy-League contacts and only process because
 of an association with Oxford University).

 The ethical challenge as I see it thus comes not from data availability
 per se, but from the bulk accessibility and processability of data, as
 well as the potential to link otherwise disconnected datasets with each
 other (for instance a voter ID from the rolls to the online electoral
 search mechanism to that voter's polling booth locality to the ration
 card of a person with the same name registered at a ration shop in close
 spatial proximity to the amount of rice that person obtained last week,
 all coupled - in case of my own research - to that person's religious
 identity through a namematching algorithm). And this IS an ethical
 challenge indeed, particularly if one leaves the ivory tower of
 academia, where ethical standards for such data are more ingrained, and
 more adhered to. One need not go all the way to the various criminal
 uses of such data - are we all happy with commercial use, to start with?

 I have no easy answers here, because I think the ethical issue is fairly
 complex, balancing privacy and personal security against transparency in
 the political process and legitimate academic use of data (also because
 I think the answer must be found in India through political
 deliberation, and not in German academia). Still, in the end, I have to
 admit that I often leave my desk in the evening with quite some unease
 over the sheer wealth of private data that I work with...

 What do others think?
 Raphael

 On 11.04.2014 06:57, Avinash Celestine wrote:
  Hi Devdatta
 
  Yes, though (and in the current context, i suppose thats a good 

Re: [datameet] Security Issues with the Voter List

2014-04-11 Thread Avinash Celestine
Hi Gautam

I dont think the issue is with having the electoral roll available publicly
per se. personally, i think its better that the rolls are available in the
open, as compared with the alternative, where it is confidential, thus
leaving it open to other types of abuses.

But i do think that certain minimum safeguards should be in place - even
something as simple as a captcha code (and mentioned in the link which
started off this thread), to deter heavy bulk downloading...it seems to me
the bare minimum.

Now, will this stop me from searching for someone specific within the
voters list that i want to target, given that i have a rough idea of where
they live? certainly not.

Coupled with this is the irony, that other datasets for which there is
absolutely no reason for secrecy (atleast i cant conceive of a reason for
it - maybe its pure bureaucracy), are extremely difficult to get. Case in
point is any official version of the PC, AC shapefiles which Raphael and
others on this group have been trying so hard to create.

Raphael is right - these are complex issues. And we have barely begun to
scratch the surface of what should be done. Interestingly, in the reddit
thread linked above, there are references to the fact that  New York or
Sweden too provide vast amounts of personal information for little or no
fee...

Avinash




On Fri, Apr 11, 2014 at 11:57 AM, Gautam John gkj...@gmail.com wrote:

 Leaving aside my earlier comment as perhaps tongue in cheek, the
 electoral rolls are *meant* to be public. The Registration of Electors
 Rules, 1960 makes that clear. However, your larger point is well made.
 Maybe what needs to be done is to *de-centralise* the storage? That
 fulfils the requirements of the Registration of Electors Rules, 1960
 and making it harder to something like this.

 It says: As soon as the roll for a constituency is ready, the
 registration officer  shall publish it in draft by making a copy
 thereof available for inspection and displaying a notice in Form 5--
 (a) at his office, if it is within the constituency, and  (b) at such
 place in the constituency as may be specified by him for the purpose,
 if his office is outside the constituency ; [or in the official
 website of the Chief Electoral Officer of the concerned State:]
 [Provided that where such draft contains names of overseas electors,
 the copies of such rolls shall also be published in the Electronic
 Gazette 6 [or in the official website of the Chief Electoral Officer
 of the concerned State].]

 The Representation of the People Act, 1951 contains this: The
 Government shall, at any election to be held for the purposes of
 constituting the House of the People or the Legislative Assembly of a
 State, supply, free of cost, to the candidates of recognised political
 parties such number of copies of the electoral roll, as finally
 published ...

 Worth asking if we want political parties to have free access to it
 but not citizens.
 People Act, 1950 (43 of 1950)

 --
 For more details about this list
 http://datameet.org/discussions/
 ---
 You received this message because you are subscribed to the Google Groups
 datameet group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to datameet+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [datameet] Security Issues with the Voter List

2014-04-11 Thread Raphael Susewind
Chandrashekhar,

just on the specific issues of targeting communities, which I have
thought about a great deal (my first book was on post-2002 Gujarat), my
tentative conclusion is this:

The fact that electoral rolls had been used in the past in riots before
they were available online shows that rioters, if they want to, can
access this data already. As Gautam pointed out, it IS public by law.
What changes is merely the scale of data availability. Large-scale data
would only be 'more useful' for large-scale targeting, however
(small-scale targeting is possible already), which I don't see happening
at this time (with the troublesome exception of Gujarat, particularly
troublesome now that Mr Modi runs for PM - but here, too, the targeting
happened in small units on the ground, even though coordination took
place higher up). On the other hand, fine-grained large-scale data is
absolutely necessary to understand a range of issues about (religious,
caste) economic position. So that in this specific case, we have
additional benefits but no additional risk (beyond the worrisome risk
already out there)...

More detailed arguments about this in a forthcoming paper of mine at
http://pub.uni-bielefeld.de/publication/2631138

Best,
Raphael

On 11.04.2014 08:49, Chandrashekhar Raman wrote:
 Raphael, you raise very pertinent issues.
 
 We as a community love open data and in this country there is a lot that
 can be done to free all kinds of data so that it can be made use of in a
 good way (election data in an aggregated form is one example). But at
 the same time there are certain kinds of data which are not open ( i
 mean not open in a machine readable format) for a good reason. I believe
 voter rolls data is one such type. In the past voter lists have been
 used to pinpoint members of specific communities which were then
 targeted with gruesome effect. Shudder to think what happens if it is
 automated, a 'riot app'?
 
 As Raphael points out this is not just about privacy, but could be much
 worse.
 
 This group is a fantastic initiative and as it evolves, it would be
 great for us to involve more social scientists and policy experts - so
 as we advocate vociferously to free more data and make it open - we can
 also bring in the technical expertise here to recommend where data needs
 to be better protected and how.
 
 cs
 
 
 On Fri, Apr 11, 2014 at 11:44 AM, Raphael Susewind
 li...@raphael-susewind.de mailto:li...@raphael-susewind.de wrote:
 
 Hi Devdatta and Avinash,
 
 yes, I, too, am frankly surprised at the ease with which one can access
 sensitive data in bulk. Not only PDF rolls and voter details, but also
 things such as land records, BPL lists, and much more - I think we are
 in an exciting as well as dangerous phase of fairly uncontrolled,
 nascent e-Governance practices. But I think the ethical issues here are
 a little more complex than mere privacy concern.
 
 Upfront, I must admit that I use all the above sources for academic
 research (in UP and across India). What Avinash described in principle
 and at the example of Delhi can indeed be done on an all-India scale,
 and I am sure there are more people than just me who do it.
 
 But then the social sciences have long dealt with sensitive data and
 developed protocols to protect it. Even though the data is publicly
 available, I for instance have my own copy on a secure workstation with
 full disk encryption and two factor authentication. Whenever possible, I
 also work on anonymized subsets of data. Yet there are other potential
 uses - some of the more worrisome you pointed out - which are not bound
 by such data protection standards.
 
 To me, this once more highlights the nascent stage of ethical standards
 around Big Data and eGovernance. On the plus side, I am happy to have
 that kind of access to conduct research which will ultimately be
 ethically beneficial, leading to better understanding of social issues
 and potentially to better policy advice. Also, there is a point to be
 made that transparency is an important asset in elections in particular,
 not only in terms of individual electoral search functions, but also in
 terms of publicly accessible (and cross-checkable, publicly verifiable)
 PDF rolls. Finally, a lot of this data had been available in the past as
 well, only in distributed and/or commercial form, which means there had
 been a hierarchy of access: small-time crooks could not use it, but
 large-time crooks were always able to use it. Likewise, scholars at
 large (often foreign) universities were able to use it, but not smaller
 ones (this is still true for some data, geodata in particular, which I
 can only access because of Ivy-League contacts and only process because
 of an association with Oxford University).
 
 The ethical challenge as I see it thus comes not from data availability
 

Re: [datameet] Security Issues with the Voter List

2014-04-11 Thread Chandrashekhar Raman
Raphael, To clarify, i am not trying to make a case against availability of
fine grained data, far from it i'm with you on this argument among others
that are made spuriously to restrict access. I might have stretched the
point but then again - killing is just one extreme form of discrimination -
there are others that are less visible

you summed it up very well, its good to have a healthy caution and unease
when dealing with some of this data,there are probably no simple answers
here.

will read the paper at leisure.

cs.


On Fri, Apr 11, 2014 at 12:37 PM, Raphael Susewind 
li...@raphael-susewind.de wrote:

 Chandrashekhar,

 just on the specific issues of targeting communities, which I have
 thought about a great deal (my first book was on post-2002 Gujarat), my
 tentative conclusion is this:

 The fact that electoral rolls had been used in the past in riots before
 they were available online shows that rioters, if they want to, can
 access this data already. As Gautam pointed out, it IS public by law.
 What changes is merely the scale of data availability. Large-scale data
 would only be 'more useful' for large-scale targeting, however
 (small-scale targeting is possible already), which I don't see happening
 at this time (with the troublesome exception of Gujarat, particularly
 troublesome now that Mr Modi runs for PM - but here, too, the targeting
 happened in small units on the ground, even though coordination took
 place higher up). On the other hand, fine-grained large-scale data is
 absolutely necessary to understand a range of issues about (religious,
 caste) economic position. So that in this specific case, we have
 additional benefits but no additional risk (beyond the worrisome risk
 already out there)...

 More detailed arguments about this in a forthcoming paper of mine at
 http://pub.uni-bielefeld.de/publication/2631138

 Best,
 Raphael

 On 11.04.2014 08:49, Chandrashekhar Raman wrote:
  Raphael, you raise very pertinent issues.
 
  We as a community love open data and in this country there is a lot that
  can be done to free all kinds of data so that it can be made use of in a
  good way (election data in an aggregated form is one example). But at
  the same time there are certain kinds of data which are not open ( i
  mean not open in a machine readable format) for a good reason. I believe
  voter rolls data is one such type. In the past voter lists have been
  used to pinpoint members of specific communities which were then
  targeted with gruesome effect. Shudder to think what happens if it is
  automated, a 'riot app'?
 
  As Raphael points out this is not just about privacy, but could be much
  worse.
 
  This group is a fantastic initiative and as it evolves, it would be
  great for us to involve more social scientists and policy experts - so
  as we advocate vociferously to free more data and make it open - we can
  also bring in the technical expertise here to recommend where data needs
  to be better protected and how.
 
  cs
 
 
  On Fri, Apr 11, 2014 at 11:44 AM, Raphael Susewind
  li...@raphael-susewind.de mailto:li...@raphael-susewind.de wrote:
 
  Hi Devdatta and Avinash,
 
  yes, I, too, am frankly surprised at the ease with which one can
 access
  sensitive data in bulk. Not only PDF rolls and voter details, but
 also
  things such as land records, BPL lists, and much more - I think we
 are
  in an exciting as well as dangerous phase of fairly uncontrolled,
  nascent e-Governance practices. But I think the ethical issues here
 are
  a little more complex than mere privacy concern.
 
  Upfront, I must admit that I use all the above sources for academic
  research (in UP and across India). What Avinash described in
 principle
  and at the example of Delhi can indeed be done on an all-India scale,
  and I am sure there are more people than just me who do it.
 
  But then the social sciences have long dealt with sensitive data and
  developed protocols to protect it. Even though the data is publicly
  available, I for instance have my own copy on a secure workstation
 with
  full disk encryption and two factor authentication. Whenever
 possible, I
  also work on anonymized subsets of data. Yet there are other
 potential
  uses - some of the more worrisome you pointed out - which are not
 bound
  by such data protection standards.
 
  To me, this once more highlights the nascent stage of ethical
 standards
  around Big Data and eGovernance. On the plus side, I am happy to have
  that kind of access to conduct research which will ultimately be
  ethically beneficial, leading to better understanding of social
 issues
  and potentially to better policy advice. Also, there is a point to be
  made that transparency is an important asset in elections in
 particular,
  not only in terms of individual electoral search functions, but also
 in
  terms of publicly accessible (and 

Re: [datameet] Security Issues with the Voter List

2014-04-10 Thread Gautam John
Not sure this is a flaw. Maybe it's a feature? :D

-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
datameet group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.