Are there any statisticians out there? Here's a question: Suppose you have a two-candidate election with 10,000,000 voters, and the computer says that candidate A beats candidate B 51% to 49%. How many randomly-selected ballots would you need to inspect and correlate to the ballot database to confirm the election result with 99.99% confidence? On the other hand, if the ballots had no information associating them with their corresponding database records, how many randomly-selected ballots would you need to count to achieve the same 99.99% confidence level? (Maybe a partial recount based on a small random sampling of the ballots would be a viable alternative to ballot serialization.)
Ken Johnson
Dave Ketchum wrote:
Perhaps we fight too much:
After the election each ballot has a home on the CD, with a fixed position.
This position must get decided before the ballot gets recorded.
It could get decided before the backup copy gets printed.
Thus it could get printed on the backup copy.
BUT, it is NOT ACCEPTABLE for this value to be knowable. So, find a way to print it without it being seen before polls close and ballot box gets shuffled, and I have no complaints.
Note that this is an addendum, done after I drafted what follows.
On Mon, 17 Nov 2003 13:34:45 -0800 Ken Johnson wrote:
>
> Dave Ketchum wrote:
>
>>
>> Either:
>> There is information in attaching this serial number to a ballot,
>> in which case I object, or
>> There is no such information, in which case I cannot imagine why
>> you would care.
>
>
> There is no information connecting the ballot with the voter. The ballot
> serialization is analogous to storing the ballot storage location with
> each database record, and it serves no purpose other than enabling
> election officials to confirm that individual ballots are correctly
> recorded in the database.
>
If "There is no information" you have not explained why you need the serials.
If there is information that people can use, we can plan on the wrong people using it sooner or later.
What is needed is more effort than we have had as to getting it done right - especially getting vendors to WANT to get it done right - which present vendor secrecy weakens.
>>
>>
>> No one talks seriously of shuffling 10,000,000 paper ballots, but New
>> York law does demand that the few hundred that accumulate in a day at
>> a polling station SHALL be shuffled BEFORE looking at content.
>
>
> If only 10 people voted at the station, and they all voted Green Party,
> the shuffling won't do much good :) On the other hand, ballot
> serialization would be fully randomized so you can't tell from the
> database where the votes came from.
>
Agreed there will be cases where there is no secrecy - such as a single paper ballot for a precinct where most voting is done via lever or electronic machines. This has always existed. It also seems less destructive after study than it seems at first glance - either:
Election is far from a tie, so the evil doers have problems other than this one, or
There are other voters who also pleased or displeased the evil doers, so this one or few non-secret votes matter little.
>>
>> On this reflector I have been specifying that electronic ballots SHALL
>> be stored randomly, such that you cannot tell by storage position
>> where the first or last ballot of the election got stored. The only
>> detail I am willing to concede is that, if there are too many ballots
>> at a polling station to all be stored in a "reasonable" sized area,
>> make the area big enough for reasonable randomizing, and write an
>> area's worth each time it fills up.
>>
>> I have also specified that if a paper backup ballot gets printed,
>> those SHALL go in a ballot box, and the content SHALL get shuffled
>> just as would have been done with paper ballots without computers.
>>
>> I do know, from having voted this month, that there is a record made
>> as to my being the nth voter (and there was ZERO secrecy as to my
>> being the nth voter). If my ballot automatically ended up in the nth
>> position in an electronic file there would be ZERO secrecy.
>
>
> I see no reason why "n" needs to be stored anywhere. Why should anyone
> care that you were the n-th voter? All that matters is that there is a
> certified record that (1) you are legally registered to vote, (2) you
> voted, and (3) you did not vote more than once.
>
And we have that without your serialization.
We have a serial of sorts - the ballot's position in the records - I just want ZERO relation of that to voter ID.
>>
>> Helps nothing if the information is distributed such that the computer
>> storing the voted ballots knows nothing of voter IDs, and the people
>> and/or computers recording which voter was nth in line are different.
>> Troublemakers would have no trouble correlating these two databases.
>
>
> n should not be stored and the two databases should be entirely separate
> with no way of cross-correlating them. Moreover, I don't think the
> second database (the one recording the fact that you voted) need be nor
> should be machine-readable. All they need to do is mark off (e.g.
> initial, stamp, or red-line) your name on a printed list of registered
> voters.
>
Machine readability is not that important. If the information is readable by machine for humans, or directly readable by humans, the wrong humans can do the reading.
> The ballot ID is is only used to correlate database ballot records with > paper ballots. This is necessary because the paper ballots - not the > database - constitute the official, legal record of the votes. Any > challenge to the election validity is resolved by inspecting the paper > ballots, not the database, so the computer-generated election result > should not be certified unless and until it is confirmed that the > database correlates to the paper ballots. > You do not need a serial to answer whether: There are the same quantity of ballots both places. And, for each voting pattern, the same quantity both places.
> I agree that voter secrecy is a concern, but I think ballot
> serialization can be implemented without compromising secrecy. My
> greater concern is the possibility that a clever hacker might find a way
> to alter the election results. Security concerns can be partially
> alleviated my mandating the use of open-source election software, but I
> think it's even more important that the raw data on which the software
> operates be freely available and subject to independent verification.
As to the hackers - DO NOT open the door when they knock. This is why I do not want to involve the internet.
> The verification means should be simple, transparent, and should not > require an high level of computer expertise or training to understand > and implement. The verification process should confirm the following: > (1) Every printed ballot corresponds to a database ballot record > containing the same vote selections.
As noted above, you can get this far by counting how many of each pattern exist.
But how do you do it without doing a full manual recount?
> (2) No two database records correspond to the same printed ballot.
Needs saying more carefully - defense against errors can include recording multiple copies of each ballot.
> (3) The number of printed ballots equals the number of database records.
Agreed - though there could be more problems making sure nothing destructive happened to the printed ballots.
> (4) The voting tallies generated from the database agree with the
> reported results.
> Without unique ballot ID's there is no way to confirm #1 and #2 without
> essentially doing a full manual recount and re-creating the full
> database. With ballot ID's #2 is a simple uniqueness test. #1 still
> requires inspection of the paper ballots, but a small random sample can
> be inspected to confirm #1 with very high statistical confidence. (Even
> if you counted all the ballots, statistical confidence would not improve
> because people make counting errors.) To fully validate the election, it
> also needs to be confirmed that
> (5) Only legally registered voters voted.
> (6) No one voted more than once.
> (7) The number of voters matches the number of ballots.
> These confirmations are made using records (preferably written) having
> no relation to the ballot database, except for the total ballot count.
>
Agreed that 5/6/7 are needed work, but they are outside current discussion.
Except to the extent that they relate to concerns about voter secrecy, and the problem of provably certifying accuracy of the election result without compromising secrecy.
> Ken Johnson
---- Election-methods mailing list - see http://electorama.com/em for list info
