Google Video and Privacy Friday January 20, 2006 by Ed Felten http://www.freedom-to-tinker.com/?p=956
Last week Google introduced its video service, which lets users download free or paid-for videos. The service¹s design is distinctive in many ways, not all of them desirable. One of the distinctive features is a DRM (anti-infringement) mechanism which is applied if the copyright owner asks for it. Today I want to discuss the design of Google Video¹s DRM, and especially its privacy implications. First, some preliminaries. Google¹s DRM, like everybody else¹s, can be defeated without great difficulty. Like all DRM schemes that rely on encrypting files, it is vulnerable to capture of the decrypted file, or to capture of the keying information, either of which will let an adversary rip the video into unprotected form. My guess is that Google¹s decision to use DRM was driven by the insistence of copyright owners, not by any illusion that the DRM would stop infringement. The Google DRM system works by trying to tether every protected file to a Google account, so that the account¹s username and password has to be entered every time the file is viewed. From the user¹s point of view, this has its pros and cons. On the one hand, an honest user can view his video on any Windows PC anywhere; all he has to do is move the file and then enter his username and password on the new machine. On the other hand, the system works only when connected to the net, and it carries privacy risks. The magnitude of privacy risk depends on the details of the design. If you¹re going to have a DRM scheme that tethers content to user accounts, there are three basic design strategies available, which differ according to how much information is sent to Google¹s servers. As we¹ll see, Google apparently chose the design that sends the most information and so carries the highest privacy risk for users. The first design strategy is to encrypt files so that they can be decrypted without any participation by the server. You create an encryption key that is derived from the username and password associated with the user¹s Google account, and you encrypt the video under that key. When the user wants to play the video, software on the user¹s own machine prompts for the username and password, derives the key, decrypts the video, and plays it. The user can play the video as often as she likes, without the server being notified. (The server participates only when the user initially buys the video.) This design is great from a privacy standpoint, but it suffers from two main drawbacks. First, if the user changes the password in her Google account, there is no practical way to update the user¹s video files. The videos can only be decrypted with the user¹s old password (the one that was current when she bought the videos), which will be confusing. Second, there is really no defense against account-sharing attacks, where a large group of users shares a single Google account, and then passes around videos freely among themselves. The second design tries to address both of these problems. In this design, a user¹s files are encrypted under a key that Google knows. Before the user can watch videos on a particular machine, she has to activate her account on that machine, by sending her username and password to a Google server, which then sends back a key that allows the unlocking of that user¹s videos on that machine. Activation of a machine can last for days, or weeks, or even forever. This design addresses the password-change problem, because the Google server always knows the user¹s current password, so it can require the current password to activate an account. It also addresses the account-sharing attack, because a widely-shared account will be activated on a suspiciously large number of machines. By watching where and how often an account is activated, Google can spot sharing of the account, at least if it is shared widely. In this second design, more information flows to Google¹s servers Google learns which machines the user watches videos on, and when the user first uses each of the machines. But they don¹t learn which videos were watched when, or which videos were watched on which machine, or exactly when the user watches videos on a given machine (after the initial activation). This design does have privacy drawbacks for users, but I think few users would complain. In the third design, the user¹s computer contacts Google¹s server every time the user wants to watch a protected video, transmitting the username and password, and possibly the identity of the video being watched. The server then provides the decryption key needed to watch that particular video; after showing the video the software on the user¹s computer discards the key, so that another handshake with the server is needed if the user wants to watch the same video later. Google hasn¹t revealed whether or not they send the identity of the video to the server. There are two pieces of evidence to suggest that they probably do send it. First, sending it is the simplest design strategy, given the other things we know about Google¹s design. Second, Google has not said that they don¹t send it, despite some privacy complaints about the system. It¹s a bit disappointing that they haven¹t answered this question one way or the other, either to disclose what information they¹re collecting, or to reassure their users. I¹d be willing to bet that they do send the identity of the video, but that bet is not a sure thing. This third design is the worst one from a privacy standpoint, giving the server a full log of exactly where and when the user watches videos, and probably which videos she watches. Compared to the second design, this one creates more privacy risk but has few if any advantages. The extra information sent to the server seems to have little if any value in stopping infringement. So why did Google choose a less privacy-friendly solution, even though it provided no real advantage over a more privacy-friendly one? Here I can only speculate. My guess is that Google is not as attuned to this kind of privacy issue as they should be. The company is used to logging lots of information about how customers use its services, so a logging-intensive solution would probably seem natural, or at least less unnatural, to its engineers. In this regard, Google¹s famous ³don¹t be evil² motto, and customers¹ general trust that the company won¹t be evil, may get Google into trouble. As more and more data builds up in the company¹s disk farms, the temptation to be evil only increases. Even if the company itself stays non-evil, its data trove will be a massive temptation for others to do evil. A rogue employee, an intruder, or just an accidental data leak could cause huge problems. And if customers ever decide that Google might be evil, or cause evil, or carelessly enable evil, the backlash would be severe. Privacy is for Google what security is for Microsoft. At some point Microsoft realized that a chain of security disasters was one of the few things that could knock the company off its perch. And so Bill Gates famously declared security to be job one, thousands of developers were retrained, and Microsoft tried to change its culture to take security more seriously. It¹s high time for Google to figure out that it is one or two privacy disasters away from becoming just another Internet company. The time is now for Google to become a privacy leader. Fixing the privacy issues in its video DRM would be a small step toward that goal. You are a subscribed member of the infowarrior list. Visit www.infowarrior.org for list information or to unsubscribe. This message may be redistributed freely in its entirety. Any and all copyrights appearing in list messages are maintained by their respective owners.
