Being familiar with ISP practice in this area, it is why you examine the content and what you do with the knowledge of the content observed, be it stored in your head or on disk, that matters.
It's pretty well established that one may monitor traffic in a general way in order to figure out what's up, make and enforce policy and so on. One cannot monitor/record a particular users's traffic and then disclose that traffic or use it for/against them or oneself. A few examples. Think about what is or is not legal and why. 1a - Any content based IDS such as bro. 1b - Any content based traffic shapers, balancers, etc. 1c - Any mail virus scanner, NetNanny, etc. 1d - Nagios, OpenNMS, netflow, HPOpenview, and so on. You can sure bet the purveyors of these products do not develop these systems in a pristine air gap lab environment using only traffic they generated. And they are deployed on real data. 2 - Any ISP trying to figure out why their traffic just trended up by 50% the last month. Any LAN admin trying to figure out why their T1 is saturated. 3a - Any network research group, whether private, institutional or white/gray/black. Bugtraq/FullDisclosure, Defcon presentations, live demos, etc. 3b - That guy who snooped Tor and published embassy passwords. 4a - Employer x, checking up on adherance to corporate email policy, reading random mails in the process. 4b - Finding out that you enjoy watching the mating habits of penuins on PBS and then wondering why you have one or more fewer friends in the lunchroom at work the next day. 5a - Social networking sites selling 'demographic and statistical' data to places like Intelius. 5b - Google trolling your email to display targeted ads and do who knows what else with. 5c - This call may be monitored or recorded. These are all black areas that are hard to get internal facts about unless you work deep inside where it happens. Some is ok, some is untrustworthy, some is evil. 6 - The US govt itself, and other countries, with their tap the entire internet projects. Some of this, and the handling of product from it, is known to be illegal, it's just so black that no one has been able to prove it yet. 7 - The thousands of networked entities that use netflow and other statistical and content analysis tools 24x7x365 without concern. 8 - Public records requests for netflow data from public institutions. Yes, they have had to disclose them. It's safe to snoop port 43 for this purpose and say I found: 200 whois queries to known public servers x, y and z. 53 HTTP GET's 34 plaintext irc sessions to these public ircnets. 22 initial ssh fingerprints 16 encrypted sessions to somewhere inside the pentagon. But not safe to say: 200 whois queries for these domains, some of which sent their domain passwords over port 80 to the registrar, here's the tokens. 53 HTTP GET's to a hapless bank x, here's Tony's info 34 irc sessions of Linda and Mark cybering, check out this conversation. 38 encrypted sessions where I further MITM'd them and here's their contents. For the most part, in the US, an exit node operator is an ISP. They are subject to common carrier, DMCA, ECPA and so on as it applies to their role as an ISP. And ISP's also have the right to protect, monitor, price and modify their nets as is standard industry practice. And to shield themselves from potential liability or legal expenditure and entanglement by dropping traffic that is too risky to handle, so long as it's done agnostically. If I were running an exit in the US, I'd be VERY happy to distill any amount of stats, be it IP or content based, and post them here. Including the number of times I saw the phrase 'I eat boogers' on my wire. It's just stats. And heck no, I'd never save or post the raw content, that's nuts. IANAL, jail may occur, subscribe to NANOG, your lawyer, EFF, etc.

