On 10/19/06, Leandro Meiners <[EMAIL PROTECTED]> wrote:
Can anybody point me to any good references regarding traffic analysis?
This is the only interesting page I found on it: http://guh.nu/projects/ta/safeweb/safeweb.html There are some historical incidents that are sufficiently old to be unclassified: For example, the Japanese left their normal morse operators behind when setting sail for Pearl Harbor. They continued to send transmissions as though they were still in Japan's waters. Morse operators are fairly identifiable by their rhythm and idiosyncrasies, known collectively as their "fist". It's just like any other behavior performed subconsciously, like typing or signing your name; at first there's a lot of variation, and later it becomes fairly fixed and potentially identifying. Also during WWII, a year before D-Day, the Allies in Scotland created a radio net that purported to be a [nonexistent] 4th Army, ostensibly to feint towards southern Norway. The purpose behind this was to further dilute Axis forces, to keep them far enough away to be unable to participate around Normandy (there were, obviously, numerous deception operations around D-Day). This last bit is well documented in "The Codebreakers", which also has numerous entries in its appendix for Traffic Analysis. I suspect that in many instances where traffic analysis was useful, it was necessary to make (or learn) certain assumptions about typical traffic patterns; that is, orders come from the top and are disseminated down the military hierarchy, etc.; that requests for supplies, battle damage assessments, and other feedback flows up from the front-line troops to the logistic units or field commanders; that traffic increases as one approaches a major military operation, etc. In other words, it's context-specific, and may resist generalization into easily-remembered axioms. Also, the mixmaster and cypherpunk remailers, AT&T's crowds, and the onion-routing groups, probably have some papers considering various traffic analysis and correlation attacks against those systems since they are encrypted inside the mixers. One thing I have been interested in is the security of typical plaintext Internet protocols when "secured" with SSL/TLS/IPSec. If they don't do any padding, then the length of each step of the protocol is effectively given away; just count how much data passes to the recipient before data starts flowing in the opposite direction. Also, there is timing information, and it is fairly well preserved even across the Internet (see the timing side channel attacks against SSL). Even if there is padding, which is basically wasted bandwidth, it may still be possible to discern information. I've been thinking about this, and I am not sure how to entirely avoid it without running into other problems. For example, Unix's configuration files and application-level TCP/IP protocols are very easy to interpret and troubleshoot thanks to their human-readable strings. The typical encrypted protocol uses non-textual, constant-length messages, which can make it difficult to extend without introducing incompatibilities (or even making different responses different lengths again, the worst of both worlds). One doesn't typically need very extensive decoding algorithms in order to make the plaintext data human-readable, which is good because those decoding libraries are also processing data from remote (untrusted) entities and form part of the attackable surface, and have proven to be security holes on more than one occasion. One alternative I came up with is to send the entire catalog of possible responses at the beginning of the transmission, then refer to them by a fixed-length index. This would be a lot of overhead in many cases. Another alternative is to have a standard catalog, something like an MIB, that may be cached between invocations. Nevertheless, there are many times during a protocol that you wish to dynamically construct a response without knowing it a priori; it would seem difficult to deal with those cases in any other way. These approaches could be implemented simultaneously, and perhaps one only needs to pad when sending variable-length messages, so that "normal" common messages don't incur any overhead (at the cost of fixed-length and variable-length messages being distinguishable sets, but not distinguishable individually). In this way it is similar to what cryptologists were doing with telegraph codebooks, which encoded standard phrases in relatively similarly sized units, but had to spell out anything not in the codebook using many codes (each signifying one letter or part of a word). If you come across any other links, please let me know as I'd like to add them to my page on side-channel attacks: http://www.subspacefield.org/~travis/side_channel_attacks.html -- "It's not like I'm encrypting... it's just that my communications developed a massive entropy deficiency." -><- <URL:http://www.subspacefield.org/~travis/> GPG fingerprint: 9D3F 395A DAC5 5CCC 9066 151D 0A6B 4098 0C55 1484 --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]