On 2014-07-18, Bear wrote:

Steganography has been around for a long time but the problem with these techniques is that they are easily defeated.

No. The trouble is that most people who build these stego applications, don't seem to read their literature at *all*. For some reason, unlike in the rest of the crypto circuit, those who actually code stego work at the script kiddie level, instead of the PhD one -- which really does exist even for stego, as part of the information theoretical viewpoint of things.

The objective with Steno.io is to bring the robustness of an electronic encryption algorithm to paper.

Seriously, that it just stupid. It has absolutely nothing to do with hardcore, statistical data hiding.

I mean, I've been thinking about how to do that for a couple of years now, so as to hide low rate text messaging within telephone audio calls. The best I've come up with are a couple of DFT and DHT compatible syncro waveforms, with baseband direct sequence spread sprectrum, resynchro algoriths keyed on GSM's line protocol, stochastic waveshaping to throw any cheap, network-wide statistical recognizer off, and whatnot.

And yet I can be nowhere sure they couldn't detect the transmission amongst the utility signal, en masse, in any case. So that I shouldn't even start *coding* my solution as of now.

So what *is* it with you people? Can't you see that steganography really starts and ends with information and coding theory, unlike cryptography? Its bounds really necessarily and from the start have to do with noise and uncertainty, whereas crypto protocols only deal with clean data and computational complexity (eventually, preferably, proven-to-be-hard one-way-functions). Steganography really is its own, separate field, eventhough it shares most of the randomness, signal processing, complexity and whatnot, framework, with current crypto proper. (Especially the symmetrical and streaming kind, BTW, which might be a problem aand a subject for further study.)

Okay, I have a hypothetical. Let's call it the "Voynich alternative." Redirecting intellectual effort from cryptography as such to linguistics could plausibly result in an arguably practical system of storing handwritten information privately. It would be a system of limited utility at best because you'd have to actually spend up to a year or two internalizing the system in your own squishy brain before it would be usable to you, or your correspondent.

On the other hand, let's squish the brain and still do proper steganography. Also proper linguistics. With the silicon brain. How many covert bits can you really fit into a Twitter post, before NSA's silicon brain flags it as being terrorist? That's the steganographic competition for real! And we'd win it simply by numbers, if we just built a proper protocol...with the numbers actually utilizing it. So then, how do we build the protocol, and especially incentivize the numbers to adopt it?

Let's imagine that there is a person who is a conlang hobbyist and has a diary which he keeps in an entirely made-up language.

Never use a conlang for this sort of thing. They're too easy to parse and much too dense to embed information in. Use something like my mother tongue, Finnish: rather regular even in orthography, but still a natural language ripe with opportunities for embedding. Especially in its various dialects.

[...] while there may be only one 'image' in the constructed language for a given proper noun, the constructed word could be the result of applying the process to any of billions of possible preimage strings - of which possibly only one or possibly as many as a few dozen are genuinely proper nouns from which it might have been derived.

That really only works with polysyntetic languages, like some of the Native American and Inuit ones. Even Finnish doesn't really carry that far. Klingon would work pretty well, as would Navajo and Taa, but pretty much anything else would be too easy to decrypt. And yet the latter actually leave quite a bit of free redundancy to be exploited.

And, to make matters worse than that, almost every *other* word in the language could also result from the same set of substitution rules, each with billions of possible preimages which might include zero, one, or as many as a few dozen completely unrelated proper nouns.

At the same time, do you actually know of a grammar framework which could actually encapsulate either of your English or of my Finnish language, fully? Generatively? Fully describing both of their natural statistics, starting from a computational model, and one which actually leads to a computationally efficient recognition framework? So that the NSA, or the GCHQ, or what was it now in Sweden and Russia, can actually find your stego in real time?

It represents a monumental amount of very much enjoyed but arguably wasted intellectual effort on his part, in much the same way that Tolkien's middle-earth languages did prior to the publication of his books.

Wasn't there supposed to be someone, somewhere, who was actually raised speaking Lojban as her (?) first language? Because while that'd be rather difficult to parse at first, over the long run the whole language has been designed to be machine parsable.

Not to mention the fact that might se'd might be a she. Oh my dear.

While not "encrypted" as such, I doubt that anyone who got their hands on his journal could, in any reasonable timeframe or possibly ever, read it.

Undoubtedly so. But then, that's not what real steganography is about. It's not about willy nilly pushing a bit or two here or there into text, or using funny words. It's about taking a well statistically characterized carrier, in language/text, image, video, sound, whatevever, and imprinting a surreptitious message upon it, without disturbing any of its visible/audible/computationally-unearthable qualities, to some chosen degree. Preferably one that you could prove to be fully undetectable, but as it goes with even symmetric cryptosystems, we don't have such unconditional proofs as of yet...

[...] it should be as impenetrable as the Voynich manuscript.

Quite. But the way you make it so is different. Nowadays you base it on information and coding theory, and also cryptography proper. Certainly not on esoteric gryphs, because they might awake interest; no, you really ought to aim at kitteh-pic-meme-embedding or something akin to that. For carrier bandwidth, you see, and the endless variability.
--
Sampo Syreeni, aka decoy - [email protected], http://decoy.iki.fi/front
+358-40-3255353, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
_______________________________________________
cryptography mailing list
[email protected]
http://lists.randombit.net/mailman/listinfo/cryptography

Reply via email to