So, there's an interesting problem I ran into:

Suppose you need to obfuscate symbols in a symbol file, and to be able
to look them up later to analyze crash dumps.

Suppose further that the symbol file format is kinda weird, and so you
can't re-create the symbol file from scratch.

IOW, the obfuscated symbols have to be changed in-place.

Since many symbols may be less than 160 bits, hashes are right out,
and those anyway would be limited since the input space is small
enough that one could easily guess all symbols, and hash them -
classic dictionary attack.  Hardening against guessing might help
some, but to store any salt would only make matters worse.

One could assign ordinal numbers to them, but for other, procedural
reasons one cannot do this - namely, the obfuscated value should not
change from release to release, and propogating the old ordinal
values with each release is a pain.

The simple answer seems to be encryption, which can operate without
adding any metadata, but there's a catch; the symbol file format
expects the values to be symbols - not arbitrary octets.

So, we have a non-uniform input space - let's say printable characters.

It would seem simple to figure out how many characters there are (n),
and compute a base n value for the symbol, encrypt it, and reduce it
to base n.

This could even be done with hairier input spaces, for example the
first octet must be a letter, but later octets can be underscores,
numbers or letters, etc.  I suppose you could assume that the
first octet is base 26, and so on, making a multibase number (is
that even a word?) out of it.

But this has a problem that the number of possible inputs is not
likely to be 2^(blocksize of the cipher).  In fact, it's likely that
it's not even a power of two.  Therefore, you'd end up encrypting more
values than are actually used, and there would be an equal number of
outputs, which couldn't necessarily then be cast back into a valid
multibase number of the same size.

This sounds like a job for coding theory, but I'm not really
sure what the solution is.

Now that I think about it, it seems like you'd want a description of
the domain for a symbol of a given length, map that to ordinal values,
then do a cryptographically strong permutation on them, then map them
back into the (co-)domain.

Does that sound right?  What would be the easiest way to do such
a thing for, say, C or C++?
-- 
Effing the ineffable since 1997. | http://www.subspacefield.org/~travis/
My emails do not usually have attachments; it's a digital signature
that your mail program doesn't understand.
If you are a spammer, please email [email protected] to get blacklisted.

Attachment: pgptA5qHvqPbz.pgp
Description: PGP signature

_______________________________________________
cryptography mailing list
[email protected]
http://lists.randombit.net/mailman/listinfo/cryptography

Reply via email to