There is a serious flaw in cryptography based on semiprimes. While attempts at breaking such cryptography typically focus on factoring semiprimes, the approach I have identified sidesteps this challenge entirely. I have yet to evaluate my techniques performance on keys of practical length, a number of features suggest that it should be highly efficient.
What I recognized that the goal of decrypting a message does not require knowing the prime factors. Cryptography is possible due to an intrinsic characteristic of semiprimes that there exists an exponent 'k' such that x^k mod n = 1 where 'n' is a semiprime number and 'x' is an arbitrary integer with x < n. And a number can be encrypted with the formula x^e mod n = y where 'y' is the encrypted message. Then decrypted with the formula y^d mod n = x where d+e = k+1. (Really d+e = i*k+1 where i is a positive integer.) The formula x^r mod n (for positive integer 'r') acts as a clock function, which includes a predictable periodicity. For powers if i*k, the formula passes through one and the next power (i*k+1) gives the original power back. And for the public key exponent 'e', a message can be decrypted using i*k+1-e, regardless of what the private key exponent is. All of that is to say that given an encrypted message and the associated public key, you just need to find the loop exponent 'k' and the problem becomes trivial. The good news is that searching for k is highly efficient. All that is needed is two different exponents which provide the same result. The goal is two exponents 'r' and 's' such that x^r mod n = x^s mod n since the difference will be a power which produces one in the formula. This can be found by testing different exponents and storing them along with the results in two hashes. The hash keyed in exponents avoids recalculating previous attempts and the hash keyed in results makes finding duplicate values as quick as a hash lookup. This reduces the difficulty to a birthday problem. (If you have 'n' people in a room, what are the odds two or more have the same birthday. https://en.wikipedia.org/wiki/Birthday_problem) The tested exponents can be spread out to limit repetition of similar distances, this means that each new exponent is compared against all previous ones with a portion of the comparisons being unique. And the unique portion will be a fraction of the total number of previous attempts. Which means that the searched space increases geometrically in both time and memory space. And that's the type of thing you want when you're trying to break cryptography. (If you consider the space of semiprime 'n' to be what is searched, this approach is actually quite a bit better than the birthday problem. You need 367 people to guarantee that two have the same birthday but you don't need to try all 'n' possible exponents but all possible distances between exponents which can be contained in 'n'. I'm an applied mathematician and never really liked number theory so I'm just going to guess and say you need about sqrt n tries to cover the whole space. So far, outstanding.) The other piece of really good news is that the value which is being searched for isn't unique. Any exponent which gives unity is sufficient for decryption and there are guaranteed to be at least two which are less than n. And the number of potential values increases with increasing key length (not absolutely but on average). Which means that longer keys actually give more chances to find a suitable exponent and the odds increase with each exponent tried. There's some obvious detail I've added but I feel like the takeaway is that simply looking for the loop exponent 'k' rather than trying to factor semiprimes makes the challenge of breaking RSA and related methods far easier. In fact, I believe that this method will allow for the cracking of keys with a reasonable length in a reasonable time on existing hardware. (My guess is that this approach may break keys in sub-linear time relative to key length.) Now for the bad news. The exponent identified in the search I described above is not the same as the 'k' I've been discussing. Each value 'x' which is encrypted has it's own loop exponent which can be a factor (therefore smaller) than the 'k' specific to the controlling semiprime. If an exponent which gives a result of unity for a value 'x', that exponent can be used to decrypt that message. But it is likely that a different exponent will be needed for a different message 'x' (really, message chunk since the total message is broken up into pieces for encryption). In order to address this, it seems wise to begin searching with an x of 2 for greater computational efficiency. Once a working exponent is found it should be factored to find the smallest exponent for that message. Then, if actual encrypted data is available, it can be searched but only for exponents which are multiples of the one identified previously. And each time a new exponent is found, the process can be repeated to reduce the search time for remaining message chunks. (I know this sounds like a lot of work but the space searched is rapidly contracting and I'm still hoping for sublinear.) That's a bit of a drag but the good news about the bad news is that multiple message chunks can be searched in parallel without any memory sharing between them. So I'm guessing it should be stupid simple to distribute computation (for people who actually understand these things, which I do not). And that's it. I first stumbled across this four years ago and took it far enough to convince myself it works, then stopped and didn't tell anybody about it until last week. Now a bunch of people know, if they can recognize a good thing when they see it, but nobody is talking about it. Which is a little scary. But people should probably switch to elliptic curve based public key cryptography (with the strong caveat that those approaches may be broken by design). Feeling less secure yet? Thanks, Gabriel
