One of the challenges in the first set is to break repeating-key XOR cipher (Vigenere cipher). The cleartext is presumed to be ASCII-encoded.
One method to break this cipher is to guess the length of the key, and then break the cipher text into blocks of key_length, and then take every Nth byte of cipher text up to N = key_length, and evaluate each collection of bytes statistically against character frequency in a target language. You write a function that takes a dictionary of characters and maps to their relative frequency. But your candidate key_length-transposed plaintext will have lowercase and uppercase characters, so let's convert them all to uppercase (method defined for ::ASCIIString). You don't know what the key is, but each key character is a byte. Most candidate key bytes will yield a lot of non-ascii characters after XOR, and even correct ones will yield common punctuation or who knows what, so for the sake of our statistical evaluation function let's write a function that filters non-ASCII characters out. It's correct to say this function needs arguments of type ASCIIString because methods will be called within that are only defined for ::ASCIIString. But when we return a ::ASCIIString from our filter function, its type degenerates to something else so we can no longer pass it into the next function without type error. example code seen here: https://github.com/cmoncure/crypto/blob/master/xor.jl https://github.com/cmoncure/crypto/blob/master/scorelang.jl On Saturday, September 12, 2015 at 3:42:22 PM UTC-4, Stefan Karpinski wrote: > > What encoding is the data in? > > On Fri, Sep 4, 2015 at 8:42 PM, Corey Moncure <[email protected] > <javascript:>> wrote: > >> Extremely new to Julia. My background is in Python and C. >> Working on implementing the Matasano crypto challenges in Julia to learn >> the ins and outs. The implementations require heavy use of string >> conversions, casting, and byte comparisons. >> >> Since Julia's built-in ascii() barfs on any byte that can't be >> represented in ASCII, it became useful to define a function that filters >> out all such bytes from a byte vector. >> >> function ascii_filter(s::Array{Uint8}) >> if is_valid_ascii(s) >> return s >> end >> filter!(x -> is_valid_ascii([x]), s) >> @assert is_valid_ascii(s) >> s = ascii(s) >> @assert isa(s, ASCIIString) <-- >> assertion OK >> return s >> end >> >> >> >> The fact that is_valid_ascii() only has a method for vectors of bytes, >> and not a single byte, is a minor annoyance that is worked around by an >> anonymous function that wraps a Uint8 as a Vector{Uint8} of length 1. >> However, I cannot seem to make this return a variable of type >> ASCIIString, which is necessary for later use with uppercase(), etc. >> >> function detect_xor_encryption(cipher_text::Array{Uint8}, keys::Vector, >> threshold::Int = 50) >> {...} >> clear_text = ascii_filter(repeating_xor(cipher_text, key)) >> @assert isa(clear_text, ASCIIString) <-- >> assertion fails >> s = score_candidate_language(clear_text, "english") >> {...} >> >> >> >> function score_candidate_language(test_str::ASCIIString, language::String >> ) >> {...} >> >> >> >> At the time of assignment to clear_text, it seems the return value of >> ascii_filter() has fallen back to Array{Uint8}. No amount of monkeying >> around in ascii_filter() could solve the problem. I tried defining >> s::ASCIIString, and explicitly returning ascii(s) after the assert. It >> seems that no matter what I do, I have to explicitly define the type of a >> variable as ::ASCIIString or wrap ascii() in the *calling function* >> every time I want to use ascii_filter() to build an ASCIIString and pass it >> to a function that takes an ASCIIString as an argument. >> >> Is this intended? Am I missing something obvious? >> > >
