One of the challenges in the first set is to break repeating-key XOR cipher 
(Vigenere cipher).  The cleartext is presumed to be ASCII-encoded.

One method to break this cipher is to guess the length of the key, and then 
break the cipher text into blocks of key_length, and then take every Nth 
byte of cipher text up to N = key_length, and evaluate each collection of 
bytes statistically against character frequency in a target language.  You 
write a function that takes a dictionary of characters and maps to their 
relative frequency.  But your candidate key_length-transposed plaintext 
will have lowercase and uppercase characters, so let's convert them all to 
uppercase (method defined for ::ASCIIString).  You don't know what the key 
is, but each key character is a byte.  Most candidate key bytes will yield 
a lot of non-ascii characters after XOR, and even correct ones will yield 
common punctuation or who knows what, so for the sake of our statistical 
evaluation function let's write a function that filters non-ASCII 
characters out.  

It's correct to say this function needs arguments of type ASCIIString 
because methods will be called within that are only defined for 
::ASCIIString.  But when we return a ::ASCIIString from our filter 
function, its type degenerates to something else so we can no longer pass 
it into the next function without type error.

example code seen here:
https://github.com/cmoncure/crypto/blob/master/xor.jl
https://github.com/cmoncure/crypto/blob/master/scorelang.jl

On Saturday, September 12, 2015 at 3:42:22 PM UTC-4, Stefan Karpinski wrote:
>
> What encoding is the data in?
>
> On Fri, Sep 4, 2015 at 8:42 PM, Corey Moncure <[email protected] 
> <javascript:>> wrote:
>
>> Extremely new to Julia.  My background is in Python and C. 
>> Working on implementing the Matasano crypto challenges in Julia to learn 
>> the ins and outs.  The implementations require heavy use of string 
>> conversions, casting, and byte comparisons.
>>
>> Since Julia's built-in ascii() barfs on any byte that can't be 
>> represented in ASCII, it became useful to define a function that filters 
>> out all such bytes from a byte vector.
>>
>> function ascii_filter(s::Array{Uint8})
>>   if is_valid_ascii(s)
>>     return s
>>   end
>>   filter!(x -> is_valid_ascii([x]), s)
>>   @assert is_valid_ascii(s)
>>   s = ascii(s)
>>   @assert isa(s, ASCIIString)                                       <-- 
>> assertion OK
>>   return s
>> end
>>
>>
>>
>> The fact that is_valid_ascii() only has a method for vectors of bytes, 
>> and not a single byte, is a minor annoyance that is worked around by an 
>> anonymous function that wraps a Uint8 as a Vector{Uint8} of length 1.
>> However, I cannot seem to make this return a variable of type 
>> ASCIIString, which is necessary for later use with uppercase(), etc.  
>>
>> function detect_xor_encryption(cipher_text::Array{Uint8}, keys::Vector, 
>> threshold::Int = 50)
>>  {...}
>>     clear_text = ascii_filter(repeating_xor(cipher_text, key))
>>     @assert isa(clear_text, ASCIIString)                            <-- 
>> assertion fails
>>     s = score_candidate_language(clear_text, "english")
>> {...}
>>
>>
>>
>> function score_candidate_language(test_str::ASCIIString, language::String
>> )
>> {...}
>>
>>
>>
>> At the time of assignment to clear_text, it seems the return value of 
>> ascii_filter() has fallen back to Array{Uint8}.  No amount of monkeying 
>> around in ascii_filter() could solve the problem.  I tried defining 
>> s::ASCIIString, and explicitly returning ascii(s) after the assert.  It 
>> seems that no matter what I do, I have to explicitly define the type of a 
>> variable as ::ASCIIString or wrap ascii() in the *calling function* 
>> every time I want to use ascii_filter() to build an ASCIIString and pass it 
>> to a function that takes an ASCIIString as an argument.
>>
>> Is this intended?  Am I missing something obvious?
>>
>
>

Reply via email to