Oh yeah, I didn't notice that – having those be abstractly typed is going
to a trainwreck for performance. It should be possible to eliminate all
allocation from this kind of code.

On Fri, Sep 11, 2015 at 10:46 PM, Tom Breloff <[email protected]> wrote:

> I looked at your code for a grand total of 20 seconds, but the one thing I
> would check is whether using concrete types (Int) in your immutable, or
> maybe making it parametric:
>
> immutable AES_cipher_params{T<:Unsigned}
>   bits::T # Cipher key length, bits
>   nk::T # Number of 32-bit words, cipher key
>   nb::T # Number of columns in State
>   nr::T # Number of rounds
>   block_size::T # byte length
>   block_x::T # block dimensions X
>   block_y::T # block dimensions Y
> end
>
> It's possible the compiler can't infer enough types because of this?  Just
> a guess...
>
> On Fri, Sep 11, 2015 at 10:38 PM, Corey Moncure <[email protected]>
> wrote:
>
>> The creator has descended from the shining heavens and responded to my
>> post.  Cool.
>>
>> Here are some stats naively gathered from taking time() before and after
>> the run and subtracting.  Second run is with gc_disable().  A 1.26x speedup
>> is noted.
>>
>> elapsed (s): 0.0916
>> Throughput, KB/s: 1705.97
>> Average time (μs) per iteration: 0.0
>> Estimated cycles / iteration @ 4.0 GHz: 36636.0
>>
>> elapsed (s): 0.0727
>> Throughput, KB/s: 2149.6
>> Average time (μs) per iteration: 7.2688
>> Estimated cycles / iteration @ 4.0 GHz: 29075.0
>>
>>
>> I'd like to know how to minimize the effect of the garbage collector and
>> allocations.  The algorithm is a tight loop with a handful of tiny
>> functions, none of which ought to require much allocation.  A few variables
>> for placeholder data are inevitable.  But I have read the warnings
>> discouraging the use of global state.  What is the Julia way to allocate a
>> few bytes of scratch memory that can be accessed within the scope of, say,
>> apply_ECB_mode!() without having to be reallocated each time gf_mult() or
>> mix_columns! are called?
>>
>> Also, can a function be forced to mutate variables passed to it,
>> eliminating a redundant assignment?  Julia seems happy to do this with
>> Arrays but not unit primitives.  Pass by reference (pointer)?  Does the
>> LLVM appreciate this?
>>
>> On Friday, September 11, 2015 at 6:15:52 PM UTC-4, Stefan Karpinski wrote:
>>>
>>> There's nothing obviously glaring here. I would definitely recommend
>>> using the built-in profiler to see where it's spending its time. There may
>>> be subtle type instabilities or some other non-obvious issue. You
>>> definitely ought to be able to get within striking distance of similar C
>>> code, which should be in the expected 4-10x slower than hand-coded assembly.
>>>
>>> On Fri, Sep 11, 2015 at 5:10 PM, Corey Moncure <[email protected]>
>>> wrote:
>>>
>>>> https://github.com/cmoncure/crypto/blob/master/aes.jl
>>>>
>>>> In the process of learning Julia (and crypto) I implemented the
>>>> Rijndael block cipher and inverse cipher.  I tried to write idiomatic yet
>>>> concise code, but the performance is not very desirable.  On my machine
>>>> (i5-2500k @ 4.0 Ghz) the throughput is piddling, on the order of 10e6
>>>> bytes/sec, and memory allocation is at 3056 bytes / block, which I have not
>>>> been able to cut down any further.
>>>>
>>>> Obviously I do not intend to compete with hand-tuned assembler routines
>>>> that heavily exploit SIMD and pre-computed tables, but my completely
>>>> unfounded gut feeling is that given the right input, Julia should be able
>>>> to approach within a factor of 4-10 without such optimizations.  Currently
>>>> this routine is within a factor of 1000.
>>>>
>>>> Any Julia experts out there willing to take a peek at the code and
>>>> offer some tips for idiomatic (i.e. within the framework of Julia syntax
>>>> and style) optimizations?
>>>>
>>>> In the course of doing this I have run into several gripes with Julia,
>>>> particularly some of the built-in functions which are often confusing or
>>>> contradictory by virtue of the type declarations of certain methods (or
>>>> lack of needed ones).  For instance, Julia does not support negative
>>>> indexing of arrays... so then why do so many functions on arrays take only
>>>> signed integer types for dimensions?  To the noobie it seems like an
>>>> obvious choice to type data holding the calculation of matrix dimensions or
>>>> indices as unsigned integers, given that the language does not support
>>>> negative indexing.  Yet this fails unexpectedly in many built-ins such as
>>>> sub().
>>>>
>>>>
>>>>
>>>
>

Reply via email to