Thanks for your reply Eric. You always have good suggestions.

On Friday, February 26, 2016 at 2:10:48 PM UTC+1, Erik Schnetter wrote:
>
> Kristoffer 
>
> You could try some simple "placeholder" operations, such as e.g. 
> creating a new tensor from an old tensor by only replacing a few 
> indices (or by adding them). Then you can compare speed between tuples 
> and arrays. Unless you see a speedup, this might not be worth 
> pursuing.  
>

I looked a bit at https://github.com/StephenVavasis/Modifyfield.jl and it 
seems the compiler is quite good at removing redundant writes when it comes 
to .
 

> If the bottleneck in your code is allocating the tensors, then using 
> (or: automatically generating?) operations that re-use existing 
> tensors might be the answer. 
>

Yes the bottleneck is in allocation. Regarding mutating functions, it is 
definitely a possibility but then you need to pass in a buffer, and the 
code gets a lot uglier because you are basically prohibiting infix operators

Compare (exaggerated for effect)

(A ⊗ B) * C ⋅ a 

and 

dot!(ABCA_buff, dcontract!(ABC_buffer, C, otimes!(AB_buffer, A, B), a)


It is of course possible create a dict to hold temporary arrays of the 
correct shape and type however, you need to be careful so you don't get 
wrong results due to aliasing. Also I have found that caching arrays in a 
dict is often quite slow.

Maybe this whole things would be solved 
by https://github.com/JuliaLang/julia/pull/8134 combined 
with https://github.com/JuliaLang/julia/pull/12205
 

>
> To reduce code duplication, you can try using the `ntuple` function, 
> and passing an actual function as argument. If this function isn't 
> inlined, then every operation on a 9x9 tensor is unrolled to "only" 81 
> operations, not to 81x9 operations. (Or maybe ntuple actually uses a 
> loop internally?) 
>
> If you then represent a matrix as a 9-tuple of 9-tuples, you might 
> reduce unrolling to 9 operations per function. 
>

Yes maybe computing one column at a time by using a nested tuple is the way 
to go. Like you say, it would give 81 operations and not the 700 I 
currently get. I think this is actually how FixedSizeArrays are doing it.
 

>
> Finally -- you can use `llvmcall` to generate low-level operations, 
> and in these low-level operations you can explicitly program loops. Of 
> course, this is likely a tad more low-level than you'd want. 
>
> I'm not aware of another way. 
>
> -erik 
>
>
> On Fri, Feb 26, 2016 at 3:25 AM, Kristoffer Carlsson 
> <[email protected] <javascript:>> wrote: 
> > Botched the title a bit. Oh well. 
> > 
> > 
> > On Friday, February 26, 2016 at 9:17:54 AM UTC+1, Kristoffer Carlsson 
> wrote: 
> >> 
> >> Hello everyone. 
> >> 
> >> To avoid the XY problem I will first describe my actual use case. I am 
> >> writing a library for tensors to be used in continuum mechanics. These 
> are 
> >> in dimensions 1,2,3 and typically of order 1,2,4. 
> >> Since tensor expressions are usually chained together like (A ⊗ B) * C 
> ⋅ a 
> >> I want to make these stack allocated so I can get good performance 
> without 
> >> making a bunch of mutating functions and having to preallocate 
> everything 
> >> etc. 
> >> 
> >> The way you stack allocate vector like objects in julia right now seems 
> to 
> >> almost exclusively be with an immutable of a tuple. So my data 
> structure 
> >> right now looks similar to: 
> >> 
> >> immutable FourthOrderTensor{T} <: AbstractArray{T, 4} 
> >>     data::NTuple{81, T} 
> >> end 
> >> 
> >> Base.size(::FourthOrderTensor) = (3,3,3,3) 
> >> Base.linearindexing{T}(::Type{FourthOrderTensor{T}}) = 
> Base.LinearFast() 
> >> Base.getindex(S::FourthOrderTensor, i::Int) = S.data[i] 
> >> 
> >> If you'd like, you can think of the NTuple{81, T} as actually represent 
> a 
> >> 9x9 matrix. 
> >> 
> >> 
> >> The largest operations I will need to do is a double contraction 
> between 
> >> two fourth order tensors which is equivalent to a 9x9 matrix multiplied 
> with 
> >> another 9x9 matrix. 
> >> 
> >> My actual question is, is there a good way to do this without 
> generating a 
> >> huge amount of code? 
> >> 
> >> I have looked at for example the FixedSizeArrays package but there they 
> >> seem to simply unroll every operation. For a 9x9 * 9x9 matrix 
> multiplication 
> >> this unrolling creates a huge amount of code and doesn't feel right. 
> >> 
> >> I guess what I want is pretty much 
> >> https://github.com/JuliaLang/julia/issues/11902 but is there any 
> workaround 
> >> as of now? 
> >> 
> >> Thanks! 
> >> 
> >> // Kristoffer 
>
>
>
> -- 
> Erik Schnetter <[email protected] <javascript:>> 
> http://www.perimeterinstitute.ca/personal/eschnetter/ 
>

Reply via email to