Everyone,

A fairly common pattern in Machine Learning is to use a single
contiguous chunk of memory for all of your parameters, as this makes
operations such as regularisation trivial.  I took a quick stab at
achieving this using ArrayViews.jl, but I am not entirely pleased with
what I could come up with.  I will in-line code as we go, there is
also a link to a Gist with a script below.

    https://gist.github.com/ninjin/6495e639a23ddfc76e30

To hold the data we can declare a simple type.

    immutable Model
        data::Vector{Float64}
        a::ContiguousView{Float64,2}
        b::ContiguousView{Float64,1}
    end

Which we can instantiate, the details of the initialisation are not
that relevant.

    function Model(d)
        # Shape and lengths for the views.
        ashp = (d, 2 * d)
        alen = sum(prod(ashp))
        bshp = (d, )
        blen = sum(prod(bshp))
        # A single slice of contiguous memory for both `a` and `b`.
        data = zeros(alen + blen)
        # Sequential data to make debugging easier.
        data[:] = 1:length(data)
        data[:] /= 10^5
        # Create the views.
        a = contiguous_view(data, 0, ashp)
        b = contiguous_view(data, alen, bshp)
        return Model(data, a, b)
    end

We can create an instance and some input.

    d = 2
    m = Model(d)
    x = [2*d:-1:1] / 10^5

Now, standard computations work just as expected.

    tanh(m.a * x + m.b)

When prototyping I frequently write vectorised code at first and then
de-vectorise it once I have it working.  So, say that we want to
update `m.a` with some new values `newa`.

    newa = rand(size(m.a))

What would be most natural would be to do a simple assignment to the
view, but...

    m.a[:] = newa # Does not work.

Using mutating variants would also be nice, but...

    rand!(m.a) # Does not work.

We can of course use direct indexing, but it is tedious and we must
give up on the idea of vectorised expressions.

    for j in 1:size(newa, 2)
        for i in 1:size(newa, 1)
            m.a[i, j] = newa[i, j]
        end
    end

The only approach that I have realised that regains some of the
ability to vectorise is to use Devectorize.jl.

    @devec m.a[:] = newa

But this still leads to less pretty code, since we have to declare
temporary variables for each function call that is used inside of the
vectorised expression.

    @devec m.a[:] = rand(size(m.a)) # Does not work.

Has anyone else been in a similar situation?  I am sure that my
approach is by no means optimal and I would be very thankful for any
and all feedback.  Until we have array views in base (maybe for 0.4?)
it would be really helpful if there was a way to express contiguous
views that works reasonably well with vectorised expressions.

    Pontus

Reply via email to