I had the same problem and stumbled on this post. Any kernel / support
vector based machine learning algorithm needs this.
Here is one more suggestion that might be useful in some circumstances:
1. Create a large 2D array X of size (a,b).
2. Keep track of the actual columns that are filled (c).
3. Use subarray(X, :, 1:c) for actual operations.
4. Increment c and use copy! for appending additional columns.
5. Grow b as necessary.
This is basically implementing a 2D dynamic array and can be wrapped up in
a new type for clean code.
Advantages: no sneaky pointer manipulations, no reshape, no unnecessary
allocations.
Disadvantages: not easy to make the code pretty.
best,
deniz
On Saturday, December 28, 2013 at 3:46:32 PM UTC-8, Sheehan Olver wrote:
>
> Thanks for the suggestions. It seems best to do the simplest first, and
> then optimize later if memory management is taking a significant cost. So
> I think I’ll stick with reshape! and Array{Array{Float64,1},1}.
>
>
>
>
>
>
> On 29 Dec 2013, at 7:36 am, Stefan Karpinski <[email protected]
> <javascript:>> wrote:
>
> If you make a mistake, segfault.
>
>
> On Sat, Dec 28, 2013 at 3:35 PM, Toivo Henningsson <[email protected]
> <javascript:>> wrote:
>
>> So what happens if you use Tim's sneaky workaround and resize the 1d
>> array? I suppose that the pointer is no longer valid...
>>
>>
>> On Saturday, 28 December 2013 18:25:50 UTC+1, Stefan Karpinski wrote:
>>
>>> The issue was bounds check elimination, which is already a problem for
>>> 1d arrays. Currently it's very hard to eliminate them because arrays can
>>> get resized out from under you at any point.
>>>
>>> > On Dec 28, 2013, at 10:08 AM, Tim Holy <[email protected]> wrote:
>>> >
>>> > Holding columns in separate entries is a great way. However, if you
>>> need to do
>>> > linear algebra on the matrix at intermediate stages during its growth,
>>> then
>>> > you'll have a lot of needless copying occurring while you convert the
>>> column-
>>> > storage into a matrix.
>>> >
>>> > In such circumstances, there's a sneaky workaround:
>>> >
>>> > reshape1(a::Vector, dims::Dims) = pointer_to_array(pointer(a),
>>> dims)
>>> >
>>> > a = zeros(3)
>>> > c = ones(3)
>>> > append!(a, c)
>>> > A = reshape1(a, (3, div(length(a),3)))
>>> > c += 1
>>> > append!(a, c)
>>> > A = reshape1(a, (3, div(length(a),3)))
>>> >
>>> > Using pointer_to_array circumvents the ordinary protections built into
>>> resize!
>>> > There's still allocation occurring (it has to build a new Array
>>> "wrapper" on
>>> > each iteration), but it avoids copying any data, and for large amounts
>>> of data
>>> > this is a big win.
>>> >
>>> > Even better would be to generalize resize! to support the final
>>> dimension of
>>> > any array. I seem to remember Stefan had a reason why this might be
>>> > problematic, but I confess I forget what it is.
>>> >
>>> > --Tim
>>> >
>>> >
>>> >> On Friday, December 27, 2013 05:45:15 PM Sheehan Olver wrote:
>>> >> What's the "best" way of constructing an array that can grow
>>> adaptively?
>>> >> For example, it has fixed m rows but the number of columns grows as
>>> an
>>> >> algorithm proceeds. Unfortunately,
>>> >>
>>> >> resize!
>>> >>
>>> >> doesn't work for 2d arrays. It does work for
>>> Array{Array{Float64,1},1},
>>> >> but not sure that's optimal.
>>>
>>
>
>