I had the same problem and stumbled on this post.  Any kernel / support 
vector based machine learning algorithm needs this.

Here is one more suggestion that might be useful in some circumstances: 

1. Create a large 2D array X of size (a,b).  
2. Keep track of the actual columns that are filled (c).  
3. Use subarray(X, :, 1:c) for actual operations.
4. Increment c and use copy! for appending additional columns.
5. Grow b as necessary.

This is basically implementing a 2D dynamic array and can be wrapped up in 
a new type for clean code.
Advantages: no sneaky pointer manipulations, no reshape, no unnecessary 
allocations.
Disadvantages: not easy to make the code pretty.

best,
deniz


On Saturday, December 28, 2013 at 3:46:32 PM UTC-8, Sheehan Olver wrote:
>
> Thanks for the suggestions.  It seems best to do the simplest first, and 
> then optimize later if memory management is taking a significant cost.   So 
> I think I’ll stick with reshape! and Array{Array{Float64,1},1}.
>
>
>
>
>
>
> On 29 Dec 2013, at 7:36 am, Stefan Karpinski <[email protected] 
> <javascript:>> wrote:
>
> If you make a mistake, segfault.
>
>
> On Sat, Dec 28, 2013 at 3:35 PM, Toivo Henningsson <[email protected] 
> <javascript:>> wrote:
>
>> So what happens if you use Tim's sneaky workaround and resize the 1d 
>> array? I suppose that the pointer is no longer valid...
>>
>>
>> On Saturday, 28 December 2013 18:25:50 UTC+1, Stefan Karpinski wrote:
>>
>>> The issue was bounds check elimination, which is already a problem for 
>>> 1d arrays. Currently it's very hard to eliminate them because arrays can 
>>> get resized out from under you at any point. 
>>>
>>> > On Dec 28, 2013, at 10:08 AM, Tim Holy <[email protected]> wrote: 
>>> > 
>>> > Holding columns in separate entries is a great way. However, if you 
>>> need to do 
>>> > linear algebra on the matrix at intermediate stages during its growth, 
>>> then 
>>> > you'll have a lot of needless copying occurring while you convert the 
>>> column- 
>>> > storage into a matrix. 
>>> > 
>>> > In such circumstances, there's a sneaky workaround: 
>>> > 
>>> >    reshape1(a::Vector, dims::Dims) = pointer_to_array(pointer(a), 
>>> dims) 
>>> > 
>>> >    a = zeros(3) 
>>> >    c = ones(3) 
>>> >    append!(a, c) 
>>> >    A = reshape1(a, (3, div(length(a),3))) 
>>> >    c += 1 
>>> >    append!(a, c) 
>>> >    A = reshape1(a, (3, div(length(a),3))) 
>>> > 
>>> > Using pointer_to_array circumvents the ordinary protections built into 
>>> resize! 
>>> > There's still allocation occurring (it has to build a new Array 
>>> "wrapper" on 
>>> > each iteration), but it avoids copying any data, and for large amounts 
>>> of data 
>>> > this is a big win. 
>>> > 
>>> > Even better would be to generalize resize! to support the final 
>>> dimension of 
>>> > any array. I seem to remember Stefan had a reason why this might be 
>>> > problematic, but I confess I forget what it is. 
>>> > 
>>> > --Tim 
>>> > 
>>> > 
>>> >> On Friday, December 27, 2013 05:45:15 PM Sheehan Olver wrote: 
>>> >> What's the "best" way of constructing an array that can grow 
>>> adaptively? 
>>> >> For example, it has fixed m rows but the number of columns grows as 
>>> an 
>>> >> algorithm proceeds.  Unfortunately, 
>>> >> 
>>> >> resize! 
>>> >> 
>>> >> doesn't work for 2d arrays.  It does work for 
>>> Array{Array{Float64,1},1}, 
>>> >> but not sure that's optimal. 
>>>
>>
>
>

Reply via email to