I'm not an expert in this area, and haven't had to extend AbstractArray 
myself, but what you are saying makes sense.

I think the best place to continue this would be as a Github Issue though, 
as we can more easily cross-reference it with other relevant issues (of 
which there are many, if I recall correctly).

On Wednesday, August 6, 2014 10:10:43 AM UTC-4, Michael Grant wrote:
>
> This is an expansion of a comment I made in my previous email, in hopes we 
> might have a separate discussion of it. Consider the documentation 
> (emphasis mine).
>
> Operations on AbstractArray objects are defined using higher level 
>> operators and functions, *in a way that is independent of the underlying 
>> storage*. These operations generally work correctly as a fallback for 
>> any specific array implementation... The AbstractArray type includes 
>> anything vaguely array-like, and implementations of it might be quite 
>> different from conventional arrays.
>
>
> Given the intent to allow AbstractArray to be extremely flexible, some of 
> the default implementations seem unnecessarily specialized and potentially 
> very inefficient. For instance, in theory, hcat and vcat are just 
> cat(2,...) and cat(1,...), so an AbstractArray implementation could be 
> quite short for these methods. But hcat has the following specializations:
>
> function hcat{T}(V::AbstractVector{T}...)
> function vcat{T}(V::AbstractVector{T}...)
> function hcat{T}(A::AbstractVecOrMat{T}...)
> function vcat{T}(A::AbstractMatrix{T}...)
>
> I can *totally* understand why these specializations are worthwhile for 
> DenseArray or Array objects. For instance, in the first three of these 
> specalizations, you can take advantage of the fact that, when using Fortran 
> ordering, the input matrices will be laid out *linearly and consecutively 
> *in the output matrix. And indeed, if you look at the implementation of 
> these methods, the implementations are taking advantage of this fact.
>
> But what happens if the concrete matrix uses C ordering, or perhaps some 
> sort of novel/compressed/distributed storage method? These implementations 
> be suboptimal---and not just because of the layout assumption, but also 
> because they use straight for-loops or list comprehensions to move elements 
> *one 
> element at a time. *Again, for Array objects, those loops are going to be 
> nice and tight when compiled down. But consider how inefficient that is for 
> sparse matrices, for instance.
>
> Of course, there are a couple of potential solutions for the creator of a 
> concrete array class. One is to study abstractarray.jl carefully and 
> override every single specialized method provided. Of course, this 
> increases the work required of the type designer, and is somewhat counter 
> to the purpose of having a truly abstract class. And this issue is made 
> more complex by the challenge of what to do about *heterogeneous *mixtures 
> of abstract arrays? For instance, what to do about concatenations of sparse 
> and dense arrays? In this case, I would think that the sparse array class 
> should be able to decide for itself whether to emit a dense or a sparse 
> array, depending on the total sparsity level of the input. That's going to 
> be made more difficult by having to override multiple methods; I'm finding 
> that even the invoke approach is not covering all the cases.
>
> I would suggest, therefore, that the right solution in cases like this is:
>
>    - Move these "specialized" methods up to the DenseArray or Array 
>    level, truly keeping AbstractArray as light as possible.
>    - Reimplement the AbstractArray methods that remain to *avoid 
> *single-element 
>    moves, preferring block moves, in case the concrete types have a faster 
> way 
>    to handle those.
>    - Find an elegant solution to the heterogeneous array dispatch 
>    problem. 
>
> On this last point, perhaps a promote_array hierarchy is in order here. 
> Consider the implementation of cat in AbstractArray instance:
>
> cat(catdim::Integer, A::AbstractArray...) = cat_t(catdim, promote_eltype(A
> ...), A...)
>
> I like this: it doesn't do the *conversion *yet, but at least it 
> establishes what the eltype should be. Well, what if, instead of 
> promote_eltype, there was some sort of promote_array method that abstract 
> array classes could override. The purpose of this method would be to 
> determine not just the final eltype but the final *concrete* array class 
> as well. New array classes could override promote_array as needed. And if 
> the second argument of cat_t were the output of promote_array, subclasses 
> could override it and provide their own storage-optimized implementation 
> without having to reimplement all the dispatch.
>
> Thoughts are welcome. If this looks promising I could even come up with a 
> prototype of promote_array.
>

Reply via email to