Thanks.  I'm new to the Arrow community so was hoping to get feedback if
any of these are controversial or subject to constraints that I'm likely
not familiar with.  Point 2 is likely simplest and I can start with
that.

Point 3 isn't coherent as a PR concept, but is a potential audience
whose relation to Arrow I don't understand (could be an explicit
non-goal for all I know).

Wes McKinney <wesmck...@gmail.com> writes:

> hi Jed,
>
> Would you like to submit a pull request to propose the changes or
> additions you are escribing?
>
> Thanks
> Wes
>
> On Sat, Mar 9, 2019 at 11:32 PM Jed Brown <j...@jedbrown.org> wrote:
>>
>> Wes asked me to bring this discussion here.  I'm a developer of PETSc
>> and, with Arrow is getting into the sparse representation space, would
>> like for it to interoperate as well as possible.
>>
>> 1. Please guarantee support for 64-bit offsets and indices.  The current
>> spec uses "long", which is 32-bit on some common platforms (notably
>> Windows).  Specifying int64_t would bring that size support to LLP64
>> architectures.
>>
>> Meanwhile, there is a significant performance benefit to using 32-bit
>> indices when sizes allow it.  If using 4-byte floats, a CSR format costs
>> 12 bytes per nonzero with int64, versus 8 bytes per nonzero with int32.
>> Sparse matrix operations are typically dominated by bandwidth, so this
>> is a substantial performance impact.
>>
>> 2. This relates to sorting of indices for CSR.  Many algorithms need to
>> operate on upper vs lower-triangular parts of matrices, which is much
>> easier if the CSR column indices are sorted within rows.  Typically, one
>> finds the diagonal (storing its location in each row, if it exists).
>> Given that the current spec says COO entries are sorted, it would be
>> simple to specify this also for CSR.
>>
>>   https://github.com/apache/arrow/blob/master/format/SparseTensor.fbs
>>
>>
>> 3. This is more nebulous and relates to partitioned representations of
>> matrices, such as arise when using distributed memory or when optimizing
>> for locality when using threads.  Some packages store "global" indices
>> in the CSR representation (in which case you can have column indices
>> that are larger than the specified sizes) -- I discourage this except as
>> intermediate representations in matrix-matrix computations.  Others
>> (including PETSc) store local matrices plus a "local-to-global"
>> translation of local indices to the distributed setting.  This may be a
>> no-op for Arrow (no support, but don't prevent), but I'm curious whether
>> Arrow has a philosophy regarding collective use of distributed memory
>> (e.g., via MPI).
>>
>>
>> Finally, there are many other matrix formats to at least be aware of.
>> These include blocked CSR (fewer indices to store), sliced Ellpack-style
>> formats (better vectorization for some sparse matrix operations), and
>> compressed row offsets for CSR (good if many rows are empty).
>> Conventions regarding diagonals and how to express parallel matrices
>> vary by package, but often requires some conversion (can sometimes be
>> done in-place) to use different packages.  PETSc has interfaces to many
>> such packages, so we have many examples of such conversions.
>>
>> Thanks for reading.  I'm happy to discuss further.

Reply via email to