hi Jed, Would you like to submit a pull request to propose the changes or additions you are escribing?
Thanks Wes On Sat, Mar 9, 2019 at 11:32 PM Jed Brown <j...@jedbrown.org> wrote: > > Wes asked me to bring this discussion here. I'm a developer of PETSc > and, with Arrow is getting into the sparse representation space, would > like for it to interoperate as well as possible. > > 1. Please guarantee support for 64-bit offsets and indices. The current > spec uses "long", which is 32-bit on some common platforms (notably > Windows). Specifying int64_t would bring that size support to LLP64 > architectures. > > Meanwhile, there is a significant performance benefit to using 32-bit > indices when sizes allow it. If using 4-byte floats, a CSR format costs > 12 bytes per nonzero with int64, versus 8 bytes per nonzero with int32. > Sparse matrix operations are typically dominated by bandwidth, so this > is a substantial performance impact. > > 2. This relates to sorting of indices for CSR. Many algorithms need to > operate on upper vs lower-triangular parts of matrices, which is much > easier if the CSR column indices are sorted within rows. Typically, one > finds the diagonal (storing its location in each row, if it exists). > Given that the current spec says COO entries are sorted, it would be > simple to specify this also for CSR. > > https://github.com/apache/arrow/blob/master/format/SparseTensor.fbs > > > 3. This is more nebulous and relates to partitioned representations of > matrices, such as arise when using distributed memory or when optimizing > for locality when using threads. Some packages store "global" indices > in the CSR representation (in which case you can have column indices > that are larger than the specified sizes) -- I discourage this except as > intermediate representations in matrix-matrix computations. Others > (including PETSc) store local matrices plus a "local-to-global" > translation of local indices to the distributed setting. This may be a > no-op for Arrow (no support, but don't prevent), but I'm curious whether > Arrow has a philosophy regarding collective use of distributed memory > (e.g., via MPI). > > > Finally, there are many other matrix formats to at least be aware of. > These include blocked CSR (fewer indices to store), sliced Ellpack-style > formats (better vectorization for some sparse matrix operations), and > compressed row offsets for CSR (good if many rows are empty). > Conventions regarding diagonals and how to express parallel matrices > vary by package, but often requires some conversion (can sometimes be > done in-place) to use different packages. PETSc has interfaces to many > such packages, so we have many examples of such conversions. > > Thanks for reading. I'm happy to discuss further.