Thanks. I'm new to the Arrow community so was hoping to get feedback if any of these are controversial or subject to constraints that I'm likely not familiar with. Point 2 is likely simplest and I can start with that.
Point 3 isn't coherent as a PR concept, but is a potential audience whose relation to Arrow I don't understand (could be an explicit non-goal for all I know). Wes McKinney <wesmck...@gmail.com> writes: > hi Jed, > > Would you like to submit a pull request to propose the changes or > additions you are escribing? > > Thanks > Wes > > On Sat, Mar 9, 2019 at 11:32 PM Jed Brown <j...@jedbrown.org> wrote: >> >> Wes asked me to bring this discussion here. I'm a developer of PETSc >> and, with Arrow is getting into the sparse representation space, would >> like for it to interoperate as well as possible. >> >> 1. Please guarantee support for 64-bit offsets and indices. The current >> spec uses "long", which is 32-bit on some common platforms (notably >> Windows). Specifying int64_t would bring that size support to LLP64 >> architectures. >> >> Meanwhile, there is a significant performance benefit to using 32-bit >> indices when sizes allow it. If using 4-byte floats, a CSR format costs >> 12 bytes per nonzero with int64, versus 8 bytes per nonzero with int32. >> Sparse matrix operations are typically dominated by bandwidth, so this >> is a substantial performance impact. >> >> 2. This relates to sorting of indices for CSR. Many algorithms need to >> operate on upper vs lower-triangular parts of matrices, which is much >> easier if the CSR column indices are sorted within rows. Typically, one >> finds the diagonal (storing its location in each row, if it exists). >> Given that the current spec says COO entries are sorted, it would be >> simple to specify this also for CSR. >> >> https://github.com/apache/arrow/blob/master/format/SparseTensor.fbs >> >> >> 3. This is more nebulous and relates to partitioned representations of >> matrices, such as arise when using distributed memory or when optimizing >> for locality when using threads. Some packages store "global" indices >> in the CSR representation (in which case you can have column indices >> that are larger than the specified sizes) -- I discourage this except as >> intermediate representations in matrix-matrix computations. Others >> (including PETSc) store local matrices plus a "local-to-global" >> translation of local indices to the distributed setting. This may be a >> no-op for Arrow (no support, but don't prevent), but I'm curious whether >> Arrow has a philosophy regarding collective use of distributed memory >> (e.g., via MPI). >> >> >> Finally, there are many other matrix formats to at least be aware of. >> These include blocked CSR (fewer indices to store), sliced Ellpack-style >> formats (better vectorization for some sparse matrix operations), and >> compressed row offsets for CSR (good if many rows are empty). >> Conventions regarding diagonals and how to express parallel matrices >> vary by package, but often requires some conversion (can sometimes be >> done in-place) to use different packages. PETSc has interfaces to many >> such packages, so we have many examples of such conversions. >> >> Thanks for reading. I'm happy to discuss further.