andishgar commented on issue #47520: URL: https://github.com/apache/arrow/issues/47520#issuecomment-3271622837
I think the current API for converting `SparseCooTensor` to `Tensor` is okay; however, the implementation forgot to address two things: 1-`SparseCooIndex` can be stored in either column-major or row-major order, and this can be determined from `SparseCooIndex` itself since it is essentially a tensor. However, the current implementation assumes that `SparseCooIndex` is always row-major. https://github.com/apache/arrow/blob/982d31f35fd2cfe87494698dae9ef67d3333658c/cpp/src/arrow/tensor/coo_converter.cc#L316-L321 2-These coordinates should be multiplied by the appropriate column- or row-major strides to determine the data locations in the final matrix. We also need to maintain a variable to track this stride and apply it to the result tensor. However, the current implementation assumes that the final stride is always row-major. https://github.com/apache/arrow/blob/982d31f35fd2cfe87494698dae9ef67d3333658c/cpp/src/arrow/tensor/coo_converter.cc#L308 https://github.com/apache/arrow/blob/982d31f35fd2cfe87494698dae9ef67d3333658c/cpp/src/arrow/tensor/coo_converter.cc#L319 https://github.com/apache/arrow/blob/982d31f35fd2cfe87494698dae9ef67d3333658c/cpp/src/arrow/tensor/coo_converter.cc#L327-L329 Let me investigate other sparse matrix formats (CSC, CSR, CSF) to see if they have the same logical error as case 2. Then I’ll give my final thoughts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
