BACtaki commented on PR #1650: URL: https://github.com/apache/systemds/pull/1650#issuecomment-1201640792
Thanks for your review @phaniarnab! Comments inline: > Are the optimizations only improve the naive cases where the input dimension is less than 1024 for the given direction (row/col)? The cache-conscious iteration optimization only applies to the naive counting case where `nnz < minimumSize = 1024`. Aside from this, I also refactored `MatrixSketch` and `KMVSketch` to replace the `getScalarValue() -> Integer` method with the `getValue() -> MatrixBlock` method to unify the handling of RowCol, Row, and Col cases (as per @Baunsgaard 's suggestion from an earlier PR). > I see that you are now iterating the dense and sparse inputs in a more cache-conscious manner (reducing CPU cache misses). Are there any other optimizations you are employing (e.g. reducing the number of intermediates)? Really good point re intermediates. We are actually creating an intermediate `SmallestPriorityQueue` of max size `k = 64` per row/col. The way we are constructing the intermediates places an upper bound on additional memory: 1. RowCol: max 64 double values (64 bits each) 2. Row: `SmallestPriorityQueue spq` is reset between each iteration, so the intermediate is at most 64 double values 3. Col: identical to Row, but the iteration occurs in column-major order and `spq` is reset between iterations of consecutive columns There are no further optimizations beyond cache-conscious iteration in CP case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org