[
https://issues.apache.org/jira/browse/MATH-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653805#action_12653805
]
Ismael Juma commented on MATH-230:
----------------------------------
Or we could just add the appropriate specialised Map to commons-math. Are there
objections to that? I could provide one, it's relatively easy to do. I think
the best approach might even be one not mentioned above. Use a Map.Entry object
with two ints and a double. This has a few advantages:
(1) We use chaining instead of open addressing, so we can use a higher load
factor with good performance (e.g. 0.75 instead of 0.50). This should mitigate
the additional memory used by the Map.Entry object.
(2) We use a single array of Map.Entry objects instead of two arrays (one for
keys and one for values). When testing the performance for Scala HashMap
implementations (which converts to bytecode very similar to the one generated
by Java), it seems that using two arrays performs quite a bit worse (although
the test was using Strings instead of primitives for the value)[1].
(3) Gets don't require any boxing if we provide a get(int, int) method (this is
the same as using open addressing with primitives only though).
A rough sample of the implementation outline:
class IntPairToDoubleHashMap {
static class Entry {
final int firstKey;
final int secondKey;
double value;
}
int size;
Entry[] entries;
double get(int firstKey, int secondKey)
double put(int firstKey, int secondKey, double value)
}
This would be an internal map for usage by SparseRealMatrixImpl. I think
HotSpot would be able to inline calls to it, but we could also just have the
relevant code in SparseMatrixImpl. I think it's important to provide an
implementation that has the right performance characteristics by default.
What do you think?
[1] http://www.nabble.com/-scala--Open-versus-Chained-HashMap-td19254845.html
> Implement Sparse Matrix Support
> -------------------------------
>
> Key: MATH-230
> URL: https://issues.apache.org/jira/browse/MATH-230
> Project: Commons Math
> Issue Type: Improvement
> Affects Versions: 2.0
> Environment: N/A
> Reporter: Sujit Pal
> Assignee: Luc Maisonobe
> Priority: Minor
> Fix For: 2.0
>
> Attachments: patch.txt, RealMatrixImplPerformanceTest.java,
> SparseRealMatrixImpl.java, SparseRealMatrixImplTest.java
>
>
> I needed a way to deal with large sparse matrices using commons-math
> RealMatrix, so I implemented it. The SparseRealMatrixImpl is a subclass of
> RealMatrixImpl, and the backing data structure is a Map<Point,Double>, where
> Point is a struct like inner-class which exposes two int parameters row and
> column. I had to make some changes to the existing components to keep the
> code for SparseRealMatrixImpl clean. Here are the details.
> 1) RealMatrix.java:
> - added a new method setEntry(int, int, double) to set data into a matrix
> 2) RealMatrixImpl.java:
> - changed all internal calls to data[i][j] to getEntry(i,j).
> - for some methods such as add(), subtract(), premultiply(), etc, there
> was code that checked for ClassCastException and had two versions,
> one for a generic RealMatrix and one for a RealMatrixImpl. This has
> been changed to have only one that operates on a RealMatrix. The
> result is something like auto-type casting. So if:
> RealMatrixImpl.add(RealMatrix) returns a RealMatrixImpl
> SparseRealMatrixImpl.add(RealMatrix) returns a SparseRealMatrixImpl
> 3) SparseRealMatrixImpl added as a subclass of RealMatrixImpl.
> 4) LUDecompositionImpl changed to use a clone of the passed in RealMatrix
> instead of its data[][] block, and now it uses clone.getEntry(row,col)
> calls instead of data[row][col] calls.
> 5) LUDecompositionImpl returned RealMatrixImpl for getL(), getU(), getP()
> and solve(). It now returns the same RealMatrix impl that is passed
> in through its constructor for these methods.
> 6) New test for SparseRealMatrixImpl, mimics the tests in RealMatrixImplTest,
> 7) New static method to create SparseRealMatrixImpl out of a double[][] in
> MatrixUtils.createSparseRealMatrix().
> but using SparseRealMatrixImpl.
> 8) Verified that all JUnit tests pass.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.