Ismael Juma (JIRA) wrote:
[ https://issues.apache.org/jira/browse/MATH-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653805#action_12653805 ]
Ismael Juma commented on MATH-230:
----------------------------------

Or we could just add the appropriate specialised Map to commons-math. Are there 
objections to that? I could provide one, it's relatively easy to do. I think 
the best approach might even be one not mentioned above. Use a Map.Entry object 
with two ints and a double. This has a few advantages:

(1) We use chaining instead of open addressing, so we can use a higher load 
factor with good performance (e.g. 0.75 instead of 0.50). This should mitigate 
the additional memory used by the Map.Entry object.

(2) We use a single array of Map.Entry objects instead of two arrays (one for 
keys and one for values). When testing the performance for Scala HashMap 
implementations (which converts to bytecode very similar to the one generated 
by Java), it seems that using two arrays performs quite a bit worse (although 
the test was using Strings instead of primitives for the value)[1].

(3) Gets don't require any boxing if we provide a get(int, int) method (this is 
the same as using open addressing with primitives only though).

A rough sample of the implementation outline:

class IntPairToDoubleHashMap {
  static class Entry {
    final int firstKey;
    final int secondKey;
    double value;
  }
  int size;
  Entry[] entries;
double get(int firstKey, int secondKey)
  double put(int firstKey, int secondKey, double value)
}

This would be an internal map for usage by SparseRealMatrixImpl. I think 
HotSpot would be able to inline calls to it, but we could also just have the 
relevant code in SparseMatrixImpl. I think it's important to provide an 
implementation that has the right performance characteristics by default.

What do you think?

[1] http://www.nabble.com/-scala--Open-versus-Chained-HashMap-td19254845.html

Implement Sparse Matrix Support
-------------------------------

                Key: MATH-230
                URL: https://issues.apache.org/jira/browse/MATH-230
            Project: Commons Math
         Issue Type: Improvement
   Affects Versions: 2.0
        Environment: N/A
           Reporter: Sujit Pal
           Assignee: Luc Maisonobe
           Priority: Minor
            Fix For: 2.0

        Attachments: patch.txt, RealMatrixImplPerformanceTest.java, 
SparseRealMatrixImpl.java, SparseRealMatrixImplTest.java


I needed a way to deal with large sparse matrices using commons-math RealMatrix, so I 
implemented it. The SparseRealMatrixImpl is a subclass of RealMatrixImpl, and the 
backing data structure is a Map<Point,Double>, where Point is a struct like 
inner-class which exposes two int parameters row and column. I had to make some 
changes to the existing components to keep the code for SparseRealMatrixImpl clean. 
Here are the details.
1) RealMatrix.java:
   - added a new method setEntry(int, int, double) to set data into a matrix
2) RealMatrixImpl.java:
   - changed all internal calls to data[i][j] to getEntry(i,j).
   - for some methods such as add(), subtract(), premultiply(), etc, there
     was code that checked for ClassCastException and had two versions,
     one for a generic RealMatrix and one for a RealMatrixImpl. This has
been changed to have only one that operates on a RealMatrix. The result is something like auto-type casting. So if:
       RealMatrixImpl.add(RealMatrix) returns a RealMatrixImpl
       SparseRealMatrixImpl.add(RealMatrix) returns a SparseRealMatrixImpl
3) SparseRealMatrixImpl added as a subclass of RealMatrixImpl.
4) LUDecompositionImpl changed to use a clone of the passed in RealMatrix
   instead of its data[][] block, and now it uses clone.getEntry(row,col)
   calls instead of data[row][col] calls.
5) LUDecompositionImpl returned RealMatrixImpl for getL(), getU(), getP()
and solve(). It now returns the same RealMatrix impl that is passed in through its constructor for these methods.
6) New test for SparseRealMatrixImpl, mimics the tests in RealMatrixImplTest,
7) New static method to create SparseRealMatrixImpl out of a double[][] in
   MatrixUtils.createSparseRealMatrix().
   but using SparseRealMatrixImpl.
8) Verified that all JUnit tests pass.


*****************************************************************
This E-mail is confidential and intended only for the recipients to whom it is 
addressed.
If you were not an intended recipient, please notify the sender and delete all 
copies.
S4Carlisle does not accept liability for electronic file transfers.
*****************************************************************

Reply via email to