Re: [math] RealMatrixFormat.parse()

Ole Ersoy Thu, 31 Dec 2015 16:38:08 -0800


On 12/31/2015 05:42 PM, Gilles wrote:

On Thu, 31 Dec 2015 12:54:00 -0600, Ole Ersoy wrote:

On 12/31/2015 11:10 AM, Gilles wrote:

On Wed, 30 Dec 2015 21:33:56 -0600, Ole Ersoy wrote:

Hi,

In RealMatrixFormat.parse() MatrixUtils makes the decision on what
type of RealMatrix instance to return.


Ideally, this is correct as the actual type is an "implementation detail".

Flexibility is gained if it
just returns double[][] letting the caller decide what type of
RealMatrix instance to create.


That could become a problem e.g. for sparse matrices where the persistent
format and the instance type could be optimized for space, but a "double[][]"
cannot be.

RealMatrixFormat.parse() first creates a double[][] and then it drops
it into the Matrix wrapper it thinks is best, per MatrixUtils. By
leaving out the last step the caller can either use MatrixUtils (Or
hopefully MatrixFactory) to perform the next step. Or maybe there is
no next step.  Perhaps just having a double[][] is fine.


My opinion is that this code should be in a separate IO module.
where the external format can be made more flexible and more
correct (such as not doing unnecessary allocation).

Totally with you on that.  Ideally something along the lines of MatrixPersist 
and MatrixParse classes that support localized formatting.  Right now it's all 
bundled up into RealMatrixFormat...probably due to time constraints.  I'll look 
at modularizing that part later.  Right I'm breaking up MatrixUtils into 
MatrixFactory and LinearExceptionFactory, and then once the dust settles I can 
look at the IO piece in more detail.

It's also better for modularity, as is
reduces RealMatrixFormat imports (The MatrixUtils supports Field
matrices as well, and I'm attempting to separate real and field
matrices into two difference modules).


For modularity, IO should not be in the same module as the core
algorithms.

I agree in general.  I'm sticking all the 'Real' (Excluding Field)
classes in one module (Vector and Matrix).  AbstractRealMatrix uses
RealMatrixFormat, so it's tightly coupled ATM and it seems like it
belongs with the real Vector and Matrix classes so...


Given the major refactoring which you are attempting, why not drop
everything that does not belong?

Good point.  I'll just strip out the formatting, etc. from AbstractRealMatrix 
and reintroduce it in the IO module.

Also just curious if Array2DRowRealMatrix is worth keeping?  It seems
like the performance of BlockRealMatrix might be just as good or
better regardless of matrix size ... although my testing is limited.


I recall having performed a benchmark years ago and IIRC, the
"BlockRealMatrix" started to be more only for very large matrix size
(although I don't remember which).

That was what I was seeing as well.  Once matrix rows reach 100K - 10
million performance goes up between 2X and 5X, but I did not really
see any difference for (multiplication only) in performance for small
data sets.  So I'm assuming, like Luc indicated, that the
Array2DRowRealMatrix is only better when attempting to reuse the
underlying double[][] matrix a lot...


As I recall, for "small" matrices, the "Block" version was significantly
slower. Depends what we call "large" and "small"...

Hmm - That probably makes sense since Block has to create the block structure.  
I'll have a second look once I get a good profiling setup added to the module.

HAPPY NEW YEAR!!

Ole


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [math] RealMatrixFormat.parse()

Reply via email to