> FOR: > > It can be hard for an implementation to know the exact number of > non-default elements.
I can't quite see that. Can you elaborate please? > There is an analogy with the size() method. > > It is what it was and users may already use it. For me that's the strongest argument. I haven't thought of this at all! I'm just not used to code that people not comitting to the same repository like me might be using. So I assumed: if it's not used within Mahout, it's not used at all. ;-) I could add a new method, maybe getNumNonZeroElements(), after all that's what non-default means. Unless the matrix is dense ... Hm, maybe that name is not that great ... > AGAINST: > > The size() method was silly to begin with. rowSize() and columnSize() are > far more useful. Returning an array > implies a copy which means that there is lots of allocation going on or it > means returning a reference to internal > state which is worse. True. So let's deprecate size() then? > The worst cost is probably the row sparse representations. NumNonDefault > would have to iterate over all rows calling numNonDefault on each row. This is how the current getNumNondefaultElements() works. It gets the number of rows as rows.size() (where rows is a hashmap) and then iterates over the row vectors calling vectorEntry.getNumNondefaultElements() each time to find the maximum. To get the actual number of non-default elements I don't look for the maximum but calculate the sum--pretty much the same cost I guess. > That isn't soooo bad, especially if somebody is about to iterate over all > those elements. There's currently no way to iterate over the non-default elements, this I will have to implement as well. I need this for the MatrixWritable implementation. When writing/reading a sparse matrix, I have to put the number of elements to follow into the stream first. This morning I found a DenseMatrixWritable class that is able to read and write dense matrices. However, it's not related to MatrixWritable, but only extends DenseMatrix and implements Writable itself. I think it's best to delete this class. MatrixWritable should be able to read and write any Matrix implementation with no need for a special class for each type. But then again, maybe people are using it already ... > SUMMARY: > > I wouldn't mind a change, but others should speak up if they care about > the current behavior. I will implement a new method, finish MatrixWritable, and submit a patch. If that isn't alright, it can still be changed later. Cheers, Alex
