> FOR:
>
> It can be hard for an implementation to know the exact number of
> non-default elements.

I can't quite see that. Can you elaborate please?


> There is an analogy with the size() method.
>
> It is what it was and users may already use it.

For me that's the strongest argument. I haven't thought of this at all!
I'm just not used to code that people not comitting to the same repository
like me might be using. So I assumed: if it's not used within Mahout, it's
not used at all. ;-)

I could add a new method, maybe getNumNonZeroElements(), after all that's
what non-default means. Unless the matrix is dense ... Hm, maybe that name
is not that great ...


> AGAINST:
>
> The size() method was silly to begin with.  rowSize() and columnSize() are
> far more useful.  Returning an array
> implies a copy which means that there is lots of allocation going on or it
> means returning a reference to internal
> state which is worse.

True. So let's deprecate size() then?


> The worst cost is probably the row sparse representations.  NumNonDefault
> would have to iterate over all rows calling numNonDefault on each row.

This is how the current getNumNondefaultElements() works. It gets the
number of rows as rows.size() (where rows is a hashmap) and then iterates
over the row vectors calling vectorEntry.getNumNondefaultElements() each
time to find the maximum. To get the actual number of non-default elements
I don't look for the maximum but calculate the sum--pretty much the same
cost I guess.


> That isn't soooo bad, especially if somebody is about to iterate over all
> those elements.

There's currently no way to iterate over the non-default elements, this I
will have to implement as well.


I need this for the MatrixWritable implementation. When writing/reading a
sparse matrix, I have to put the number of elements to follow into the
stream first.

This morning I found a DenseMatrixWritable class that is able to read and
write dense matrices. However, it's not related to MatrixWritable, but
only extends DenseMatrix and implements Writable itself. I think it's best
to delete this class. MatrixWritable should be able to read and write any
Matrix implementation with no need for a special class for each type. But
then again, maybe people are using it already ...


> SUMMARY:
>
> I wouldn't mind a change, but others should speak up if they care about
> the current behavior.

I will implement a new method, finish MatrixWritable, and submit a patch.
If that isn't alright, it can still be changed later.


Cheers,

Alex


Reply via email to