There IS no issue with nrow being a lazy val. I never touch it read below.

creating a new matrix val is fine if it doesn’t cause a new rdd to be created 
I’ll look into that.

rbind as I read it requires me to construct the rows to be added. I don’t know 
what their keys are and don’t want to calculate them. If I’m right about how 
the math works the actual rows are not needed. This looks like a much heavier 
weight operation than just changing the row cardinality and works for other 
cases where you are adding real vectors. 

I’ll look deeper now that cross-cooccurrence seems to be fixed.

On Jul 15, 2014, at 7:40 PM, Ted Dunning <[email protected]> wrote:

The rbind approach also gives a new object and avoids all questions of lazy
evaluation.



On Tue, Jul 15, 2014 at 1:04 PM, Anand Avati <[email protected]> wrote:

> 
> 
> 
> On Tue, Jul 15, 2014 at 12:45 PM, Pat Ferrel <[email protected]> wrote:
> 
>> I appreciate the thoughts.
>> 
>> I don’t change nrow it is still a lazy val. I change _nrow, which is a
>> var and is used to calculate nrow when it is needed. The only thing run on
>> them is the CheckpointedDrmSpark constructor. The class exists to guarantee
>> the drm is pinned down and _nrow is changed after construction but before
>> any math is done on it. Changing _nrow may be safe on a
>> CheckpointedDrmSpark but the question is why I’ll put it up on a PR.
>> 
>> btw I was thinking of calling the method
>> CheckpointedDrmSpark#addEmptyRows, which since it’s sparse will just change
>> _nrow and will flag the purpose of the method not to mention it avoids the
>> question about reducing the number of rows.
> 
> 
> 
> I would prefer a new rbind() operator instead of addEmptyRows() method.
> Just feels more consistent.
> 
> Thanks
> 

Reply via email to