Re: Problem of dimensions

Anand Avati Mon, 14 Jul 2014 11:06:13 -0700

On Mon, Jul 14, 2014 at 10:58 AM, Ted Dunning <[email protected]> wrote:


> On Mon, Jul 14, 2014 at 9:47 AM, Pat Ferrel <[email protected]> wrote:
>
> > BTW that requires that drm.nrow be mutable. That is defined as immutable
> > in the DSL and so will require a change to several traits. I’ve done this
> > but am still trying to decide the cleanest.
>
>
> Hmmm.... immutability has lots of virtues.  And changing nrows is just the
> tip of the iceberg.  You also have to shuffle the rows to match the row
> partitioning between the two matrices.
>
> Or it requires more than one pass through the data.  Since you have to read
> both matrices before you can deal with either, and since one matrix is
> likely to be shuffled relative to the other, might it just be better to
> either do two read passes or pay the cost to shuffle the matrices after
> getting a consensus view. Note that the second read pass will have to do a
> shuffle any way so the only savings to doing two passes is to decrease
> memory usage.
>
> *Anand,*
>
> I think I remember you were addressing a shuffle problem in some of your
> earlier work.  What did you conclude?
>

I think the larger question is, what does it mean to make drm.nrow mutable.
If changed to a smaller value, which rows do you "sacrifice". Why not just
do a RowRange operation to get a new DRM with fewer rows (instead of
mutating the given drm)? After that, if you care specifically about
partitioning the Par operator can shuffle data for you.

Re: Problem of dimensions

Reply via email to