Re: Problem of dimensions

Anand Avati Mon, 14 Jul 2014 12:11:29 -0700

On Mon, Jul 14, 2014 at 11:56 AM, Pat Ferrel <[email protected]> wrote:


> In the application, the number of rows will always be increased, adding
> blank rows. I don’t think shuffle is necessary in this case because there
> is no actual row, no data in the drm it’s just needed to make the
> cardinality match, the IDs will take care of data matching . Maybe calling
> it something else is a good idea to emphasize the special case for it’s
> use. I went over this with Dmitriy and, though I haven’t checked actual
> values on large datasets, it works.
>


Does that mean the cardinality is faked at the logical layer with no
changes at the engine level? Does that means the physical operators need to
be prepared to handle non-matching matrix multiplication by assuming the
missing rows or columns are 0's? Does that really work with no changes?

This sounds like a need to introduce a new R-like rbind() operator. This
way you could fix up row cardinality like:

 drmAnew = drmA rbind drmParallelizeEmpty(extra_rows, drmA.ncol)

You could already do this, though twisted::

 drmAnew = (drmA.t cbind drmParallelizeEmpty(drmA.ncol, extra_rows).t

Re: Problem of dimensions

Reply via email to