[
https://issues.apache.org/jira/browse/SYSTEMML-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias Boehm updated SYSTEMML-1977:
-------------------------------------
Description:
On Kmeans, the fusion heuristic fnr is failing with index of of bounds on
distributed (i.e., spark) codegen row operations. The root cause is misplaced
meta data management, that implicitly assumes that the first side input is
broadcast, which fails if this side input is also large and taken as an
additional rdd input. Specifically, its failing when executing the following
operator:
{code}
public final class TMP64 extends SpoofRowwise {
public TMP64() {
super(RowType.COL_AGG_B1_T, -1, false, 1);
}
protected void genexec(double[] a, int ai, SideInput[] b, double[] scalars,
double[] c, int len, int rix) {
LibSpoofPrimitives.vectOuterMultAdd(a, b[0].values(rix), c, ai,
b[0].pos(rix), 0, len, b[0].clen);
}
protected void genexec(double[] avals, int[] aix, int ai, SideInput[] b,
double[] scalars, double[] c, int alen, int len, int rix) {
LibSpoofPrimitives.vectOuterMultAdd(avals, b[0].values(rix), c, aix, ai,
b[0].pos(rix), 0, alen, len, b[0].clen);
}
}
{code}
was:On Kmeans, the fusion heuristic fnr is failing with index of of bounds on
distributed (i.e., spark) codegen row operations. The root cause is misplaced
meta data management, that implicitly assumes that the first side input is
broadcast, which fails if this side input is also large and taken as an
additional rdd input.
> Codegen spark row ops failing w/ index-out-of-bounds
> ----------------------------------------------------
>
> Key: SYSTEMML-1977
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1977
> Project: SystemML
> Issue Type: Bug
> Reporter: Matthias Boehm
>
> On Kmeans, the fusion heuristic fnr is failing with index of of bounds on
> distributed (i.e., spark) codegen row operations. The root cause is misplaced
> meta data management, that implicitly assumes that the first side input is
> broadcast, which fails if this side input is also large and taken as an
> additional rdd input. Specifically, its failing when executing the following
> operator:
> {code}
> public final class TMP64 extends SpoofRowwise {
> public TMP64() {
> super(RowType.COL_AGG_B1_T, -1, false, 1);
> }
> protected void genexec(double[] a, int ai, SideInput[] b, double[] scalars,
> double[] c, int len, int rix) {
> LibSpoofPrimitives.vectOuterMultAdd(a, b[0].values(rix), c, ai,
> b[0].pos(rix), 0, len, b[0].clen);
> }
> protected void genexec(double[] avals, int[] aix, int ai, SideInput[] b,
> double[] scalars, double[] c, int alen, int len, int rix) {
> LibSpoofPrimitives.vectOuterMultAdd(avals, b[0].values(rix), c, aix, ai,
> b[0].pos(rix), 0, alen, len, b[0].clen);
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)