Dmitriy Lyubimov created MAHOUT-1597:
----------------------------------------
Summary: A + 1.0 (element-wise scala operation) gives wrong result
if rdd is missing rows, Spark side
Key: MAHOUT-1597
URL: https://issues.apache.org/jira/browse/MAHOUT-1597
Project: Mahout
Issue Type: Bug
Affects Versions: 0.9
Reporter: Dmitriy Lyubimov
Assignee: Dmitriy Lyubimov
Fix For: 1.0
{code}
// Concoct an rdd with missing rows
val aRdd: DrmRdd[Int] = sc.parallelize(
0 -> dvec(1, 2, 3) ::
3 -> dvec(3, 4, 5) :: Nil
).map { case (key, vec) => key -> (vec: Vector)}
val drmA = drmWrap(rdd = aRdd)
val controlB = inCoreA + 1.0
val drmB = drmA + 1.0
(drmB -: controlB).norm should be < 1e-10
{code}
should not fail.
it was failing due to elementwise scalar operator only evaluates rows actually
present in dataset.
In case of Int-keyed row matrices, there are implied rows that yet may not be
present in RDD.
Our goal is to detect the condition and evaluate missing rows prior to physical
operators that don't work with missing implied rows.
--
This message was sent by Atlassian JIRA
(v6.2#6252)