On Sat, Jul 19, 2014 at 9:06 PM, Pat Ferrel <[email protected]> wrote:
> Using methods instead of symbolic ops returns different types so methods
> work, ops don’t. If math is math, they should do the smae thing so I’d like
> to know what it is supposed to do, can you please allow me to ask specific
> people the question?
>
I don't see how I'm coming in the way of anybody else from replying. By
posting to a public mailing list, you by definition allow anybody to review
and comment.
test("plus one"){
> val a = dense(
> (1, 1),
> (0, 0))
>
> val drmA1 = drmParallelize(m = a, numPartitions = 2)
>
> // modified to return a new CheckpointedDrm so maintains immutability
> but still only increases the row cardinality
> // by returning new CheckpointedDrmSpark[K](rdd, nrow + n, ncol,
> _cacheStorageLevel ) Hack for now.
> val drmABigger1 = drmA1.addToRowCardinality(1)
>
> val drmABiggerPlusOne1 = drmABigger1.plus(1.0) // drmABigger has no
> row 2 in the rdd but an empty row 1
> // drmABiggerPlusOne1 is a dense matrix
> println(drmABiggerPlusOne1)
>
> val drmA2 = drmParallelize(m = a, numPartitions = 2)
> val drmABigger2 = drmA2.addToRowCardinality(1)
> val drmABiggerPlusOne2 = drmABigger2 + 1.0
> drmABiggerPlusOne2.writeDRM("tmp/plus-one/drma-bigger-plus-one-ops/")
>
>
> val bp = 0
> }
>
> method #1 works, #2 doesn’t. Even when I create a new CheckpintedDrmSpark
> with larger _nrow than is in the data—even without the addToRowCardinality.
>
In Method #1, plus() is not even operating on the DRM. The plus() is
operating on in core Matrix which is implicitly collect()ed because of the
drm2InCore implicit type converter. So drmABiggerPlusOne1 is neither a
DrmLike nor CheckpointedDrm, but is actually just an incore Matrix. You
will have to drmParallelize() it again in order to do any distributed
operations.
I agree that in some cases it doesn’t can you please allow me to ask if it
> _should_?
>
I don't think it is working in any case. The implicit converter is making
it feel like it is working.
Thanks
> On Jul 19, 2014, at 8:56 PM, Anand Avati <[email protected]> wrote:
>
>
>
>
> On Sat, Jul 19, 2014 at 6:50 PM, Pat Ferrel <[email protected]> wrote:
>
>>
>> On another thread I’ll send you code that shows A + 1 works with blank
>> rows in A.
>>
>
> I don't see how that worked for you. See this:
>
> test("DRM addToRowCardinality - will fail") {
> val inCoreA = sparse(
> 0 -> 1 :: 1 -> 2 :: Nil,
> 0 -> 3 :: 1 -> 4 :: Nil,
> 0 -> 2 :: 1 -> 0.0 :: Nil
> )
>
> val inCoreBControl = sparse(
> 0 -> 2 :: 1 -> 3 :: Nil,
> 0 -> 4 :: 1 -> 5 :: Nil,
> 0 -> 3 :: 1 -> 1 :: Nil,
> 0 -> 1 :: 1 -> 1 :: Nil,
> 0 -> 1 :: 1 -> 1 :: Nil
> )
>
> val drmA = drmParallelize(inCoreA)
> drmA.addToRowCardinality(2)
> val drmB = (drmA + 1.0).checkpoint()
>
> (drmB.collect - inCoreBControl).norm should be < 1e-3
>
> }
>
> test("DRM addToRowCardinality - wont fail") {
> val inCoreA = sparse(
> 0 -> 1 :: 1 -> 2 :: Nil,
> 0 -> 3 :: 1 -> 4 :: Nil,
> 0 -> 2 :: 1 -> 0.0 :: Nil
> )
>
> val inCoreBWrong = sparse(
> 0 -> 2 :: 1 -> 3 :: Nil,
> 0 -> 4 :: 1 -> 5 :: Nil,
> 0 -> 3 :: 1 -> 1 :: Nil,
> 0 -> 0 :: 1 -> 0 :: Nil,
> 0 -> 0 :: 1 -> 0 :: Nil
> )
>
> val drmA = drmParallelize(inCoreA)
> drmA.addToRowCardinality(2)
> val drmB = (drmA + 1.0).checkpoint()
>
> (drmB.collect - inCoreBWrong).norm should be < 1e-3
> }
>
>
> And sure enough, inCoreBControl fails, and inCoreBWrong succeeds:
>
> - DRM addToRowCardinality - will fail *** FAILED ***
>
> 2.0 was not less than 0.001 (DrmLikeSuiteBase.scala:116)
> - DRM addToRowCardinality - wont fail
>
>
> BTW this implies rbind will not solve the problem, it is firmly in data
>> prep. But until I know the rules I won’t know how to do the right thing.
>>
>
> Rbind expects both A and B to have their Int row keys filled from 0 to
> nrow-1, which is how they should be ideally.
>
>
>