Github user frreiss commented on the pull request:
https://github.com/apache/spark/pull/8563#issuecomment-143096254
Review comments:
--> Try making the portion of the examples with input data more condensed;
perhaps
by reading the matrices from string constants.
--> There's an orphan/duplicate ScalaDoc comment at line 356 of
BlockMatrix.scala
--> Remove carriage return on lines 370, 481, 569, 732
--> Recommend adding a call to the new subtract method to the MLLib
programmer's guide
--> New API calls to BlockMatrix should have corresponding PySpark APIs
--> Error message at line 394 should print out the block sizes that don't
match
--> The code at line 384 should multiply every element of b.head by -1 as
far
as I can see
--> Line 456 and 465-471 have wrong indentation
--> Scaladoc at 474 should state that blockRowRange and blockColRange are
block
indexes, not row/column indexes
--> In lines 460-463, consider making a single pass over the blocks instead
of
4 passes
--> Add a note to SchurComplement that the current implementation assumes
that
a11 fits into memory on the driver
--> Might want to use a case class in return type of blockLUtoSolver
--> Take a close look at the performance impact of the chains of
multiplications at line 811 when there are many levels of recursion
--> In recursiveSequencesBuild, you may want to break off more than one
block
from the upper left corner of the matrix; in many cases, the available
memory
on the driver can comfortably hold, say 10x10 blocks. You should be
able to
query the SparkContext's memory information to determine how much heap
space is available for the in-memory portion of the computation. On a
side note, it would be good to produce a user-friendly error if it looks
like there is not enough local heap to process one block's data locally.
--> Might want to reuse buffers for the local data allocated at lines
623-629
to avoid triggering global GC at each level of recursion
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]