just as a quick update: this issue has now been fixed in SystemML master - it was essentially a missing guard for recursive functions when checking for unary size-preserving functions during inter-procedural analysis (IPA).
However, while working with this recursive cholesky function I came to the conclusion that it may need some rework. The current top-down, depth-first, approach is inherently sequential. This is partially unnecessary because for the used recursive function U_triangular_inv (which is called many more times than cholesky), blocks per level are independent. Therefore, we should look into a bottom-up, breadth-first approach to parallelize over the blocks in each level, which could be done via parfor at script level. Regards, Matthias On Sat, Apr 21, 2018 at 6:59 PM, Matthias Boehm <mboe...@gmail.com> wrote: > thanks for catching this - I just ran a toy example and this seems to > be a rewrite issue (there are specific right indexing rewrites that > collapse U[1:k,1:k] and U[1:k,k+1:n] into a single access to U which > helps for large distributed matrices). As a workaround, you can set > "sysml.optlevel" to 1 (instead of default 2, where 1 disables all > rewrites), which worked fine for me. I'll fix this later today. Also > I'll fix the naming from "Choleskey" to "Cholesky". Thanks again. > > Regards, > Matthias > > > On Sat, Apr 21, 2018 at 6:28 PM, Qifan Pu <qifan...@gmail.com> wrote: >> Hi Matthias, >> >> Thanks for the fast response and detailed information. This is really >> helpful. >> >> I just tried to run it, and was tracing down a indexing bug that can be >> repeated by simply running the test script of triangle solve[1] >> Caused by: org.apache.sysml.runtime.DMLRuntimeException: Invalid values for >> matrix indexing: [1667:3333,1:1666] must be within matrix dimensions >> [1000,1000] >> >> >> Am I missing some configuration here? >> >> >> [1] >> https://github.com/apache/systemml/blob/master/scripts/staging/scalable_linalg/test/test_triangular_inv.dml >> >> >> Best, >> Qifan >> >> >> On Sat, Apr 21, 2018 at 4:06 PM, Matthias Boehm <mboe...@gmail.com> wrote: >>> >>> Hi Qifan, >>> >>> thanks for your feedback. You're right, the builtin functions >>> cholesky, inverse, eigen, solve, svd, qr, and lu are currently only >>> supported as single-node operations because they're still implemented >>> via Apache commons.math. >>> >>> However, there is an experimental script for distributed cholesky [1] >>> which uses a recursive approach (with operations that allow for >>> automatic distributed computation) for matrices larger than a >>> user-defined block size. Once blocks become small enough, we use again >>> the builtin cholesky. Graduating this script would require a broader >>> set of experiments (and potential improvements) but it simply did not >>> have the highest priority so far. You might want to give it a try >>> though. >>> >>> Thanks again for your feedback - we'll consider a higher priority for >>> these distributed operations when discussing the roadmap for the next >>> releases. >>> >>> [1] >>> https://github.com/apache/systemml/blob/master/scripts/staging/scalable_linalg/cholesky.dml >>> >>> Regards, >>> Matthias >>> >>> On Sat, Apr 21, 2018 at 2:15 PM, Qifan Pu <qifan...@gmail.com> wrote: >>> > Hi, >>> > >>> > I would love to do distributed cholesky on large matrix with SystemML. I >>> > found two related jiras (SYSTEMML-1213, SYSTEMML-1163), but AFAIK, this >>> > is >>> > currently not implemented? I just wanted to check. >>> > >>> > Best, >>> > Qifan >> >>