Invert large matrix
Hi all, I have a matrix X stored as RDD[SparseVector] that is high dimensional, say 800 million rows and 2 million columns, and more 95% of the entries are zero. Is there a way to invert (X'X + eye) efficiently, where X' is the transpose of X and eye is the identity matrix? I am thinking of using RowMatrix but not sure if it is feasible. Any suggestion is highly appreciated. Thanks. Wayne
Re: Invert large matrix
Thanks for the advice. I figured out a way to solve this problem by avoiding the matrix representation. Wayne From: Sean Owen Sent: Thursday, December 29, 2016 1:52 PM To: Yanwei Wayne Zhang; user Subject: Re: Invert large matrix I think the best advice is: don't do that. If you're trying to solve a linear system, solve the linear system without explicitly constructing a matrix inverse. Is that what you mean? On Thu, Dec 29, 2016 at 2:22 AM Yanwei Wayne Zhang mailto:actuary_zh...@hotmail.com>> wrote: Hi all, I have a matrix X stored as RDD[SparseVector] that is high dimensional, say 800 million rows and 2 million columns, and more 95% of the entries are zero. Is there a way to invert (X'X + eye) efficiently, where X' is the transpose of X and eye is the identity matrix? I am thinking of using RowMatrix but not sure if it is feasible. Any suggestion is highly appreciated. Thanks. Wayne
Spark test error
I tried to run the tests in 'GeneralizedLinearRegressionSuite', and all tests passed except for test("read/write") which yielded the following error message. Any suggestion on why this happened and how to fix it? Thanks. BTW, I ran the test in IntelliJ. The default jsonEncode only supports string and vector. org.apache.spark.ml.param.Param must override jsonEncode for java.lang.Double. scala.NotImplementedError: The default jsonEncode only supports string and vector. org.apache.spark.ml.param.Param must override jsonEncode for java.lang.Double. at org.apache.spark.ml.param.Param.jsonEncode(params.scala:98) at org.apache.spark.ml.util.DefaultParamsWriter$$anonfun$1$$anonfun$2.apply(ReadWrite.scala:293) at org.apache.spark.ml.util.DefaultParamsWriter$$anonfun$1$$anonfun$2.apply(ReadWrite.scala:292) Regards, Wayne