[ https://issues.apache.org/jira/browse/SPARK-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312795#comment-14312795 ]
Sean Owen commented on SPARK-4900: ---------------------------------- So I think there is at least a small problem in the error reporting: {code} info.`val` match { case 1 => throw new IllegalStateException("ARPACK returns non-zero info = " + info.`val` + " Maximum number of iterations taken. (Refer ARPACK user guide for details)") case 2 => throw new IllegalStateException("ARPACK returns non-zero info = " + info.`val` + " No shifts could be applied. Try to increase NCV. " + "(Refer ARPACK user guide for details)") case _ => throw new IllegalStateException("ARPACK returns non-zero info = " + info.`val` + " Please refer ARPACK user guide for error message.") } {code} Really, what's called case 2 here corresponds to return value 3, which is what you get. {code} = 0: Normal exit. = 1: Maximum number of iterations taken. All possible eigenvalues of OP has been found. IPARAM(5) returns the number of wanted converged Ritz values. = 2: No longer an informational error. Deprecated starting with release 2 of ARPACK. = 3: No shifts could be applied during a cycle of the Implicitly restarted Arnoldi iteration. One possibility is to increase the size of NCV relative to NEV. See remark 4 below. {code} I can fix the error message. Remark 4 that it refers to is: {code} 4. At present there is no a-priori analysis to guide the selection of NCV relative to NEV. The only formal requrement is that NCV > NEV. However, it is recommended that NCV .ge. 2*NEV. If many problems of the same type are to be solved, one should experiment with increasing NCV while keeping NEV fixed for a given test problem. This will usually decrease the required number of OP*x operations but it also increases the work and storage required to maintain the orthogonal basis vectors. The optimal "cross-over" with respect to CPU time is problem dependent and must be determined empirically. {code} So I think that translates to "k is too big". Is the matrix low-rank? In any event this is ultimately breeze code and I'm not sure if there's much that will done in Spark itself. > MLlib SingularValueDecomposition ARPACK IllegalStateException > -------------------------------------------------------------- > > Key: SPARK-4900 > URL: https://issues.apache.org/jira/browse/SPARK-4900 > Project: Spark > Issue Type: Bug > Components: MLlib > Affects Versions: 1.1.1, 1.2.0, 1.2.1 > Environment: Ubuntu 1410, Java HotSpot(TM) 64-Bit Server VM (build > 25.25-b02, mixed mode) > spark local mode > Reporter: Mike Beyer > > java.lang.reflect.InvocationTargetException > ... > Caused by: java.lang.IllegalStateException: ARPACK returns non-zero info = 3 > Please refer ARPACK user guide for error message. > at > org.apache.spark.mllib.linalg.EigenValueDecomposition$.symmetricEigs(EigenValueDecomposition.scala:120) > at > org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:235) > at > org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:171) > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org