[ 
https://issues.apache.org/jira/browse/SPARK-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312795#comment-14312795
 ] 

Sean Owen commented on SPARK-4900:
----------------------------------

So I think there is at least a small problem in the error reporting:

{code}
      info.`val` match {
        case 1 => throw new IllegalStateException("ARPACK returns non-zero info 
= " + info.`val` +
            " Maximum number of iterations taken. (Refer ARPACK user guide for 
details)")
        case 2 => throw new IllegalStateException("ARPACK returns non-zero info 
= " + info.`val` +
            " No shifts could be applied. Try to increase NCV. " +
            "(Refer ARPACK user guide for details)")
        case _ => throw new IllegalStateException("ARPACK returns non-zero info 
= " + info.`val` +
            " Please refer ARPACK user guide for error message.")
      }
{code}

Really, what's called case 2 here corresponds to return value 3, which is what 
you get.

{code}
            =  0: Normal exit.
            =  1: Maximum number of iterations taken.
                  All possible eigenvalues of OP has been found. IPARAM(5)  
                  returns the number of wanted converged Ritz values.
            =  2: No longer an informational error. Deprecated starting
                  with release 2 of ARPACK.
            =  3: No shifts could be applied during a cycle of the 
                  Implicitly restarted Arnoldi iteration. One possibility 
                  is to increase the size of NCV relative to NEV. 
                  See remark 4 below.
{code}

I can fix the error message. Remark 4 that it refers to is:

{code}
    4. At present there is no a-priori analysis to guide the selection
       of NCV relative to NEV.  The only formal requrement is that NCV > NEV.
       However, it is recommended that NCV .ge. 2*NEV.  If many problems of
       the same type are to be solved, one should experiment with increasing
       NCV while keeping NEV fixed for a given test problem.  This will 
       usually decrease the required number of OP*x operations but it
       also increases the work and storage required to maintain the orthogonal
       basis vectors.   The optimal "cross-over" with respect to CPU time
       is problem dependent and must be determined empirically.
{code}

So I think that translates to "k is too big". Is the matrix low-rank?
In any event this is ultimately breeze code and I'm not sure if there's much 
that will done in Spark itself.

> MLlib SingularValueDecomposition ARPACK IllegalStateException 
> --------------------------------------------------------------
>
>                 Key: SPARK-4900
>                 URL: https://issues.apache.org/jira/browse/SPARK-4900
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 1.1.1, 1.2.0, 1.2.1
>         Environment: Ubuntu 1410, Java HotSpot(TM) 64-Bit Server VM (build 
> 25.25-b02, mixed mode)
> spark local mode
>            Reporter: Mike Beyer
>
> java.lang.reflect.InvocationTargetException
>         ...
> Caused by: java.lang.IllegalStateException: ARPACK returns non-zero info = 3 
> Please refer ARPACK user guide for error message.
>         at 
> org.apache.spark.mllib.linalg.EigenValueDecomposition$.symmetricEigs(EigenValueDecomposition.scala:120)
>         at 
> org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:235)
>         at 
> org.apache.spark.mllib.linalg.distributed.RowMatrix.computeSVD(RowMatrix.scala:171)
>               ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to