[jira] [Commented] (MAHOUT-319) SVD solvers should be gracefully stoppable/restartable

Hudson (JIRA) Thu, 05 May 2011 23:58:46 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029792#comment-13029792
 ]


Hudson commented on MAHOUT-319:
-------------------------------

Integrated in Mahout-Quality #800 (See 
[https://builds.apache.org/hudson/job/Mahout-Quality/800/])
    Fixes MAHOUT-319, by the following means:

  LanczosSolver now takes a LanczosState object as part of its solve() method, 
and operates on this
state as it iterates.  One of the possible side-effects of completing an 
iteration is that it persists
state to disk (or HDFS, etc).  When the solver is started up, and passed the 
path to the intermediate
state and there is already state persisted there, it picks up where it left off.

  This additionally improves scalability for the solver, by not requring more 
than 3 singular vectors
to be held in memory at any one time, instead of 2*desiredRank dense vectors of 
this size.

  This API change to LanczosSolver is non-backwards compatible, but hopefully 
moving to a single 
packaged state object will make it less likely that this kind of change will be 
needed much in the
future on this class.


> SVD solvers should be gracefully stoppable/restartable
> ------------------------------------------------------
>
>                 Key: MAHOUT-319
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-319
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>    Affects Versions: 0.3
>            Reporter: Jake Mannix
>            Assignee: Jake Mannix
>             Fix For: 0.5
>
>         Attachments: MAHOUT-319.diff, MAHOUT-319.diff, MAHOUT-319.patch
>
>
> LanczosSolver, DistributedLanczosSolver, and HebbianSolver all keep copious 
> amounts of memory-resident data which is lost if the app crashes or is killed 
> (OOM, forgetting to run in a screen session, and losing net connectivity to 
> the server running it, etc...).  
> These algorithms (and many other Mahout processes!) should enable a pluggable 
> "persist state" mechanism (to HDFS, RDBMS, local disk, key-value store, etc), 
> and similarly, a way to pick up and start from such a state.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-319) SVD solvers should be gracefully stoppable/restartable

Reply via email to