[ 
https://issues.apache.org/jira/browse/MAHOUT-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165354#comment-13165354
 ] 

Jake Mannix commented on MAHOUT-917:
------------------------------------

Yeah, the new LDA integration test is just slow.  It runs on a *tiny* data set, 
but the problem is: to verify correctness, it needs to run an iterative 
map-reduce job first for numExpectedTopics - 1, then numExpectedTopics, then 
numExpectedTopics + 1, to verify that the perplexity is lowest for 
numExpectedTopics.  Each one is hoping for convergence after only 5 iterations 
currently, but we're still talking about 15 map-reduce jobs launched from 
junit, for that one test.  Not much that can be done to speed it up, but it 
could certainly be run as a @nightly.
                
> Build takes too long
> --------------------
>
>                 Key: MAHOUT-917
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-917
>             Project: Mahout
>          Issue Type: Improvement
>          Components: build
>            Reporter: Frank Scholten
>
> On my machine a full mvn clean install takes 55 minutes.
> As an experiment I put all MapReduce job tests for all clustering algorithms 
> on ignore. This reduces the build to 45 minutes. There are a lot of these 
> long running tests in the project.
> What about creating a separate maven profile for the nightly build that run 
> all MapReduce job tests? For this we have to move these MapReduce tests
> to separate classes with a naming convention such as *JobTest or 
> *IntegrationTest and add some maven configuration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to