[ https://issues.apache.org/jira/browse/MAHOUT-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15852612#comment-15852612 ]
ASF GitHub Bot commented on MAHOUT-1936: ---------------------------------------- GitHub user rawkintrevo opened a pull request: https://github.com/apache/mahout/pull/278 MAHOUT-1936 fix AsFactor allReduce block Issue in AsFactor fit method was max was being found in "map" phase, not reduce phase. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rawkintrevo/mahout mahout-1936 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/mahout/pull/278.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #278 ---- commit 14c795c9b2ab0868e5acc281e9ce5f9710534df0 Author: rawkintrevo <trevor.d.gr...@gmail.com> Date: 2017-02-04T05:50:07Z MAHOUT-1936 fix AsFactor allReduce block ---- > FactorMap finds column maximums incorrectly on large data sets > -------------------------------------------------------------- > > Key: MAHOUT-1936 > URL: https://issues.apache.org/jira/browse/MAHOUT-1936 > Project: Mahout > Issue Type: Bug > Components: Algorithms > Affects Versions: 0.13.0 > Reporter: Trevor Grant > Fix For: 0.13.0 > > > FactorMap's fit method does not properly find the maximum of the column. > Likely due to an impropper allreduceBlock here > https://github.com/apache/mahout/blob/master/math-scala/src/main/scala/org/apache/mahout/math/algorithms/preprocessing/AsFactor.scala#L40 > Also, factorMap in this instance might be more appropriately named "factorMax" -- This message was sent by Atlassian JIRA (v6.3.15#6346)