[
https://issues.apache.org/jira/browse/MAHOUT-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505915#comment-14505915
]
ASF GitHub Bot commented on MAHOUT-1693:
----------------------------------------
GitHub user andrewpalumbo opened a pull request:
https://github.com/apache/mahout/pull/121
MAHOUT-1693: Override .toString() in AbstractMatrix using VectorView
I still have to test this out a bit, but this is a fix for the memory
issues caused by the shell triggering .toString() after the instantiation of
large DenseMatrix, SparseMatrix and FunctionalMatrixViews (MAHOUT-1693). This
is set to display an (arbitrarily sized)10x20 upper left block of the matrix.
For sparse matrices this does affect the printout, so it may not be acceptable.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/andrewpalumbo/mahout shellMem
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/mahout/pull/121.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #121
----
commit ded13976ed3ee4f4e4e8f1e59629ccf8f6986b34
Author: Andrew Palumbo <[email protected]>
Date: 2015-04-21T20:27:31Z
use VectorView to display subVectors in AbstractVector.toString()
commit ffe9ee5701922fde1e897f21a7eb5b0896e461b0
Author: Andrew Palumbo <[email protected]>
Date: 2015-04-21T20:59:18Z
fixed check for numRows(). added work on SparseColumnMatrix
----
> FunctionalMatrixView materializes row vectors in scala shell
> ------------------------------------------------------------
>
> Key: MAHOUT-1693
> URL: https://issues.apache.org/jira/browse/MAHOUT-1693
> Project: Mahout
> Issue Type: Bug
> Components: Mahout spark shell, Math
> Affects Versions: 0.10.0
> Reporter: Suneel Marthi
> Assignee: Andrew Palumbo
> Priority: Blocker
> Fix For: 0.10.1
>
>
> FunctionalMatrixView materializes row vectors in scala shell.
> Problem first reported by a user Michael Alton, Intel:
> "When I first tried to make a large matrix, I got an out of Java heap space
> error. I increased the memory incrementally until I got it to work. “export
> MAHOUT_HEAPSIZE=8000” didn’t work, but “export MAHOUT_HEAPSIZE=64000” did.
> The question is why do we need so much memory? A 5000x5000 matrix of doubles
> should only take up ~200MB of space?"
> Problem has been narrowed down to not override toString() method in
> FunctionalMatrixView which causes it to materialize all of the row vectors
> when run in Mahout Spark Shell.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)