[
https://issues.apache.org/jira/browse/MAHOUT-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karol Grzegorczyk updated MAHOUT-1557:
--------------------------------------
Attachment: mlp_sparse.diff
Yes, my patch was a bit buggy. I'm sorry. I've corrected it and added unit test
to verify it.
Regarding internal vector representation (dense vs sparse) in the
{{NeuralNetwork}} class, I think that it is a bit different issue. Maybe it can
be modified, maybe not. I do not know. My motivation was only to be able to
pass input data file in sparse format, similar way as we are passing to Mahout
Naive Bayes classifier.
In addition, as a part of this patch, I've added a new line separator inserted
between predicted label indices (on the end of the
{{RunMultilayerPerceptron.main}} method). Previously entries in the output file
were not separated at all. I've added assertions to unit tests checking that
output is of required size.
> Add support for sparse training vectors in MLP
> ----------------------------------------------
>
> Key: MAHOUT-1557
> URL: https://issues.apache.org/jira/browse/MAHOUT-1557
> Project: Mahout
> Issue Type: Improvement
> Components: Classification
> Reporter: Karol Grzegorczyk
> Priority: Minor
> Labels: mlp
> Fix For: 1.0
>
> Attachments: mlp_sparse.diff
>
>
> When the number of input units of MLP is big, it is likely that input vector
> will be sparse. It should be possible to read input files in a sparse format.
--
This message was sent by Atlassian JIRA
(v6.2#6252)