Dear All, what do you think about to scale out the learning of Multi Layer Perceptrons (MLP) with BSPs? I heard the talk of Tommaso at the apacheConAt first glance the pramming model BSP seems to fit better the MapReduce for this purpose. The basic idea is to distribute the backprop algorithm is the following:
Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 On each working node: Learning the weight changes (as matrices) in each chunk 3 Combining the matrixes (weight changes) and simultaneous update of the weights in each node - back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning). I wrote the (basic) backprob algorithm of a multi layer preceptron (see mahout patch https://issues.apache.org/jira/browse/MAHOUT-976). It uses the Mahout Matrix Library, which is under the hood the Colt Matrix Library from Cern. Probably using the Cern Matrix Library would also suitable for Hama. Then it could be easy to port the MLP to Hama. What do you think about it? Thanks for you response. Cheers Christian
