[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an Arti...

avulanov Fri, 08 May 2015 12:30:12 -0700

Github user avulanov commented on the pull request:

    https://github.com/apache/spark/pull/1290#issuecomment-100334613
  
    I did small test to compare new implementation performance with the 
previous one. 
    * 8 machines  (Xeon 3.3GHz 4 cores, 16GB RAM) with 7 workers total, 
    * mnist8m dataset, persist in memory
    * Network topology 784x10 (no hidden layer = logistic regression)
    * LBFGS optimizer, 40 steps, tolerance 1e-4, batch size = 100
    * Accuracy on mnist test set: 0.9076
    
     Name |Time, hh:mm:ss
    --------|------------ 
    Total time | 00:03:53 
    Avg step time | 00:00:06
    
    Code (FOR the new version 
https://github.com/avulanov/spark/tree/ann-interface-gemm):
    
    ```
    import org.apache.spark.mllib.util.MLUtils
    import org.apache.spark.mllib.ann.{FeedForwardTrainer, Topology}
    import org.apache.spark.mllib.classification.ANNClassifier
    val mnist = MLUtils.loadLibSVMFile(sc, 
"hdfs://my.net:9000/input/mnist8m.scale").persist
    val mnist784 = MLUtils.loadLibSVMFile(sc, 
"hdfs://my.net:9000/input/mnist.scale.t.784").persist
    val topology = Topology.multiLayerPerceptron(Array[Int](784, 10), false)
    val trainer = new FeedForwardTrainer(topology, 784, 10).setBatchSize(100)
    trainer.LBFGSOptimizer.setNumIterations(40).setConvergenceTol(1e-4)
    val model40 = new ANNClassifier(trainer).train(mnist)
    val predictionAndLabels = mnist784.map( lp => 
(model40.predict(lp.features), lp.label))
    val accuracy = predictionAndLabels.map{ case(p, l) => if (p == l) 1 else 
0}.sum() / predictionAndLabels.count()
    
    ```




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an Arti...

Reply via email to