Github user bgreeven commented on the pull request:

    https://github.com/apache/spark/pull/1290#issuecomment-68241121
  
    I have compared the ANN with Support Vector Machine (SVM) and Logistic 
Regression.
    
    I have tested using a master "local(5)" configuration, and applied the 
MNIST dataset, using 60000 training examples and 10000 test examples.
    
    Since SVM and Logistic Regression are binary classifiers, I applied two 
methods to convert them to a multinary classifier: majority vote and ad-hoc 
tree.
    
    For the majority vote, I trained 10 different models, each to distinguish a 
single class from the rest. The classification was done by looking at which 
model gives the highest positive output. I performed 100 iterations per class, 
leading to 1000 iterations in total.
    
    For ANN, I used a single hidden layer with 32 nodes (not counting the bias 
nodes). I performed 100 iterations.
    
    For LBFGS I used tolerance 1e-5.
    
    Because of the poor performance of SVM+SGD, I re-ran it with 1000 
iterations per class (10000 in total). The performance was similar.
    
    I found the following results for the test set:
    
    ```
      Algorithm                     Accuracy   Time        # correct   # 
incorrect
    
+-----------------------------+----------+-----------+-----------+-------------+
    | ANN (LBFGS)                 |    95.1% |      665s |      9510 |         
490 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    | Logistic Regression (SGD)   |    72.0% |     1325s |      7202 |        
2798 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    | Logistic Regression (LBFGS) |    86.6% |     1635s |      8658 |        
1342 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    | SVM (SGD)                   |    18.6% |     1294s |      1860 |        
8140 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    | (SVM (SGD) 1000 iterations) |    18.5% |    12658s |      1850 |        
8150 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    | SVM (LBFGS)                 |    86.2% |     1453s |      8622 |        
1378 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    ```
    
    I also created an ad-hoc tree model. This separates the collection of 
training examples in two approximately equal size partitions, where I tried to 
separate the numbers based on how different they look. I continued with the two 
separated partitions, until each output class corresponded to a single number.
    
    The partioning choice was made manually and intuitively, as follows:
    
    0123456789 -> (04689, 12357)
    04689 -> (068, 49)
    068 -> (0, 68)
    68 -> (6, 8)
    49 -> (4, 9)
    12357 -> (17, 235)
    17 -> (1, 7)
    235 -> (2, 35)
    35 -> (3, 5)
    
    Notice that this leads to only nine classification runs, not ten as in the 
voting scheme.
    
    After training, I used the trained models to classify the test set. I got 
the following results (same parameters as with the voting scheme):
    
    ```
      Algorithm                     Accuracy   Time        # correct   # 
incorrect
    
+-----------------------------+----------+-----------+-----------+-------------+
    | ANN (LBFGS)                 |    95.1% |      665s |      9510 |         
490 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    | Logistic Regression (SGD)   |    82.3% |     1146s |      8228 |        
1772 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    | Logistic Regression (LBFGS) |    87.2% |     1273s |      8719 |        
1281 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    | SVM (SGD)                   |    61.1% |     1148s |      6113 |        
3887 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    | SVM (LBFGS)                 |    87.5% |     1182s |      8753 |        
1247 |
    
+-----------------------------+----------+-----------+-----------+-------------+
    ```
    
    Notice that I left ANN in the table because this is to compare ANN with 
other algorithms. Since ANN is a multinary classifier by nature, it didn't use 
the ad-hoc tree.
    
    It would be great if someone could verify of my results. I am particularly 
amazed of the low performance of SVM+SGD with voting, and the increase with the 
ad-hoc tree. I used the same code for SGD and LBFGS, and only changed the 
optimiser and related parameters.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to