Github user avulanov commented on the pull request:

    https://github.com/apache/spark/pull/1290#issuecomment-70313952
  
    Below are results for experiments with batch sizes.
    
    First, I tested what kind of improvement the batch makes per 
backpropagation iteration. I used a single machine with Core i5 2.5GhZ, 8GB, 
Win64 and tried different batch sizes and versions of matrix multiplication (no 
native BLAS, reference-win64 and system-win64). The dataset was mnist, 780 
features in each vector. The batch matrix size is num_features times 
batch_size. Graph is log-log plotted.
    
    
![image](https://cloud.githubusercontent.com/assets/939524/5783296/76089c90-9d76-11e4-8719-bb8c589d13a4.png)
    
    
    It turns out that the optimal size is around 100 for any version of 
multiplication routine. The size of batch matrix for batch size of 100 is 78000 
in this experiment. Perhaps, one need to pick batch that forms a matrix of 
thousands of elements, i.e. ~ 10K / (num_features) or 100K / (num_features). I 
confirmed this observation by experiment with my other data. Also, if we look 
at graph for matrix multiplication (dgemm) from here: 
https://github.com/fommil/netlib-java, we can find that the optimal matrix size 
is around 10^3 (the end of horizontal line with constant time). It is more or 
less in line with my results, although Backpropagation is heavier than bgemm 
test. 
    
    Second, I tested how batch performs on a Linux cluster with reasonable 
sized data: cluster of 6 machines (Xeon 3.3GHz 4 cores, 16GB RAM) with 12 
workers total, mnist8m dataset. I trained ANNClassifier for 40 iterations with 
no hidden layer, which is 784x10 topology. The batch size and the use of 
native-system-blas were parameters of my experiment. Resulting error on mnist 
test was around 9% in all experiments.
    
    
![image](https://cloud.githubusercontent.com/assets/939524/5783415/7c6cb764-9d77-11e4-9972-42c888999671.png)
    
    This graph also suggest that the size of the batch should be 100, or 
proportional to 100K / (num_features). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to