[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an Arti...

avulanov Fri, 16 Jan 2015 12:06:12 -0800

Github user avulanov commented on the pull request:

https://github.com/apache/spark/pull/1290#issuecomment-70313952

Below are results for experiments with batch sizes.

First, I tested what kind of improvement the batch makes per
backpropagation iteration. I used a single machine with Core i5 2.5GhZ, 8GB,
Win64 and tried different batch sizes and versions of matrix multiplication (no
native BLAS, reference-win64 and system-win64). The dataset was mnist, 780
features in each vector. The batch matrix size is num_features times
batch_size. Graph is log-log plotted.

![image](https://cloud.githubusercontent.com/assets/939524/5783296/76089c90-9d76-11e4-8719-bb8c589d13a4.png)

It turns out that the optimal size is around 100 for any version of
multiplication routine. The size of batch matrix for batch size of 100 is 78000
in this experiment. Perhaps, one need to pick batch that forms a matrix of
thousands of elements, i.e. ~ 10K / (num_features) or 100K / (num_features). I
confirmed this observation by experiment with my other data. Also, if we look
at graph for matrix multiplication (dgemm) from here:
https://github.com/fommil/netlib-java, we can find that the optimal matrix size
is around 10^3 (the end of horizontal line with constant time). It is more or
less in line with my results, although Backpropagation is heavier than bgemm
test.

Second, I tested how batch performs on a Linux cluster with reasonable
sized data: cluster of 6 machines (Xeon 3.3GHz 4 cores, 16GB RAM) with 12
workers total, mnist8m dataset. I trained ANNClassifier for 40 iterations with
no hidden layer, which is 784x10 topology. The batch size and the use of
native-system-blas were parameters of my experiment. Resulting error on mnist
test was around 9% in all experiments.

![image](https://cloud.githubusercontent.com/assets/939524/5783415/7c6cb764-9d77-11e4-9972-42c888999671.png)

This graph also suggest that the size of the batch should be 100, or
proportional to 100K / (num_features).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an Arti...

Reply via email to