Repository: systemml Updated Branches: refs/heads/gh-pages 0284f593f -> 135cf394c
[SYSTEMML-445] Refactored the shadow buffer and added documentation for newly added features - Refactored the shadow buffer logic from GPUObject to ShadowBuffer class for maintenance. - Added an additional timer to measure shadow buffer time. - Updated the gpu documentation Project: http://git-wip-us.apache.org/repos/asf/systemml/repo Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/135cf394 Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/135cf394 Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/135cf394 Branch: refs/heads/gh-pages Commit: 135cf394c0397a81d0638c68f4f524f85a637e74 Parents: 0284f59 Author: Niketan Pansare <[email protected]> Authored: Mon Aug 6 09:40:08 2018 -0700 Committer: Niketan Pansare <[email protected]> Committed: Mon Aug 6 09:42:45 2018 -0700 ---------------------------------------------------------------------- gpu.md | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/systemml/blob/135cf394/gpu.md ---------------------------------------------------------------------- diff --git a/gpu.md b/gpu.md index e9d7bca..5e13e60 100644 --- a/gpu.md +++ b/gpu.md @@ -91,4 +91,30 @@ cd gcc-5.3.0 num_cores=`grep -c ^processor /proc/cpuinfo` make -j $num_cores sudo make install -``` \ No newline at end of file +``` + +# Advanced Configuration + +## Using single precision + +By default, SystemML uses double precision to store its matrices in the GPU memory. +To use single precision, the user needs to set the configuration property 'sysml.floating.point.precision' +to 'single'. However, with exception of BLAS operations, SystemML always performs all CPU operations +in double precision. + +## Training very deep network + +### Shadow buffer +To train very deep network with double precision, no additional configurations are necessary. +But to train very deep network with single precision, the user can speed up the eviction by +using shadow buffer. The fraction of the driver memory to be allocated to the shadow buffer can +be set by using the configuration property 'sysml.gpu.eviction.shadow.bufferSize'. +In the current version, the shadow buffer is currently not guarded by SystemML +and can potentially lead to OOM if the network is deep as well as wide. + +### Unified memory allocator + +By default, SystemML uses CUDA's memory allocator and performs on-demand eviction +using the eviction policy set by the configuration property 'sysml.gpu.eviction.policy'. +To use CUDA's unified memory allocator that performs page-level eviction instead, +please set the configuration property 'sysml.gpu.memory.allocator' to 'unified_memory'. \ No newline at end of file
