Repository: systemml
Updated Branches:
  refs/heads/gh-pages 0284f593f -> 135cf394c


[SYSTEMML-445] Refactored the shadow buffer and added documentation for newly 
added features

- Refactored the shadow buffer logic from GPUObject to ShadowBuffer class for 
maintenance.
- Added an additional timer to measure shadow buffer time.
- Updated the gpu documentation


Project: http://git-wip-us.apache.org/repos/asf/systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/systemml/commit/135cf394
Tree: http://git-wip-us.apache.org/repos/asf/systemml/tree/135cf394
Diff: http://git-wip-us.apache.org/repos/asf/systemml/diff/135cf394

Branch: refs/heads/gh-pages
Commit: 135cf394c0397a81d0638c68f4f524f85a637e74
Parents: 0284f59
Author: Niketan Pansare <[email protected]>
Authored: Mon Aug 6 09:40:08 2018 -0700
Committer: Niketan Pansare <[email protected]>
Committed: Mon Aug 6 09:42:45 2018 -0700

----------------------------------------------------------------------
 gpu.md | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/systemml/blob/135cf394/gpu.md
----------------------------------------------------------------------
diff --git a/gpu.md b/gpu.md
index e9d7bca..5e13e60 100644
--- a/gpu.md
+++ b/gpu.md
@@ -91,4 +91,30 @@ cd gcc-5.3.0
 num_cores=`grep -c ^processor /proc/cpuinfo`
 make -j $num_cores
 sudo make install
-```
\ No newline at end of file
+```
+
+# Advanced Configuration
+
+## Using single precision
+
+By default, SystemML uses double precision to store its matrices in the GPU 
memory.
+To use single precision, the user needs to set the configuration property 
'sysml.floating.point.precision'
+to 'single'. However, with exception of BLAS operations, SystemML always 
performs all CPU operations
+in double precision.
+
+## Training very deep network
+
+### Shadow buffer
+To train very deep network with double precision, no additional configurations 
are necessary.
+But to train very deep network with single precision, the user can speed up 
the eviction by 
+using shadow buffer. The fraction of the driver memory to be allocated to the 
shadow buffer can  
+be set by using the configuration property 
'sysml.gpu.eviction.shadow.bufferSize'.
+In the current version, the shadow buffer is currently not guarded by SystemML
+and can potentially lead to OOM if the network is deep as well as wide.
+
+### Unified memory allocator
+
+By default, SystemML uses CUDA's memory allocator and performs on-demand 
eviction
+using the eviction policy set by the configuration property 
'sysml.gpu.eviction.policy'.
+To use CUDA's unified memory allocator that performs page-level eviction 
instead,
+please set the configuration property 'sysml.gpu.memory.allocator' to 
'unified_memory'.
\ No newline at end of file

Reply via email to