[julia-users] [ANN, x-post julia-stats] Mocha.jl v0.0.4 (Deep Learning Framework for Julia)

Chiyuan Zhang Wed, 10 Dec 2014 09:11:52 -0800

For those who don't know:

Mocha is a Deep Learning framework for Julia <http://julialang.org/>, 
inspired by the C++ Deep Learning framework Caffe 
<http://caffe.berkeleyvision.org/>. Some hilights:


   - *Modular Architecture*: Mocha has a clean architecture with isolated 
   components like network layers, activation functions, solvers, 
   regularizers, initializers, etc. Built-in components are sufficient for 
   typical deep (convolutional) neural network applications and more are being 
   added in each release. All of them could be easily extended by adding 
   custom sub-types.
   - *High-level Interface*: Mocha is written in Julia 
   <http://julialang.org/>, a high-level dynamic programming language 
   designed for scientific computing. Combining with the expressive power of 
   Julia and other its package eco-system, playing with deep neural networks 
   in Mocha is easy and intuitive. See for example our IJulia Notebook example 
   of using a pre-trained imagenet model to do image classification 
   
<http://nbviewer.ipython.org/github/pluskid/Mocha.jl/blob/master/examples/ijulia/ilsvrc12/imagenet-classifier.ipynb>
   .
   - *Portability and Speed*: Mocha comes with multiple backend that could 
   be switched transparently. 
      - The *pure Julia backend* is portable -- it runs on any platform 
      that support Julia. This is reasonably fast on small models thanks to 
      Julia's LLVM-based just-in-time (JIT) compiler and Performance 
      Annotations 
      
<http://julia.readthedocs.org/en/latest/manual/performance-tips/#performance-annotations>,
 
      and could be very useful for prototyping.
      - The *native extension backend* could be turned on when a C++ 
      compiler is available. It runs 2~3 times faster than the pure Julia 
backend.
      - The *GPU backend* uses NVidia® cuDNN 
      <https://developer.nvidia.com/cuDNN>, cuBLAS and customized CUDA 
      kernels to provide highly efficient computation. 20~30 times or even more 
      speedup could be observed on a modern GPU device, especially on larger 
      models.
   - *Compatability*: Mocha uses the widely adopted HDF5 format to store 
   both datasets and model snapshots, making it easy to inter-operate with 
   Matlab, Python (numpy) and other existing computational tools. Mocha also 
   provides tools to import trained model snapshot from Caffe.
   - *Correctness*: the computational components in Mocha in all backends 
   are extensively covered by unit-tests.
   - *Open Source*: Mocha is licensed under the MIT "Expat" License 
   <https://github.com/pluskid/Mocha.jl/blob/master/LICENSE.md>.

And here is the changelog for v0.0.4
v0.0.4 2014.12.09 
   
   - Network 
      - Parameter (l2-norm) constraints (@stokasto)
      - Random shuffling for HDF5 data layer
      - ConcatLayer
   - Infrastructure 
      - Momentum policy (@stokasto)
      - Save training statistics to file and plot tools (@stokasto)
      - Coffee breaks now have a coffee lounge
      - Auto detect whether CUDA kernel needs update
      - Stochastic Nesterov Accelerated Gradient Solver
      - Solver refactoring: 
         - Behaviors for coffee breaks are simplified
         - Solver state variables like iteration now has clearer semantics
         - Support loading external pre-trained models for fine-tuning
      - Support explicit weight-sharing layers
      - Behaviors of layers taking multiple inputs made clear and 
      unit-tested
      - Refactoring: 
         - Removed the confusing System type
         - CuDNNBackend renamed to GPUBackend 
         - Cleaned up cuBLAS API (@stokasto)
      - Layers are now organized by characterization properties
      - Robustness 
         - Various explicit topology verifiecations for Net and unit tests
         - Increased unit test coverage for rare cases
      - Updated dependency to HDF5.jl 0.4.7
   - Documentation 
      - A new MNIST example using fully connected and dropout layers 
      (@stokasto)
      - Reproducible MNIST results with fixed random seed (@stokasto)
      - Tweaked IJulia Notebook image classification example
      - Document for solvers and coffee breaks
   

Best,
pluskid

[julia-users] [ANN, x-post julia-stats] Mocha.jl v0.0.4 (Deep Learning Framework for Julia)

Reply via email to