Re: ANN: boltzmann 0.1.1 - a deep-learning library

Mike Anderson Mon, 05 Jan 2015 20:05:52 -0800

On Tuesday, 6 January 2015 04:27:55 UTC+8, Christian Weilbach wrote:
>
> -----BEGIN PGP SIGNED MESSAGE----- 
> Hash: SHA1 
>
> On 05.01.2015 03:34, Mike Anderson wrote: 
> > Very cool stuff! 
>
> Like yours! I wish nurokit was EPLed, then I could have had a look at 
> it and try to include it there. Have libraries like this high 
> commercial value? I thought the knowledge to apply them and tune them 
> to the problem is still more expensive, this is why I picked EPL. Also 
> GPL and EPL don't seem to be compatible due to my recherche (which is 
> a pity, because I like the GPL). 
>


I think there isn't much commercial value in the library itself - there are 
many free libraries for machine learning that work just fine. Nobody with 
enough skill to use the library is going to pay you for something they can 
get for free.

The commercial value is all around :
- Building a solution that solves a business problem *using* the library
- Integrating with other applications/services (Clojure shines here because 
of the JVM ecosystem)
- Professional services / consulting

 

>
> > 
> > I notice that you are specialising the RBM to a specific matrix 
> > implementation (Clatrix / JBlas) in the file "jblas.clj". Are you 
> > sure you need to do that? Part of the beauty of core.matrix is 
> > that you should be able to write your algorithms in an 
> > implementation-independent manner and still get the performance 
> > benefits of the optimised implementation when you need it. 
>
> I started with core.matrix operations and clatrix and then tried to 
> eliminate all overhead showing up in the VisualVM sampling profiler. 
> In my experiments the protocol overhead in this inner loop in 
> `cond-prob-batch` was something like 10% or so, but I am not sure 
> whether I did something wrong. In the mean time I have benchmarked my 
> cryptographic hash function, which also uses protocols, and sometimes 
> I have seen protocol overhead and sometimes not, maybe it was related 
> to tiered compilation and the JIT sometimes not optimizing it, but 
> this is only guessing. 
>

10% protocol overhead sounds like you must be doing quite a lot of protocol 
calls.

The usual trick to minimise this is to ensure that a single protocol call 
does a lot of work (i.e. work on whole arrays at a time rather than 
individual elements). If you do that, then the protocol overhead should be 
negligible.
 

>
> If you replace all the jBlas method calls with core.matrix fns in 
> `cond-prob-batch` (3), which is quick to do, do you see a performance 
> difference? 
>
> I really like core.matrix, or in general sound, light protocols and 
> then implementations. Yesterday I found an improved fork for clj-hdf5 
> for instance, which implements some of core.matrix protocols and fixed 
> that to read double matrices for me, potentially this even allows to 
> read tensors bigger than memory partially then. (1) So I didn't want 
> to inline jBlas, but really use core.matrix. This internal inlining 
> seemed to be some compromise, since it still allows to use clatrix 
> when dealing with the jblas implementation (otherwise it was just a 
> mini-batch implementation). 
>
> For deep learning most interesting was GPU support in core.matrix for 
> typical BLAS routines, e.g. with jCuBLAS or clBLAS, but I just 
> couldn't start work on this yet. You then have to be very careful not 
> to access some memory, but if this could work with core.matrix 
> protocols it was a major win. 
>

It should certainly be possible to wrap GPU matrix support in a core.matrix 
implementation, indeed I think there have been a couple of "proof of 
concept" attempts already.

I personally have in the back of my mind a GPU-accelerated extension to 
Vectorz (i.e. GPU-backed subclasses of AMatrix and AVector), using 
something like jCuBLAS. Then the full core.matrix support would come for 
free via vectorz-clj. Would possibly be the easiest way to get 
comprehensive GPU array programming support in Clojure.
 

> boltzmann's CPU version is for me 1/3 to 1/4th training speed of 
> theano (which again is 1/5 of its GPU version on my older gaming 
> laptop). Theano uses a symbolic compute graph modelled after Python's 
> numpy API and then emits that either to CPU or GPU (including some 
> numeric optimizations). I guess my jBlas backend is still slower than 
> theirs.... netlib-java (2) recommends building a custom version of 
> ATLAS (for Ubuntu here), have you experience with this? I probably 
> should do this for clatrix (and also for numpy). 
>

Not really - I generally do pure-JVM stuff (vectorz-clj etc.). 

Would be interested to see how vectorz-clj stacks up to Clatrix / Blas if 
you get an opportunity to benchmark this (matrix multiplication is probably 
worse since BLAS shines here, but most other operations I believe are much 
faster with vectorz). vectorz-clj has definitely had far more optimisation 
work than Clatrix.
 

>
>
> > 
> > For example, the core.matrix protocols (mmul, add!, add, 
> > inner-product, transpose etc.) should all call the right Clatrix 
> > implementation without any noticeable loss of performance (if they 
> > don't that's an implementation issue in Clatrix... would be good 
> > to unearth these!). 
>
> Indeed! I also missed outer-product, which I have implemented for 
> jBLAS, as this at some point was taking most of the time, seemingly 
> falling back on a default implementation of core.matrix including 
> conversion to default types. 
>

outer-product is tricky because results require higher dimensional arrays - 
which JBlas doesn't support sadly. outer-product is another operation that 
I think is much better in vectorz-clj.
 

>
> > 
> > If the core.matrix API is insufficient to implement what you need, 
> > then I'd love to get issues / PRs (either for core.matrix or 
> > Clatrix). 
>
> Ok. Maybe you can verify that you don't see a significant performance 
> difference between the clatrix and the jblas version of 
> cond-prob-batch so I can remove the inlining and the rest should be 
> able to be patched into clatrix. 
>
> Christian 
>
> (1) https://github.com/ghubber/clj-hdf5/tree/develop 
> (2) https://github.com/fommil/netlib-java/ 
> (3) 
>
> https://github.com/ghubber/boltzmann/blob/master/src/boltzmann/jblas.clj#L33 
>
>
> -----BEGIN PGP SIGNATURE----- 
> Version: GnuPG v1 
>
> iQEcBAEBAgAGBQJUqvPJAAoJEKel+aujRZMkNyEIAKTUqhsZOWI+17Fk9eZCkvLj 
> 0geoshCHdX0K1A6ZmIGblFuRZ+DuJ6fiP/cO95IxRDfkXnK+cm/FIAJAXxz+U5PB 
> 4+cl6x9x86C8VLL7MwrTR0woiP8sSHmnrbGpeefoj5KFBD03GQ0g0P/5ONFIeYPc 
> 4MNOvFIja8EiHmFph2rOgBXvM3WWtbaibSeRbYkAVyq7jZ7D8sHcmM43Ycg+S0kM 
> Gfweuc3dzWAShxK8WKOazBiu7T4IPwHHIMZgNiPYNK5jFV6C1NIUrUpyU+fkWbTB 
> Tz7gm4l8i0zpX/M7yfa2l8r6Hgq6B0wtGXivSeurXyJnLDyHvWKvbzUJzvMACec= 
> =3TgC 
> -----END PGP SIGNATURE----- 
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ANN: boltzmann 0.1.1 - a deep-learning library

Reply via email to