Re: ANN: boltzmann 0.1.1 - a deep-learning library

Christian Weilbach Tue, 10 Feb 2015 07:07:25 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hey,


On 06.01.2015 05:04, Mike Anderson wrote:
> On Tuesday, 6 January 2015 04:27:55 UTC+8, Christian Weilbach
> wrote:
>> 
> On 05.01.2015 03:34, Mike Anderson wrote:
>>>> Very cool stuff!
> 
> Like yours! I wish nurokit was EPLed, then I could have had a look
> at it and try to include it there. Have libraries like this high 
> commercial value? I thought the knowledge to apply them and tune
> them to the problem is still more expensive, this is why I picked
> EPL. Also GPL and EPL don't seem to be compatible due to my
> recherche (which is a pity, because I like the GPL).
> 
> 
>> I think there isn't much commercial value in the library itself -
>> there are many free libraries for machine learning that work just
>> fine. Nobody with enough skill to use the library is going to pay
>> you for something they can get for free.
> 
>> The commercial value is all around : - Building a solution that
>> solves a business problem *using* the library - Integrating with
>> other applications/services (Clojure shines here because of the
>> JVM ecosystem) - Professional services / consulting

Ok, thx. I'd like to move into this direction.

> 
> 
> 
> 
>>>> 
>>>> I notice that you are specialising the RBM to a specific
>>>> matrix implementation (Clatrix / JBlas) in the file
>>>> "jblas.clj". Are you sure you need to do that? Part of the
>>>> beauty of core.matrix is that you should be able to write
>>>> your algorithms in an implementation-independent manner and
>>>> still get the performance benefits of the optimised
>>>> implementation when you need it.
> 
> I started with core.matrix operations and clatrix and then tried to
>  eliminate all overhead showing up in the VisualVM sampling
> profiler. In my experiments the protocol overhead in this inner
> loop in `cond-prob-batch` was something like 10% or so, but I am
> not sure whether I did something wrong. In the mean time I have
> benchmarked my cryptographic hash function, which also uses
> protocols, and sometimes I have seen protocol overhead and
> sometimes not, maybe it was related to tiered compilation and the
> JIT sometimes not optimizing it, but this is only guessing.
> 
> 
>> 10% protocol overhead sounds like you must be doing quite a lot
>> of protocol calls.
> 
>> The usual trick to minimise this is to ensure that a single
>> protocol call does a lot of work (i.e. work on whole arrays at a
>> time rather than individual elements). If you do that, then the
>> protocol overhead should be negligible.

I only do a matrix-multiplication and element-wise calculation of the
sigmoid activation:
https://github.com/ghubber/boltzmann/blob/master/src/boltzmann/jblas.clj#L40
I have not done any inlining without a profiler proving significant
performance benefits, but I can recheck at some point.

> 
> 
> 
> If you replace all the jBlas method calls with core.matrix fns in 
> `cond-prob-batch` (3), which is quick to do, do you see a
> performance difference?
> 
> I really like core.matrix, or in general sound, light protocols and
>  then implementations. Yesterday I found an improved fork for
> clj-hdf5 for instance, which implements some of core.matrix
> protocols and fixed that to read double matrices for me,
> potentially this even allows to read tensors bigger than memory
> partially then. (1) So I didn't want to inline jBlas, but really
> use core.matrix. This internal inlining seemed to be some
> compromise, since it still allows to use clatrix when dealing with
> the jblas implementation (otherwise it was just a mini-batch
> implementation).
> 
> For deep learning most interesting was GPU support in core.matrix
> for typical BLAS routines, e.g. with jCuBLAS or clBLAS, but I just
>  couldn't start work on this yet. You then have to be very careful
> not to access some memory, but if this could work with core.matrix
>  protocols it was a major win.
> 
> 
>> It should certainly be possible to wrap GPU matrix support in a
>> core.matrix implementation, indeed I think there have been a
>> couple of "proof of concept" attempts already.
> 
>> I personally have in the back of my mind a GPU-accelerated
>> extension to Vectorz (i.e. GPU-backed subclasses of AMatrix and
>> AVector), using something like jCuBLAS. Then the full core.matrix
>> support would come for free via vectorz-clj. Would possibly be
>> the easiest way to get comprehensive GPU array programming
>> support in Clojure.

Cool. Maybe we could also just wrap the NDarray library of
deeplearning4j, then we could wrap their API and use a industry-level
deep-learning solution as Shriphani Palakodety suggested. While I
still don't think it is nice to implement machine learning algorithms
as giant frameworks in Java, but I'd prefer to have them hackable in
Clojure, it makes sense to start with some state-of-the-art. I also
don't see enough drive behind Clojure ml-libraries to make them
compete with Java ones atm.

> 
> 
> boltzmann's CPU version is for me 1/3 to 1/4th training speed of 
> theano (which again is 1/5 of its GPU version on my older gaming 
> laptop). Theano uses a symbolic compute graph modelled after
> Python's numpy API and then emits that either to CPU or GPU
> (including some numeric optimizations). I guess my jBlas backend is
> still slower than theirs.... netlib-java (2) recommends building a
> custom version of ATLAS (for Ubuntu here), have you experience with
> this? I probably should do this for clatrix (and also for numpy).
> 
> 
>> Not really - I generally do pure-JVM stuff (vectorz-clj etc.).
> 
>> Would be interested to see how vectorz-clj stacks up to Clatrix /
>> Blas if you get an opportunity to benchmark this (matrix
>> multiplication is probably worse since BLAS shines here, but most
>> other operations I believe are much faster with vectorz).
>> vectorz-clj has definitely had far more optimisation work than
>> Clatrix.
> 
> 
> 
> 
>>>> 
>>>> For example, the core.matrix protocols (mmul, add!, add, 
>>>> inner-product, transpose etc.) should all call the right
>>>> Clatrix implementation without any noticeable loss of
>>>> performance (if they don't that's an implementation issue in
>>>> Clatrix... would be good to unearth these!).
> 
> Indeed! I also missed outer-product, which I have implemented for 
> jBLAS, as this at some point was taking most of the time, seemingly
>  falling back on a default implementation of core.matrix including
>  conversion to default types.
> 
> 
>> outer-product is tricky because results require higher
>> dimensional arrays - which JBlas doesn't support sadly.
>> outer-product is another operation that I think is much better in
>> vectorz-clj.

Ok, but I wanted to use one library without copying matrices between
vector-clj and clatrix if possible. I only need the 2-dimensional
outer-product expansion.


Christian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJU2h5zAAoJEKel+aujRZMkjscH/RUnDEPDQ2pqZrE3Dt8bTrnL
RdwZydg13odAncc3g9PlOh4tDYWI445FO7NonU+DV0Rf23AiBmWmhQoguh1BlFUc
Z0yqFyECgnIB9UiThDNGVt6mPoLkK5H2ovI/Er/99zLcNBN50rqpSuKguqaALVaV
ckyTKQTnkiVO2idxDrtETM4a2LlWzpOqny9cKcwaIChenC9wS3hsJGV0XXotmVYA
ksmwaMjak1KylhWdD/jtMEtVtorDgmGD+xib4R/bkT8iFf8rR5tvg3GB4IK9wjUF
nd8ZCbSE+N07pQRTcOC1CuWu+JIbK51cbSYkC4tzxEHfLQmldnvF9qUF2AquZ9g=
=92ha
-----END PGP SIGNATURE-----

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ANN: boltzmann 0.1.1 - a deep-learning library

Reply via email to