Re: Clojure with Tensorflow, Torch etc (call for participation, brainstorming etc)

Dragan Djuric Fri, 07 Oct 2016 01:18:06 -0700

Hi Kovas,


> One question:
>
> Is it possible to feed Neanderthal's matrix representation (the underlying 
> bytes) into one of these other libraries, to obtain 
> computations Neanderthal doesn't support? 
>

There are two parts to that question, I think: 1) How can you make 
Neanderthal work with any other library that you need to interoperate with? 
2) Does Neanderthal have the operation that you need, and what to do if it 
doesn't?
I'll start with 2:

2) Currently, Neanderthal supports all BLAS 1, 2 and 3 operations for 
vectors and dense matrices on CPU & GPU. The ultimate goal is to support 
all other standard matrix formats (TR, sparse, etc.) AND LAPACK, which has 
extensive support for linear algebra. The good news is that what is there 
works really, really well, because I concentrated my efforts on solving the 
hardest problem first (and succeeded!). Now it is a matter of putting the 
grunt work to repeat what's done for the existing things to cover more of 
CBLAS and LAPACK API, and even to do the integration with CUDA in a similar 
way I did the OpenCL integration. I could have even done it by now, but I 
preferred to work on other things, one of those being a bayesian data 
analysis library Bayadera, that puts what Neanderthal offers to great use 
:) I have also seen that the Yieldbot people forked Neanderthal and 
implemented some part of LAPACK, but did not release anything nor issued a 
PR. So, if the methods you need fall into the scope of matrices and linear 
algebra (BLAS + LAPACK), there is a good chance it will be supported, 
either by you or some other user providing it, or bugging me often enough 
that I realize it is urgent that I add it :)

1) There are at least two parts to the interoperation story - API for 
operations (like mm vs mmult or whatever) and that is the very easy part. 
The hard part is a multitude of matrix formats and those formats' 
representations in memory. This is what makes or breaks your performance, 
and not by a few percents but by a few orders of magnitude. The sad part is 
that almost all focus is always on the easy part, completely ignoring the 
hard part or just thinking that it will magically solve itself. So, suppose 
that you have data laid out in memory in the format A. That format may or 
may not be suitable for operation X, and if it is not, it is often a bad 
idea to shoehorn it in for convenience, instead of thinking harder about 
data flow and transition data to format B to be used appropriately. That 
means that even inside the same library, you often need to do the data 
transformations to best suit what you want to do with the data. Long story 
short, you need to do data transformations anyway, so having Neanderthal 
and ND4J support core.matrix mmult operation won't help you a bit here. 
You'll have to transform data from the one to the other. If you are lucky, 
they use the same underlying format, so the transformation is easy or even 
automatic, or can be, but the point is that someone needs to create 
explicit transformations to ensure the optimal way instead on relying on 
generic interoperability layer (at least for now).
 

> My situation: Neanderthal covers some of the algorithms I'd like to 
> implement. It looks easier to get started with and understand than the 
> alternatives. But down the line I'll likely want to do convolution, 
> sigmoid, the typical Neural Network layers. Note, I don't care about 
> 'tensors' exactly; the bookkeeping to simulate a tensor can be automated, 
> but the fundamental operations cannot. So there is a question of whether to 
> just absorb the learning curve and shortcomings of these more general 
> libraries rather than to risk switching horses midstream. 
>

I think it is important to note that the operations you mentioned are not 
in the scope of a matrix library, but in the scope of a neural networks 
library. Neanderthal is simply not a library that should have those 
operations, nor it can have all operations for all ML techniques (that are 
countlesss :)
 
On the other hand, what I created Neanderthal for, is exactly as a building 
block for such libraries. The focus is on: if you need to build a NN 
library, Neanderthal should (ideally) give you a standard matrix methods 
for computations and data shuffling, Clojure should enable you to create a 
great interface layer, and (if needed) ClojureCL should help you write 
custom optimized low-level algorithms for GPU and CPU. 

I imagine I'm not alone in this.. if there was a straightforward story for 
> how to interop Neanderthal when necessary with some more general library 
> that would be powerful. Unfortunately I'm not sufficiently expert to 
> evaluate which of the choices would be most pragmatic and how to pull it 
> off. 
>

Today's state (in Clojure, Java, and probably elsewhere) is, IMO: if you 
need a ready-made solution for NNs/DL, you have to pick one library that 
has the stuff that you need, and go with what they recommend completely. 
For example, let's say that you need to use deeplearning4J. It requires 
ND4J matrices. Let's also say that someone creates a decent core.matrix 
wrapper for ND4J, and you use it. Now, 95% of your code will be tied to 
DL4J, and you won't be able to just swap ND4J and use Vectorz instead, 
unless you create ND4J<->Vectorz transformation to help you on the API 
side. Even then, ND4J does not care about Vectorz, so it will require you 
to give it ND4J data. It may or may not be made to work, but it will not 
magically work well.

On the other hand, if you need to implement your own algorithms and help 
building a Clojure-based high-performance computing ecosystem, Neanderthal 
is exactly what I'd have in mind. 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Clojure with Tensorflow, Torch etc (call for participation, brainstorming etc)

Reply via email to