One of the tricky things to figure out is how to separate
statistics from machine learning, as they overlap heavily
(completely?) but with different terminology and goals. I
think it's really important that JuliaStats and
JuliaML/JuliaLearn play nicely together, and this probably
means that any ML interface uses StatsBase verbs whenever
possible. There has been a little tension (from my
perspective) and a slight turf war when it comes to
statistical processes and terminology... is it possible to
avoid?
On Wed, Nov 11, 2015 at 9:49 AM, Stefan Karpinski
<[email protected] <javascript:>> wrote:
This is definitely already in progress, but we've a ways
to go before it's as easy as scikit-learn. I suspect
that having an organization will be more effective at
coordinating the various efforts than people might expect.
On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloff
<[email protected] <javascript:>> wrote:
Randy, see LearnBase.jl, MachineLearning.jl,
Learn.jl (just a readme for now), Orchestra.jl, and
many others. Many people have the same goal, and
wrapping TensorFlow isn't going to change the need
for a high level interface. I do agree that a good
high level interface is higher on the priority list,
though.
On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch
<[email protected] <javascript:>> wrote:
Sure. I'm not against anyone doing anything,
just that it seems like Julia suffers from an
"expert/edge case" problem right now. For me,
it'd be awesome if there was just a scikit-learn
(Python) or caret (R) type mega-interface that
ties together the packages that are already
coded together. From my cursory reading, it
seems like TensorFlow is more like a low-level
toolkit for expressing/solving equations, where
I see Julia lacking an easy method to evaluate
3-5 different algorithms on the same dataset
quickly.
A tweet I just saw sums it up pretty succinctly:
"TensorFlow already has more stars than
scikit-learn, and probably more stars than
people actually doing deep learning"
On Tuesday, November 10, 2015 at 11:28:32 PM
UTC-5, Alireza Nejati wrote:
Randy: To answer your question, I'd reckon
that the two major gaps in julia that
TensorFlow could fill are:
1. Lack of automatic differentiation on
arbitrary graph structures.
2. Lack of ability to map computations
across cpus and clusters.
Funny enough, I was thinking about (1) for
the past few weeks and I think I have an
idea about how to accomplish it using
existing JuliaDiff libraries. About (2), I
have no idea, and that's probably going to
be the most important aspect of TensorFlow
moving forward (and also probably the
hardest to implement). So for the time
being, I think it's definitely worthwhile to
just have an interface to TensorFlow. There
are a few ways this could be done. Some ways
that I can think of:
1. Just tell people to use PyCall directly.
Not an elegant solution.
2. A more julia-integrated interface /a la/
SymPy.
3. Using TensorFlow as the 'backend' of a
novel julia-based machine learning library.
In this scenario, everything would be in
julia, and TensorFlow would only be used to
map computations to hardware.
I think 3 is the most attractive option, but
also probably the hardest to do.