[julia-users] Re: Google releases TensorFlow as open source
Question- Why do you need theano? Aside from the benefits of symbolic graph optimization, what does Theano provide that Julia doesn't do? With Julia you can write normal imperative code that is easier to read/write than theano, and then do autodiff on that. On Tuesday, November 24, 2015 at 2:43:46 PM UTC-5, Marcin Elantkowski wrote: > > Hi, > > I'm just learning Julia, but so far it looks amazing. What I'm missing the > most, though, is *Theano*. > > Theano provides an extremely flexible symbolic API, much more mature than > *mxnet* (e.g.: *for* loops, *IfElse*). TensorFlow seems to replicate > that, but it remains to be seen how it compares to other frameworks. > > Since there is not much info on the web, I'd like to ask You: > >- How hard do you think would it be to port theano to Julia? >- Or maybe you guys know of anyone working on doing that already? > > > W dniu poniedziałek, 9 listopada 2015 22:02:36 UTC+1 użytkownik Phil > Tomson napisał: >> >> Google has released it's deep learning library called TensorFlow as open >> source code: >> >> https://github.com/tensorflow/tensorflow >> >> They include Python bindings, Any ideas about how easy/difficult it would >> be to create Julia bindings? >> >> Phil >> >
[julia-users] Re: Google releases TensorFlow as open source
Hi Viral, I want to be a part of JuliaML. ~ Ravish On Wednesday, November 11, 2015 at 4:48:07 PM UTC+5:30, Viral Shah wrote: > > I think TensorFlow.jl is a great idea. Also their distributed computation > framework is also the kind that we want to have in Julia. > > I have created JuliaML. Send me email if you want to be part of it, and I > will make you an owner. Perhaps we can even move some of the JuliaStats ML > projects to JuliaML. > > -viral > > On Wednesday, November 11, 2015 at 11:27:21 AM UTC+5:30, Valentin Churavy > wrote: >> >> It fits in the same niche that Mocha.jl and MXNet.jl are filling right >> now. MXNet is a ML library that shares many of the same design ideas of >> TensorFlow and has great Julia support https://github.com/dmlc/MXNet.jl >> >> >> On Wednesday, 11 November 2015 01:04:00 UTC+9, Randy Zwitch wrote: >>> >>> For me, the bigger question is how does TensorFlow fit in/fill in gaps >>> in currently available Julia libraries? I'm not saying that someone who is >>> sufficiently interested shouldn't wrap the library, but it'd be great to >>> identify what major gaps remain in ML for Julia and figure out if >>> TensorFlow is the right way to proceed. >>> >>> We're certainly nowhere near the R duplication problem yet, but >>> certainly we're already repeating ourselves in many areas. >>> >>> On Monday, November 9, 2015 at 4:02:36 PM UTC-5, Phil Tomson wrote: Google has released it's deep learning library called TensorFlow as open source code: https://github.com/tensorflow/tensorflow They include Python bindings, Any ideas about how easy/difficult it would be to create Julia bindings? Phil >>>
Re: [julia-users] Re: Google releases TensorFlow as open source
JuliaML is a collection of repos, not people. If you create a package that ends up in JuliaML or make significant contributions to one of them, then the owner of some JuliaML package may give you commit access to that package. On Mon, Nov 16, 2015 at 4:24 AM, Ravish Mishrawrote: > Hi Viral, > > I want to be a part of JuliaML. > > ~ Ravish > > > On Wednesday, November 11, 2015 at 4:48:07 PM UTC+5:30, Viral Shah wrote: >> >> I think TensorFlow.jl is a great idea. Also their distributed computation >> framework is also the kind that we want to have in Julia. >> >> I have created JuliaML. Send me email if you want to be part of it, and I >> will make you an owner. Perhaps we can even move some of the JuliaStats ML >> projects to JuliaML. >> >> -viral >> >> On Wednesday, November 11, 2015 at 11:27:21 AM UTC+5:30, Valentin Churavy >> wrote: >>> >>> It fits in the same niche that Mocha.jl and MXNet.jl are filling right >>> now. MXNet is a ML library that shares many of the same design ideas of >>> TensorFlow and has great Julia support https://github.com/dmlc/MXNet.jl >>> >>> >>> On Wednesday, 11 November 2015 01:04:00 UTC+9, Randy Zwitch wrote: For me, the bigger question is how does TensorFlow fit in/fill in gaps in currently available Julia libraries? I'm not saying that someone who is sufficiently interested shouldn't wrap the library, but it'd be great to identify what major gaps remain in ML for Julia and figure out if TensorFlow is the right way to proceed. We're certainly nowhere near the R duplication problem yet, but certainly we're already repeating ourselves in many areas. On Monday, November 9, 2015 at 4:02:36 PM UTC-5, Phil Tomson wrote: > > Google has released it's deep learning library called TensorFlow as > open source code: > > https://github.com/tensorflow/tensorflow > > They include Python bindings, Any ideas about how easy/difficult it > would be to create Julia bindings? > > Phil >
Re: [julia-users] Re: Google releases TensorFlow as open source
Thanks Stefan. Just starting on Julia. Hope to start contributing soon. ~ Ravish On Mon, Nov 16, 2015 at 11:16 PM, Stefan Karpinskiwrote: > JuliaML is a collection of repos, not people. If you create a package that > ends up in JuliaML or make significant contributions to one of them, then > the owner of some JuliaML package may give you commit access to that > package. > > On Mon, Nov 16, 2015 at 4:24 AM, Ravish Mishra > wrote: > >> Hi Viral, >> >> I want to be a part of JuliaML. >> >> ~ Ravish >> >> >> On Wednesday, November 11, 2015 at 4:48:07 PM UTC+5:30, Viral Shah wrote: >>> >>> I think TensorFlow.jl is a great idea. Also their distributed >>> computation framework is also the kind that we want to have in Julia. >>> >>> I have created JuliaML. Send me email if you want to be part of it, and >>> I will make you an owner. Perhaps we can even move some of the JuliaStats >>> ML projects to JuliaML. >>> >>> -viral >>> >>> On Wednesday, November 11, 2015 at 11:27:21 AM UTC+5:30, Valentin >>> Churavy wrote: It fits in the same niche that Mocha.jl and MXNet.jl are filling right now. MXNet is a ML library that shares many of the same design ideas of TensorFlow and has great Julia support https://github.com/dmlc/MXNet.jl On Wednesday, 11 November 2015 01:04:00 UTC+9, Randy Zwitch wrote: > > For me, the bigger question is how does TensorFlow fit in/fill in gaps > in currently available Julia libraries? I'm not saying that someone who is > sufficiently interested shouldn't wrap the library, but it'd be great to > identify what major gaps remain in ML for Julia and figure out if > TensorFlow is the right way to proceed. > > We're certainly nowhere near the R duplication problem yet, but > certainly we're already repeating ourselves in many areas. > > On Monday, November 9, 2015 at 4:02:36 PM UTC-5, Phil Tomson wrote: >> >> Google has released it's deep learning library called TensorFlow as >> open source code: >> >> https://github.com/tensorflow/tensorflow >> >> They include Python bindings, Any ideas about how easy/difficult it >> would be to create Julia bindings? >> >> Phil >> > >
Re: [julia-users] Re: Google releases TensorFlow as open source
On Monday, November 16, 2015 at 11:46:14 AM UTC-8, George Coles wrote: > > Does MXNet provide features that are analogous with Theano? I would rather > do machine learning in one language, than a mix of python + c + a DSL like > Theano. MXNet.jl is a wrapper around libmxnet so there is c (C++) in the background. MXNet.jl would be analogous to Theano in some ways. It also seems similar to TensorFlow. It is always cool to be able to quickly wrap native libraries, but Julia > would really gain momentum if it could obviate Theano et al (as cool as the > python ecosystem is, it is all quite ungainly.) > You should give some of the MXNet.jl examples a try. Phil
Re: [julia-users] Re: Google releases TensorFlow as open source
Does MXNet provide features that are analogous with Theano? I would rather do machine learning in one language, than a mix of python + c + a DSL like Theano. It is always cool to be able to quickly wrap native libraries, but Julia would really gain momentum if it could obviate Theano et al (as cool as the python ecosystem is, it is all quite ungainly.)
Re: [julia-users] Re: Google releases TensorFlow as open source
Does MXNet provide features that are analogous with Theano? I would rather do machine learning in one language, than a mix of python + c + a DSL like Theano. It is always cool to be able to quickly wrap native libraries, but Julia would really gain momentum if it could obviate Theano et al (as cool as the python ecosystem is, it is all quite ungainly.)
[julia-users] Re: Google releases TensorFlow as open source
Den torsdag 12 november 2015 kl. 06:36:28 UTC+1 skrev Alireza Nejati > > Anyway, the problem I'm facing right now is that even though TensorFlow's > python interface works fine, I can't get TensorFlow's C library to build! > Has anyone else had any luck with this? I've had to update java AND gcc > just to make some progress in building (they use c++11 features, don't > ask). Plus I had to install google's own bizarre and buggy build manager > (bazel). TensorFlow.jl would be kind of pointless if everyone faced the > same build issues... > I managed to build from source on an Ubuntu machine at work, following the instructions at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md. The most difficult part was getting the Python package installation dependencies in working order, which was not covered by the instructions. (The low mark being "pip install --no-use-wheel --upgrade distribute", which a google search suggested to get past the final hurdle and actually did work. Please don't emulate this in Julia.) Whether it actually built anything useful beyond what the Python package needed I have no idea though. There's a forest of bazel generated directories that I'm not particularly tempted to try to navigate unless there's something specific I should look into.
[julia-users] Re: Google releases TensorFlow as open source
Good to know that. On Wednesday, November 11, 2015 at 12:18:07 PM UTC+1, Viral Shah wrote: > > I think TensorFlow.jl is a great idea. Also their distributed computation > framework is also the kind that we want to have in Julia. > > I have created JuliaML. Send me email if you want to be part of it, and I > will make you an owner. Perhaps we can even move some of the JuliaStats ML > projects to JuliaML. > > -viral > > On Wednesday, November 11, 2015 at 11:27:21 AM UTC+5:30, Valentin Churavy > wrote: >> >> It fits in the same niche that Mocha.jl and MXNet.jl are filling right >> now. MXNet is a ML library that shares many of the same design ideas of >> TensorFlow and has great Julia support https://github.com/dmlc/MXNet.jl >> >> >> On Wednesday, 11 November 2015 01:04:00 UTC+9, Randy Zwitch wrote: >>> >>> For me, the bigger question is how does TensorFlow fit in/fill in gaps >>> in currently available Julia libraries? I'm not saying that someone who is >>> sufficiently interested shouldn't wrap the library, but it'd be great to >>> identify what major gaps remain in ML for Julia and figure out if >>> TensorFlow is the right way to proceed. >>> >>> We're certainly nowhere near the R duplication problem yet, but >>> certainly we're already repeating ourselves in many areas. >>> >>> On Monday, November 9, 2015 at 4:02:36 PM UTC-5, Phil Tomson wrote: Google has released it's deep learning library called TensorFlow as open source code: https://github.com/tensorflow/tensorflow They include Python bindings, Any ideas about how easy/difficult it would be to create Julia bindings? Phil >>>
[julia-users] Re: Google releases TensorFlow as open source
I think TensorFlow.jl is a great idea. Also their distributed computation framework is also the kind that we want to have in Julia. I have created JuliaML. Send me email if you want to be part of it, and I will make you an owner. Perhaps we can even move some of the JuliaStats ML projects to JuliaML. -viral On Wednesday, November 11, 2015 at 11:27:21 AM UTC+5:30, Valentin Churavy wrote: > > It fits in the same niche that Mocha.jl and MXNet.jl are filling right > now. MXNet is a ML library that shares many of the same design ideas of > TensorFlow and has great Julia support https://github.com/dmlc/MXNet.jl > > > On Wednesday, 11 November 2015 01:04:00 UTC+9, Randy Zwitch wrote: >> >> For me, the bigger question is how does TensorFlow fit in/fill in gaps in >> currently available Julia libraries? I'm not saying that someone who is >> sufficiently interested shouldn't wrap the library, but it'd be great to >> identify what major gaps remain in ML for Julia and figure out if >> TensorFlow is the right way to proceed. >> >> We're certainly nowhere near the R duplication problem yet, but certainly >> we're already repeating ourselves in many areas. >> >> On Monday, November 9, 2015 at 4:02:36 PM UTC-5, Phil Tomson wrote: >>> >>> Google has released it's deep learning library called TensorFlow as open >>> source code: >>> >>> https://github.com/tensorflow/tensorflow >>> >>> They include Python bindings, Any ideas about how easy/difficult it >>> would be to create Julia bindings? >>> >>> Phil >>> >>
[julia-users] Re: Google releases TensorFlow as open source
Sure. I'm not against anyone doing anything, just that it seems like Julia suffers from an "expert/edge case" problem right now. For me, it'd be awesome if there was just a scikit-learn (Python) or caret (R) type mega-interface that ties together the packages that are already coded together. From my cursory reading, it seems like TensorFlow is more like a low-level toolkit for expressing/solving equations, where I see Julia lacking an easy method to evaluate 3-5 different algorithms on the same dataset quickly. A tweet I just saw sums it up pretty succinctly: "TensorFlow already has more stars than scikit-learn, and probably more stars than people actually doing deep learning" On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati wrote: > > Randy: To answer your question, I'd reckon that the two major gaps in > julia that TensorFlow could fill are: > > 1. Lack of automatic differentiation on arbitrary graph structures. > 2. Lack of ability to map computations across cpus and clusters. > > Funny enough, I was thinking about (1) for the past few weeks and I think > I have an idea about how to accomplish it using existing JuliaDiff > libraries. About (2), I have no idea, and that's probably going to be the > most important aspect of TensorFlow moving forward (and also probably the > hardest to implement). So for the time being, I think it's definitely > worthwhile to just have an interface to TensorFlow. There are a few ways > this could be done. Some ways that I can think of: > > 1. Just tell people to use PyCall directly. Not an elegant solution. > 2. A more julia-integrated interface *a la* SymPy. > 3. Using TensorFlow as the 'backend' of a novel julia-based machine > learning library. In this scenario, everything would be in julia, and > TensorFlow would only be used to map computations to hardware. > > I think 3 is the most attractive option, but also probably the hardest to > do. >
Re: [julia-users] Re: Google releases TensorFlow as open source
This is definitely already in progress, but we've a ways to go before it's as easy as scikit-learn. I suspect that having an organization will be more effective at coordinating the various efforts than people might expect. On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloffwrote: > Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for > now), Orchestra.jl, and many others. Many people have the same goal, and > wrapping TensorFlow isn't going to change the need for a high level > interface. I do agree that a good high level interface is higher on the > priority list, though. > > On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch > wrote: > >> Sure. I'm not against anyone doing anything, just that it seems like >> Julia suffers from an "expert/edge case" problem right now. For me, it'd be >> awesome if there was just a scikit-learn (Python) or caret (R) type >> mega-interface that ties together the packages that are already coded >> together. From my cursory reading, it seems like TensorFlow is more like a >> low-level toolkit for expressing/solving equations, where I see Julia >> lacking an easy method to evaluate 3-5 different algorithms on the same >> dataset quickly. >> >> A tweet I just saw sums it up pretty succinctly: "TensorFlow already has >> more stars than scikit-learn, and probably more stars than people actually >> doing deep learning" >> >> >> >> On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati wrote: >>> >>> Randy: To answer your question, I'd reckon that the two major gaps in >>> julia that TensorFlow could fill are: >>> >>> 1. Lack of automatic differentiation on arbitrary graph structures. >>> 2. Lack of ability to map computations across cpus and clusters. >>> >>> Funny enough, I was thinking about (1) for the past few weeks and I >>> think I have an idea about how to accomplish it using existing JuliaDiff >>> libraries. About (2), I have no idea, and that's probably going to be the >>> most important aspect of TensorFlow moving forward (and also probably the >>> hardest to implement). So for the time being, I think it's definitely >>> worthwhile to just have an interface to TensorFlow. There are a few ways >>> this could be done. Some ways that I can think of: >>> >>> 1. Just tell people to use PyCall directly. Not an elegant solution. >>> 2. A more julia-integrated interface *a la* SymPy. >>> 3. Using TensorFlow as the 'backend' of a novel julia-based machine >>> learning library. In this scenario, everything would be in julia, and >>> TensorFlow would only be used to map computations to hardware. >>> >>> I think 3 is the most attractive option, but also probably the hardest >>> to do. >>> >> >
Re: [julia-users] Re: Google releases TensorFlow as open source
i have the same philosophy: "An enduser should never have to type a unicode character" On 2015-11-11 17:11, Cedric St-Jean wrote: scikit-learn uses greek letters in its implementation, which I'm fine with since domain experts work on those, but I wish that in the visible interface they had consistently used more descriptive names (eg. regularization_strength instead of alpha). On Wednesday, November 11, 2015 at 11:00:56 AM UTC-5, Christof Stocker wrote: I understand that. But that would imply that a group of people that are used to different notation would need to reach a consensus. Also there would be an uglyness to it. For example SVMs have a pretty standardized notation for the most things. I think it would not help anyone if we would start to change that just to make the whole codebase more unified. It would also be confusing to newcomers. To me it would make most sense if a domain expert has an easy time to see what's going on. I think it's unlikely that someone comes along and wants to work on 10 packages at the same time. It seems more likely that the newcomer wants to work on something from the special domain he/she is familiar with. On 2015-11-11 16:49, Tom Breloff wrote: if you implement some algorithm one should use the notation from the referenced paper This can be easier to implement (essentially just copy from the paper) but will make for a mess and a maintenance nightmare. I don't want to have to read a paper just to understand what someone's code is doing. Not to mention there are lots of "unique findings" and algorithms in papers that have actually already been found/implemented, but with different terminology in a different field. My research has taken me down lots of rabbit holes, and I'm always amazed at how very different fields/applications all have the same underlying math. We should do everything we can to unify the algorithms in the most Julian way. It's not always easy, but it should at least be the goal. This is most important with terminology and using greek letters. I don't want one algorithm to represent a learning rate with eta, and another to use alpha. It may match the paper, but it makes for mass confusion when you're not using the paper as a reference. (the obvious solution is to never use greek letters, of course) On Wed, Nov 11, 2015 at 10:34 AM, Christof Stockerwrote: I agree. I personally think the ML efforts should follow the StatsBase and Optim conventions where it makes sense. The notational differences are inconvenient, but they are manageable. I think readability should be the goal there. For example if you implement some algorithm one should use the notation from the referenced paper. A package tailored towards use in a statistical context such as GLMs should probably follow the convention used in statistics (e.g. beta for the coefficients). A package for SVMs should follow the conventions for SVMs (e.g. w for the coefficients) and so forth. It's nice to streamline things, but lets not get carried away with this kind of micromanagement On 2015-11-11 16:01, Tom Breloff wrote: One of the tricky things to figure out is how to separate statistics from machine learning, as they overlap heavily (completely?) but with different terminology and goals. I think it's really important that JuliaStats and JuliaML/JuliaLearn play nicely together, and this probably means that any ML interface uses StatsBase verbs whenever possible. There has been a little tension (from my perspective) and a slight turf war when it comes to statistical processes and terminology... is it possible to avoid? On Wed, Nov 11, 2015 at 9:49 AM, Stefan Karpinski wrote: This is definitely already in progress, but we've a ways to go before it's as easy as scikit-learn. I suspect that having an organization will be more effective at coordinating the various efforts than people might expect. On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloff wrote: Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for now), Orchestra.jl, and many others. Many people have the same goal, and wrapping TensorFlow isn't going to change the need for a high level interface. I do agree that a good high level interface is higher on the priority list, though. On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch wrote:
Re: [julia-users] Re: Google releases TensorFlow as open source
+1 to consistent interfaces for machine learning algorithms. On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitchwrote: > Sure. I'm not against anyone doing anything, just that it seems like Julia > suffers from an "expert/edge case" problem right now. For me, it'd be > awesome if there was just a scikit-learn (Python) or caret (R) type > mega-interface that ties together the packages that are already coded > together. From my cursory reading, it seems like TensorFlow is more like a > low-level toolkit for expressing/solving equations, where I see Julia > lacking an easy method to evaluate 3-5 different algorithms on the same > dataset quickly. > > A tweet I just saw sums it up pretty succinctly: "TensorFlow already has > more stars than scikit-learn, and probably more stars than people actually > doing deep learning" > > > > On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati wrote: >> >> Randy: To answer your question, I'd reckon that the two major gaps in >> julia that TensorFlow could fill are: >> >> 1. Lack of automatic differentiation on arbitrary graph structures. >> 2. Lack of ability to map computations across cpus and clusters. >> >> Funny enough, I was thinking about (1) for the past few weeks and I think >> I have an idea about how to accomplish it using existing JuliaDiff >> libraries. About (2), I have no idea, and that's probably going to be the >> most important aspect of TensorFlow moving forward (and also probably the >> hardest to implement). So for the time being, I think it's definitely >> worthwhile to just have an interface to TensorFlow. There are a few ways >> this could be done. Some ways that I can think of: >> >> 1. Just tell people to use PyCall directly. Not an elegant solution. >> 2. A more julia-integrated interface *a la* SymPy. >> 3. Using TensorFlow as the 'backend' of a novel julia-based machine >> learning library. In this scenario, everything would be in julia, and >> TensorFlow would only be used to map computations to hardware. >> >> I think 3 is the most attractive option, but also probably the hardest to >> do. >> >
Re: [julia-users] Re: Google releases TensorFlow as open source
I agree. I personally think the ML efforts should follow the StatsBase and Optim conventions where it makes sense. The notational differences are inconvenient, but they are manageable. I think readability should be the goal there. For example if you implement some algorithm one should use the notation from the referenced paper. A package tailored towards use in a statistical context such as GLMs should probably follow the convention used in statistics (e.g. beta for the coefficients). A package for SVMs should follow the conventions for SVMs (e.g. w for the coefficients) and so forth. It's nice to streamline things, but lets not get carried away with this kind of micromanagement On 2015-11-11 16:01, Tom Breloff wrote: One of the tricky things to figure out is how to separate statistics from machine learning, as they overlap heavily (completely?) but with different terminology and goals. I think it's really important that JuliaStats and JuliaML/JuliaLearn play nicely together, and this probably means that any ML interface uses StatsBase verbs whenever possible. There has been a little tension (from my perspective) and a slight turf war when it comes to statistical processes and terminology... is it possible to avoid? On Wed, Nov 11, 2015 at 9:49 AM, Stefan Karpinski> wrote: This is definitely already in progress, but we've a ways to go before it's as easy as scikit-learn. I suspect that having an organization will be more effective at coordinating the various efforts than people might expect. On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloff > wrote: Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for now), Orchestra.jl, and many others. Many people have the same goal, and wrapping TensorFlow isn't going to change the need for a high level interface. I do agree that a good high level interface is higher on the priority list, though. On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch > wrote: Sure. I'm not against anyone doing anything, just that it seems like Julia suffers from an "expert/edge case" problem right now. For me, it'd be awesome if there was just a scikit-learn (Python) or caret (R) type mega-interface that ties together the packages that are already coded together. From my cursory reading, it seems like TensorFlow is more like a low-level toolkit for expressing/solving equations, where I see Julia lacking an easy method to evaluate 3-5 different algorithms on the same dataset quickly. A tweet I just saw sums it up pretty succinctly: "TensorFlow already has more stars than scikit-learn, and probably more stars than people actually doing deep learning" On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati wrote: Randy: To answer your question, I'd reckon that the two major gaps in julia that TensorFlow could fill are: 1. Lack of automatic differentiation on arbitrary graph structures. 2. Lack of ability to map computations across cpus and clusters. Funny enough, I was thinking about (1) for the past few weeks and I think I have an idea about how to accomplish it using existing JuliaDiff libraries. About (2), I have no idea, and that's probably going to be the most important aspect of TensorFlow moving forward (and also probably the hardest to implement). So for the time being, I think it's definitely worthwhile to just have an interface to TensorFlow. There are a few ways this could be done. Some ways that I can think of: 1. Just tell people to use PyCall directly. Not an elegant solution. 2. A more julia-integrated interface /a la/ SymPy. 3. Using TensorFlow as the 'backend' of a novel julia-based machine learning library. In this scenario, everything would be in julia, and TensorFlow would only be used to map computations to hardware. I think 3 is the most attractive option, but also probably the hardest to do.
Re: [julia-users] Re: Google releases TensorFlow as open source
Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for now), Orchestra.jl, and many others. Many people have the same goal, and wrapping TensorFlow isn't going to change the need for a high level interface. I do agree that a good high level interface is higher on the priority list, though. On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitchwrote: > Sure. I'm not against anyone doing anything, just that it seems like Julia > suffers from an "expert/edge case" problem right now. For me, it'd be > awesome if there was just a scikit-learn (Python) or caret (R) type > mega-interface that ties together the packages that are already coded > together. From my cursory reading, it seems like TensorFlow is more like a > low-level toolkit for expressing/solving equations, where I see Julia > lacking an easy method to evaluate 3-5 different algorithms on the same > dataset quickly. > > A tweet I just saw sums it up pretty succinctly: "TensorFlow already has > more stars than scikit-learn, and probably more stars than people actually > doing deep learning" > > > > On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati wrote: >> >> Randy: To answer your question, I'd reckon that the two major gaps in >> julia that TensorFlow could fill are: >> >> 1. Lack of automatic differentiation on arbitrary graph structures. >> 2. Lack of ability to map computations across cpus and clusters. >> >> Funny enough, I was thinking about (1) for the past few weeks and I think >> I have an idea about how to accomplish it using existing JuliaDiff >> libraries. About (2), I have no idea, and that's probably going to be the >> most important aspect of TensorFlow moving forward (and also probably the >> hardest to implement). So for the time being, I think it's definitely >> worthwhile to just have an interface to TensorFlow. There are a few ways >> this could be done. Some ways that I can think of: >> >> 1. Just tell people to use PyCall directly. Not an elegant solution. >> 2. A more julia-integrated interface *a la* SymPy. >> 3. Using TensorFlow as the 'backend' of a novel julia-based machine >> learning library. In this scenario, everything would be in julia, and >> TensorFlow would only be used to map computations to hardware. >> >> I think 3 is the most attractive option, but also probably the hardest to >> do. >> >
Re: [julia-users] Re: Google releases TensorFlow as open source
One of the tricky things to figure out is how to separate statistics from machine learning, as they overlap heavily (completely?) but with different terminology and goals. I think it's really important that JuliaStats and JuliaML/JuliaLearn play nicely together, and this probably means that any ML interface uses StatsBase verbs whenever possible. There has been a little tension (from my perspective) and a slight turf war when it comes to statistical processes and terminology... is it possible to avoid? On Wed, Nov 11, 2015 at 9:49 AM, Stefan Karpinskiwrote: > This is definitely already in progress, but we've a ways to go before it's > as easy as scikit-learn. I suspect that having an organization will be more > effective at coordinating the various efforts than people might expect. > > On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloff wrote: > >> Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for >> now), Orchestra.jl, and many others. Many people have the same goal, and >> wrapping TensorFlow isn't going to change the need for a high level >> interface. I do agree that a good high level interface is higher on the >> priority list, though. >> >> On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch < >> randy.zwi...@fuqua.duke.edu> wrote: >> >>> Sure. I'm not against anyone doing anything, just that it seems like >>> Julia suffers from an "expert/edge case" problem right now. For me, it'd be >>> awesome if there was just a scikit-learn (Python) or caret (R) type >>> mega-interface that ties together the packages that are already coded >>> together. From my cursory reading, it seems like TensorFlow is more like a >>> low-level toolkit for expressing/solving equations, where I see Julia >>> lacking an easy method to evaluate 3-5 different algorithms on the same >>> dataset quickly. >>> >>> A tweet I just saw sums it up pretty succinctly: "TensorFlow already has >>> more stars than scikit-learn, and probably more stars than people actually >>> doing deep learning" >>> >>> >>> >>> On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati wrote: Randy: To answer your question, I'd reckon that the two major gaps in julia that TensorFlow could fill are: 1. Lack of automatic differentiation on arbitrary graph structures. 2. Lack of ability to map computations across cpus and clusters. Funny enough, I was thinking about (1) for the past few weeks and I think I have an idea about how to accomplish it using existing JuliaDiff libraries. About (2), I have no idea, and that's probably going to be the most important aspect of TensorFlow moving forward (and also probably the hardest to implement). So for the time being, I think it's definitely worthwhile to just have an interface to TensorFlow. There are a few ways this could be done. Some ways that I can think of: 1. Just tell people to use PyCall directly. Not an elegant solution. 2. A more julia-integrated interface *a la* SymPy. 3. Using TensorFlow as the 'backend' of a novel julia-based machine learning library. In this scenario, everything would be in julia, and TensorFlow would only be used to map computations to hardware. I think 3 is the most attractive option, but also probably the hardest to do. >>> >> >
Re: [julia-users] Re: Google releases TensorFlow as open source
> > if you implement some algorithm one should use the notation from the > referenced paper This can be easier to implement (essentially just copy from the paper) but will make for a mess and a maintenance nightmare. I don't want to have to read a paper just to understand what someone's code is doing. Not to mention there are lots of "unique findings" and algorithms in papers that have actually already been found/implemented, but with different terminology in a different field. My research has taken me down lots of rabbit holes, and I'm always amazed at how very different fields/applications all have the same underlying math. We should do everything we can to unify the algorithms in the most Julian way. It's not always easy, but it should at least be the goal. This is most important with terminology and using greek letters. I don't want one algorithm to represent a learning rate with eta, and another to use alpha. It may match the paper, but it makes for mass confusion when you're not using the paper as a reference. (the obvious solution is to never use greek letters, of course) On Wed, Nov 11, 2015 at 10:34 AM, Christof Stocker < stocker.chris...@gmail.com> wrote: > I agree. I personally think the ML efforts should follow the StatsBase and > Optim conventions where it makes sense. > > The notational differences are inconvenient, but they are manageable. I > think readability should be the goal there. For example if you implement > some algorithm one should use the notation from the referenced paper. A > package tailored towards use in a statistical context such as GLMs should > probably follow the convention used in statistics (e.g. beta for the > coefficients). A package for SVMs should follow the conventions for SVMs > (e.g. w for the coefficients) and so forth. It's nice to streamline things, > but lets not get carried away with this kind of micromanagement > > > On 2015-11-11 16:01, Tom Breloff wrote: > > One of the tricky things to figure out is how to separate statistics from > machine learning, as they overlap heavily (completely?) but with different > terminology and goals. I think it's really important that JuliaStats and > JuliaML/JuliaLearn play nicely together, and this probably means that any > ML interface uses StatsBase verbs whenever possible. There has been a > little tension (from my perspective) and a slight turf war when it comes to > statistical processes and terminology... is it possible to avoid? > > On Wed, Nov 11, 2015 at 9:49 AM, Stefan Karpinski> wrote: > >> This is definitely already in progress, but we've a ways to go before >> it's as easy as scikit-learn. I suspect that having an organization will be >> more effective at coordinating the various efforts than people might expect. >> >> On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloff < >> t...@breloff.com> wrote: >> >>> Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for >>> now), Orchestra.jl, and many others. Many people have the same goal, and >>> wrapping TensorFlow isn't going to change the need for a high level >>> interface. I do agree that a good high level interface is higher on the >>> priority list, though. >>> >>> On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch < >>> randy.zwi...@fuqua.duke.edu> wrote: >>> Sure. I'm not against anyone doing anything, just that it seems like Julia suffers from an "expert/edge case" problem right now. For me, it'd be awesome if there was just a scikit-learn (Python) or caret (R) type mega-interface that ties together the packages that are already coded together. From my cursory reading, it seems like TensorFlow is more like a low-level toolkit for expressing/solving equations, where I see Julia lacking an easy method to evaluate 3-5 different algorithms on the same dataset quickly. A tweet I just saw sums it up pretty succinctly: "TensorFlow already has more stars than scikit-learn, and probably more stars than people actually doing deep learning" On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati wrote: > > Randy: To answer your question, I'd reckon that the two major gaps in > julia that TensorFlow could fill are: > > 1. Lack of automatic differentiation on arbitrary graph structures. > 2. Lack of ability to map computations across cpus and clusters. > > Funny enough, I was thinking about (1) for the past few weeks and I > think I have an idea about how to accomplish it using existing JuliaDiff > libraries. About (2), I have no idea, and that's probably going to be the > most important aspect of TensorFlow moving forward (and also probably the > hardest to implement). So for the time being, I think it's definitely > worthwhile to just have an interface to TensorFlow. There are a few ways > this could be done. Some ways that I can think of: > >
Re: [julia-users] Re: Google releases TensorFlow as open source
I'm afraid it is not as easy as simply wrapping "existing" functionality, unless one is ok with a lot of wrapper packages for C backends. I do realize that a lot of people might be ok with this, but to some (me included) that would defeat the purpose of using Julia in the first place. I really love Julia, but I am not going to use Julia for the sake of using Julia. I do agree though, that it might be a good first step to wrap the C backends. The thing is one has to find someone interested in implementing that. There are a few people working towards the goal of a scikit-learn/caret like interface nonetheless, but some basic things have to be implemented in Julia first (such as a detailed treatment of SVMs). A couple of us are interested and actively gravitating towards a common code base (e.g. loss functions, class encodings) but this takes time to flesh out and get right. On 2015-11-11 15:29, Randy Zwitch wrote: Sure. I'm not against anyone doing anything, just that it seems like Julia suffers from an "expert/edge case" problem right now. For me, it'd be awesome if there was just a scikit-learn (Python) or caret (R) type mega-interface that ties together the packages that are already coded together. From my cursory reading, it seems like TensorFlow is more like a low-level toolkit for expressing/solving equations, where I see Julia lacking an easy method to evaluate 3-5 different algorithms on the same dataset quickly. A tweet I just saw sums it up pretty succinctly: "TensorFlow already has more stars than scikit-learn, and probably more stars than people actually doing deep learning" On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati wrote: Randy: To answer your question, I'd reckon that the two major gaps in julia that TensorFlow could fill are: 1. Lack of automatic differentiation on arbitrary graph structures. 2. Lack of ability to map computations across cpus and clusters. Funny enough, I was thinking about (1) for the past few weeks and I think I have an idea about how to accomplish it using existing JuliaDiff libraries. About (2), I have no idea, and that's probably going to be the most important aspect of TensorFlow moving forward (and also probably the hardest to implement). So for the time being, I think it's definitely worthwhile to just have an interface to TensorFlow. There are a few ways this could be done. Some ways that I can think of: 1. Just tell people to use PyCall directly. Not an elegant solution. 2. A more julia-integrated interface /a la/ SymPy. 3. Using TensorFlow as the 'backend' of a novel julia-based machine learning library. In this scenario, everything would be in julia, and TensorFlow would only be used to map computations to hardware. I think 3 is the most attractive option, but also probably the hardest to do.
Re: [julia-users] Re: Google releases TensorFlow as open source
I understand that. But that would imply that a group of people that are used to different notation would need to reach a consensus. Also there would be an uglyness to it. For example SVMs have a pretty standardized notation for the most things. I think it would not help anyone if we would start to change that just to make the whole codebase more unified. It would also be confusing to newcomers. To me it would make most sense if a domain expert has an easy time to see what's going on. I think it's unlikely that someone comes along and wants to work on 10 packages at the same time. It seems more likely that the newcomer wants to work on something from the special domain he/she is familiar with. On 2015-11-11 16:49, Tom Breloff wrote: if you implement some algorithm one should use the notation from the referenced paper This can be easier to implement (essentially just copy from the paper) but will make for a mess and a maintenance nightmare. I don't want to have to read a paper just to understand what someone's code is doing. Not to mention there are lots of "unique findings" and algorithms in papers that have actually already been found/implemented, but with different terminology in a different field. My research has taken me down lots of rabbit holes, and I'm always amazed at how very different fields/applications all have the same underlying math. We should do everything we can to unify the algorithms in the most Julian way. It's not always easy, but it should at least be the goal. This is most important with terminology and using greek letters. I don't want one algorithm to represent a learning rate with eta, and another to use alpha. It may match the paper, but it makes for mass confusion when you're not using the paper as a reference. (the obvious solution is to never use greek letters, of course) On Wed, Nov 11, 2015 at 10:34 AM, Christof Stocker> wrote: I agree. I personally think the ML efforts should follow the StatsBase and Optim conventions where it makes sense. The notational differences are inconvenient, but they are manageable. I think readability should be the goal there. For example if you implement some algorithm one should use the notation from the referenced paper. A package tailored towards use in a statistical context such as GLMs should probably follow the convention used in statistics (e.g. beta for the coefficients). A package for SVMs should follow the conventions for SVMs (e.g. w for the coefficients) and so forth. It's nice to streamline things, but lets not get carried away with this kind of micromanagement On 2015-11-11 16:01, Tom Breloff wrote: One of the tricky things to figure out is how to separate statistics from machine learning, as they overlap heavily (completely?) but with different terminology and goals. I think it's really important that JuliaStats and JuliaML/JuliaLearn play nicely together, and this probably means that any ML interface uses StatsBase verbs whenever possible. There has been a little tension (from my perspective) and a slight turf war when it comes to statistical processes and terminology... is it possible to avoid? On Wed, Nov 11, 2015 at 9:49 AM, Stefan Karpinski > wrote: This is definitely already in progress, but we've a ways to go before it's as easy as scikit-learn. I suspect that having an organization will be more effective at coordinating the various efforts than people might expect. On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloff > wrote: Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for now), Orchestra.jl, and many others. Many people have the same goal, and wrapping TensorFlow isn't going to change the need for a high level interface. I do agree that a good high level interface is higher on the priority list, though. On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch > wrote: Sure. I'm not against anyone doing anything, just that it seems like Julia suffers from an "expert/edge case" problem right now. For me, it'd be awesome if there was just a scikit-learn (Python) or caret (R) type mega-interface that ties together the packages that are already coded together. From my cursory reading, it seems like TensorFlow is more like a low-level toolkit for expressing/solving equations, where I see Julia lacking an easy method to evaluate 3-5 different
Re: [julia-users] Re: Google releases TensorFlow as open source
scikit-learn uses greek letters in its implementation, which I'm fine with since domain experts work on those, but I wish that in the visible interface they had consistently used more descriptive names (eg. regularization_strength instead of alpha). On Wednesday, November 11, 2015 at 11:00:56 AM UTC-5, Christof Stocker wrote: > > I understand that. But that would imply that a group of people that are > used to different notation would need to reach a consensus. Also there > would be an uglyness to it. For example SVMs have a pretty standardized > notation for the most things. I think it would not help anyone if we would > start to change that just to make the whole codebase more unified. > It would also be confusing to newcomers. To me it would make most sense if > a domain expert has an easy time to see what's going on. I think it's > unlikely that someone comes along and wants to work on 10 packages at the > same time. It seems more likely that the newcomer wants to work on > something from the special domain he/she is familiar with. > > On 2015-11-11 16:49, Tom Breloff wrote: > > if you implement some algorithm one should use the notation from the >> referenced paper > > > This can be easier to implement (essentially just copy from the paper) but > will make for a mess and a maintenance nightmare. I don't want to have to > read a paper just to understand what someone's code is doing. Not to > mention there are lots of "unique findings" and algorithms in papers that > have actually already been found/implemented, but with different > terminology in a different field. My research has taken me down lots of > rabbit holes, and I'm always amazed at how very different > fields/applications all have the same underlying math. We should do > everything we can to unify the algorithms in the most Julian way. It's not > always easy, but it should at least be the goal. > > This is most important with terminology and using greek letters. I don't > want one algorithm to represent a learning rate with eta, and another to > use alpha. It may match the paper, but it makes for mass confusion when > you're not using the paper as a reference. (the obvious solution is to > never use greek letters, of course) > > On Wed, Nov 11, 2015 at 10:34 AM, Christof Stocker < > stocker@gmail.com > wrote: > >> I agree. I personally think the ML efforts should follow the StatsBase >> and Optim conventions where it makes sense. >> >> The notational differences are inconvenient, but they are manageable. I >> think readability should be the goal there. For example if you implement >> some algorithm one should use the notation from the referenced paper. A >> package tailored towards use in a statistical context such as GLMs should >> probably follow the convention used in statistics (e.g. beta for the >> coefficients). A package for SVMs should follow the conventions for SVMs >> (e.g. w for the coefficients) and so forth. It's nice to streamline things, >> but lets not get carried away with this kind of micromanagement >> >> >> On 2015-11-11 16:01, Tom Breloff wrote: >> >> One of the tricky things to figure out is how to separate statistics from >> machine learning, as they overlap heavily (completely?) but with different >> terminology and goals. I think it's really important that JuliaStats and >> JuliaML/JuliaLearn play nicely together, and this probably means that any >> ML interface uses StatsBase verbs whenever possible. There has been a >> little tension (from my perspective) and a slight turf war when it comes to >> statistical processes and terminology... is it possible to avoid? >> >> On Wed, Nov 11, 2015 at 9:49 AM, Stefan Karpinski < >> ste...@karpinski.org > wrote: >> >>> This is definitely already in progress, but we've a ways to go before >>> it's as easy as scikit-learn. I suspect that having an organization will be >>> more effective at coordinating the various efforts than people might expect. >>> >>> On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloff < >>> t...@breloff.com > wrote: >>> Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for now), Orchestra.jl, and many others. Many people have the same goal, and wrapping TensorFlow isn't going to change the need for a high level interface. I do agree that a good high level interface is higher on the priority list, though. On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch < randy@fuqua.duke.edu > wrote: > Sure. I'm not against anyone doing anything, just that it seems like > Julia suffers from an "expert/edge case" problem right now. For me, it'd > be > awesome if there was just a scikit-learn (Python) or caret (R) type > mega-interface that ties together the packages that are already coded > together. From my cursory reading, it seems like TensorFlow is more like > a > low-level toolkit for expressing/solving
[julia-users] Re: Google releases TensorFlow as open source
Awesome. Feel free to open up a LightGraphs issue to track. On Wednesday, November 11, 2015 at 2:24:13 PM UTC-8, Alireza Nejati wrote: > > Both! :)
[julia-users] Re: Google releases TensorFlow as open source
On Tuesday, November 10, 2015 at 9:57:21 PM UTC-8, Valentin Churavy wrote: > > It fits in the same niche that Mocha.jl and MXNet.jl are filling right > now. MXNet is a ML library that shares many of the same design ideas of > TensorFlow and has great Julia support https://github.com/dmlc/MXNet.jl > MXNet and TensorFlow look like very similar frameworks. Both use symbolic computation which means they both create a DAG that can be manipulated and optimized for the underlying hardware (cpu or gpu). It would be interesting to see some comparisons between the two. I've read on another forum that MXNet is probably faster than TensorFlow at this point, but nobody has done any benchmarks yet (I'd try, but I don't have a GPU available at this point). This DAG optimization step is pretty much a compiler in itself, I wonder how many similarities there are to ParallelAccelerator.jl? One could imagine borrowing some of ideas from it and taking advantage of Julia's macro features (which Python and C++ lack) to create a native Julia ML toolkit that could also have very high performance... problem is, there are so many ML toolkits coming out now that things are already getting pretty fragmented in the space. > > On Wednesday, 11 November 2015 01:04:00 UTC+9, Randy Zwitch wrote: >> >> For me, the bigger question is how does TensorFlow fit in/fill in gaps in >> currently available Julia libraries? I'm not saying that someone who is >> sufficiently interested shouldn't wrap the library, but it'd be great to >> identify what major gaps remain in ML for Julia and figure out if >> TensorFlow is the right way to proceed. >> >> We're certainly nowhere near the R duplication problem yet, but certainly >> we're already repeating ourselves in many areas. >> >> On Monday, November 9, 2015 at 4:02:36 PM UTC-5, Phil Tomson wrote: >>> >>> Google has released it's deep learning library called TensorFlow as open >>> source code: >>> >>> https://github.com/tensorflow/tensorflow >>> >>> They include Python bindings, Any ideas about how easy/difficult it >>> would be to create Julia bindings? >>> >>> Phil >>> >>
[julia-users] Re: Google releases TensorFlow as open source
> From reading through some of the TensorFlow docs, it seems to currently only run on one machine. This is where MXNet has an advantage (and MXNet.jl) as it can run across multiple machines/gpus I think it's fair to assume that Google will soon release a distributed version. > problem is, there are so many ML toolkits coming out now that things are already getting pretty fragmented in the space. Let them fight it out until one wins, I say. Anyway, the problem I'm facing right now is that even though TensorFlow's python interface works fine, I can't get TensorFlow's C library to build! Has anyone else had any luck with this? I've had to update java AND gcc just to make some progress in building (they use c++11 features, don't ask). Plus I had to install google's own bizarre and buggy build manager (bazel). TensorFlow.jl would be kind of pointless if everyone faced the same build issues...
[julia-users] Re: Google releases TensorFlow as open source
On Tuesday, November 10, 2015 at 8:28:32 PM UTC-8, Alireza Nejati wrote: > > Randy: To answer your question, I'd reckon that the two major gaps in > julia that TensorFlow could fill are: > > 1. Lack of automatic differentiation on arbitrary graph structures. > 2. Lack of ability to map computations across cpus and clusters. > >From reading through some of the TensorFlow docs, it seems to currently only run on one machine. This is where MXNet has an advantage (and MXNet.jl) as it can run across multiple machines/gpus (see: https://mxnet.readthedocs.org/en/latest/distributed_training.html for example) > > Funny enough, I was thinking about (1) for the past few weeks and I think > I have an idea about how to accomplish it using existing JuliaDiff > libraries. About (2), I have no idea, and that's probably going to be the > most important aspect of TensorFlow moving forward (and also probably the > hardest to implement). So for the time being, I think it's definitely > worthwhile to just have an interface to TensorFlow. There are a few ways > this could be done. Some ways that I can think of: > > 1. Just tell people to use PyCall directly. Not an elegant solution. > 2. A more julia-integrated interface *a la* SymPy. > 3. Using TensorFlow as the 'backend' of a novel julia-based machine > learning library. In this scenario, everything would be in julia, and > TensorFlow would only be used to map computations to hardware. > > I think 3 is the most attractive option, but also probably the hardest to > do. > So if I understand correctly, we need bindings to TensorFlow - they use SWIG to generate Python bindings, but there is no Julia backend for SWIG. Then using the #3 approach we'd build something more general on top of those bindings. Julia's macros should allow for some features that would be difficult in C++ or Python.
Re: [julia-users] Re: Google releases TensorFlow as open source
I think rather than always matching papers we should endeavor to use consistent and standard terminology and notation. When there is disagreement, we need to have a discussion and come to some kind of agreement within our own community at least. So far that's gone quite well in StatsBase (and ColorTypes, etc.) – let's continue that trend. In some cases when an algorithm is sufficiently novel, the paper that introduces it defines the standard notation and terminology. In that case, yes, we should follow it until some other standard emerges. On Wed, Nov 11, 2015 at 10:49 AM, Tom Breloffwrote: > if you implement some algorithm one should use the notation from the >> referenced paper > > > This can be easier to implement (essentially just copy from the paper) but > will make for a mess and a maintenance nightmare. I don't want to have to > read a paper just to understand what someone's code is doing. Not to > mention there are lots of "unique findings" and algorithms in papers that > have actually already been found/implemented, but with different > terminology in a different field. My research has taken me down lots of > rabbit holes, and I'm always amazed at how very different > fields/applications all have the same underlying math. We should do > everything we can to unify the algorithms in the most Julian way. It's not > always easy, but it should at least be the goal. > > This is most important with terminology and using greek letters. I don't > want one algorithm to represent a learning rate with eta, and another to > use alpha. It may match the paper, but it makes for mass confusion when > you're not using the paper as a reference. (the obvious solution is to > never use greek letters, of course) > > On Wed, Nov 11, 2015 at 10:34 AM, Christof Stocker < > stocker.chris...@gmail.com> wrote: > >> I agree. I personally think the ML efforts should follow the StatsBase >> and Optim conventions where it makes sense. >> >> The notational differences are inconvenient, but they are manageable. I >> think readability should be the goal there. For example if you implement >> some algorithm one should use the notation from the referenced paper. A >> package tailored towards use in a statistical context such as GLMs should >> probably follow the convention used in statistics (e.g. beta for the >> coefficients). A package for SVMs should follow the conventions for SVMs >> (e.g. w for the coefficients) and so forth. It's nice to streamline things, >> but lets not get carried away with this kind of micromanagement >> >> >> On 2015-11-11 16:01, Tom Breloff wrote: >> >> One of the tricky things to figure out is how to separate statistics from >> machine learning, as they overlap heavily (completely?) but with different >> terminology and goals. I think it's really important that JuliaStats and >> JuliaML/JuliaLearn play nicely together, and this probably means that any >> ML interface uses StatsBase verbs whenever possible. There has been a >> little tension (from my perspective) and a slight turf war when it comes to >> statistical processes and terminology... is it possible to avoid? >> >> On Wed, Nov 11, 2015 at 9:49 AM, Stefan Karpinski >> wrote: >> >>> This is definitely already in progress, but we've a ways to go before >>> it's as easy as scikit-learn. I suspect that having an organization will be >>> more effective at coordinating the various efforts than people might expect. >>> >>> On Wed, Nov 11, 2015 at 9:46 AM, Tom Breloff < >>> t...@breloff.com> wrote: >>> Randy, see LearnBase.jl, MachineLearning.jl, Learn.jl (just a readme for now), Orchestra.jl, and many others. Many people have the same goal, and wrapping TensorFlow isn't going to change the need for a high level interface. I do agree that a good high level interface is higher on the priority list, though. On Wed, Nov 11, 2015 at 9:29 AM, Randy Zwitch < randy.zwi...@fuqua.duke.edu> wrote: > Sure. I'm not against anyone doing anything, just that it seems like > Julia suffers from an "expert/edge case" problem right now. For me, it'd > be > awesome if there was just a scikit-learn (Python) or caret (R) type > mega-interface that ties together the packages that are already coded > together. From my cursory reading, it seems like TensorFlow is more like a > low-level toolkit for expressing/solving equations, where I see Julia > lacking an easy method to evaluate 3-5 different algorithms on the same > dataset quickly. > > A tweet I just saw sums it up pretty succinctly: "TensorFlow already > has more stars than scikit-learn, and probably more stars than people > actually doing deep learning" > > > > On Tuesday, November 10, 2015 at 11:28:32 PM UTC-5, Alireza Nejati > wrote: >> >> Randy: To answer your question, I'd reckon that the two major gaps in >>
Re: [julia-users] Re: Google releases TensorFlow as open source
Sounds fine to me... are you volunteering to do it, or just suggesting a plan? On Wed, Nov 11, 2015 at 5:09 PM, Alireza Nejatiwrote: > So I had a look at the C api. Seems simple enough. I propose a basic > TensorFlow.jl package that does the following: > >- Defines the types TensorFlow.Status, TensorFlow.Tensor, >TensorFlow.Session, and TensorFlow.SessionOptions. >- Defines constructors such that e.g. a tensor can be created with >Tensor(data::Array{T,N}) and a session can also be created. >- Provides custom show() functions for displaying basic properties >about tensors and sessions. > > And that's it. Further functionality (such as defining graphs, etc. using > other Julia packages such as LightGraphs.jl) can be done later once this > basic interface is in place. Thoughts? > >
[julia-users] Re: Google releases TensorFlow as open source
Both! :)
Re: [julia-users] Re: Google releases TensorFlow as open source
I'm interested as well. Who wants to claim TensorFlow.jl? On Tue, Nov 10, 2015 at 9:11 AM, Ben Moranwrote: > I'm very interested in this. I haven't gone through the details yet but > they say that C++ API currently only supports a subset of the Python API > (weird!). > > One possibility is to use PyCall to wrap the Python version, like was done > for PyPlot, SymPy and like I began tentatively for Theano here - > https://github.com/benmoran/MochaTheano.jl > > > On Monday, 9 November 2015 21:06:41 UTC, Phil Tomson wrote: >> >> Looks like they used SWIG to create the Python bindings. I don't see >> Julia listed as an output target for SWIG. >> >> >> >> On Monday, November 9, 2015 at 1:02:36 PM UTC-8, Phil Tomson wrote: >>> >>> Google has released it's deep learning library called TensorFlow as open >>> source code: >>> >>> https://github.com/tensorflow/tensorflow >>> >>> They include Python bindings, Any ideas about how easy/difficult it >>> would be to create Julia bindings? >>> >>> Phil >>> >>
[julia-users] Re: Google releases TensorFlow as open source
I'm very interested in this. I haven't gone through the details yet but they say that C++ API currently only supports a subset of the Python API (weird!). One possibility is to use PyCall to wrap the Python version, like was done for PyPlot, SymPy and like I began tentatively for Theano here - https://github.com/benmoran/MochaTheano.jl On Monday, 9 November 2015 21:06:41 UTC, Phil Tomson wrote: > > Looks like they used SWIG to create the Python bindings. I don't see > Julia listed as an output target for SWIG. > > > > On Monday, November 9, 2015 at 1:02:36 PM UTC-8, Phil Tomson wrote: >> >> Google has released it's deep learning library called TensorFlow as open >> source code: >> >> https://github.com/tensorflow/tensorflow >> >> They include Python bindings, Any ideas about how easy/difficult it would >> be to create Julia bindings? >> >> Phil >> >
[julia-users] Re: Google releases TensorFlow as open source
For me, the bigger question is how does TensorFlow fit in/fill in gaps in currently available Julia libraries? I'm not saying that someone who is sufficiently interested shouldn't wrap the library, but it'd be great to identify what major gaps remain in ML for Julia and figure out if TensorFlow is the right way to proceed. We're certainly nowhere near the R duplication problem yet, but certainly we're already repeating ourselves in many areas. On Monday, November 9, 2015 at 4:02:36 PM UTC-5, Phil Tomson wrote: > > Google has released it's deep learning library called TensorFlow as open > source code: > > https://github.com/tensorflow/tensorflow > > They include Python bindings, Any ideas about how easy/difficult it would > be to create Julia bindings? > > Phil >
[julia-users] Re: Google releases TensorFlow as open source
It fits in the same niche that Mocha.jl and MXNet.jl are filling right now. MXNet is a ML library that shares many of the same design ideas of TensorFlow and has great Julia support https://github.com/dmlc/MXNet.jl On Wednesday, 11 November 2015 01:04:00 UTC+9, Randy Zwitch wrote: > > For me, the bigger question is how does TensorFlow fit in/fill in gaps in > currently available Julia libraries? I'm not saying that someone who is > sufficiently interested shouldn't wrap the library, but it'd be great to > identify what major gaps remain in ML for Julia and figure out if > TensorFlow is the right way to proceed. > > We're certainly nowhere near the R duplication problem yet, but certainly > we're already repeating ourselves in many areas. > > On Monday, November 9, 2015 at 4:02:36 PM UTC-5, Phil Tomson wrote: >> >> Google has released it's deep learning library called TensorFlow as open >> source code: >> >> https://github.com/tensorflow/tensorflow >> >> They include Python bindings, Any ideas about how easy/difficult it would >> be to create Julia bindings? >> >> Phil >> >
[julia-users] Re: Google releases TensorFlow as open source
If anyone draws up an initial implementation (or pathway to implementation, even), I'd gladly contribute. I think it's highly strategically important to have a julia interface to TensorFlow.
[julia-users] Re: Google releases TensorFlow as open source
Randy: To answer your question, I'd reckon that the two major gaps in julia that TensorFlow could fill are: 1. Lack of automatic differentiation on arbitrary graph structures. 2. Lack of ability to map computations across cpus and clusters. Funny enough, I was thinking about (1) for the past few weeks and I think I have an idea about how to accomplish it using existing JuliaDiff libraries. About (2), I have no idea, and that's probably going to be the most important aspect of TensorFlow moving forward (and also probably the hardest to implement). So for the time being, I think it's definitely worthwhile to just have an interface to TensorFlow. There are a few ways this could be done. Some ways that I can think of: 1. Just tell people to use PyCall directly. Not an elegant solution. 2. A more julia-integrated interface *a la* SymPy. 3. Using TensorFlow as the 'backend' of a novel julia-based machine learning library. In this scenario, everything would be in julia, and TensorFlow would only be used to map computations to hardware. I think 3 is the most attractive option, but also probably the hardest to do.
[julia-users] Re: Google releases TensorFlow as open source
Looks like they used SWIG to create the Python bindings. I don't see Julia listed as an output target for SWIG. On Monday, November 9, 2015 at 1:02:36 PM UTC-8, Phil Tomson wrote: > > Google has released it's deep learning library called TensorFlow as open > source code: > > https://github.com/tensorflow/tensorflow > > They include Python bindings, Any ideas about how easy/difficult it would > be to create Julia bindings? > > Phil >