Good morning, I think we should make a wiki page giving an overview of topics relevant to Machine Learning in J.
I implemented already various machine learning related algorithms, like PCA, Minimum Noise Fraction transform, k-means clustering, k nearest neighbor classification and Kohonen self-organising maps. I can put those online, if I can find some time. Maybe an addon and a lab would be a good idea, but I think we should think of how to structure those well in advance. Qua books I would also like to mention Introduction to statistical learning (http://www-bcf.usc.edu/~gareth/ISL/) and Elements of statistical learning ( https://web.stanford.edu/~hastie/ElemStatLearn/), which are both freely (and legally) available as PDF's. Best regards, Jan-Pieter On Fri, 16 Mar 2018, 01:27 'Jon Hough' via Programming, < [email protected]> wrote: > Many ML algorithms are very suitable to write in J. I have written a few > myself, which I will put on Github when > I iron out some issues and clean it up since it is a mess at the moment - > maybe this month or next. > But having said that, I am definitely not an expert in ML (or J for that > matter), and am more interested in understanding > and building the algorithms than actually using them on real world data. > From that point of view, J is great, > because implementing a lot of the algorithms is essentially manipulating > matrices, i.e. what J is built for. And even > convolutions have their own inbuilt conjunction (;._3 or ;.3). > > I noticed you have missed a few popular ML algorithms: > SVM > multi layer perceptron > PCA > SOM (kohonen nets) > Gaussian Processes > ... > > One issue that has been mentioned before is SVMs. These are somewhat > popular, but difficult to write since > a quadratic programming solver is necessary, and as far as I know nobody > has written one in J. > > I have written a somewhat shabby convolutional net in J, (for 2D > convolutions, i.e. image data). I could > get a 90%ish accuracy rate with the MNIST dataset using my convnet... the > downside being that it took over 15 hours > to do the training (on a CPU obviously). I will add that to Github too, > merely as a reference, or as something of interest. > > My current goal (I am just going through various ML algos and trying to > implement them, for the sake of my own > learning, not to solve any specific problem) is to write an LSTM network. > I will, time permitting, add that to github > too. > > It would be good to have a whole section of the Wiki devoted to ML in > future. > > > Other sources of information: > This book is very good, better in hardback than reading online: > http://www.deeplearningbook.org/ > scikit-learn source code is very readable, if you know Python, and > sometimes easily applicable to J. > https://github.com/scikit-learn/scikit-learn > I also found Hands on Machine Learning with Scikit-Learn and Tensorflow > to be a very good book. > > > > Jon > > -------------------------------------------- > On Fri, 3/16/18, Skip Cave <[email protected]> wrote: > > Subject: [Jprogramming] J for ML > To: "[email protected]" <[email protected]> > Date: Friday, March 16, 2018, 5:56 AM > > All, > > If J would like to stay relevant in > today's programmimg world, providing J > code for the most common machine > learning and deep learning algorithms such > as gradient decent, neural networks, > word2vec etc. would likely attract > some attention. Many of the basic ML > algorithms are already published in J, > but they are scattered in various > locations on the J website and other > places. Collecting them together in one > place would help show J's relevance > to the hot field of ML research. > > Here's a list of some of the most basic > ML algorithms: > Linear Regression > Linear Regression w/ Gradient Decent > Logistic Function > Logistic Regression > Linear Discriminant Analysis > Gini Coefficient > Classification and Regression Trees > Naive Bayes > Gaussian > Gaussian Naive Bayes > Nearest Neighbors > Vector Quantization > Support Vector Machines > Bagged Decision Trees > Adaptive Boosting > > Jason Brownlee, Ph.D. maintains a > website focused on ML, called: "Machine > Learning Mastery" https://machinelearningmastery.com/ > On his website, Brownnlee sells several > books that he has wtten on various > aspects of of ML, Deep Learning, and > Natural Language Processing. > In one book, entitled A gentle > step-by-step introduction to 10 top machine > learning algorithms > <https://machinelearningmastery.com/master-machine-learning-algorithms/>. > (for > the ML beginner) *he provides Excel > spreadsheets for all the basic ML > algorithms I mentioned above.* > > Here are two more of Brownlee's books > where he shows how R & Python (With > NumPy) can be used for ML algorithms. > > Machine Learning Mastery with R > <https://machinelearningmastery.com/machine-learning-with-r/> > (for the ML > intermediate) > > Deep Learning with Python > <https://machinelearningmastery.com/deep-learning-with-python/> > (for the > Deep Learning afacionado) > > IMO, J's implementation of these > algorithms would be more clear & concise, > with much less reliance on external > routines. Making a J adjunct workbook > to Brownlee's books, though a huge > task, would be a showcase for why a true > matrix language is the optimal way to > describe these algorithms. > > Also, attached below is an email I just > received containing a topical > discussion about operations on sparse > matrices using Python's NumPy addon, > as well as an interesting article about > math operations on different-sized > arrays called "Broadcasting". Jason > sends these emails out as a weekly ML > newsletter. > > Skip Cave > Cave Consulting LLC > > <<<>>> > > ---------- Forwarded message > ---------- > From: Jason @ ML Mastery <[email protected]> > Date: Thu, Mar 15, 2018 at 1:11 PM > Subject: Broadcasting, Sparsity and > Deep Learning > To: [email protected] > > Hi, this week we have two important > tutorials and an overview of linear > algebra for deep learning. > Broadcasting is a handy shortcut to > performing arithmetic operations on > arrays with differing sizes. Discover > how broadcasting works in this > tutorial: > >> A Gentle Introduction to > Broadcasting with NumPy Arrays > < > http://t.dripemail2.com/c/eyJhY2NvdW50X2lkIjoiOTU1NjU4OCIsImRlbGl2ZXJ5X2lkIjoiMjI5MzQ1MDYyOSIsInVybCI6Imh0dHBzOi8vbWFjaGluZWxlYXJuaW5nbWFzdGVyeS5jb20vYnJvYWRjYXN0aW5nLXdpdGgtbnVtcHktYXJyYXlzLz9fX3M9dWIxYnBpaG9la3Fic3BmdnF6cnMifQ > > > > Sparse vectors and matrices are an > important an under-discussed area of > applied machine learning. Discover > sparsity and how to work with sparse > data in this tutorial: > >> A Gentle Introduction to > Sparse Matrices for Machine Learning > < > http://t.dripemail2.com/c/eyJhY2NvdW50X2lkIjoiOTU1NjU4OCIsImRlbGl2ZXJ5X2lkIjoiMjI5MzQ1MDYyOSIsInVybCI6Imh0dHBzOi8vbWFjaGluZWxlYXJuaW5nbWFzdGVyeS5jb20vc3BhcnNlLW1hdHJpY2VzLWZvci1tYWNoaW5lLWxlYXJuaW5nP19fcz11YjFicGlob2VrcWJzcGZ2cXpycyJ9 > > > > Linear algebra is a required tool for > understanding precise descriptions of > deep learning methods. Discover the > linear algebra topics required for deep > learning in this post: > >> Linear Algebra for Deep > Learning > < > http://t.dripemail2.com/c/eyJhY2NvdW50X2lkIjoiOTU1NjU4OCIsImRlbGl2ZXJ5X2lkIjoiMjI5MzQ1MDYyOSIsInVybCI6Imh0dHBzOi8vbWFjaGluZWxlYXJuaW5nbWFzdGVyeS5jb20vbGluZWFyLWFsZ2VicmEtZm9yLWRlZXAtbGVhcm5pbmc_X19zPXViMWJwaWhvZWtxYnNwZnZxenJzIn0 > > > I'll speak to you soon. > > Jason > > <<<>>> > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
