Hi All, I was looking onto one of the open issues in mlpack "Tests for Stochastic Optimization" <https://github.com/mlpack/mlpack/issues/894>. The idea here is to implement a set of unit test for evaluating variants of Stochastic Gradient Descent algorithm on a diverse set of loss functions, passing which is necessary for a SGD algorithm to prove its generality. The idea for this work is inspired from Schaul et al.'s paper "Unit Tests for Stochastic Optimization" <https://arxiv.org/abs/1312.6055>. For those, who want to skip the tedious task of going through the paper, here are some bullet points that briefly cover most of the important aspects mentioned in the paper.
* Each prototype function to be evaluated as a unit test is composed of some very simple one-dimensional mathematical functions such as Linear, Quadratic, Gaussian, LaPlacian, Absolute Value Function, ReLu, Sigmoid etc. Each function is defined on a particular interval. * These Simple 1D prototype functions can be concatenated to form more complex 1D prototypes like concatenating Line function followed by Quadratic bowl followed by a cliff. * 1D function prototypes can further be expanded onto multi-dimensional functions by using suitable norms. * There's also the feature to add different noise prototypes like additive gaussian noise to better mimic the behavior of real world loss functions. * There's also mechanisms to introduce different amounts of curl to multi-dimensional vector field to create loss functions similar to that produced by temporal difference learning in reinforcement learning. * There's also functionalities to create non-stationary objective functions which are typically observed in real world scenarios. Obviously, the paper itself has a much richer set of information than that could be covered in some bullet points. So, please consider giving it a look if you want to get a depth understanding of the problem at hand. One of the most important points I want to mention is that the authors of the paper have already open sourced a reference implementation of their work on github <https://github.com/IoannisAntonoglou/optimBench>, which is written lua. Already having a reference implementation in hand makes our life a lot easier because all of the function logics including all the mathematical intricacies can simply be ported to C++. This gives us a lot of time to put more effort on designing the framework to better suit the styles of C++ and mlpack. Some of the design decisions taken by authors although suits that of a scripting language like Lua, simply doesn't match the standards of general purpose programming language like C++. Especially, with a framework like mlpack, which touts template metaprogramming as its most important feature. Take for an example, the way 1D concatenation and multi-dimensional scaling is being handled by the reference framework. The way it has been implemented is by passing the function_prototype, noise_prototype and corresponding dimensions in a nicely formatted string, which will then be parsed by a parser to generate the corresponding function. In a language like C++, this will be better achieved either by using method chaining or by overloading the '+' operator. Ex:- FunctionProtype f = LinearUnit(starting_point, ending_point, dimension#1) .add(ReLuUnit(starting_point, ending_point, dimension#1) .add(QuadraticUnit(starting_point, ending_point, dimension#2) .add(NoisePrototype(starting_point, ending_point, dimension#1) .curl(rotation_matrix) or FunctionPrototype f = (LinearUnit(starting_point, ending_point, dimension#1) + (ReLuUnit(starting_point, ending_point, dimension#1) + (QuadraticUnit(starting_point, ending_point, dimension#2) + (NoisePrototype(starting_point, ending_point, dimension#1)) .curl(rotation_matrix) There's also the consideration of designing the FunctionProtoypes in such a way that it matches the FunctionType parameter taken by existing implementation of SGD variants in mlpack. (Although, this wouldn't be much of a problem.) My point here being, as we already have a reference implementation and hence all the function logics, it would be wiser to spend some extra time designing the problem carefully, rather than jumping onto the implementation directly. I'll come up with another mail to demonstrate a basic Class Hierarchy, which I think would better suit this situation. Please, feel free to add any comments or suggestions meanwhile. Regards Saswat
_______________________________________________ mlpack mailing list [email protected] http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
