Re: [mlpack] Augmented RNNs @ GSoC'17

Marcus Edel Wed, 08 Mar 2017 05:57:43 -0800

Hello Konstantin,

> Something like this: https://thesundayprogrammer.wordpress.com/2016/01/27
> /neural-networks-using-mlpack/, but written for the up-to-date mlpack version,
> rather that the version from last year. Of course, I tried hard to google it,
> but failed :)


I'm working on a tutorial, for now, you could take a look at the unit test cases
where different architectures are tested on simple problems:

https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/feedforward_network_test.cpp
https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/recurrent_network_test.cpp
https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/convolutional_network_test.cpp

Let me know if I should translate the model used from the blog post.

> I think it's time to sum up the discussed API: we create namespace
> mlpack::ann::augmented, and do all the work in it. For instance, we create NTM
> class with Train() / Evaluate() methods (standard for mlpack). Also, we create
> classes for evaluating models (for instance, CopyTask). As an idea: what do 
> you
> think about creating yet another namespace for such tasks? (for instance, the
> full class name will be mlpack::ann::augmented::tasks::CopyTask)? Earlier I've
> offered to inherit them from BaseBenchmark, but now I see it's not the only 
> way.
> Right now, I'm more inclined to think that not doing inheritance is better
> because of that argument with the virtual functions. What do you think?

I agree, using a subnamespace of augmented is a good idea, makes things more
clear. Also,  I think we can use another template parameter for e.g. the
Controler, probably makes it easier to switch between a feed-forward and LSTM
controller. There are some other positions where we could use templates to
generalize the API.

> I'm interested in continuing the discussion and getting deeper into that
> project. By the way, sorry for the delay with the first API idea. The reason 
> is
> that I'm a freshman student, and for this reason my university studies often 
> get
> quite stressful. However, right now I'm trying to submit all of my coursework
> and finish the semester early - with the goal of fully transiting to GSoC work
> :)

No worries, take all the time you need for your studies, there is plenty of
time, where we can discuss ideas. Just some note, recommendation; take a look at
other augmented memory models as well, because the NTM is probably not enough
work for the GSoC.

Thanks,
Marcus


> On 7 Mar 2017, at 16:47, Сидоров Константин <[email protected]> wrote:
> 
> Hello Marcus,
>  
> Regarding
> the BaseAugmented class, we should avoid inheritance (or, at least, virtual
> inheritance) in deference to templates. This is because virtual functions 
> incur
> runtime overhead. In this case I don't see a compelling reason to introduce a
> BaseAugmented class, maybe I missed something? We could do:
>  
> Code:
> =====
>  
> class CopyTask
> {
>   template<typename ModelType>
>   void Evaluate(ModelType& model)
>   {
>     // Call model.Train() using the generated copy task data.
>     // Call model.Evaluate
>   }
> }
>  
> Usage:
> ======
>  
> NTM ntm;
> CopyTask.Evaluate(ntm);
>  
> What do you think about the design?
> 
> This is definitely an excellent idea :) Actually, this shows that I have 
> forgotten most of C++ - here, I forgot about that after using the template 
> keyword, I cal call any functions I like (just like in Python, actually). So, 
> it's about time to refresh C++ - and thanks for your advice on resources, 
> since my C++ experience was biased towards competitive programming, which 
> mostly ignores these language features.
>  
> - Can you show me the working example of mlpack::ann::FFN compatible with
> current upstream C++ API?
>  
> Not sure what you mean, can you elaborate more?
> 
> Something like this: 
> https://thesundayprogrammer.wordpress.com/2016/01/27/neural-networks-using-mlpack/
>  
> <https://thesundayprogrammer.wordpress.com/2016/01/27/neural-networks-using-mlpack/>,
>  but written for the up-to-date mlpack version, rather that the version from 
> last year. Of course, I tried hard to google it, but failed :)
>  
> I think it's time to sum up the discussed API: we create namespace 
> mlpack::ann::augmented, and do all the work in it. For instance, we create 
> NTM class with Train() / Evaluate() methods (standard for mlpack). Also, we 
> create classes for evaluating models (for instance, CopyTask).
> As an idea: what do you think about creating yet another namespace for such 
> tasks? (for instance, the full class name will be 
> mlpack::ann::augmented::tasks::CopyTask)? Earlier I've offered to inherit 
> them from BaseBenchmark, but now I see it's not the only way. Right now, I'm 
> more inclined to think that not doing inheritance is better because of that 
> argument with the virtual functions. What do you think?
>  
> I'm interested in continuing the discussion and getting deeper into that 
> project. By the way, sorry for the delay with the first API idea. The reason 
> is that I'm a freshman student, and for this reason my university studies 
> often get quite stressful. However, right now I'm trying to submit all of my 
> coursework and finish the semester early - with the goal of fully transiting 
> to GSoC work :)
>  
> --
> Best Regards,
> Konstantin.
>  
> 07.03.2017, 17:54, "Marcus Edel" <[email protected]>:
>> Hello Konstantin,
>>  
>> thanks for getting back.
>>  
>>  
>>> Since we have to make several different augmented RNN models, I think it is 
>>> a
>>> good idea to make a namespace mlpacK::ann::augmented and a class 
>>> BaseAugmented
>>> (will be useful for benchmarking later).
>>  
>> Putting different networks under a unified namespace is a good idea. 
>> Regarding
>> the BaseAugmented class, we should avoid inheritance (or, at least, virtual
>> inheritance) in deference to templates. This is because virtual functions 
>> incur
>> runtime overhead. In this case I don't see a compelling reason to introduce a
>> BaseAugmented class, maybe I missed something? We could do:
>>  
>> Code:
>> =====
>>  
>> class CopyTask
>> {
>>   template<typename ModelType>
>>   void Evaluate(ModelType& model)
>>   {
>>     // Call model.Train() using the generated copy task data.
>>     // Call model.Evaluate
>>   }
>> }
>>  
>> Usage:
>> ======
>>  
>> NTM ntm;
>> CopyTask.Evaluate(ntm);
>>  
>> What do you think about the design?
>>  
>>  
>>> P.S. Some questions that arose while trying to get in grips with mlpack:
>>> - What resources can you advice to brush up *advanced/modern* C++? (e.g., 
>>> templates, && in functions)
>>  
>> There are some really nice books that might help to refresh your knowledge:
>>  
>> - "Modern C++ Design, Generic Programming and Design Patterns Applied" by 
>> Andrei Alexandrescu
>> - "Effective C++" by Scott Meyers
>> - "Effective STL" by Scott Meyers
>>  
>> There are also some references on http://mlpack.org/gsoc.html 
>> <http://mlpack.org/gsoc.html>
>>  
>>  
>>> - Can you show me the working example of mlpack::ann::FFN compatible with
>>> current upstream C++ API?
>>  
>> Not sure what you mean, can you elaborate more?
>>  
>> I hope this is helpful, let us know if you have any more questions.
>>  
>> Thanks,
>> Marcus
>>  
>>> On 5 Mar 2017, at 10:12, Сидоров Константин <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>  
>>> Hello Marcus!
>>> Right now, I'm thinking quite a lot on "Augmented RNNs" project. For 
>>> example, I'm trying to convert ideas from project description to something 
>>> more concrete and mlpack-specific. Here is my first shot at C++ API for NTM.
>>> Disclaimer: I haven't written on C++ for ~10 months. My last experience of 
>>> C++ coding is our (Russian) national programming olympiad (April '16), 
>>> after which I've been coding almost exclusively on Python.
>>> Since we have to make several different augmented RNN models, I think it is 
>>> a good idea to make a namespace mlpacK::ann::augmented and a class 
>>> BaseAugmented (will be useful for benchmarking later).
>>> Inside it, we can define class NTM< OutputLayerType, InitializationRuleType 
>>> > : BaseAugmented with standard mlpack interface:
>>> - template<typename NetworkType, typename Lambda> NTM(NetworkType 
>>> controller, arma::mat &memory, Lambda similarity)
>>> - Predict (arma::mat &predictors, arma::mat &responses)
>>> - Train (const arma::mat &predictors, const arma::mat &responses)
>>> For benchmarking, we can define class mlpack::ann:augmented::BaseBenchmark 
>>> with interface:
>>> - BaseBenchmark() = 0
>>> - Evaluate() = 0 (the method that will accept task parameters as arguments 
>>> and run the augmented model)
>>> As an example, an API of CopyBenchmark : BaseBenchmark:
>>> - CopyBenchmark()
>>> - Evaluate(BaseAugmented model, int maxLength = 5, int repeats = 1) // 
>>> repeats is a parameter to convert copy to repeat-copy task.
>>> So, that's some kind of API. I would be interested to discuss and analyze 
>>> it with you.
>>>  
>>> P.S. Some questions that arose while trying to get in grips with mlpack:
>>> - What resources can you advice to brush up *advanced/modern* C++? (e.g., 
>>> templates, && in functions)
>>> - Can you show me the working example of mlpack::ann::FFN compatible with 
>>> current upstream C++ API?
>>>  
>>> 28.02.2017, 16:43, "Marcus Edel" <[email protected] 
>>> <mailto:[email protected]>>:
>>>> Hello Konstantin,
>>>>  
>>>>  
>>>>> My name is Konstantin Sidorov, and I am an undergraduate student in 
>>>>> Astrakhan
>>>>> State University (Russia). I’m glad to know that mlpack was accepted in 
>>>>> GSoC’17
>>>>> – as a side note, congratulations :)
>>>>  
>>>> thanks and welcome!
>>>>  
>>>>  
>>>>> I’m already fairly familiar with deep learning. For example, recently I
>>>>> implemented optimality tightening from “Learning to play in a day”
>>>>> (https://arxiv.org/abs/1611.01606 <https://arxiv.org/abs/1611.01606>) for 
>>>>> the AgentNet (“Deep Reinforcement
>>>>> Learning library for humans”, 
>>>>> https://github.com/yandexdataschool/AgentNet 
>>>>> <https://github.com/yandexdataschool/AgentNet>).
>>>>  
>>>> Sounds really interesting, the "Learning to Play in a Day" paper is on my
>>>> reading list, looks like I should move it up.
>>>>  
>>>>  
>>>>> Of course, on such an early stage I have no detailed plan what (and how) 
>>>>> to do –
>>>>> only some ideas. In the beginning, for example, I’m planning to implement 
>>>>> NTMs
>>>>> as described in arXiv paper and implement *reusable* benchmarking code 
>>>>> (e.g.,
>>>>> copy, repeat copy, n-grams). I would like to discuss this project more
>>>>> thoroughly if possible. In addition, this is my first participation in 
>>>>> GSoC. So,
>>>>> excuse me in advance if I’ve done something inappropriate.
>>>>  
>>>> Implementing the NTM task from the paper, so that they can be used for 
>>>> other
>>>> models as well is a great idea. In fact, you see a lot of other papers 
>>>> that at
>>>> least reuse the copy task. There are a bunch of other interesting tasks 
>>>> that
>>>> could be implemented like the MNIST pen stroke classification task recently
>>>> introduced by Edwin D. de Jong in his "Incremental Sequence Learning" 
>>>> paper. The
>>>> Stanford Natural Language Inference task proposed by Samuel R. Bowman et 
>>>> al. in
>>>> "A large annotated corpus for learning natural language inference" can be 
>>>> also
>>>> transformed into a long-term dependency task, that might be interesting.
>>>>  
>>>> Regarding the project itself, take a look at other models as well, 
>>>> depending on
>>>> the model you choose, I think there is some time left for another model. 
>>>> Also,
>>>> about the implementation, mlpack's architecture is kinda different to 
>>>> Theano's
>>>> graph construction and compilation work, but if you managed to work with 
>>>> Theano
>>>> you shouldn't have a problem.
>>>>  
>>>> If you like we can discuss any details over the mailing list and 
>>>> brainstorm some
>>>> ideas, discuss an initial class design, etc.
>>>>  
>>>> I hope this is helpful, let us know if you have any more questions.
>>>>  
>>>> Thanks,
>>>> Marcus
>>>>  
>>>>> On 28 Feb 2017, at 07:06, Сидоров Константин <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>>  
>>>>> Hello Marcus,
>>>>> My name is Konstantin Sidorov, and I am an undergraduate student in 
>>>>> Astrakhan State University (Russia). I’m glad to know that mlpack was 
>>>>> accepted in GSoC’17 – as a side note, congratulations :)
>>>>> I’m interested to work on project “Augmented Recurrent Neural Networks”. 
>>>>> I’m already fairly familiar with deep learning. For example, recently I 
>>>>> implemented optimality tightening from “Learning to play in a day” 
>>>>> (https://arxiv.org/abs/1611.01606 <https://arxiv.org/abs/1611.01606>) for 
>>>>> the AgentNet (“Deep Reinforcement Learning library for humans”, 
>>>>> https://github.com/yandexdataschool/AgentNet 
>>>>> <https://github.com/yandexdataschool/AgentNet>). Here is the merged pull 
>>>>> request: https://github.com/yandexdataschool/AgentNet/pull/88 
>>>>> <https://github.com/yandexdataschool/AgentNet/pull/88>.
>>>>> As you see, I’m quite familiar with deep learning and Theano. Even though 
>>>>> my main field of interest is RL, I would be very interested in doing 
>>>>> something new – that is why I’ve chosen “Augmented RNNs”.
>>>>> Of course, on such an early stage I have no detailed plan what (and how) 
>>>>> to do – only some ideas. In the beginning, for example, I’m planning to 
>>>>> implement NTMs as described in arXiv paper and implement *reusable* 
>>>>> benchmarking code (e.g., copy, repeat copy, n-grams).
>>>>> I would like to discuss this project more thoroughly if possible. In 
>>>>> addition, this is my first participation in GSoC. So, excuse me in 
>>>>> advance if I’ve done something inappropriate.
>>>>> ---
>>>>> Best Regards,
>>>>> Konstantin.
>>>>> _______________________________________________
>>>>> mlpack mailing list
>>>>> [email protected] <mailto:[email protected]>
>>>>> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack 
>>>>> <http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack> 
>>>  
>>> --
>>> Best Regards,
>>> Konstantin.
>

_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Re: [mlpack] Augmented RNNs @ GSoC'17

Reply via email to