Re: [mlpack] Augmented RNNs @ GSoC'17

Marcus Edel Wed, 15 Mar 2017 08:31:11 -0700

Hello Konstantin,

> Of course, it's a good idea - I have implemented it. Updated gist:
> https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5 However, 
> I
> don't completely understand what Backward method returns. I have written in
> comments to code my understanding of it - but I strongly doubt that I'm right.
> Can you review that comment and explain what it returns?


The Forward function computes the output using the current parameters and given
onput of the implemented module.

The Backward function computes the gradient of the implemented module with
respect to its own input.

The Gradeint function computes the gradient of the implemented module with
respect to its own parameters. However, a bunch of modules don't have any
parameters so they don't implement this function.

Also, I noticed you described the REINFORCE version of HAM, I think it would be
easier to go with the fully differentiable HAM, what do you think? I've
implemented the model from the "Recurrent Models of Visual Attention" paper,
that also used the REINFORCE method, I'll have to update the code slightly but I
think if you opt for the REINFORCE HAM maybe it helps to take a look at what I
did.

> I looked in core/tree directory
> (https://github.com/mlpack/mlpack/tree/master/src/mlpack/core/tree). There are
> some tree structures, but I can't see a good way to adapt them to HAM
> architecture - but even if there is, I think it still will be more convenient 
> to
> implement the gist of my previous letter. The reason is that tree structures
> from core/tree are created with different problems in mind and maintain 
> problem-
> specific information in them, which is definitely not going to help because it
> will hinder both execution and developer time. Maybe I was just unable to find
> the right tree structure - if so, give me a link to it, because (as you
> mentioned) having a ready and tested implementation is helpful.

I agree I'll see if I have some time, to play with some ideas that uses the
existing tree structures and let you know what I find out. But as I said, I'm
not against implementing an own structure, on the other side, it would make
sense to reuse some code if it turns out to be good idea.

Let me know what you think.

Thanks,
Marcus

> On 14 Mar 2017, at 18:23, Сидоров Константин <[email protected]> wrote:
> 
> Hello Marcus,
>  
>  
> The addition is simple all we have to do is to provide a
> minimal Function set: Forward, Backward, and Gradient for more information 
> take
> a look at the existing layers. What do you think about that, I think that this
> would allow us to use a bunch of already existing features?
> Of course, it's a good idea - I have implemented it. Updated gist: 
> https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5 
> <https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5>
> However, I don't completely understand what Backward method returns. I have 
> written in comments to code my understanding of it - but I strongly doubt 
> that I'm right. Can you review that comment and explain what it returns?
>  
> That's right the HAM module uses a fixed memory size and the gist looks clean
> and minimal, however, I would also take a look at the existing mlpack
> implementations. If we implement some tree that we use for the HAM model we 
> have
> to make sure it's well tested and that sometimes takes more time than the 
> actual
> implementation; if we could reuse some existing code, it's already tested. But
> as I said, if it turns out implementing some specific structure is a better 
> way,
> we can do that. No need to use some code that wasn't designed to be used in a
> totally different way.
> I looked in core/tree directory 
> (https://github.com/mlpack/mlpack/tree/master/src/mlpack/core/tree 
> <https://github.com/mlpack/mlpack/tree/master/src/mlpack/core/tree)>). There 
> are some tree structures, but I can't see a good way to adapt them to HAM 
> architecture - but even if there is, I think it still will be more convenient 
> to implement the gist of my previous letter. The reason is that tree 
> structures from core/tree are created with different problems in mind and 
> maintain problem-specific information in them, which is definitely not going 
> to help because it will hinder both execution and developer time. Maybe I was 
> just unable to find the right tree structure - if so, give me a link to it, 
> because (as you mentioned) having a ready and tested implementation is 
> helpful.
>  
> --
> Best Regards,
> Konstantin.
>  
> 13.03.2017, 18:25, "Marcus Edel" <[email protected]>:
>> Hello Konstantin,
>>  
>>  
>>> gist @ GitHub is a great idea, so the API (with changes in task 
>>> definitions) is
>>> now here: 
>>> https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5 
>>> <https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5> 
>> Great, now if you modify the gist it automatically creates a new revision 
>> and we
>> can switch between revisions.
>>  
>>  
>>> Regarding the NTM and HAM API, I was thinking that it might be a good idea, 
>>> to
>>> implement the models as layers, that way someone could reuse the 
>>> implementation
>>> inside other architectures. Sorry, I didn't quite understand your idea - 
>>> can you
>>> restate it?
>>  
>> The HAM module was designed with the intention to be generic enough so that 
>> it
>> can be used as a building block of larger neural architectures. So, for 
>> example,
>> I have this recurrent neural network:
>>  
>> RNN<MeanSquaredError<> > model(rho);
>> model.Add<IdentityLayer<> >();
>> model.Add<Linear<> >(inputSize, 20);
>> model.Add<LSTM<> >(20, 7, rho);
>> model.Add<Linear<> >(7, outputSize);
>> model.Add<SigmoidLayer<> >();
>>  
>> instead of using the LSTM I'd like to use the HAM module or the NTM. It 
>> would be
>> great if we could design the classes so that they can be integrated into the
>> current infrastructure. The addition is simple all we have to do is to 
>> provide a
>> minimal Function set: Forward, Backward, and Gradient for more information 
>> take
>> a look at the existing layers. What do you think about that, I think that 
>> this
>> would allow us to use a bunch of already existing features?
>>  
>>  
>>> Also, I haven't complete thought this through but maybe it's possible to 
>>> use the
>>> existing binary or decision tree for the HAM model, would save us a lot of 
>>> work
>>> if that's manageable. In HAM model, memory has fixed size. In this 
>>> situation, as
>>> a competitive programmer, I would write the code using std::vector as 
>>> storage
>>> for data and int (size_t actually) instead node pointers. Gist showcasing 
>>> this
>>> idea: https://gist.github.com/partobs-mdp/411df153a6067008d27c255ebd0fb0cb 
>>> <https://gist.github.com/partobs-mdp/411df153a6067008d27c255ebd0fb0cb>. As
>>> you can see, the code is rather small (without STL iterator support, 
>>> though) and
>>> quite easy to implement quickly and properly (the kind of thing that is all-
>>> important in such competitions).
>>  
>> That's right the HAM module uses a fixed memory size and the gist looks clean
>> and minimal, however, I would also take a look at the existing mlpack
>> implementations. If we implement some tree that we use for the HAM model we 
>> have
>> to make sure it's well tested and that sometimes takes more time than the 
>> actual
>> implementation; if we could reuse some existing code, it's already tested. 
>> But
>> as I said, if it turns out implementing some specific structure is a better 
>> way,
>> we can do that. No need to use some code that wasn't designed to be used in a
>> totally different way.
>>  
>> Let us know what you think.
>>  
>> Best,
>> Marcus
>>  
>>  
>>> On 10 Mar 2017, at 18:54, Сидоров Константин <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>  
>>> Hello Marcus,
>>>  
>>> gist @ GitHub is a great idea, so the API (with changes in task 
>>> definitions) is now here: 
>>> https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5 
>>> <https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5>
>>>  
>>> Regarding the NTM and HAM API, I was thinking that it might be a good idea, 
>>> to
>>> implement the models as layers, that way someone could reuse the 
>>> implementation
>>> inside other architectures.
>>> Sorry, I didn't quite understand your idea - can you restate it?
>>>  
>>> Also, I haven't
>>> complete thought this through but maybe it's possible to use the existing 
>>> binary
>>> or decision tree for the HAM model, would save us a lot of work if that's
>>> manageable.
>>> In HAM model, memory has fixed size. In this situation, as a competitive 
>>> programmer, I would write the code using std::vector as storage for data 
>>> and int (size_t actually) instead node pointers. Gist showcasing this idea: 
>>> https://gist.github.com/partobs-mdp/411df153a6067008d27c255ebd0fb0cb 
>>> <https://gist.github.com/partobs-mdp/411df153a6067008d27c255ebd0fb0cb>. As 
>>> you can see, the code is rather small (without STL iterator support, 
>>> though) and quite easy to implement quickly and properly (the kind of thing 
>>> that is all-important in such competitions).
>>>  
>>> Right now, since I have a detailed version of API, I have an idea to switch 
>>> to writing the application. Do you mind if I send the draft of my 
>>> application for review?
>>>  
>>> --
>>> Thanks in advance,
>>> Konstantin
>>>  
>>> 10.03.2017, 18:39, "Marcus Edel" <[email protected] 
>>> <mailto:[email protected]>>:
>>>> Hello Konstantin,
>>>>  
>>>> you put some really good thoughts into the first draft, do you think 
>>>> switching
>>>> to Github gist or something else makes it easier to share the code?
>>>>  
>>>> About the tasks, I think we should put all specific task parameters into 
>>>> the
>>>> constructor instead of using the Evaluate function. If we provide a unified
>>>> function like the Evaluate function for each task, it's probably easier to 
>>>> use.
>>>> Besides that the task API looks really clean.
>>>>  
>>>> Regarding the NTM and HAM API, I was thinking that it might be a good 
>>>> idea, to
>>>> implement the models as layers, that way someone could reuse the 
>>>> implementation
>>>> inside other architectures. I like the idea of implementing Highway 
>>>> Networks,
>>>> but maybe instead of implementing the idea first, I would recommend to 
>>>> implement
>>>> it if there is time left and propose this as an improvement. Also, I 
>>>> haven't
>>>> complete thought this through but maybe it's possible to use the existing 
>>>> binary
>>>> or decision tree for the HAM model, would save us a lot of work if that's
>>>> manageable.
>>>>  
>>>> Let me know what you think.
>>>>  
>>>> Thanks,
>>>> Marcus
>>>>  
>>>>> On 9 Mar 2017, at 12:09, Сидоров Константин <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>>  
>>>>> Hello Marcus,
>>>>> As a continuation of the API discussion, I've made some kind of C++ 
>>>>> header file with the ideas we've previously discussed plus ideas you've 
>>>>> pointed out in the last letter, which I attach to the letter. Speaking 
>>>>> about non-NTM augmented models, I'm inclined towards choosing 
>>>>> Hierarchical Attractive Memory - because I have some experience in 
>>>>> reinforcement learning and this ideaby itself seems very interesting.
>>>>> However, I have the feeling that I've somewhat messed up HAM code (partly 
>>>>> because I haven't quite understood the paper yet, partly because I have 
>>>>> hard time getting back to the C++ world). For this reason, can you review 
>>>>> the API / C++ declaration file sketch?
>>>>>  
>>>>> --
>>>>> Thanks in advance,
>>>>> Konstantin
>>>>>  
>>>>> 08.03.2017, 16:57, "Marcus Edel" <[email protected] 
>>>>> <mailto:[email protected]>>:
>>>>>> Hello Konstantin,
>>>>>>  
>>>>>>  
>>>>>>> Something like this: 
>>>>>>> https://thesundayprogrammer.wordpress.com/2016/01/27 
>>>>>>> <https://thesundayprogrammer.wordpress.com/2016/01/27>
>>>>>>> /neural-networks-using-mlpack/, but written for the up-to-date mlpack 
>>>>>>> version,
>>>>>>> rather that the version from last year. Of course, I tried hard to 
>>>>>>> google it,
>>>>>>> but failed :)
>>>>>>  
>>>>>> I'm working on a tutorial, for now, you could take a look at the unit 
>>>>>> test cases
>>>>>> where different architectures are tested on simple problems:
>>>>>>  
>>>>>> https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/feedforward_network_test.cpp
>>>>>>  
>>>>>> <https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/feedforward_network_test.cpp>
>>>>>> https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/recurrent_network_test.cpp
>>>>>>  
>>>>>> <https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/recurrent_network_test.cpp>
>>>>>> https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/convolutional_network_test.cpp
>>>>>>  
>>>>>> <https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/convolutional_network_test.cpp>
>>>>>>  
>>>>>> Let me know if I should translate the model used from the blog post.
>>>>>>  
>>>>>>  
>>>>>>> I think it's time to sum up the discussed API: we create namespace
>>>>>>> mlpack::ann::augmented, and do all the work in it. For instance, we 
>>>>>>> create NTM
>>>>>>> class with Train() / Evaluate() methods (standard for mlpack). Also, we 
>>>>>>> create
>>>>>>> classes for evaluating models (for instance, CopyTask). As an idea: 
>>>>>>> what do you
>>>>>>> think about creating yet another namespace for such tasks? (for 
>>>>>>> instance, the
>>>>>>> full class name will be mlpack::ann::augmented::tasks::CopyTask)? 
>>>>>>> Earlier I've
>>>>>>> offered to inherit them from BaseBenchmark, but now I see it's not the 
>>>>>>> only way.
>>>>>>> Right now, I'm more inclined to think that not doing inheritance is 
>>>>>>> better
>>>>>>> because of that argument with the virtual functions. What do you think?
>>>>>>  
>>>>>> I agree, using a subnamespace of augmented is a good idea, makes things 
>>>>>> more
>>>>>> clear. Also,  I think we can use another template parameter for e.g. the
>>>>>> Controler, probably makes it easier to switch between a feed-forward and 
>>>>>> LSTM
>>>>>> controller. There are some other positions where we could use templates 
>>>>>> to
>>>>>> generalize the API.
>>>>>>  
>>>>>>  
>>>>>>> I'm interested in continuing the discussion and getting deeper into that
>>>>>>> project. By the way, sorry for the delay with the first API idea. The 
>>>>>>> reason is
>>>>>>> that I'm a freshman student, and for this reason my university studies 
>>>>>>> often get
>>>>>>> quite stressful. However, right now I'm trying to submit all of my 
>>>>>>> coursework
>>>>>>> and finish the semester early - with the goal of fully transiting to 
>>>>>>> GSoC work
>>>>>>> :)
>>>>>>  
>>>>>> No worries, take all the time you need for your studies, there is plenty 
>>>>>> of
>>>>>> time, where we can discuss ideas. Just some note, recommendation; take a 
>>>>>> look at
>>>>>> other augmented memory models as well, because the NTM is probably not 
>>>>>> enough
>>>>>> work for the GSoC.
>>>>>>  
>>>>>> Thanks,
>>>>>> Marcus
>>>>>>  
>>>>>>  
>>>>>>> On 7 Mar 2017, at 16:47, Сидоров Константин <[email protected] 
>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>  
>>>>>>> Hello Marcus,
>>>>>>>  
>>>>>>> Regarding
>>>>>>> the BaseAugmented class, we should avoid inheritance (or, at least, 
>>>>>>> virtual
>>>>>>> inheritance) in deference to templates. This is because virtual 
>>>>>>> functions incur
>>>>>>> runtime overhead. In this case I don't see a compelling reason to 
>>>>>>> introduce a
>>>>>>> BaseAugmented class, maybe I missed something? We could do:
>>>>>>>  
>>>>>>> Code:
>>>>>>> =====
>>>>>>>  
>>>>>>> class CopyTask
>>>>>>> {
>>>>>>>   template<typename ModelType>
>>>>>>>   void Evaluate(ModelType& model)
>>>>>>>   {
>>>>>>>     // Call model.Train() using the generated copy task data.
>>>>>>>     // Call model.Evaluate
>>>>>>>   }
>>>>>>> }
>>>>>>>  
>>>>>>> Usage:
>>>>>>> ======
>>>>>>>  
>>>>>>> NTM ntm;
>>>>>>> CopyTask.Evaluate(ntm);
>>>>>>>  
>>>>>>> What do you think about the design?
>>>>>>> 
>>>>>>> This is definitely an excellent idea :) Actually, this shows that I 
>>>>>>> have forgotten most of C++ - here, I forgot about that after using the 
>>>>>>> template keyword, I cal call any functions I like (just like in Python, 
>>>>>>> actually). So, it's about time to refresh C++ - and thanks for your 
>>>>>>> advice on resources, since my C++ experience was biased towards 
>>>>>>> competitive programming, which mostly ignores these language features.
>>>>>>>  
>>>>>>> - Can you show me the working example of mlpack::ann::FFN compatible 
>>>>>>> with
>>>>>>> current upstream C++ API?
>>>>>>>  
>>>>>>> Not sure what you mean, can you elaborate more?
>>>>>>> 
>>>>>>> Something like this: 
>>>>>>> https://thesundayprogrammer.wordpress.com/2016/01/27/neural-networks-using-mlpack/
>>>>>>>  
>>>>>>> <https://thesundayprogrammer.wordpress.com/2016/01/27/neural-networks-using-mlpack/>,
>>>>>>>  but written for the up-to-date mlpack version, rather that the version 
>>>>>>> from last year. Of course, I tried hard to google it, but failed :)
>>>>>>>  
>>>>>>> I think it's time to sum up the discussed API: we create namespace 
>>>>>>> mlpack::ann::augmented, and do all the work in it. For instance, we 
>>>>>>> create NTM class with Train() / Evaluate() methods (standard for 
>>>>>>> mlpack). Also, we create classes for evaluating models (for instance, 
>>>>>>> CopyTask).
>>>>>>> As an idea: what do you think about creating yet another namespace for 
>>>>>>> such tasks? (for instance, the full class name will be 
>>>>>>> mlpack::ann::augmented::tasks::CopyTask)? Earlier I've offered to 
>>>>>>> inherit them from BaseBenchmark, but now I see it's not the only way. 
>>>>>>> Right now, I'm more inclined to think that not doing inheritance is 
>>>>>>> better because of that argument with the virtual functions. What do you 
>>>>>>> think?
>>>>>>>  
>>>>>>> I'm interested in continuing the discussion and getting deeper into 
>>>>>>> that project. By the way, sorry for the delay with the first API idea. 
>>>>>>> The reason is that I'm a freshman student, and for this reason my 
>>>>>>> university studies often get quite stressful. However, right now I'm 
>>>>>>> trying to submit all of my coursework and finish the semester early - 
>>>>>>> with the goal of fully transiting to GSoC work :)
>>>>>>>  
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>> Konstantin.
>>>>>>>  
>>>>>>> 07.03.2017, 17:54, "Marcus Edel" <[email protected] 
>>>>>>> <mailto:[email protected]>>:
>>>>>>>> Hello Konstantin,
>>>>>>>>  
>>>>>>>> thanks for getting back.
>>>>>>>>  
>>>>>>>>  
>>>>>>>>> Since we have to make several different augmented RNN models, I think 
>>>>>>>>> it is a
>>>>>>>>> good idea to make a namespace mlpacK::ann::augmented and a class 
>>>>>>>>> BaseAugmented
>>>>>>>>> (will be useful for benchmarking later).
>>>>>>>>  
>>>>>>>> Putting different networks under a unified namespace is a good idea. 
>>>>>>>> Regarding
>>>>>>>> the BaseAugmented class, we should avoid inheritance (or, at least, 
>>>>>>>> virtual
>>>>>>>> inheritance) in deference to templates. This is because virtual 
>>>>>>>> functions incur
>>>>>>>> runtime overhead. In this case I don't see a compelling reason to 
>>>>>>>> introduce a
>>>>>>>> BaseAugmented class, maybe I missed something? We could do:
>>>>>>>>  
>>>>>>>> Code:
>>>>>>>> =====
>>>>>>>>  
>>>>>>>> class CopyTask
>>>>>>>> {
>>>>>>>>   template<typename ModelType>
>>>>>>>>   void Evaluate(ModelType& model)
>>>>>>>>   {
>>>>>>>>     // Call model.Train() using the generated copy task data.
>>>>>>>>     // Call model.Evaluate
>>>>>>>>   }
>>>>>>>> }
>>>>>>>>  
>>>>>>>> Usage:
>>>>>>>> ======
>>>>>>>>  
>>>>>>>> NTM ntm;
>>>>>>>> CopyTask.Evaluate(ntm);
>>>>>>>>  
>>>>>>>> What do you think about the design?
>>>>>>>>  
>>>>>>>>  
>>>>>>>>> P.S. Some questions that arose while trying to get in grips with 
>>>>>>>>> mlpack:
>>>>>>>>> - What resources can you advice to brush up *advanced/modern* C++? 
>>>>>>>>> (e.g., templates, && in functions)
>>>>>>>>  
>>>>>>>> There are some really nice books that might help to refresh your 
>>>>>>>> knowledge:
>>>>>>>>  
>>>>>>>> - "Modern C++ Design, Generic Programming and Design Patterns Applied" 
>>>>>>>> by Andrei Alexandrescu
>>>>>>>> - "Effective C++" by Scott Meyers
>>>>>>>> - "Effective STL" by Scott Meyers
>>>>>>>>  
>>>>>>>> There are also some references on http://mlpack.org/gsoc.html 
>>>>>>>> <http://mlpack.org/gsoc.html>
>>>>>>>>  
>>>>>>>>  
>>>>>>>>> - Can you show me the working example of mlpack::ann::FFN compatible 
>>>>>>>>> with
>>>>>>>>> current upstream C++ API?
>>>>>>>>  
>>>>>>>> Not sure what you mean, can you elaborate more?
>>>>>>>>  
>>>>>>>> I hope this is helpful, let us know if you have any more questions.
>>>>>>>>  
>>>>>>>> Thanks,
>>>>>>>> Marcus
>>>>>>>>  
>>>>>>>>> On 5 Mar 2017, at 10:12, Сидоров Константин <[email protected] 
>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>  
>>>>>>>>> Hello Marcus!
>>>>>>>>> Right now, I'm thinking quite a lot on "Augmented RNNs" project. For 
>>>>>>>>> example, I'm trying to convert ideas from project description to 
>>>>>>>>> something more concrete and mlpack-specific. Here is my first shot at 
>>>>>>>>> C++ API for NTM.
>>>>>>>>> Disclaimer: I haven't written on C++ for ~10 months. My last 
>>>>>>>>> experience of C++ coding is our (Russian) national programming 
>>>>>>>>> olympiad (April '16), after which I've been coding almost exclusively 
>>>>>>>>> on Python.
>>>>>>>>> Since we have to make several different augmented RNN models, I think 
>>>>>>>>> it is a good idea to make a namespace mlpacK::ann::augmented and a 
>>>>>>>>> class BaseAugmented (will be useful for benchmarking later).
>>>>>>>>> Inside it, we can define class NTM< OutputLayerType, 
>>>>>>>>> InitializationRuleType > : BaseAugmented with standard mlpack 
>>>>>>>>> interface:
>>>>>>>>> - template<typename NetworkType, typename Lambda> NTM(NetworkType 
>>>>>>>>> controller, arma::mat &memory, Lambda similarity)
>>>>>>>>> - Predict (arma::mat &predictors, arma::mat &responses)
>>>>>>>>> - Train (const arma::mat &predictors, const arma::mat &responses)
>>>>>>>>> For benchmarking, we can define class 
>>>>>>>>> mlpack::ann:augmented::BaseBenchmark with interface:
>>>>>>>>> - BaseBenchmark() = 0
>>>>>>>>> - Evaluate() = 0 (the method that will accept task parameters as 
>>>>>>>>> arguments and run the augmented model)
>>>>>>>>> As an example, an API of CopyBenchmark : BaseBenchmark:
>>>>>>>>> - CopyBenchmark()
>>>>>>>>> - Evaluate(BaseAugmented model, int maxLength = 5, int repeats = 1) 
>>>>>>>>> // repeats is a parameter to convert copy to repeat-copy task.
>>>>>>>>> So, that's some kind of API. I would be interested to discuss and 
>>>>>>>>> analyze it with you.
>>>>>>>>>  
>>>>>>>>> P.S. Some questions that arose while trying to get in grips with 
>>>>>>>>> mlpack:
>>>>>>>>> - What resources can you advice to brush up *advanced/modern* C++? 
>>>>>>>>> (e.g., templates, && in functions)
>>>>>>>>> - Can you show me the working example of mlpack::ann::FFN compatible 
>>>>>>>>> with current upstream C++ API?
>>>>>>>>>  
>>>>>>>>> 28.02.2017, 16:43, "Marcus Edel" <[email protected] 
>>>>>>>>> <mailto:[email protected]>>:
>>>>>>>>>> Hello Konstantin,
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>> My name is Konstantin Sidorov, and I am an undergraduate student in 
>>>>>>>>>>> Astrakhan
>>>>>>>>>>> State University (Russia). I’m glad to know that mlpack was 
>>>>>>>>>>> accepted in GSoC’17
>>>>>>>>>>> – as a side note, congratulations :)
>>>>>>>>>>  
>>>>>>>>>> thanks and welcome!
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>> I’m already fairly familiar with deep learning. For example, 
>>>>>>>>>>> recently I
>>>>>>>>>>> implemented optimality tightening from “Learning to play in a day”
>>>>>>>>>>> (https://arxiv.org/abs/1611.01606 
>>>>>>>>>>> <https://arxiv.org/abs/1611.01606>) for the AgentNet (“Deep 
>>>>>>>>>>> Reinforcement
>>>>>>>>>>> Learning library for humans”, 
>>>>>>>>>>> https://github.com/yandexdataschool/AgentNet 
>>>>>>>>>>> <https://github.com/yandexdataschool/AgentNet>).
>>>>>>>>>>  
>>>>>>>>>> Sounds really interesting, the "Learning to Play in a Day" paper is 
>>>>>>>>>> on my
>>>>>>>>>> reading list, looks like I should move it up.
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>> Of course, on such an early stage I have no detailed plan what (and 
>>>>>>>>>>> how) to do –
>>>>>>>>>>> only some ideas. In the beginning, for example, I’m planning to 
>>>>>>>>>>> implement NTMs
>>>>>>>>>>> as described in arXiv paper and implement *reusable* benchmarking 
>>>>>>>>>>> code (e.g.,
>>>>>>>>>>> copy, repeat copy, n-grams). I would like to discuss this project 
>>>>>>>>>>> more
>>>>>>>>>>> thoroughly if possible. In addition, this is my first participation 
>>>>>>>>>>> in GSoC. So,
>>>>>>>>>>> excuse me in advance if I’ve done something inappropriate.
>>>>>>>>>>  
>>>>>>>>>> Implementing the NTM task from the paper, so that they can be used 
>>>>>>>>>> for other
>>>>>>>>>> models as well is a great idea. In fact, you see a lot of other 
>>>>>>>>>> papers that at
>>>>>>>>>> least reuse the copy task. There are a bunch of other interesting 
>>>>>>>>>> tasks that
>>>>>>>>>> could be implemented like the MNIST pen stroke classification task 
>>>>>>>>>> recently
>>>>>>>>>> introduced by Edwin D. de Jong in his "Incremental Sequence 
>>>>>>>>>> Learning" paper. The
>>>>>>>>>> Stanford Natural Language Inference task proposed by Samuel R. 
>>>>>>>>>> Bowman et al. in
>>>>>>>>>> "A large annotated corpus for learning natural language inference" 
>>>>>>>>>> can be also
>>>>>>>>>> transformed into a long-term dependency task, that might be 
>>>>>>>>>> interesting.
>>>>>>>>>>  
>>>>>>>>>> Regarding the project itself, take a look at other models as well, 
>>>>>>>>>> depending on
>>>>>>>>>> the model you choose, I think there is some time left for another 
>>>>>>>>>> model. Also,
>>>>>>>>>> about the implementation, mlpack's architecture is kinda different 
>>>>>>>>>> to Theano's
>>>>>>>>>> graph construction and compilation work, but if you managed to work 
>>>>>>>>>> with Theano
>>>>>>>>>> you shouldn't have a problem.
>>>>>>>>>>  
>>>>>>>>>> If you like we can discuss any details over the mailing list and 
>>>>>>>>>> brainstorm some
>>>>>>>>>> ideas, discuss an initial class design, etc.
>>>>>>>>>>  
>>>>>>>>>> I hope this is helpful, let us know if you have any more questions.
>>>>>>>>>>  
>>>>>>>>>> Thanks,
>>>>>>>>>> Marcus
>>>>>>>>>>  
>>>>>>>>>>> On 28 Feb 2017, at 07:06, Сидоров Константин <[email protected] 
>>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>  
>>>>>>>>>>> Hello Marcus,
>>>>>>>>>>> My name is Konstantin Sidorov, and I am an undergraduate student in 
>>>>>>>>>>> Astrakhan State University (Russia). I’m glad to know that mlpack 
>>>>>>>>>>> was accepted in GSoC’17 – as a side note, congratulations :)
>>>>>>>>>>> I’m interested to work on project “Augmented Recurrent Neural 
>>>>>>>>>>> Networks”. I’m already fairly familiar with deep learning. For 
>>>>>>>>>>> example, recently I implemented optimality tightening from 
>>>>>>>>>>> “Learning to play in a day” (https://arxiv.org/abs/1611.01606 
>>>>>>>>>>> <https://arxiv.org/abs/1611.01606>) for the AgentNet (“Deep 
>>>>>>>>>>> Reinforcement Learning library for humans”, 
>>>>>>>>>>> https://github.com/yandexdataschool/AgentNet 
>>>>>>>>>>> <https://github.com/yandexdataschool/AgentNet>). Here is the merged 
>>>>>>>>>>> pull request: https://github.com/yandexdataschool/AgentNet/pull/88 
>>>>>>>>>>> <https://github.com/yandexdataschool/AgentNet/pull/88>.
>>>>>>>>>>> As you see, I’m quite familiar with deep learning and Theano. Even 
>>>>>>>>>>> though my main field of interest is RL, I would be very interested 
>>>>>>>>>>> in doing something new – that is why I’ve chosen “Augmented RNNs”.
>>>>>>>>>>> Of course, on such an early stage I have no detailed plan what (and 
>>>>>>>>>>> how) to do – only some ideas. In the beginning, for example, I’m 
>>>>>>>>>>> planning to implement NTMs as described in arXiv paper and 
>>>>>>>>>>> implement *reusable* benchmarking code (e.g., copy, repeat copy, 
>>>>>>>>>>> n-grams).
>>>>>>>>>>> I would like to discuss this project more thoroughly if possible. 
>>>>>>>>>>> In addition, this is my first participation in GSoC. So, excuse me 
>>>>>>>>>>> in advance if I’ve done something inappropriate.
>>>>>>>>>>> ---
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Konstantin.
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> mlpack mailing list
>>>>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>>>>> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack 
>>>>>>>>>>> <http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack> 
>>>>>>>>>  
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>> Konstantin.
>>>>>>>  
>>>>>  
>>>>>  
>>>>> -- 
>>>>> Сидоров Константин
>>>>> Телефон: +7 917 095-31-27
>>>>> E-mail: [email protected] <mailto:[email protected]>
>>>>>  
>>>>> <api.hpp>
>>>  
>>>  
>>> -- 
>>> Сидоров Константин
>>> Телефон: +7 917 095-31-27
>>> E-mail: [email protected] <mailto:[email protected]> 
>  
> -- 
> Сидоров Константин
> Телефон: +7 917 095-31-27
> E-mail: [email protected]
>

_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Re: [mlpack] Augmented RNNs @ GSoC'17

Reply via email to