Re: [mlpack] GSoC proposal improvements and tips

2022-05-29 Thread Ryan Curtin
On Sun, May 29, 2022 at 12:54:22AM +0530, Suvarsha Chennareddy wrote:
> Hello everyone,
> 
> First of all I would like to thank those who have taken their time to
> review my 2022 GSoC proposal before I had submitted it. Unfortunately, it
> was rejected. However, I understand GSoC is usually very competitive.
> I would like to know how I could improve for my next attempt. If possible,
> I would also like to know if there was a particular reason (or reasons) why
> my proposal was rejected.
> Thank you for taking the time to review my proposal and this email.

Hey Suvarsha,

I'll respond off-list in a moment.

Ryan

-- 
Ryan Curtin| "Get out of the way!"
r...@ratml.org |   - Train Engineer Assistant
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


Re: [mlpack] GSoC Proposal Discussion: DAG class & InceptionV3 model

2022-03-23 Thread Ryan Curtin
Sorry for the slow response on this one.  The general idea of the MATLAB
DAGNetwork you linked to is the same basic idea, although I wouldn't
suggest taking design and API ideas from MATLAB. :)

In general I do think that what you proposed is sufficient for a
large-sized project, but I say that because I am thinking about all the
complexities of making an easy-to-use, well-documented `DAGNetwork`
class (or whatever we choose to call it... `DAGN`, like `FFN` and
`RNN`?).  That will take quite some additional time past just
implementing the core support of it.

When you put together a proposal, it's probably worth thinking through
those details: what methods will be necessary for a user to do the kinds
of tasks you expect them to do?  How can we make the API nice?  What
kinds of documentation should we provide and how should we provide it?
How should we structure the code to maximize reuse?  Those are some
questions that I would be thinking if I put together a proposal like
this.  (Of course, there are more that could be asked!  That's just a
few.)

I'm doing my best to get the refactored RNN working and then we can
merge #2777.  I think it's pretty important for the success of this
project and other ones that we are working against master, not against
the `ann-vtable` branch.  So, I really hope to have that ready soon.  At
least personally my goal is just to get RNNs + LSTMs working, so that
the `lstm_stock_prediction` example works.  Then we can merge #2777 with
very limited layer support, and then slowly re-add layers.

That approach may also have ramifications for #2963, because I may not
yet have adapted layers that InceptionV3 depends on (I haven't looked at
that PR so I can't say).  In general adapting layers to the new approach
ranges from "takes 5 minutes" to "start by reading the paper..."
depending on the complexity of the layer.  So, that could be worth
thinking about too.

I hope this information is helpful!  I really want #2777 done soon, so
if my response seems a bit all-over-the-place, it's because I'm actually
thinking about what that bug with the backward passes might be. :)

On Fri, Mar 11, 2022 at 10:29:08PM -0500, Marcus Edel wrote:
> Hello Anwaar,
> 
> first of all, thanks again for all the great contributions over the last 
> months.
> I like the idea, as you already pointed out Ryan is working on a major 
> updated,
> so I'll let him comment on the feasibility, how the approach fits into the 
> updated architecture.
> 
> Thanks,
> Marcus
> 
> > On Mar 10, 2022, at 8:09 PM, Anwaar khalid  wrote:
> > 
> > Hello everyone,
> > 
> > I'm Anwaar Khalid, currently a 4th year dual degree CS student at IIITDM, 
> > India. I've been contributing to mlpack for quite some time now, 
> > particularly in the ANN codebase & I wish to spend the summer working with 
> > mlpack under GSoC-22. I hope to propose a potential idea for a large 
> > project (~350 hours) through this thread & get the community's feedback to 
> > help build my proposal.
> > 
> > I really like the idea of building a DAG class for the ANN module. I've 
> > been researching and found out that MATLAB also has a DAG framework 
> >  for 
> > building complex neural network architectures. I really like how they have 
> > approached this & I think we can build a similar interface for mlpack. In 
> > PR #2777, Ryan added a `MultiLayer 
> > `
> >  class which can be used as the vertices of this DAG network. And we can 
> > add an `AddEdge` module which will allow layers to have input from multiple 
> > layers and also redirect their output to multiple layers.
> > 
> > Once the DAG class is built, I can adapt my InceptionV3 layers PR 
> >  to use that class & then 
> > finally we can add the InceptionV3 model to the models repository. And if 
> > there's time left in the end, I can demonstrate the usage of InceptionV3 
> > model in the examples repo by solving a simple image classification task.
> > 
> > I wanted to get the community's opinions on these ideas. Are the number of 
> > deliverables sufficient for a large sized project? Looking forward to 
> > hearing back from the community :)
> > 
> > 
> > Best
> > Anwaar Khalid
> > Github Username: hello-fri-end
> > ___
> > mlpack mailing list
> > mlpack@lists.mlpack.org
> > http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
> 

> ___
> mlpack mailing list
> mlpack@lists.mlpack.org
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


-- 
Ryan Curtin| "Oh man, I shot Marvin in the face."
r...@ratml.org |   - Vincent
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


Re: [mlpack] GSoC Proposal Discussion: DAG class & InceptionV3 model

2022-03-11 Thread Marcus Edel
Hello Anwaar,

first of all, thanks again for all the great contributions over the last months.
I like the idea, as you already pointed out Ryan is working on a major updated,
so I'll let him comment on the feasibility, how the approach fits into the 
updated architecture.

Thanks,
Marcus

> On Mar 10, 2022, at 8:09 PM, Anwaar khalid  wrote:
> 
> Hello everyone,
> 
> I'm Anwaar Khalid, currently a 4th year dual degree CS student at IIITDM, 
> India. I've been contributing to mlpack for quite some time now, particularly 
> in the ANN codebase & I wish to spend the summer working with mlpack under 
> GSoC-22. I hope to propose a potential idea for a large project (~350 hours) 
> through this thread & get the community's feedback to help build my proposal.
> 
> I really like the idea of building a DAG class for the ANN module. I've been 
> researching and found out that MATLAB also has a DAG framework 
>  for 
> building complex neural network architectures. I really like how they have 
> approached this & I think we can build a similar interface for mlpack. In PR 
> #2777, Ryan added a `MultiLayer 
> `
>  class which can be used as the vertices of this DAG network. And we can add 
> an `AddEdge` module which will allow layers to have input from multiple 
> layers and also redirect their output to multiple layers.
> 
> Once the DAG class is built, I can adapt my InceptionV3 layers PR 
>  to use that class & then finally 
> we can add the InceptionV3 model to the models repository. And if there's 
> time left in the end, I can demonstrate the usage of InceptionV3 model in the 
> examples repo by solving a simple image classification task.
> 
> I wanted to get the community's opinions on these ideas. Are the number of 
> deliverables sufficient for a large sized project? Looking forward to hearing 
> back from the community :)
> 
> 
> Best
> Anwaar Khalid
> Github Username: hello-fri-end
> ___
> mlpack mailing list
> mlpack@lists.mlpack.org
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


Re: [mlpack] Gsoc proposal discussion

2021-03-18 Thread Marcus Edel
Hello Abhinav,

the scope of the project looks reasonable to me; if you like you can add one
or two layers to the proposal as potential work that can be done if there is
time left at the end.

Also in case you haven't seen it we have an application guide:

https://github.com/mlpack/mlpack/wiki/Google-Summer-of-Code-Application-Guide

that could be helpful. That said, once the GSoC application submission
platform is open, you can submit drafts, and we can, if time permits, provide
feedback.

Thanks,
Marcus

> On 17. Mar 2021, at 12:42, Abhinav Anand  wrote:
> 
> Hi, I am Abhinav from India. I have been contributing towards mlpack for 
> quite some time. I am interested in applying for Gsoc this year with mlpack. 
> I have discussed my idea on Slack and received positive feedback with some 
> useful suggestions. Keeping in mind of this year shortened Gsoc commitment, I 
> have reduced my proposal idea to the below three layers:
> 1. Upsample Layer
> 2. Group Normalization
> 3. Channel Shuffle
> If you believe that this might be less work for Gsoc, let me know. I have a 
> couple more good layers that can be included in the proposal.
> I have attached my first draft of the proposal, please let me know what you 
> think. Feel free to give any suggestion.
> 
> Best Regards,
> Abhinav 
> 
> ___
> mlpack mailing list
> mlpack@lists.mlpack.org
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


Re: [mlpack] GSOC proposal for Profiling for parallelism.

2020-03-16 Thread Ryan Curtin
On Sun, Mar 15, 2020 at 06:20:31PM +0530, Param Mirani wrote:
> Hello Marcus/Ryan,
> 
> I am Param Mirani and I want to contribute to MLpack by participating in
> GSOC 2020 in Profiling for Parallelism project. I have made some pull
> requests in trying to improve performance of the algorithms on multi-core
> processors using OpenMP. I had an idea and wanted to know your view on the
> same.

Hi Param,

Thanks for getting in touch and sorry that I still have not had time to
look into these PRs in detail.  I have some comments on these
algorithms:

> I wanted to work on following algorithms in the 12 week program based on my
> knowledge of machine learning algorithms.
> 
> 1)KNN

Dual-tree (and single-tree) algorithms can actually be parallelized
pretty well via the use of OpenMP tasks.  There were some other
trickinesses to tuning it, and I never tried it on multiple different
systems.  I did this experiment back in 2015, but for some reason I
never actually opened a PR for it or merged it in.  It gave not perfect
speedup but fairly close to it with ~4-8 cores.  So the strategy here
may not be too hard.

> 2)Decision Trees

I bet tasks would work well for this also, probably moreso than
low-level parallelization of building each individual node.  Note that
random forests actually should be parallelized already at the highest
level.

> 3)Perceptron

I wonder if the best thing to do here is refactor the Perceptron code to
use ensmallen optimizers; some of those are parallel.  Implementing
parallelized optimizers may also be useful.

> 4)Back-propagation

I'm not sure how useful OpenMP will be here.  Most of the heavy lifting
in training neural networks will already be in things like matrix
multiplication, etc., and since most people will be using OpenBLAS,
these will already be parallelized.

> I'd like to know if the timeline I've proposed is realistic, if all
> assumptions are correct, and if there are any more algorithms that you
> would like me to work on while working on this project.

Yes, I think that what you've proposed is reasonably realistic.  There
will certainly be a lot of trial and error and experimentation during
the process of figuring out the best way to parallelize something.  Some
techniques may work well on some systems and not well on others, making
the problem more difficult.

I hope the comments I've added here help.  It might be reasonable to
consider different algorithms in some cases, perhaps.

Thanks!

Ryan

-- 
Ryan Curtin| "No... not without incident."
r...@ratml.org |   - John Preston
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


Re: [mlpack] GSoC Proposal: Graph Convolutional Networks (GCN)

2020-03-16 Thread Ryan Curtin
On Sun, Mar 15, 2020 at 11:21:48AM +0530, Hemal Mamtora wrote:
> Hello Mentors,
> 
> Thank you for your reviews regarding the project on the IRC. The've helped
> me shape the proposal well.

Hey there Hemal,

Glad to hear that things have been going well.  The basic idea of what
you wrote seems reasonable to me.  The use of arma::sp_mat to represent
graphs *should* be a reasonable approach, but it might be worth some
simple validation tests to make sure that arma::sp_mat is fast enough.
(Sometimes, a sparse representation of data can be slow.  However, I
think that arma::sp_mat *should* be fine... still, good to check.)

> *One query:*
> The Graph datasets available online have different formats like GraphML,
> Adjacency List in a CSV, Edge List in a CSV, etc.
> They would have to be converted to Adjacency Matrix.
> Most of the time the conversion is *straightforward* in O(E) time, where E
> is the number of edges.
> What are your views on having this functionality of ToAdjacencyMatrix() in
> MLpack.
> I think it should not be in MLpack. We could say that MLpack expects
> AdjacencyMatrix (Sparse or Dense) for Graph Networks.
> But if you think that this functionality should exist, I could start
> implementing it, asap.

Personally, I think that we could provide support in data::Load() to
load GraphML or adjacency lists or whatever sparse format, and I think
this could be useful to put into the mlpack codebase.  As it stands, I
think Armadillo supports loading coordinate list CSV files, so probably
a good bit of the support is already there.

Hope the input helps.

Thanks!

Ryan

-- 
Ryan Curtin| "It's too bad she won't live!  But then again, who
r...@ratml.org | does?" - Gaff
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


Re: [mlpack] GSoC Proposal for implementing HyperNEAT and es-HyperNEAT

2020-03-14 Thread Rahul Prabhu
Hey Pranav,
It would be great if my NEAT implementation could be extended. I think you
should focus on CPPNs and HyperNEAT, since it seems infeasible to also
create ES-HyperNEAT in the time left.

CPPN's are pretty nice - off the top of my head, it would be cool if we
could add something to the website where people could use CPPNs to generate
art like on PicBreeder or EndlessForms. It would also be nice to have
HyperNEAT. An issue important to address would be how you could use the
existing NEAT implementation and extend it to HyperNEAT - what changes
would be required, if any.



On Sat, Mar 14, 2020 at 2:34 AM PESALADINNE PRANAV REDDY . <
f20180...@hyderabad.bits-pilani.ac.in> wrote:

> Hey everyone, My name is Pranav Reddy and my idea for GSoC is to implement
> HyperNEAT and if time permits es-HyperNEAT as well. I feel like this is a
> good idea since as far as I've seen there are so few HyperNEAT
> implementations out there.
>
> All of this would be using the NEAT implementation that was added last
> year as HyperNEAT relies on it. HyperNEAT also involves CPPNs which I plan
> to implement first. Since CPPNs are very similar to ANNs this shouldn't be
> too much of a problem.
> Following which I will implement HyperNEAT based off of the paper
> http://eplex.cs.ucf.edu/publications/2009/stanley-alife09. For this we
> would mainly be applying the NEAT algorithm to a CPPN. I will also be
> implementing a user defined substrate as described in the aforementioned
> paper.
>
> On completion of HyperNEAT, if time permits I would also like to implement
> Evolvable Substrate HyperNEAT() as it builds off of HyperNEAT directly. For
> this, the substrate would also have to evolve with each generation. Further
> details can be found in this paper:
> http://eplex.cs.ucf.edu/publications/2012/risi-alife12. I will only
> complete this if there is time of course but I hope that I am able to.
>
> Of course testing is also a very important part and I will test each
> method in the following ways:
> CPPN :
> I think the best test for this would be creating images using CPPNs to
> view spatial patterns such as bilateral symmetry,  imperfect symmetry,
> repetition with variation, etc. as can be seen here :
> http://picbreeder.org/.
> HyperNeat :
> For now my idea is to test this using the visual discrimination experiment
> in the paper http://eplex.cs.ucf.edu/publications/2009/stanley-alife09.
> If I can think of a better experiment or if anyone has any suggestions I
> will do that.
> es-HyperNEAT:
> As of yet, I have not been able to find any experiment that does not
> involve using robots in a controlled environment so any suggestions for
> this test would be greatly appreciated.
>
> Another reason I think this project would be appropriate is that it is a
> very sequential project which will result in at least something solid being
> merged into the codebase in case everything planned is not completed on
> time. I will provide a more detailed phase by phase implementation
> hopefully in a few days for the same.
> Any suggestions are greatly appreciated. Also sorry if it was a long read.
> Thanks in advance.
>
>
>
> ___
> mlpack mailing list
> mlpack@lists.mlpack.org
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
>
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


Re: [mlpack] GSoC Proposal

2019-04-05 Thread Ryan Curtin
On Thu, Apr 04, 2019 at 06:34:17PM +0300, Дворкин Игорь wrote:
>Hello!
> 
>My name is Igor, I'm a student at Moscow State University. It's basically
>my first time trying to work on an open-source project and I might be
>asking some dumb questions. I'm very interested in the idea of
>implementing NEAT algorithm, so I wrote a proposal and I ask you to give
>some feedback on it
>
> ([1]https://docs.google.com/document/d/13O_bMCSO1UhqN8kl415pKAu28h5YtolxjKu6PbGxlLQ/edit?usp=sharing).
>But at the same time a have a concern about whole project.  

Hey Igor,

Thanks for getting in touch.  I took a quick look at the proposal; I
can't leave comments directly on it, but I might suggest spending some
more time thinking about the testing.  You're right that MarI/O would be
a nice test to have, but from the perspective of unit testing it may be
good to focus on simpler situations that we can run very quickly to
hopefully help ensure that NEAT is working right.  This also means
structuring the NEAT code so that it is easy to test.

>I have found a little problem. I was looking through the optimization API
>and it doesn't really add up with how I understand that NEAT works. As I
>understand, NEAT only optimizes object functions which take Neural
>Networks as inputs (and finds the Network, which gives best result on that
>function). This doesn't correspond with optimization API, as it doesn't
>seem to make any limitations on function input. Wouldn't this issue make
>it impossible to implement NEAT with those requirements?

We talked about this in IRC too but just for the sake of the rest of the
list, the outcome of the discussion was basically that NEAT may not fit
nicely into ensmallen because it is specific to neural networks and thus
we may be best off keeping it in mlpack separately.

However it may be interesting to think about how it might fit into
ensmallen without changing any of the existing FunctionTypes; but I am
not sure how feasible that is.

Hope this helps.

Thanks!

Ryan

-- 
Ryan Curtin| "I know... but I really liked those ones."
r...@ratml.org |   - Vincent
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


Re: [mlpack] gsoc proposal

2018-03-26 Thread Ryan Curtin
On Mon, Mar 26, 2018 at 10:52:05AM +0300, Артём Лян wrote:
> Hello mlpack mentors. 
> My name is Lyan Artyom. 
> Could you please review my proposal, that uploaded as draft at 
> summerofcode.withgoogle.com
> Thanks in advance. 

Hi Artyom,

Thanks for submitting a proposal.  I took a look at it.  I would suggest
that our expectations for strong proposals are that they have some more
detail; basically, I see a detailed timeline (that is nice, thank you
for that) but not so much detail on how exactly the JNI bindings would
fit into the existing automatic bindings section.

It looks like you have left some time already in the timeline for
studying the existing automatic binding implementations, but most
proposals for GSoC submitted to mlpack will already have done a study of
the existing code and will have an in-depth plan for how the work will
be implemented, so you might consider doing the same.

I hope this is helpful.

Thanks!

Ryan

-- 
Ryan Curtin| "Not like this...  not like this."
r...@ratml.org |   - Switch
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack