[mlpack] (no subject)

2020-03-27 Thread Alina Boshchenko
Dear mentors,

I submitted my draft proposal for Essential Deep Learning Modules around a
week ago, but didn't receive any feedback. As the deadline is coming, I
would be very grateful if you could give some comments on it for me to take
it into account before the final deadline.
Thank you.

-- 
Best regards,
Alina
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


Re: [mlpack] GSoC 2020 Ideas

2020-03-27 Thread Andrei M
Hello,

Thank you for the feedback. I submitted a draft proposal based on it.

Best,
Andrei

On Sat, Mar 21, 2020, 00:43 Marcus Edel  wrote:

> Hello Andrei,
>
> thanks for the update, I don't have anything to add, sounds totally
> reasonable
> to me. As an overall timeline, this could definitely work.
>
> Thanks,
> Marcus
>
> On 19. Mar 2020, at 13:07, Andrei M  wrote:
>
> Hello again,
>
> Thank you for the feedback.
>
> After a longer though process, I decided I would like to implement the
> DeepLabv3+ model for semantic segmentation as part of the ANN models
> project. This implies several phases of implementation and this is the
> split I propose:
>
> Step 1:
> Implement a dataloader for a semantic segmentation dataset: This will be
> either Pascal VOC 2012 or ADE20K.
>
> Step 2:
> Implement an Xception model backbone, with atrous depth-wise separable
> convolutions. This is the backbone that makes the model yield the best
> performance, according to the original paper, overpassing the ResNet-101
> backbone.
>
> Step 3:
> a. Implement the encoder architecture of the model, which is a DeepLabv3,
> that uses the previously mentioned Xception backbone. This task also
> implies the building of the atrous spatial pyramid pooling module.
> b. Implement the decoder architecture, which is a simple architecture
> based on convolutions, which refine the segmentation results of the encoder
>
> Step 4:
> Train and test the implemented model on the selected dataset, then compare
> the results with the ones obtained in the paper. Visualize the results and
> create relevant plots and statistics.
>
> That would be a shorter version of my proposal.
>
> Best,
> Andrei
>
> On Wed, 11 Mar 2020 at 23:35, Marcus Edel 
> wrote:
>
>> Hello Andrei,
>>
>> 1. RL: I've taken a more in-depth look on the reinforcement learning
>> module. The
>> DQN, Double DQN and prioritized replay are already implemented, so as
>> part of
>> the rainbow the remaining components are Dueling networks, Multi-step
>> learning,
>> Distributional RL, Noisy. Therefore, I suggest finishing the
>> implementation of
>> the Rainbow DQN and then an implementation of the ACKTR algorithm.
>>
>> Sounds totally reasonable to me.
>>
>> 2. Applications of ANN: Implementing a U-Net or DeepLabv3 architecture for
>> semantic segmentation.
>>
>> I like both models, also good that you mentioned you like to focus on
>> either
>> U-Net or DeepLabv3.
>>
>> I would like to know if the ideas above would make enough for a summer
>> project
>> for each of the two sections.
>>
>> Definitely, a big part of each project is documentation and testing,
>> writing
>> good tests takes time.
>>
>> Let me know if I should clarify anything further.
>>
>> Thanks,
>> Marcus
>>
>> On 10. Mar 2020, at 15:50, Andrei M  wrote:
>>
>> Hello,
>>
>> Thank you for the response.
>>
>> I've been thinking more about the ideas for the GSoC and I've established
>> a top 2 I'd like to work on: Reinforcement learning or Applications of ANN.
>> (I'll only select one for the final proposal)
>>
>> 1. RL: I've taken a more in-depth look on the reinforcement learning
>> module. The DQN, Double DQN and prioritized replay are already implemented,
>> so as part of the rainbow the remaining components are Dueling networks,
>> Multi-step learning, Distributional RL, Noisy. Therefore, I suggest
>> finishing the implementation of the Rainbow DQN and then an implementation
>> of the ACKTR algorithm.
>>
>> 2. Applications of ANN: Implementing a U-Net or DeepLabv3 architecture
>> for semantic segmentation.
>>
>> I would like to know if the ideas above would make enough for a summer
>> project for each of the two sections.
>>
>> Thank you,
>> Andrei
>>
>> On Mon, Mar 9, 2020 at 1:22 AM Marcus Edel 
>> wrote:
>>
>>> Hello Andrei,
>>>
>>> welcome and thanks for you interest. Looks like you already brainstormed
>>> about
>>> the ideas, that great. I think each method you proposed made sense,
>>> there is
>>> alrady a PR open for PPO (https://github.com/mlpack/mlpack/pull/1912)
>>> which is
>>> very close to being merged, so I think you can remove that from the list.
>>>
>>> Also, I think both ideas could be combined, like if you add a new layer
>>> to the
>>> codebase. That said, we don't have project priorities, so you a free to
>>> go with
>>> anything you find interesting.
>>>
>>> Let me know if I should clarify anything.
>>>
>>> Thanks,
>>> Marcus
>>>
>>> On 6. Mar 2020, at 15:47, Andrei M  wrote:
>>>
>>> Hello,
>>>
>>> I'm a second year master's degree student in the field of artificial
>>> intelligence and I've been thinking about applying to Google Summer of Code
>>> for this summer and mlpack is the project I want to work on.
>>>
>>> I've spent the last few weeks to get familiar with the code base and
>>> write some code for a new feature (a loss function that wasn't
>>> implemented). There are several ideas in the list that peaked my interest
>>> and I consider them equally interesting: reinforcement learning, 

Re: [mlpack] Regarding GSoC'20 Project as well as Proposal

2020-03-27 Thread Rahul Prabhu
You can contact mentors on this mailing list or the IRC (sorry for the
previous mail, forgot to hit Reply All)


On Fri, Mar 27, 2020, 10:26 PM Garv Tambi  wrote:

> Hello Mentors,
> After reading many articles and Research paper and understanding the
> mlpack codebase, I proposed my own Idea and completed my GSoC Proposal with
> my Best!! and submitted to the GSoC website.
> Can anyone please guide me how I contact the mentors...
> It would be a great help for me...
>
> Thank You!!
> With Regard.
>
>
> ___
> mlpack mailing list
> mlpack@lists.mlpack.org
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
>
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


[mlpack] Regarding GSoC'20 Project as well as Proposal

2020-03-27 Thread Garv Tambi
Hello Mentors,
After reading many articles and Research paper and understanding the mlpack
codebase, I proposed my own Idea and completed my GSoC Proposal with my
Best!! and submitted to the GSoC website.
Can anyone please guide me how I contact the mentors...
It would be a great help for me...

Thank You!!
With Regard.
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


[mlpack] Fwd: (no subject)

2020-03-27 Thread Aman Pandey
Hi Ryan/Marcus,
Just saw 2017s work on parallelisation by Shikhar Bhardwaj.
https://www.mlpack.org/gsocblog/profiling-for-parallelization-and-parallel-stochastic-optimization-methods-summary.html

Impressive work he has done. Excellent documentation I must say.
-

In this email, I'll be telling about

   - What Algos I'll be implementing?
   - What thoughts I had over parallelisation?
   - My rough plan to complete the proposed work in MLPACK
   - Just thought of adding something like *Federated Learning* in
   Mlpack(this could be very complex though!)


I want to have a little more clear understanding of what I am going to do,
please check if I am planning correctly. Also if my waythrough is feasible.

I will be focusing on following algos, during my GSOC period:
1) Random Forest
2) KNN
3) A few Gradient Boosting Algorithms

Either I can "Parallelize" the algorithms according to their computation
tasks(e.g. in Random Forest, I can try training its N trees in parallel) or
by Distributing tasks by MapReduce or other distributed computation
algorithms (https://stackoverflow.com/a/8727235/9982106 lists a few pretty
well). MapReduce only works well if very small data moves across the
machine very few times. *This could be a reason why should we try looking
at a few better alternatives*.
For. e.g. after each iteration, derivative-based methods have to calculate
gradient over complete training data, which, in general, requires moving
complete data to a single machine and compute the gradient. As the number
of iterations increases, this could result in a bottleneck.

I am in favour of working with OpenMP, before trying any such thing.

Similar can occur with Tree-based algorithms when splitting has to be
calculated on complete data repetitively,

I would follow the given "rough" timeline:
*(haven't kept it complex and unrealistic)*

1) Profiling algorithms to find their bottlenecks, training on a variety of
example datasets(small to big, which brings in a heavy difference in
calculations) -
*Week 1-2*2) Working on at least one GBA, to check it my approach is cool,
and that in complete in accordance with MLPACK. Parallelly working on
profiling and designing parallelism for Random Forest  - *Week* *2-3*
3)  Working on Random forest and KNN - *Week 4 - 8*
4)* Building on Different distributed computing alternatives for MapReduce.
*This one if works well, could transform MLPACK into an actual *distributed
killer*. However, working randomly on different algos with different
Distributed computation technique may lead to randomness in MLPACK
Development. (*I still have to be sure on this stuff.*)

- *An additional idea -*
I don't know if this has been discussed before, as I have been away from
MLPACK for almost a year.
Have you ever thought of adding FEDERATED LEARNING support for MLPACK? Like
the *PYSYFT Support(*https://github.com/OpenMined/PySyft*), *can bring
tremendous improvement in MLPACK. This would really help people working on
Big Deep Learning and for the researchers?


Please let me know if we can discuss this idea!

--
The reason for me choosing MLPACK is that, I have knowledge of its
codebase, as I tried in mlpack last year, and ofc, the team is awesome, I
have always found good support from everyone here.

And, amid this COVID-19 thing, I *will* *not* be able to complete my earned
internship at *NUS-Singapore,* so I need something of that level to work
upon and utilising these summers.
I am very comfortable with any kind of code, as an example, I have worked
on completely unknown HASKELL code while working as an Undergrad Researcher
at IITK(one of the finest CSE depts in INDIA).
Plus, having knowledge of Advanced C++ can help me be quick and efficient.

I have started drafting a proposal. Please, let me know your thoughts.

Will update you soon within the next 2 days.


Please be safe!
Looking forward to a wonderful experience with MLPACK. :)



*Aman Pandeyamanpandey.codes *

On Mon, Mar 16, 2020 at 7:52 PM Aman Pandey 
wrote:

> Hi Ryan,
> I think that is enough information.
> Thanks a lot.
> I tried MLPACK, the last year, on QGMM, unfortunately, couldn't make it.
>
> Will try once again, with a possibly better proposal. ;)
> In parallelisation this time.
>
> Thanks.
> Aman Pandey
> GitHub Username: johnsoncarl
>
> On Mon, Mar 16, 2020 at 7:33 PM Ryan Curtin  wrote:
>
>> On Sun, Mar 15, 2020 at 12:38:09PM +0530, Aman Pandey wrote:
>> > Hey Ryan/Marcus,
>> > Are there any current coordinates to start with, in "Profiling for
>> > Parallelization"?
>> > I want to know if any, to avoid any redundant work.
>>
>> Hey Aman,
>>
>> I don't think that there are any particular directions.  You could
>> consider looking at previous messages from previous years in the mailing
>> list archives (this project has been proposed in the past and there has
>> been some discussion).  My suggestion would be to find