Hello Kartik,

> Phase 1 would be implementing the following:
> 
> 1. DataLoader for Pascal VOC (I chose this because it has 20 classes and 
> roughly
> 10k training images which should easier to train a model than training on COCO
> or ImageNet.)

Right, sounds like a good idea to me.

> 2. Addition of Non-Maximal Suppression to mlpack.
> 3. Adding Darknet class (here Darkent-53) to models repo.

The differences between the models are minor so I think supporting other models
should be straightforward.

> Addition of Darknet is necessary to facilitate training of YOLOv3. It would 
> be a
> little time consuming to train tinyYOLOv3 using just NVBLAS so I think we 
> could
> take the approach similar to ladder training. We can first train the Darknet 
> and
> then use those weights in tinyYOLOv3 for the darknet portion. This way we can
> break down training and get better results faster. This is something I did 
> when
> I had to build and train a RetinaNet and YOLO from scratch for a month long
> hackathon.

Yeah, time-consuming indeed, but possible, we have a bunch of machines we could
provide for training models.

> Phase 2 would be implementing the following:
> 1. Addition of tinyYOLOv3.
> 2. During this phase I will be training Darknet as well incorporating your 
> suggestions.
> 
> Phase 3 would result have the following:
> 1. Adding support for YOLOv3.
> 2. Add support for COCO for future use.
> 3. Here I will be training tinyYOLOv3.
> 
> Phase 4:
> 1. Some cool visualization tools using matplotlib-cpp to show off our results.
> 2. Adding inference support for videos.

Sounds good, one thing that is currently missing is testing and documentation.
Writing some good tests takes a lot of time, so a timeline should account for
that. Same for documentation, I guess a neat tutorial would be great to have as
well.

> After this project we models repo would be able to check one more task of 
> their
> list. For users that want to run real-time inference on IoTs that don't
> necessarily support python, this will be incredibly useful. I think we can
> really show off mlpack in a nice object localization video. They would also be
> able to participate in COCO challenge. I know this also might not be the best
> possible representation of project so I would love to incorporate any
> suggestions that you have.

I think that is a great way to show what we can do with mlpack. The models
repository could be a nice playground for different ideas, and I think the
project you proposed would fit perfectly.

Thanks,
Marcus

> On 10. Mar 2020, at 19:14, Kartik Dutt <[email protected]> wrote:
> 
> Hello Marcus,
>   Thank you for the reply, it encouraged me to come up with a better project 
> proposal. Also thanks a lot for the all help and the code reviews.
> 
> I think since models repo has object detection models, the next step should 
> be object localization. For this I propose YOLOv3 and tinyYOLOv3. Some simple 
> changes should allow inter-conversion between the two. I think this makes 
> sense because they have the fastest inference time and we can train them 
> using the procedure I have mentioned below. I think I can break the work down 
> in phases so that it's a bit more coherent.
> 
> Phase 1 would be implementing the following:
> 
> 1. DataLoader for Pascal VOC (I chose this because it has 20 classes and 
> roughly 10k training images which should easier to train a model than 
> training on COCO or ImageNet.)
> 2. Addition of Non-Maximal Suppression to mlpack.
> 3. Adding Darknet class (here Darkent-53) to models repo.
> 
> Addition of Darknet is necessary to facilitate training of YOLOv3. It would 
> be a little time consuming to train tinyYOLOv3 using just NVBLAS so I think 
> we could take the approach similar to ladder training.
> We can first train the Darknet and then use those weights in tinyYOLOv3 for 
> the darknet portion. This way we can break down training and get better 
> results faster. This is something I did when I had to build and train a 
> RetinaNet and YOLO from scratch for a month long hackathon.
> 
> Phase 2 would be implementing the following:
> 1. Addition of tinyYOLOv3.
> 2. During this phase I will be training Darknet as well incorporating your 
> suggestions.
> 
> Phase 3 would result have the following:
> 1. Adding support for YOLOv3.
> 2. Add support for COCO for future use.
> 3. Here I will be training tinyYOLOv3.
> 
> Phase 4:
> 1. Some cool visualization tools using matplotlib-cpp to show off our results.
> 2. Adding inference support for videos.
> 
> Another thing that I can do is, get inference timing of tinyYOLOv3 and YOLOv3 
> on a raspberry-Pi and we can the result and a nice video / image to models 
> repo.
> 
> Till then I will be completing all PRs that I have open and implement 
> upsampling layer before GSOC.
> For FPN, I will add it after GSOC or hopefully before GSOC ends if I finish 
> the above task early.
> 
> After this project we models repo would be able to check one more task of 
> their list. For users that want to run real-time inference on IoTs that don't 
> necessarily support python, this will be incredibly useful. I think we can 
> really show off mlpack in a nice object localization video. They would also 
> be able to participate in COCO challenge.
> I know this also might not be the best possible representation of project so 
> I would love to incorporate any suggestions that you have.
> 
> Regards,
> Kartik Dutt,
> Github id: kartikdutt18

_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Reply via email to