Hello Marcus,
  Thank you for the reply, it encouraged me to come up with a better project 
proposal. Also thanks a lot for the all help and the code reviews.

I think since models repo has object detection models, the next step should be 
object localization. For this I propose YOLOv3 and tinyYOLOv3. Some simple 
changes should allow inter-conversion between the two. I think this makes sense 
because they have the fastest inference time and we can train them using the 
procedure I have mentioned below. I think I can break the work down in phases 
so that it's a bit more coherent.

Phase 1 would be implementing the following:

1. DataLoader for Pascal VOC (I chose this because it has 20 classes and 
roughly 10k training images which should easier to train a model than training 
on COCO or ImageNet.)
2. Addition of Non-Maximal Suppression to mlpack.
3. Adding Darknet class (here Darkent-53) to models repo.

Addition of Darknet is necessary to facilitate training of YOLOv3. It would be 
a little time consuming to train tinyYOLOv3 using just NVBLAS so I think we 
could take the approach similar to ladder training.
We can first train the Darknet and then use those weights in tinyYOLOv3 for the 
darknet portion. This way we can break down training and get better results 
faster. This is something I did when I had to build and train a RetinaNet and 
YOLO from scratch for a month long hackathon.

Phase 2 would be implementing the following:
1. Addition of tinyYOLOv3.
2. During this phase I will be training Darknet as well incorporating your 
suggestions.

Phase 3 would result have the following:
1. Adding support for YOLOv3.
2. Add support for COCO for future use.
3. Here I will be training tinyYOLOv3.

Phase 4:
1. Some cool visualization tools using matplotlib-cpp to show off our results.
2. Adding inference support for videos.

Another thing that I can do is, get inference timing of tinyYOLOv3 and YOLOv3 
on a raspberry-Pi and we can the result and a nice video / image to models repo.

Till then I will be completing all PRs that I have open and implement 
upsampling layer before GSOC.
For FPN, I will add it after GSOC or hopefully before GSOC ends if I finish the 
above task early.

After this project we models repo would be able to check one more task of their 
list. For users that want to run real-time inference on IoTs that don't 
necessarily support python, this will be incredibly useful. I think we can 
really show off mlpack in a nice object localization video. They would also be 
able to participate in COCO challenge.
I know this also might not be the best possible representation of project so I 
would love to incorporate any suggestions that you have.

Regards,
Kartik Dutt,
Github id: kartikdutt18
_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Reply via email to