Hello Kartik, > Phase 1 would be implementing the following: > > 1. DataLoader for Pascal VOC (I chose this because it has 20 classes and > roughly > 10k training images which should easier to train a model than training on COCO > or ImageNet.)
Right, sounds like a good idea to me. > 2. Addition of Non-Maximal Suppression to mlpack. > 3. Adding Darknet class (here Darkent-53) to models repo. The differences between the models are minor so I think supporting other models should be straightforward. > Addition of Darknet is necessary to facilitate training of YOLOv3. It would > be a > little time consuming to train tinyYOLOv3 using just NVBLAS so I think we > could > take the approach similar to ladder training. We can first train the Darknet > and > then use those weights in tinyYOLOv3 for the darknet portion. This way we can > break down training and get better results faster. This is something I did > when > I had to build and train a RetinaNet and YOLO from scratch for a month long > hackathon. Yeah, time-consuming indeed, but possible, we have a bunch of machines we could provide for training models. > Phase 2 would be implementing the following: > 1. Addition of tinyYOLOv3. > 2. During this phase I will be training Darknet as well incorporating your > suggestions. > > Phase 3 would result have the following: > 1. Adding support for YOLOv3. > 2. Add support for COCO for future use. > 3. Here I will be training tinyYOLOv3. > > Phase 4: > 1. Some cool visualization tools using matplotlib-cpp to show off our results. > 2. Adding inference support for videos. Sounds good, one thing that is currently missing is testing and documentation. Writing some good tests takes a lot of time, so a timeline should account for that. Same for documentation, I guess a neat tutorial would be great to have as well. > After this project we models repo would be able to check one more task of > their > list. For users that want to run real-time inference on IoTs that don't > necessarily support python, this will be incredibly useful. I think we can > really show off mlpack in a nice object localization video. They would also be > able to participate in COCO challenge. I know this also might not be the best > possible representation of project so I would love to incorporate any > suggestions that you have. I think that is a great way to show what we can do with mlpack. The models repository could be a nice playground for different ideas, and I think the project you proposed would fit perfectly. Thanks, Marcus > On 10. Mar 2020, at 19:14, Kartik Dutt <[email protected]> wrote: > > Hello Marcus, > Thank you for the reply, it encouraged me to come up with a better project > proposal. Also thanks a lot for the all help and the code reviews. > > I think since models repo has object detection models, the next step should > be object localization. For this I propose YOLOv3 and tinyYOLOv3. Some simple > changes should allow inter-conversion between the two. I think this makes > sense because they have the fastest inference time and we can train them > using the procedure I have mentioned below. I think I can break the work down > in phases so that it's a bit more coherent. > > Phase 1 would be implementing the following: > > 1. DataLoader for Pascal VOC (I chose this because it has 20 classes and > roughly 10k training images which should easier to train a model than > training on COCO or ImageNet.) > 2. Addition of Non-Maximal Suppression to mlpack. > 3. Adding Darknet class (here Darkent-53) to models repo. > > Addition of Darknet is necessary to facilitate training of YOLOv3. It would > be a little time consuming to train tinyYOLOv3 using just NVBLAS so I think > we could take the approach similar to ladder training. > We can first train the Darknet and then use those weights in tinyYOLOv3 for > the darknet portion. This way we can break down training and get better > results faster. This is something I did when I had to build and train a > RetinaNet and YOLO from scratch for a month long hackathon. > > Phase 2 would be implementing the following: > 1. Addition of tinyYOLOv3. > 2. During this phase I will be training Darknet as well incorporating your > suggestions. > > Phase 3 would result have the following: > 1. Adding support for YOLOv3. > 2. Add support for COCO for future use. > 3. Here I will be training tinyYOLOv3. > > Phase 4: > 1. Some cool visualization tools using matplotlib-cpp to show off our results. > 2. Adding inference support for videos. > > Another thing that I can do is, get inference timing of tinyYOLOv3 and YOLOv3 > on a raspberry-Pi and we can the result and a nice video / image to models > repo. > > Till then I will be completing all PRs that I have open and implement > upsampling layer before GSOC. > For FPN, I will add it after GSOC or hopefully before GSOC ends if I finish > the above task early. > > After this project we models repo would be able to check one more task of > their list. For users that want to run real-time inference on IoTs that don't > necessarily support python, this will be incredibly useful. I think we can > really show off mlpack in a nice object localization video. They would also > be able to participate in COCO challenge. > I know this also might not be the best possible representation of project so > I would love to incorporate any suggestions that you have. > > Regards, > Kartik Dutt, > Github id: kartikdutt18
_______________________________________________ mlpack mailing list [email protected] http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
