Hi,For an image classification task, I need to extract random patches and their coordinates from images.
Until now, I used a custom code to extract them at the same time. I recently tested the extract_patches_2d function from scikit-learn and it seems very fast. To extract the coordinates along with the patches, I wrote this test script <https://gist.github.com/NicolasTr/5429897>. Logically and unfortunately, it uses 3 times more memory compared to the same script without the coordinates extraction. I want to create a better solution but I need your opinion:
* I could modify extract_patches_2d to return a tuple (patches, coordinates) o The memory consumption would probably be the same since the coordinates are already computed in the function (here <https://github.com/scikit-learn/scikit-learn/blob/85ec0fd1ae904f275f608b11044a2476ed4723e6/sklearn/feature_extraction/image.py#L322-L323>). If max_patches is not specified, the function could return an itertools.product o It could break the existing code because the return value will be different * I could create a new kind of PatchExtracor: o The existing code wouldn't break o The random_state would need to be copied before any extraction to have the correct coordinates with randint What do you think? Regards, Nicolas Trésegnie
------------------------------------------------------------------------------ Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general