Can you open a PR with an example and attach the plots and output and timing?
The example needs to download the dataset.

Cheers,
Andy

On 02/29/2016 03:15 PM, Guillaume Lemaître wrote:
Hi guys,

I got a bit of time meanwhile in a conference: https://github.com/glemaitre/bow-sklearn-example The extraction is in the source. I have to include the part to measure the timing. Let me know of that make sense.

Cheers,

On 23 February 2016 at 20:39, Nadim Farhat <nadim.far...@gmail.com <mailto:nadim.far...@gmail.com>> wrote:

    HI Andreas,

    Sorry for Jumping into the conversation and getting a bit off
    topic, what does it mean  "flat data" sets in sklearn ?

    Bests

    Nadim Farhat
    Phd Bioengineering candidate
    Center for Ultrasound and therapeutics
    University of Pittsburgh


    On Mon, Feb 22, 2016 at 12:12 PM Andreas Mueller <t3k...@gmail.com
    <mailto:t3k...@gmail.com>> wrote:

        Hi Guillaume.

        I was a big user of BoW myself, but I don't think it should go
        into scikit-learn.
        BoW doesn't really operate on a "flat" dataset, as
        scikit-learn usually does. It works on groups of data points.
        Each sample is usually a concatenation of feature vectors,
        which you summarize as a histogram.
        That doesn't really fit into the scikit-learn API.

        For any particular application (I did bag of visual words),
        creating an implementation using the kmeans or sparse coding
        in scikit-learn
        is only a couple of lines (you can find my visual bow for
        per-superpixel descriptors here
        https://github.com/amueller/segmentation/blob/master/bow.py#L184)

        Cheers,
        Andy



        On 02/14/2016 09:03 PM, Guillaume Lemaître wrote:
        Dear all,

        My group and I, are currently working on image classification
        applied to medical images. We are using the Bag-of-Features
        (or Bag-of-Visual-Words, Bag-of-Words) which was inspired
        originally from the text classification. In fact, we have a
        kind of dirty implementation
        
[here](https://github.com/glemaitre/protoclass/blob/master/protoclass/extraction/codebook.py)
        which I would like to, somehow, even only if it is for a
        personal branch, integrate to the scikit-learn.

        However, I have some philosophical questions before to mess
        around, which in fact are feeding some discussions in our
        lab. Checking the API, the BoF approach could be part of the
        `feature_extraction` module. BoF is really similar to the
        implementation of the BoW for text as previously mentioned.

        Nevertheless, I am questioning if the BoF shall rather not be
        integrated to the `decomposition` module. By looking at it,
        the method consists of: (i) dictionary learning (base
        K-Means, Mean-Shift, etc.), (ii) encoding (or voting in that
        case using k-NN), and (iii) pooling (histogram).

        Thus, in some sort the BoF can be seen as any of the
        decomposition (even more similar to sparse coding). For
        instance the sparse learning follow exactly the same scheme:
        dictionary learning with K-SVD, encoding, and pooling
        (min/max/etc.). Similar thing for PCA, if you tackle the
        problem of dictionary as finding the eigenvectors/eigenvalues.

        My questions are thus the following:
        - what are you thinking about such thing;
        - where the BoF implementation of this approach is the most
        judicious;
        - would it be judicious to think about the different
        decomposition methods as the three steps earlier mentioned or
        it would be not at all intuitive?

        Hope that the topic is not to weird.

        Cheers,
-- *LEMAÎTRE Guillaume
        PhD Candidate
        MSc Erasmus Mundus ViBOT (Vision-roBOTic)
        MSc Business Innovation and Technology Management
        **
        *g.lemaitr...@gmail.com <mailto:g.lemaitr...@gmail.com>

                *ViCOROB - Computer Vision and Robotic Team*
        Universitat de Girona, Campus Montilivi, Edifici P-IV 17071
        Girona
        Tel. +34 972 41 98 12 <tel:%2B34%20972%2041%2098%2012> - Fax.
        +34 972 41 82 59
        http://vicorob.udg.es/
        *LE2I - Le Creusot
        *IUT Le Creusot, Laboratoire LE2I, 12 rue de la Fonderie,
        71200 Le Creusot
        Tel. +33 3 85 73 10 90 <tel:%2B33%203%2085%2073%2010%2090> -
        Fax. +33 3 85 73 10 97 <tel:%2B33%203%2085%2073%2010%2097>
        http://le2i.cnrs.fr

        https://sites.google.com/site/glemaitre58/
        Vice - Chairman of A.S.C. Fours UFOLEP
        Chairman of A.S.C. Fours FFC
        Webmaster of http://ascfours.free.fr


        
------------------------------------------------------------------------------
        Site24x7 APM Insight: Get Deep Visibility into Application Performance
        APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
        Monitor end-to-end web transactions and take corrective actions now
        Troubleshoot faster and improve end-user experience. Signup Now!
        http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140


        _______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

        
------------------------------------------------------------------------------
        Site24x7 APM Insight: Get Deep Visibility into Application
        Performance
        APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
        Monitor end-to-end web transactions and take corrective
        actions now
        Troubleshoot faster and improve end-user experience. Signup Now!
        
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


    
------------------------------------------------------------------------------
    Site24x7 APM Insight: Get Deep Visibility into Application Performance
    APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
    Monitor end-to-end web transactions and take corrective actions now
    Troubleshoot faster and improve end-user experience. Signup Now!
    http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




--
*LEMAÎTRE Guillaume
PhD Candidate
MSc Erasmus Mundus ViBOT (Vision-roBOTic)
MSc Business Innovation and Technology Management
**
*g.lemaitr...@gmail.com <mailto:g.lemaitr...@gmail.com>

        *ViCOROB - Computer Vision and Robotic Team*
Universitat de Girona, Campus Montilivi, Edifici P-IV 17071 Girona
Tel. +34 972 41 98 12 - Fax. +34 972 41 82 59
http://vicorob.udg.es/
*LE2I - Le Creusot
*IUT Le Creusot, Laboratoire LE2I, 12 rue de la Fonderie, 71200 Le Creusot
Tel. +33 3 85 73 10 90 - Fax. +33 3 85 73 10 97
http://le2i.cnrs.fr

https://sites.google.com/site/glemaitre58/
Vice - Chairman of A.S.C. Fours UFOLEP
Chairman of A.S.C. Fours FFC
Webmaster of http://ascfours.free.fr


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to