I really want to push this approach, and hope I have time to establish it early next year. And I think that a zero-intersection approach would be better than a fork, as it avoids incompatible changes.


On 12/03/2014 10:31 AM, Mathieu Blondel wrote:
As you mentioned popular methods from scikit-learn-contrib could be promoted to scikit-learn.

Conversely, methods which became obsolete in scikit-learn could move to scikit-learn-contrib to lower the maintenance burden.

Mathieu

On Thu, Dec 4, 2014 at 12:26 AM, Mathieu Blondel <math...@mblondel.org <mailto:math...@mblondel.org>> wrote:

    Hi Satra,

    I can't find the link but there was a discussion some time ago
    about creating a scikit-learn-contrib organization on github for
    this purpose.

    Two differences with what you suggest:
    1) this wouldn't be a fork, i.e., the intersection with
    scikit-learn would be empty
    2) we were thinking of creating repositories for different
    sub-topics (multilabel classification, kernel approximations, etc)

    2) might require too much work in terms of making releases so a
    global scikit-learn-contrib might be more realistic.

    scikit-learn-contrib would have its own website
    http://contrib.scikit-learn.org.

    There would still be some work involved for minimal reviewing and
    releasing, though.

    Mathieu

    On Wed, Dec 3, 2014 at 11:56 PM, Satrajit Ghosh <sa...@mit.edu
    <mailto:sa...@mit.edu>> wrote:

        hi folks,

        since this comes up from time to time and i completely
        understand the needed focus and limited resources within
        scikit-learn, how about the following approach:

        - let the community (to put zero additional burden on the
        current maintainers) maintain a fork of scikit-learn that
        provides no guarantees other than it is kept upto date with
        scikit-learn/master.
        - people are welcome to add any algorithms to this (trivial,
        non-trivial, recent)
        - if things prove useful within this branch/fork/labs they can
        be incorporated into the main stream through the current
        standard PR mechanism

        people will use it at their own discretion, but what it would
        allow is for people to have a single place within which to toy
        with things while still maintaining the core benefits of
        scikit-learn.

        with the different kinds of data (types and size) coming
        online, algorithm development has gone in many different
        directions. some variants are on speed/hardware, others on
        generalizability, yet others on domain specific apps, etc.,..
        what works in one domain/app may completely fail in another.

        the hope here is that this fork would let interested people
        toy with this developmental eco-system as opposed to the
        stable maintained ecosystem. the key advantages of having a
        fork are that:
        - folks don't have to recreate packaging
        - it brings all the folks who are forking anyway together
        instead of splitting off into forks (multiple forks are harder
        to use)
        - it makes for increased availability of algorithms that may
        be useful in practice but never makes it out because the world
        is biased towards loudspeakers
        - it doesn't add anything to the current maintainers plates,
        nor take away anything from the main project. perhaps those
        wishing to add things will take it upon themselves to maintain
        this fork.
        - and if you find that more people are using this fork rather
        than the mainstream (that might tell you something about the
        current culture of science and engineering in practice).
        - there might be fixes that can be incorporated into master
        coming into this fork because more people end up toying within it
        - if this fork goes bust, nobody cares.

        you could even call the fork:

        scikit-learn-minefield
        scikit-learn-teenage-mutants
        ...
        scikit-learn-labs

        cheers,

        satra

        On Wed, Dec 3, 2014 at 5:25 AM, Joel Nothman
        <joel.noth...@gmail.com <mailto:joel.noth...@gmail.com>> wrote:


                I agree. We should ammend this sentence to say that if
                the paper is an
                clear-cut improvement on top of a very used method, it
                should be
                examinded.

            Done <http://scikit-learn.org/dev/faq.html>.


            On 3 December 2014 at 20:07, Gael Varoquaux
            <gael.varoqu...@normalesup.org
            <mailto:gael.varoqu...@normalesup.org>> wrote:

                On Wed, Dec 03, 2014 at 06:04:58PM +0900, Mathieu
                Blondel wrote:
                > I think 1000 citations is a bit too much to ask. We
                should probably
                > update the FAQ with something more reasonable, like
                say 200 citations.
                > That said, I agree that the citation threshold is
                just an indicator.
                > For example, SAG and AdaGrad, which are considerely
                considered for
                > inclusion, have around 75 and 250 citations currently.

                I agree. We should ammend this sentence to say that if
                the paper is an
                clear-cut improvement on top of a very used method, it
                should be
                examinded.

                >     Perhaps scikit-learn needs to strengthen and
                formalise its support
                >     for external related projects that adopt its API
                design to
                >     implement less established techniques. The listing
                >     at https://github.com/scikit-learn/
                >
                 scikit-learn/wiki/Third-party-projects-and-code-snippets
                lacks
                >     glamour, and could be easier to find and navigate.

                > +1

                +1

                > We need to bring this page to the main documentation
                and make it more sexy.

                Good with me.

                G

                
------------------------------------------------------------------------------
                Download BIRT iHub F-Type - The Free Enterprise-Grade
                BIRT Server
                from Actuate! Instantly Supercharge Your Business
                Reports and Dashboards
                with Interactivity, Sharing, Native Excel Exports, App
                Integration & more
                Get technology previously reserved for billion-dollar
                corporations, FREE
                
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
                _______________________________________________
                Scikit-learn-general mailing list
                Scikit-learn-general@lists.sourceforge.net
                <mailto:Scikit-learn-general@lists.sourceforge.net>
                
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



            
------------------------------------------------------------------------------
            Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT
            Server
            from Actuate! Instantly Supercharge Your Business Reports
            and Dashboards
            with Interactivity, Sharing, Native Excel Exports, App
            Integration & more
            Get technology previously reserved for billion-dollar
            corporations, FREE
            
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
            _______________________________________________
            Scikit-learn-general mailing list
            Scikit-learn-general@lists.sourceforge.net
            <mailto:Scikit-learn-general@lists.sourceforge.net>
            https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



        
------------------------------------------------------------------------------
        Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
        from Actuate! Instantly Supercharge Your Business Reports and
        Dashboards
        with Interactivity, Sharing, Native Excel Exports, App
        Integration & more
        Get technology previously reserved for billion-dollar
        corporations, FREE
        
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
        _______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general





------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to