Hi Mathieu,

Thanks for the feedback and great to hear that I am not alone in this
pursuit!!!  More comments are below

On Sat, 29 Aug 2015, Mathieu Blondel wrote:

>    I am a bit concerned that most people wouldn't want or wouldn't be able to
>    install an external program, though. 

As long as installation is straightforward, I think it should be a minor
hurdle. It will be by default (Recommends) installed with scikit-learn, pymvpa,
and any other related package I am maintaining in Debian/Ubuntu.  It is already
available from pypi although installation there could be a bit problematic due
to external depends indeed.  We will look into minimizing possibility for
issues and will also look into packaging within conda universe.  Happen it is a
no brainer to have it installed -- installation of an external tool, especially
if recommended by the project, should not be a big issue.

> For this reason, I think the ideal
>    solution should be web based. This could for example take the form of a
>    sphinx plugin for easily integrating with the project's documentation. We
>    could maintain a BibTeX file and reference BibTeX entries from within the
>    documentation. The sphinx plugin would make it easier to find relevant
>    citations from various places in the documentation (class reference, user
>    guide).

Although sound idea on its own, even if complementary to duecredit, it
IMHO would not be as "productive".  Sure thing some determined users
will look up references for pieces they used, but not exhaustively and
not for core functions which they might have not even knew have called
(indirectly).
That is exactly what duecredit tries to address -- automate that
collection of references.

>    One difficulty, though, is that the relevant citations in scikit-learn
>    estimators often depends on constructor options. For example, in
>    LinearSVC, the paper to cite is not the same whether we use dual=True or
>    dual=False, penalty="l1" or penalty="l2", etc.

That is already partially handled, e.g. 

https://github.com/duecredit/duecredit/blob/master/duecredit/injections/mod_scipy.py#L134
    injector.add('scipy.cluster.hierarchy', 'linkage', BibTeX("""
    @article{ward1963hierarchical,
        title={Hierarchical grouping to optimize an objective function},
        author={Ward Jr, Joe H},
        journal={Journal of the American statistical association},
        volume={58},
        number={301},
        pages={236--244},
        year={1963},
        publisher={Taylor \& Francis}
    }"""),
                 conditions={(1, 'method'): {'ward'}},
                 description="Ward hierarchical clustering",
                 min_version='0.4.3',
                 tags=['reference'])

says to reference that publication only if method='ward' to the linkage call.
Similarly I can decorate __init__. But thus partially -- since I don't want to
cite merely if __init__ was called, I would like to cite only if actual
computation has happened, so it should also be conditioned on some methods of
the class being called...  We will look  into supporting that.

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to