Re: [Scikit-learn-general] RFCC: duecredit citations for sklearn (and anything else you like ; ) )

Mathieu Blondel Sat, 29 Aug 2015 21:38:23 -0700

On Sun, Aug 30, 2015 at 7:27 AM, Yaroslav Halchenko <s...@onerussian.com>
wrote:


>
> As long as installation is straightforward, I think it should be a minor
> hurdle. It will be by default (Recommends) installed with scikit-learn,
> pymvpa,
> and any other related package I am maintaining in Debian/Ubuntu.  It is
> already
> available from pypi although installation there could be a bit problematic
> due
> to external depends indeed.  We will look into minimizing possibility for
> issues and will also look into packaging within conda universe.  Happen it
> is a
> no brainer to have it installed -- installation of an external tool,
> especially
> if recommended by the project, should not be a big issue.
>

Even if installation is easy, people also have to know that the project
even exists.

> For this reason, I think the ideal
> >    solution should be web based. This could for example take the form of
> a
> >    sphinx plugin for easily integrating with the project's
> documentation. We
> >    could maintain a BibTeX file and reference BibTeX entries from within
> the
> >    documentation. The sphinx plugin would make it easier to find relevant
> >    citations from various places in the documentation (class reference,
> user
> >    guide).
>
> Although sound idea on its own, even if complementary to duecredit, it
> IMHO would not be as "productive".  Sure thing some determined users
> will look up references for pieces they used, but not exhaustively and
> not for core functions which they might have not even knew have called
> (indirectly).
>

Indeed, both approaches are complementary. Even if duecredit succeeds, I
think it would still be nice to make it easier to find relevant citations
from the online documentation. Ideally, the citation annotations would be
reused by both duecredit and the sphinx plugin.


> That is exactly what duecredit tries to address -- automate that
> collection of references.
>

We also need to give an idea to users as to *why* they should cite a
certain paper. For example, cite paper [...] because it is the solver used
by LinearSVC(dual=True) for solving the SVM dual objective.


>
> >    One difficulty, though, is that the relevant citations in scikit-learn
> >    estimators often depends on constructor options. For example, in
> >    LinearSVC, the paper to cite is not the same whether we use dual=True
> or
> >    dual=False, penalty="l1" or penalty="l2", etc.
>
> That is already partially handled, e.g.
>
>
> https://github.com/duecredit/duecredit/blob/master/duecredit/injections/mod_scipy.py#L134
>     injector.add('scipy.cluster.hierarchy', 'linkage', BibTeX("""
>     @article{ward1963hierarchical,
>         title={Hierarchical grouping to optimize an objective function},
>         author={Ward Jr, Joe H},
>         journal={Journal of the American statistical association},
>         volume={58},
>         number={301},
>         pages={236--244},
>         year={1963},
>         publisher={Taylor \& Francis}
>     }"""),
>                  conditions={(1, 'method'): {'ward'}},
>                  description="Ward hierarchical clustering",
>                  min_version='0.4.3',
>                  tags=['reference'])
>
> says to reference that publication only if method='ward' to the linkage
> call.
> Similarly I can decorate __init__. But thus partially -- since I don't
> want to
> cite merely if __init__ was called, I would like to cite only if actual
> computation has happened, so it should also be conditioned on some methods
> of
> the class being called...  We will look  into supporting that.
>

Ideally the citation annotations should be as concise as possible. For the
BibTeX part, I would prefer to reference an external BibTeX file. For
example, the file could sit next to __ini__.py at the project root.

Mathieu

------------------------------------------------------------------------------

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] RFCC: duecredit citations for sklearn (and anything else you like ; ) )

Reply via email to