On Wed, Jul 31, 2013 at 4:08 PM, Lars Buitinck <[email protected]> wrote:
> 2013/7/31 Joel Nothman <[email protected]>:
> > I am wondering why there is a need to support the indices=False case in
> > cross_validation. Indices are superior in that they can be used with
> np.take
> > and with sparse matrices. And most of the standard cv implementations
> output
> > indices that are converted into boolean masks and back to indices.
> >
> > Moreover, building generic tools that take cv implementations as input
> need
> > to handle both cases (or make assumptions).
> >
> > What is the intention behind indices=False; why not deprecate it and
> > simplify the API and code? (And speed up indexing by using np.take.)
>
> Funny, I was wondering the same thing yesterday. IIRC, we originally
> used only masks and indices were added to please the sparse
> matrix-pushing crowd (yours truly). Then safe_mask got introduced to
> accept both at the consumer side.
>
> Arguably, masks are easier to interpret, though, esp. in feature
> selection code; you can multiply them with your coef_ before plotting
> it to see which features are deactivated.
>
But that isn't really meaningful for cv.
Do you have any timings for np.take?
>
See http://wesmckinney.com/blog/?p=215
Ideally this is a bug that will disappear from numpy anyway -- for all I
know it already has -- so it should be less of the focus than a simplified
API.
- Joel
------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent
caught up. So what steps can you take to put your SQL databases under
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general