> +1 for starting with a first patch on the current CD implementation to
> get familiar with the existing code base.
Just want to let you know that I'm on it, I hope I can write the batch
over the weekend.
>
> As for the content of the proposal itself, it would be good to include
> extensive profiling sessions on realistic datasets (e.g. microarray
> data) both on individual estimator runs and on regularization paths
> with warm restarts.
>
> Also David experienced poor performance compared to other
> implementation when using the CD models in a sparse coding. Would be
You mean that the data matrix X has a lot of zero entries? There is a
comment
on this case in section 2.3 (
www.stanford.edu/~hastie/Papers/glmnet.pdf ).
> great to ensure comparable performance with state of the art for this
> use case as well. Investigating with OpenMP via cython prange might be
> possible solution.
I'm not sure if the algorithm is good to parallelize//, I think there a
other speed up tricks
not used yet but I will look into it too.
Thanks for the suggestions, I start drafting the proposal as soon as I'm
done with
the patch.
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general