Hi Kemal,
Thanks a lot for the modifications. The introduction is now much better and
the figure is really helpful to visualize what biclustering can do!
Some further comments...
To keep the "proposal timeline" section more concise and focused on your
schedule during the summer, I would move the descriptions of data
preprocessing, data generation and evaluation metrics to the previous
section (you can introduce subsections). While doing that, can you also
describe in more details what kind of data generation tool you want to add?
Regarding nsNMF, following the previous discussion, I feel that it may not
be a good fit for this GSOC: you cannot reuse / depend on GPL code and
implementing an NMF method will be time-consuming.
I'm a bit concerned with adding to scikit-learn a 2012 paper with only 1
citation. For this reason, I think I would prefer if you implemented the
BiMax paper, which as 427 citations.
Regarding fit_predict / predict, the output shape is not compatible with
the rest of scikit-learn. Therefore, I think we should just expect users to
directly access the fitted attributes. Can you give an actual code snippet
and use the same notation as in scikit-learn (e.g., n -> n_samples)?
Regarding missing value imputation, I think that it would be a more natural
fit in the matrix completion project.
Mathieu
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general