This is very interesting. I have been playing recently with learning
to rank. Right now I just used point-wise regressors and just
implemented NDCG as a ranking metric to compare the models. I tried to
experiment with parallelizing extra trees here:

  
http://nbviewer.ipython.org/urls/raw.github.com/ogrisel/notebooks/master/Learning%20to%20Rank.ipynb

I think a GradientBoostingRegressor model can reach better accuracy
but is not parallizable alone. Off-course if you use list-wise
approach directly optimizing the target cost (e.g. NDCG like
LambdaMART does) you should be able to reach the state of the art.

The data was parsed once and save in compressed format here:

http://nbviewer.ipython.org/url/raw.github.com/ogrisel/notebooks/master/Data%20Preprocessing%20for%20the%20%20Learning%20to%20Rank%20example.ipynb

Here are the slides I am gonna present this afternoon at Budapest BI Forum:

  https://speakerdeck.com/ogrisel/growing-randomized-trees-in-the-cloud-1

About the API, properly supporting Learning to Rank will have impact
on the scorer API and the cross validation / grid search. I am not yet
sure how to best address all of this.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to