Hello scikit-learn,
I recently wrote up an implementation of the LambdaMART algorithm on top of
the existing gradient boosting code (thanks for the great base of code to
work with btw). It currently only supports NDCG but it would be easy to
generalize. That's kind of besides the point however. Before I even think
about putting together a PR I wanted to compare it against the gbm package.
I'm aware of java implementations like jforest and ranklib but gbm's
interface seems closest to sklearn's so that's what I want to use.
Unfortunately whenever I try to use ndcg, it segfaults on me or I get an
error in split.default depending on where I specify the group variable. I
realize this isn't an R list but I was hoping someone could shed some light
for me.
I'm using the supervised MQ2007 and MQ2008 datasets from (
https://research.microsoft.com/en-us/um/beijing/projects/letor//letor4download.aspx)
and my test code is here (https://gist.github.com/jwkvam/7332448).
I simply use python to transform the given train.txt file into a csv so I
can load it in R. I'm using gbm 2.1 and I've tried R 2.15.3 and 3.0.2.
Alternatively can I easily transform my gbm.fit() call to use the gbm()
interface? Sorry I'm kind of a newbie when it comes to R.
I saw there's also this standing issue, but it doesn't look like there's
been a lot of movement on it.
https://code.google.com/p/gradientboostedmodels/issues/detail?id=28&q=pairwise
Thanks,
Jacques
------------------------------------------------------------------------------
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general