>> Do you believe that it is a major tool that is very useful in general?
I'm not sure it's the best option, but the main motive I had behind sending
this is my desire to add new features to the ensemble package of
scikit-learn
>> Have you had a lot of success using it?
I've tried a it with the tw
Hi Magellane,
> I would like to provide an implementation for the Ensemble selection
> technique as described by the following paper : Ensemble selection from
> libraries of models by Rich Caruana ,Alexandru Niculescu-Mizil,Geoff
> Crew,Alex Ksikes (
> www.cs.cornell.edu/~caruana/ctp/ct.papers/car
Hi Caleb,
you need to extract the path from the decision tree structure
``DecisionTreeClassifier.tree_`` - take a look at the attributes
``children_left`` and ``children_right`` - these encode the parent-child
relationship.
Extracting the path is very similar to finding the leaf node; you just nee
Hi,
For Polynomial regression, take a look at this PR, which was recently
merged:
https://github.com/scikit-learn/scikit-learn/pull/2585
An example of this new feature in use is here:
https://github.com/scikit-learn/scikit-learn/blob/master/examples/linear_model/plot_polynomial_interpolation
hi,
have a look at:
https://github.com/scikit-learn/scikit-learn/pull/2285
any help to test/review this PR is very welcome.
Alex
On Sun, Dec 8, 2013 at 1:53 PM, Chen Wang wrote:
>
> Dear all,
>
> I mainly use scikit learn do regression analysis. I found that this package
> didn't have polyno
Dear all,
I mainly use scikit learn do regression analysis. I found that this package
didn't have polynomial regression and Multivariate Adaptive Regression
Splines(MARS) regression methods which are famous in my research area. If the
authors have no time, I want to contribute these two method
Forgot to mention that true labels are *not *present. Thus, I had chosen
*Silhouette* coefficient.
On Sun, Dec 8, 2013 at 5:02 PM, nipun batra wrote:
> Hi,
>
> I am using *kmean++* to cluster my data series. From my domain expertise,
> I know that the number of cluster varies between 2 and 4. T
Hi,
I am using *kmean++* to cluster my data series. From my domain expertise, I
know that the number of cluster varies between 2 and 4. To find this
*optimum* number of clusters, I was doing the following (pseudocode):
for num_cluster in [2, 3, 4]:
cluster_using_kmeans (num_cluster, data)
Hi
I would like to provide an implementation for the *Ensemble selection
technique *as described by the following paper *: Ensemble selection from
libraries of models by Rich Caruana ,Alexandru Niculescu-Mizil,Geoff
Crew,Alex Ksikes *(
www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml04.icdm0