Hello,

On 27-06-2016 12:37, Joel Nothman wrote:
Hi Hugo,

Andrew's approach -- using a list of dicts to specify multiple parameter
grids -- is the correct one.

However, Andrew, you don't need to include parameters that will be
ignored into your parameter grid. The following will be effectively the
same:

params =
[{'kernel':['poly'],'degree':[1,2,3],'gamma':[1/p,1,2],'coef0':[-1,0,1]},
{'kernel':['rbf'],'gamma':[1/p,1,2]},
{'kernel':['sigmoid'],'gamma':[1/p,1,2],'coef0':[-1,0,1]}]


I tried to do this but am having errors. Seems like I need to use the
'metric_params' parameter but I cannot get it right. Here are some of the attempts I made:

{'metric': ['wminkowski'], 'metric_params':[{ 'w': [0.01, 0.1, 1, 10, 100], 'p': [1,2,3,4,5]}], 'n_neighbors': list(k_range), 'weights': weights, 'algorithm': algos, 'leaf_size': list(leaf_sizes) }

{'metric': ['wminkowski'], 'metric_params':[{ 'w': 0.01, 'p': 1}], 'n_neighbors': list(k_range), 'weights': weights, 'algorithm': algos, 'leaf_size': list(leaf_sizes) }

{'metric': ['wminkowski'], 'metric_params':[dict(w=0.01,p=1)], 'n_neighbors': list(k_range), 'weights': weights, 'algorithm': algos, 'leaf_size': list(leaf_sizes) }

The last two give me the following error:

Exception ignored in: 'sklearn.neighbors.dist_metrics.get_vec_ptr'
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)

Can anyone see what I am doing wrong?

TIA,


Joel

On 27 June 2016 at 20:59, Andrew Howe <ahow...@gmail.com
<mailto:ahow...@gmail.com>> wrote:

    I did something similar where I was using GridSearchCV over
    different kernel functions for SVM and not all kernel functions use
    the same parameters.  For example, the *degree* parameter is only
    used by the *poly* kernel.

    from sklearn import svm
    from sklearn import cross_validation
    from sklearn import grid_search

    params =
    [{'kernel':['poly'],'degree':[1,2,3],'gamma':[1/p,1,2],'coef0':[-1,0,1]},\
    {'kernel':['rbf'],'gamma':[1/p,1,2],'degree':[3],'coef0':[0]},\
    {'kernel':['sigmoid'],'gamma':[1/p,1,2],'coef0':[-1,0,1],'degree':[3]}]
    GSC = grid_search.GridSearchCV(estimator = svm.SVC(), param_grid =
    params,\
         cv = cvrand, n_jobs = -1)

    This worked in this instance because the svm.SVC() object only
    passes parameters to the kernel functions as needed:
    Inline image 1

    Hence, even though my list of dicts includes all three parameters
    for all types of kernels I used, they were selectively ignored.  I'm
    not sure about parameters for the distance metrics for the KNN
    object, but it's a good bet it works the same way.

    Andrew

    <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
    J. Andrew Howe, PhD
    Editor-in-Chief, European Journal of Mathematical Sciences
    Executive Editor, European Journal of Pure and Applied Mathematics
    www.andrewhowe.com <http://www.andrewhowe.com>
    http://www.linkedin.com/in/ahowe42
    https://www.researchgate.net/profile/John_Howe12/
    I live to learn, so I can learn to live. - me
    <~~~~~~~~~~~~~~~~~~~~~~~~~~~>

    On Mon, Jun 27, 2016 at 1:27 PM, Hugo Ferreira <h...@inesctec.pt
    <mailto:h...@inesctec.pt>> wrote:

        Hello,

        I have posted this question in Stackoverflow and did not get an
        answer. This seems to be a basic usage question and am therefore
        sending it here.

        I have following code snippet that attempts to do a grid search
        in which one of the grid parameters are the distance metrics to
        be used for the KNN algorithm. The example below fails if I use
        "wminkowski", "seuclidean" or "mahalanobis" distances metrics.

        # Define the parameter values that should be searched
        k_range    = range(1,31)
        weights    = ['uniform' , 'distance']
        algos      = ['auto', 'ball_tree', 'kd_tree', 'brute']
        leaf_sizes = range(10, 60, 10)
        metrics = ["euclidean", "manhattan", "chebyshev", "minkowski",
        "mahalanobis"]

        param_grid = dict(n_neighbors = list(k_range), weights =
        weights, algorithm = algos, leaf_size = list(leaf_sizes),
        metric=metrics)
        param_grid

        # Instantiate the algorithm
        knn = KNeighborsClassifier(n_neighbors=10)

        # Instantiate the grid
        grid = GridSearchCV(knn, param_grid=param_grid, cv=10,
        scoring='accuracy', n_jobs=-1)

        # Fit the models using the grid parameters
        grid.fit(X,y)

        I assume this is because I have to set or define the ranges for
        the various distance parameters (for example p, w for
        “wminkowski” - WMinkowskiDistance ). The "minkowski" distance
        may be working because its "p" parameter has the default 2.

        So my questions are:

        1. Can we set the range of parameters for the distance metrics
        for the grid search and if so how?
        2. Can we set the value of a parameters for the distance metrics
        for the grid search and if so how?

        Hope the question is clear.
        TIA
        _______________________________________________
        scikit-learn mailing list
        scikit-learn@python.org <mailto:scikit-learn@python.org>
        https://mail.python.org/mailman/listinfo/scikit-learn



    _______________________________________________
    scikit-learn mailing list
    scikit-learn@python.org <mailto:scikit-learn@python.org>
    https://mail.python.org/mailman/listinfo/scikit-learn




_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to