Re: [Scikit-learn-general] Data format
Why do you want to convert libsvm to another structure? I don't quite get it. If you want to use examples: scikit learn has included datasets that can be directly loaded. I think this section should help: http://scikit-learn.org/stable/datasets/index.html Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba: Hello ! I am wondering if someone has developed a snippet or a script that converts libsvm format into a format directly usable by scikit without the need to use of load_svmlight_file. The reason is that I am trying to use the examples provided on the website, but all of them are written in a format that is not a libsvm one. Thanks Rad -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Re: [Scikit-learn-general] Data format
Simply because I am new to both python and scikit (Coming from R world) The problem is that I tried using load_svmlight_file with in particular RBF parameters example http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py and I get a lot of problems / errors/ so I thought it is easier to use native format which preserve the structure of the variables used. What I tested in this script in particular is something like : #iris = load_iris() #X = iris.data #Y = iris.target X,Y = load_svmlight_file(toto) X = X.toarray() # dataset for decision function visualization X_2d = X[:, :2] X_2d = X_2d[Y 0] Y_2d = Y[Y 0] Y_2d -= 1 What does the section _2d means ? not vey clear Rad On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote: Why do you want to convert libsvm to another structure? I don't quite get it. If you want to use examples: scikit learn has included datasets that can be directly loaded. I think this section should help: http://scikit-learn.org/stable/datasets/index.html Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba: Hello ! I am wondering if someone has developed a snippet or a script that converts libsvm format into a format directly usable by scikit without the need to use of load_svmlight_file. The reason is that I am trying to use the examples provided on the website, but all of them are written in a format that is not a libsvm one. Thanks Rad -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Re: [Scikit-learn-general] Data format
That uses the Boolean indexing function of numpy arrays iirc On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Simply because I am new to both python and scikit (Coming from R world) The problem is that I tried using load_svmlight_file with in particular RBF parameters example http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py and I get a lot of problems / errors/ so I thought it is easier to use native format which preserve the structure of the variables used. What I tested in this script in particular is something like : #iris = load_iris() #X = iris.data #Y = iris.target X,Y = load_svmlight_file(toto) X = X.toarray() # dataset for decision function visualization X_2d = X[:, :2] X_2d = X_2d[Y 0] Y_2d = Y[Y 0] Y_2d -= 1 What does the section _2d means ? not vey clear Rad On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote: Why do you want to convert libsvm to another structure? I don't quite get it. If you want to use examples: scikit learn has included datasets that can be directly loaded. I think this section should help: http://scikit-learn.org/stable/datasets/index.html Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba: Hello ! I am wondering if someone has developed a snippet or a script that converts libsvm format into a format directly usable by scikit without the need to use of load_svmlight_file. The reason is that I am trying to use the examples provided on the website, but all of them are written in a format that is not a libsvm one. Thanks Rad -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Re: [Scikit-learn-general] Data format
Suppose you have x = np.array([1, 2, 3, 4]) Then x 2 = [False, False, True, True] Using boolean indexing x[x 2] = [3, 4] -- Flavio On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: That uses the Boolean indexing function of numpy arrays iirc On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Simply because I am new to both python and scikit (Coming from R world) The problem is that I tried using load_svmlight_file with in particular RBF parameters example http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py and I get a lot of problems / errors/ so I thought it is easier to use native format which preserve the structure of the variables used. What I tested in this script in particular is something like : #iris = load_iris() #X = iris.data #Y = iris.target X,Y = load_svmlight_file(toto) X = X.toarray() # dataset for decision function visualization X_2d = X[:, :2] X_2d = X_2d[Y 0] Y_2d = Y[Y 0] Y_2d -= 1 What does the section _2d means ? not vey clear Rad On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote: Why do you want to convert libsvm to another structure? I don't quite get it. If you want to use examples: scikit learn has included datasets that can be directly loaded. I think this section should help: http://scikit-learn.org/stable/datasets/index.html Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba: Hello ! I am wondering if someone has developed a snippet or a script that converts libsvm format into a format directly usable by scikit without the need to use of load_svmlight_file. The reason is that I am trying to use the examples provided on the website, but all of them are written in a format that is not a libsvm one. Thanks Rad -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Re: [Scikit-learn-general] Data format
Thank you guys it makes more sense now. I slightly changed the code to fit my data ( I have 6 features) I got then an error message saying : File plot_rbf_parameters.py, line 109, in module Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 365, in decision_function X = self._validate_for_predict(X) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 412, in _validate_for_predict (n_features, self.shape_fit_[1])) ValueError: X.shape[1] = 2 should be equal to 6, the number of features at training time around that line : Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) How can I make X.shape[1]=6 instead of 2 Sorry if that sounds a newbie request Thanks Rad On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote: Suppose you have x = np.array([1, 2, 3, 4]) Then x 2 = [False, False, True, True] Using boolean indexing x[x 2] = [3, 4] -- Flavio On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: That uses the Boolean indexing function of numpy arrays iirc On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Simply because I am new to both python and scikit (Coming from R world) The problem is that I tried using load_svmlight_file with in particular RBF parameters example http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py and I get a lot of problems / errors/ so I thought it is easier to use native format which preserve the structure of the variables used. What I tested in this script in particular is something like : #iris = load_iris() #X = iris.data #Y = iris.target X,Y = load_svmlight_file(toto) X = X.toarray() # dataset for decision function visualization X_2d = X[:, :2] X_2d = X_2d[Y 0] Y_2d = Y[Y 0] Y_2d -= 1 What does the section _2d means ? not vey clear Rad On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote: Why do you want to convert libsvm to another structure? I don't quite get it. If you want to use examples: scikit learn has included datasets that can be directly loaded. I think this section should help: http://scikit-learn.org/stable/datasets/index.html Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba: Hello ! I am wondering if someone has developed a snippet or a script that converts libsvm format into a format directly usable by scikit without the need to use of load_svmlight_file. The reason is that I am trying to use the examples provided on the website, but all of them are written in a format that is not a libsvm one. Thanks Rad -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full
Re: [Scikit-learn-general] Data format
X_2d = X[:, :6] if the data is formatted correctly. Its rows by cols and then slicing. The numpy docs should help On Mar 8, 2013 3:12 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Thank you guys it makes more sense now. I slightly changed the code to fit my data ( I have 6 features) I got then an error message saying : File plot_rbf_parameters.py, line 109, in module Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 365, in decision_function X = self._validate_for_predict(X) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 412, in _validate_for_predict (n_features, self.shape_fit_[1])) ValueError: X.shape[1] = 2 should be equal to 6, the number of features at training time around that line : Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) How can I make X.shape[1]=6 instead of 2 Sorry if that sounds a newbie request Thanks Rad On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote: Suppose you have x = np.array([1, 2, 3, 4]) Then x 2 = [False, False, True, True] Using boolean indexing x[x 2] = [3, 4] -- Flavio On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: That uses the Boolean indexing function of numpy arrays iirc On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Simply because I am new to both python and scikit (Coming from R world) The problem is that I tried using load_svmlight_file with in particular RBF parameters example http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py and I get a lot of problems / errors/ so I thought it is easier to use native format which preserve the structure of the variables used. What I tested in this script in particular is something like : #iris = load_iris() #X = iris.data #Y = iris.target X,Y = load_svmlight_file(toto) X = X.toarray() # dataset for decision function visualization X_2d = X[:, :2] X_2d = X_2d[Y 0] Y_2d = Y[Y 0] Y_2d -= 1 What does the section _2d means ? not vey clear Rad On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote: Why do you want to convert libsvm to another structure? I don't quite get it. If you want to use examples: scikit learn has included datasets that can be directly loaded. I think this section should help: http://scikit-learn.org/stable/datasets/index.html Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba: Hello ! I am wondering if someone has developed a snippet or a script that converts libsvm format into a format directly usable by scikit without the need to use of load_svmlight_file. The reason is that I am trying to use the examples provided on the website, but all of them are written in a format that is not a libsvm one. Thanks Rad -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Re: [Scikit-learn-general] Data format
Ronnie, This is exactly what I did and that's what shows in the error message saying X.shape[1] = 2 should be equal to 6, the number of features at training time The training was made successfully, best parameters sent to output successfully but then I think it is a bug when rendering the plots I might miss something Rad On Mar 8, 2013, at 3:17 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: X_2d = X[:, :6] if the data is formatted correctly. Its rows by cols and then slicing. The numpy docs should help On Mar 8, 2013 3:12 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Thank you guys it makes more sense now. I slightly changed the code to fit my data ( I have 6 features) I got then an error message saying : File plot_rbf_parameters.py, line 109, in module Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 365, in decision_function X = self._validate_for_predict(X) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 412, in _validate_for_predict (n_features, self.shape_fit_[1])) ValueError: X.shape[1] = 2 should be equal to 6, the number of features at training time around that line : Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) How can I make X.shape[1]=6 instead of 2 Sorry if that sounds a newbie request Thanks Rad On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote: Suppose you have x = np.array([1, 2, 3, 4]) Then x 2 = [False, False, True, True] Using boolean indexing x[x 2] = [3, 4] -- Flavio On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: That uses the Boolean indexing function of numpy arrays iirc On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Simply because I am new to both python and scikit (Coming from R world) The problem is that I tried using load_svmlight_file with in particular RBF parameters example http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py and I get a lot of problems / errors/ so I thought it is easier to use native format which preserve the structure of the variables used. What I tested in this script in particular is something like : #iris = load_iris() #X = iris.data #Y = iris.target X,Y = load_svmlight_file(toto) X = X.toarray() # dataset for decision function visualization X_2d = X[:, :2] X_2d = X_2d[Y 0] Y_2d = Y[Y 0] Y_2d -= 1 What does the section _2d means ? not vey clear Rad On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote: Why do you want to convert libsvm to another structure? I don't quite get it. If you want to use examples: scikit learn has included datasets that can be directly loaded. I think this section should help: http://scikit-learn.org/stable/datasets/index.html Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba: Hello ! I am wondering if someone has developed a snippet or a script that converts libsvm format into a format directly usable by scikit without the need to use of load_svmlight_file. The reason is that I am trying to use the examples provided on the website, but all of them are written in a format that is not a libsvm one. Thanks Rad -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security
Re: [Scikit-learn-general] Data format
could you by chance upload a part of your data if not all of it / a representation or the like? On Fri, Mar 8, 2013 at 3:21 PM, Mohamed Radhouane Aniba arad...@gmail.comwrote: Ronnie, This is exactly what I did and that's what shows in the error message saying X.shape[1] = 2 should be equal to 6, the number of features at training time The training was made successfully, best parameters sent to output successfully but then I think it is a bug when rendering the plots I might miss something Rad On Mar 8, 2013, at 3:17 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: X_2d = X[:, :6] if the data is formatted correctly. Its rows by cols and then slicing. The numpy docs should help On Mar 8, 2013 3:12 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Thank you guys it makes more sense now. I slightly changed the code to fit my data ( I have 6 features) I got then an error message saying : File plot_rbf_parameters.py, line 109, in module Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 365, in decision_function X = self._validate_for_predict(X) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 412, in _validate_for_predict (n_features, self.shape_fit_[1])) ValueError: X.shape[1] = 2 should be equal to 6, the number of features at training time around that line : Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) How can I make X.shape[1]=6 instead of 2 Sorry if that sounds a newbie request Thanks Rad On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote: Suppose you have x = np.array([1, 2, 3, 4]) Then x 2 = [False, False, True, True] Using boolean indexing x[x 2] = [3, 4] -- Flavio On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: That uses the Boolean indexing function of numpy arrays iirc On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Simply because I am new to both python and scikit (Coming from R world) The problem is that I tried using load_svmlight_file with in particular RBF parameters example http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py and I get a lot of problems / errors/ so I thought it is easier to use native format which preserve the structure of the variables used. What I tested in this script in particular is something like : #iris = load_iris() #X = iris.data #Y = iris.target X,Y = load_svmlight_file(toto) X = X.toarray() # dataset for decision function visualization X_2d = X[:, :2] X_2d = X_2d[Y 0] Y_2d = Y[Y 0] Y_2d -= 1 What does the section _2d means ? not vey clear Rad On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote: Why do you want to convert libsvm to another structure? I don't quite get it. If you want to use examples: scikit learn has included datasets that can be directly loaded. I think this section should help: http://scikit-learn.org/stable/datasets/index.html Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba: Hello ! I am wondering if someone has developed a snippet or a script that converts libsvm format into a format directly usable by scikit without the need to use of load_svmlight_file. The reason is that I am trying to use the examples provided on the website, but all of them are written in a format that is not a libsvm one. Thanks Rad -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Re: [Scikit-learn-general] Data format
Sorry for the format but this is what it looks like -1 1:0.0256992 2:0.89 3:16.2094 4:3.17376 5:1.03704 6:0.161745 -1 1:0.0382503 2:7.159 3:44.5586 4:65.4716 5:24.0289 6:0.168695 1 1:0.0908366 2:10.2772 3:8.25109 4:31.2472 5:47.3532 6:0.163662 -1 1:0.0158669 2:1.87153 3:8.5248 4:2.775 5:0.888333 6:0 1 1:0.0322297 2:7.76297 3:32.3831 4:32.0085 5:15.8588 6:0.176949 -1 1:0.026197 2:9.55476 3:16.128 4:64.2671 5:39.3876 6:0.161745 -1 1:0.00695965 2:0 3:0.89 4:0.88 5:1.23188 6:0 -1 1:0.00801151 2:1.21212 3:1.37105 4:0.925 5:0.846154 6:0 1 1:0.0166757 2:113.734 3:36.022 4:25.009 5:7.25888 6:2.12453 1 1:0.0338014 2:10.3112 3:67.5516 4:9.40092 5:4.40648 6:0.190324 1 1:0.122874 2:5.31028 3:12.2217 4:46.5857 5:47.8841 6:0.213615 1 1:0.0203669 2:42.8012 3:27.4512 4:48.2356 5:28.9404 6:0.609883 1 1:0.172107 2:77.7534 3:4.19294 4:61.5518 5:61.5732 6:0.191141 -1 1:0.0352764 2:22.364 3:5.66 4:62.521 5:68.556 6:0 -1 1:0.0026989 2:0 3:0.89 4:1 5:0.870968 6:0 -1 1:0.0116963 2:1.23077 3:1.58024 4:1.48475 5:1.38604 6:0 -1 1:0.00663107 2:0.859779 3:4.45632 4:1.04706 5:0 6:0 1 1:0.00396688 2:1.1296 3:13.006 4:4.16164 5:1.16139 6:0 1 1:0.0426954 2:6.19684 3:34.5593 4:40.9415 5:11.607 6:0.238037 -1 1:0.00768275 2:1.0659 3:14.1148 4:11.9666 5:3.9526 6:0 On Mar 8, 2013, at 3:24 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: could you by chance upload a part of your data if not all of it / a representation or the like? On Fri, Mar 8, 2013 at 3:21 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Ronnie, This is exactly what I did and that's what shows in the error message saying X.shape[1] = 2 should be equal to 6, the number of features at training time The training was made successfully, best parameters sent to output successfully but then I think it is a bug when rendering the plots I might miss something Rad On Mar 8, 2013, at 3:17 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: X_2d = X[:, :6] if the data is formatted correctly. Its rows by cols and then slicing. The numpy docs should help On Mar 8, 2013 3:12 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Thank you guys it makes more sense now. I slightly changed the code to fit my data ( I have 6 features) I got then an error message saying : File plot_rbf_parameters.py, line 109, in module Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 365, in decision_function X = self._validate_for_predict(X) File /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py, line 412, in _validate_for_predict (n_features, self.shape_fit_[1])) ValueError: X.shape[1] = 2 should be equal to 6, the number of features at training time around that line : Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) How can I make X.shape[1]=6 instead of 2 Sorry if that sounds a newbie request Thanks Rad On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote: Suppose you have x = np.array([1, 2, 3, 4]) Then x 2 = [False, False, True, True] Using boolean indexing x[x 2] = [3, 4] -- Flavio On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote: That uses the Boolean indexing function of numpy arrays iirc On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote: Simply because I am new to both python and scikit (Coming from R world) The problem is that I tried using load_svmlight_file with in particular RBF parameters example http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py and I get a lot of problems / errors/ so I thought it is easier to use native format which preserve the structure of the variables used. What I tested in this script in particular is something like : #iris = load_iris() #X = iris.data #Y = iris.target X,Y = load_svmlight_file(toto) X = X.toarray() # dataset for decision function visualization X_2d = X[:, :2] X_2d = X_2d[Y 0] Y_2d = Y[Y 0] Y_2d -= 1 What does the section _2d means ? not vey clear Rad On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote: Why do you want to convert libsvm to another structure? I don't quite get it. If you want to use examples: scikit learn has included datasets that can be directly loaded. I think this section should help: http://scikit-learn.org/stable/datasets/index.html Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba: Hello ! I am wondering if someone has developed a snippet or a script that converts libsvm format into a format directly usable by scikit without the need to use of load_svmlight_file. The reason is that I am trying to use the examples provided on the website, but all of them