subject:"Re\: \[Scikit\-learn\-general\] Data format"

Re: [Scikit-learn-general] Data format

2013-03-08 Thread Philipp Singer

Why do you want to convert libsvm to another structure?

I don't quite get it.

If you want to use examples: scikit learn has included datasets that can 
be directly loaded. I think this section should help:
http://scikit-learn.org/stable/datasets/index.html

Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba:
 Hello !

 I am wondering if someone has developed a snippet or a script that converts 
 libsvm format into a format directly usable by scikit without the need to use 
 of load_svmlight_file.

 The reason is that I am trying to use the examples provided on the website, 
 but all of them are written in a format that is not a libsvm one.

 Thanks

 Rad




 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
 endpoint security space. For insight on selecting the right partner to
 tackle endpoint security challenges, access the full report.
 http://p.sf.net/sfu/symantec-dev2dev
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


--
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
___
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Data format

2013-03-08 Thread Mohamed Radhouane Aniba

Simply because I am new to both python and scikit (Coming from R world)

The problem is that I tried using load_svmlight_file with in particular RBF 
parameters example 
http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py

and I get a lot of problems / errors/ so I thought it is easier to use native 
format which preserve the structure of the variables used.

What I tested in this script in particular is something like :

#iris = load_iris()
#X = iris.data
#Y = iris.target
X,Y = load_svmlight_file(toto)
X = X.toarray()

# dataset for decision function visualization

X_2d = X[:, :2]
X_2d = X_2d[Y  0]
Y_2d = Y[Y  0]
Y_2d -= 1

What does the section _2d means ? not vey clear

Rad

On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote:

 Why do you want to convert libsvm to another structure?
 
 I don't quite get it.
 
 If you want to use examples: scikit learn has included datasets that can 
 be directly loaded. I think this section should help:
 http://scikit-learn.org/stable/datasets/index.html
 
 Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba:
 Hello !
 
 I am wondering if someone has developed a snippet or a script that converts 
 libsvm format into a format directly usable by scikit without the need to 
 use of load_svmlight_file.
 
 The reason is that I am trying to use the examples provided on the website, 
 but all of them are written in a format that is not a libsvm one.
 
 Thanks
 
 Rad
 
 
 
 
 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
 endpoint security space. For insight on selecting the right partner to
 tackle endpoint security challenges, access the full report.
 http://p.sf.net/sfu/symantec-dev2dev
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the  
 endpoint security space. For insight on selecting the right partner to 
 tackle endpoint security challenges, access the full report. 
 http://p.sf.net/sfu/symantec-dev2dev
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


--
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
___
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Data format

2013-03-08 Thread Ronnie Ghose

That uses the Boolean indexing function of numpy arrays iirc
On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote:

 Simply because I am new to both python and scikit (Coming from R world)

 The problem is that I tried using load_svmlight_file with in particular
 RBF parameters example
 http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py

 and I get a lot of problems / errors/ so I thought it is easier to use
 native format which preserve the structure of the variables used.

 What I tested in this script in particular is something like :

 #iris = load_iris()
 #X = iris.data
 #Y = iris.target
 X,Y = load_svmlight_file(toto)
 X = X.toarray()

 # dataset for decision function visualization

 X_2d = X[:, :2]
 X_2d = X_2d[Y  0]
 Y_2d = Y[Y  0]
 Y_2d -= 1

 What does the section _2d means ? not vey clear

 Rad

 On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote:

  Why do you want to convert libsvm to another structure?
 
  I don't quite get it.
 
  If you want to use examples: scikit learn has included datasets that can
  be directly loaded. I think this section should help:
  http://scikit-learn.org/stable/datasets/index.html
 
  Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba:
  Hello !
 
  I am wondering if someone has developed a snippet or a script that
 converts libsvm format into a format directly usable by scikit without the
 need to use of load_svmlight_file.
 
  The reason is that I am trying to use the examples provided on the
 website, but all of them are written in a format that is not a libsvm one.
 
  Thanks
 
  Rad
 
 
 
 
 
 --
  Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
  endpoint security space. For insight on selecting the right partner to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 
 --
  Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
  endpoint security space. For insight on selecting the right partner to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
 endpoint security space. For insight on selecting the right partner to
 tackle endpoint security challenges, access the full report.
 http://p.sf.net/sfu/symantec-dev2dev
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

--
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev___
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Data format

2013-03-08 Thread Flavio Vinicius

Suppose you have

x = np.array([1, 2, 3, 4])

Then

x  2 = [False, False, True, True]

Using boolean indexing

x[x  2] = [3, 4]
--
Flavio


On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote:
 That uses the Boolean indexing function of numpy arrays iirc

 On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote:

 Simply because I am new to both python and scikit (Coming from R world)

 The problem is that I tried using load_svmlight_file with in particular
 RBF parameters example
 http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py

 and I get a lot of problems / errors/ so I thought it is easier to use
 native format which preserve the structure of the variables used.

 What I tested in this script in particular is something like :

 #iris = load_iris()
 #X = iris.data
 #Y = iris.target
 X,Y = load_svmlight_file(toto)
 X = X.toarray()

 # dataset for decision function visualization

 X_2d = X[:, :2]
 X_2d = X_2d[Y  0]
 Y_2d = Y[Y  0]
 Y_2d -= 1

 What does the section _2d means ? not vey clear

 Rad

 On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote:

  Why do you want to convert libsvm to another structure?
 
  I don't quite get it.
 
  If you want to use examples: scikit learn has included datasets that can
  be directly loaded. I think this section should help:
  http://scikit-learn.org/stable/datasets/index.html
 
  Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba:
  Hello !
 
  I am wondering if someone has developed a snippet or a script that
  converts libsvm format into a format directly usable by scikit without the
  need to use of load_svmlight_file.
 
  The reason is that I am trying to use the examples provided on the
  website, but all of them are written in a format that is not a libsvm one.
 
  Thanks
 
  Rad
 
 
 
 
 
  --
  Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
  endpoint security space. For insight on selecting the right partner to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 
  --
  Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
  endpoint security space. For insight on selecting the right partner to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
 endpoint security space. For insight on selecting the right partner to
 tackle endpoint security challenges, access the full report.
 http://p.sf.net/sfu/symantec-dev2dev
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
 endpoint security space. For insight on selecting the right partner to
 tackle endpoint security challenges, access the full report.
 http://p.sf.net/sfu/symantec-dev2dev
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


--
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
___
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Data format

2013-03-08 Thread Mohamed Radhouane Aniba

Thank you guys it makes more sense now.

I slightly changed the code to fit my data ( I have 6 features)

I got then an error message saying :

File plot_rbf_parameters.py, line 109, in module
Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
  File 
/Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
 line 365, in decision_function
X = self._validate_for_predict(X)
  File 
/Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
 line 412, in _validate_for_predict
(n_features, self.shape_fit_[1]))
ValueError: X.shape[1] = 2 should be equal to 6, the number of features at 
training time


around that line :

Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])

How can I make X.shape[1]=6 instead of 2

Sorry if that sounds a newbie request

Thanks

Rad

On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote:

 Suppose you have
 
 x = np.array([1, 2, 3, 4])
 
 Then
 
 x  2 = [False, False, True, True]
 
 Using boolean indexing
 
 x[x  2] = [3, 4]
 --
 Flavio
 
 
 On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote:
 That uses the Boolean indexing function of numpy arrays iirc
 
 On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote:
 
 Simply because I am new to both python and scikit (Coming from R world)
 
 The problem is that I tried using load_svmlight_file with in particular
 RBF parameters example
 http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py
 
 and I get a lot of problems / errors/ so I thought it is easier to use
 native format which preserve the structure of the variables used.
 
 What I tested in this script in particular is something like :
 
 #iris = load_iris()
 #X = iris.data
 #Y = iris.target
 X,Y = load_svmlight_file(toto)
 X = X.toarray()
 
 # dataset for decision function visualization
 
 X_2d = X[:, :2]
 X_2d = X_2d[Y  0]
 Y_2d = Y[Y  0]
 Y_2d -= 1
 
 What does the section _2d means ? not vey clear
 
 Rad
 
 On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote:
 
 Why do you want to convert libsvm to another structure?
 
 I don't quite get it.
 
 If you want to use examples: scikit learn has included datasets that can
 be directly loaded. I think this section should help:
 http://scikit-learn.org/stable/datasets/index.html
 
 Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba:
 Hello !
 
 I am wondering if someone has developed a snippet or a script that
 converts libsvm format into a format directly usable by scikit without the
 need to use of load_svmlight_file.
 
 The reason is that I am trying to use the examples provided on the
 website, but all of them are written in a format that is not a libsvm one.
 
 Thanks
 
 Rad
 
 
 
 
 
 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
 endpoint security space. For insight on selecting the right partner to
 tackle endpoint security challenges, access the full report.
 http://p.sf.net/sfu/symantec-dev2dev
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 
 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
 endpoint security space. For insight on selecting the right partner to
 tackle endpoint security challenges, access the full report.
 http://p.sf.net/sfu/symantec-dev2dev
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 
 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
 endpoint security space. For insight on selecting the right partner to
 tackle endpoint security challenges, access the full report.
 http://p.sf.net/sfu/symantec-dev2dev
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 --
 Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
 Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
 endpoint security space. For insight on selecting the right partner to
 tackle endpoint security challenges, access the full

Re: [Scikit-learn-general] Data format

2013-03-08 Thread Ronnie Ghose

X_2d = X[:, :6] if the data is formatted correctly. Its rows by cols and
then slicing. The numpy docs should help
On Mar 8, 2013 3:12 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote:

 Thank you guys it makes more sense now.

 I slightly changed the code to fit my data ( I have 6 features)

 I got then an error message saying :

 File plot_rbf_parameters.py, line 109, in module
 Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
   File
 /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
 line 365, in decision_function
 X = self._validate_for_predict(X)
   File
 /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
 line 412, in _validate_for_predict
 (n_features, self.shape_fit_[1]))
 ValueError: X.shape[1] = 2 should be equal to 6, the number of features at
 training time


 around that line :

 Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])

 How can I make X.shape[1]=6 instead of 2

 Sorry if that sounds a newbie request

 Thanks

 Rad

 On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote:

  Suppose you have
 
  x = np.array([1, 2, 3, 4])
 
  Then
 
  x  2 = [False, False, True, True]
 
  Using boolean indexing
 
  x[x  2] = [3, 4]
  --
  Flavio
 
 
  On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com
 wrote:
  That uses the Boolean indexing function of numpy arrays iirc
 
  On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com
 wrote:
 
  Simply because I am new to both python and scikit (Coming from R world)
 
  The problem is that I tried using load_svmlight_file with in particular
  RBF parameters example
 
 http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py
 
  and I get a lot of problems / errors/ so I thought it is easier to use
  native format which preserve the structure of the variables used.
 
  What I tested in this script in particular is something like :
 
  #iris = load_iris()
  #X = iris.data
  #Y = iris.target
  X,Y = load_svmlight_file(toto)
  X = X.toarray()
 
  # dataset for decision function visualization
 
  X_2d = X[:, :2]
  X_2d = X_2d[Y  0]
  Y_2d = Y[Y  0]
  Y_2d -= 1
 
  What does the section _2d means ? not vey clear
 
  Rad
 
  On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote:
 
  Why do you want to convert libsvm to another structure?
 
  I don't quite get it.
 
  If you want to use examples: scikit learn has included datasets that
 can
  be directly loaded. I think this section should help:
  http://scikit-learn.org/stable/datasets/index.html
 
  Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba:
  Hello !
 
  I am wondering if someone has developed a snippet or a script that
  converts libsvm format into a format directly usable by scikit
 without the
  need to use of load_svmlight_file.
 
  The reason is that I am trying to use the examples provided on the
  website, but all of them are written in a format that is not a
 libsvm one.
 
  Thanks
 
  Rad
 
 
 
 
 
 
 --
  Symantec Endpoint Protection 12 positioned as A LEADER in The
 Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in
 the
  endpoint security space. For insight on selecting the right partner
 to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 
 
 --
  Symantec Endpoint Protection 12 positioned as A LEADER in The
 Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in
 the
  endpoint security space. For insight on selecting the right partner to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 
 
 --
  Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
  endpoint security space. For insight on selecting the right partner to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Data format

2013-03-08 Thread Mohamed Radhouane Aniba

Ronnie,

This is exactly what I did and that's what shows in the error message saying 
X.shape[1] = 2 should be equal to 6, the number of features at training time

The training was made successfully, best parameters sent to output successfully

but then I think it is a bug when rendering the plots

I might miss something

Rad

On Mar 8, 2013, at 3:17 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote:

 X_2d = X[:, :6] if the data is formatted correctly. Its rows by cols and then 
 slicing. The numpy docs should help
 
 On Mar 8, 2013 3:12 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote:
 Thank you guys it makes more sense now.
 
 I slightly changed the code to fit my data ( I have 6 features)
 
 I got then an error message saying :
 
 File plot_rbf_parameters.py, line 109, in module
 Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
   File 
 /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
  line 365, in decision_function
 X = self._validate_for_predict(X)
   File 
 /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
  line 412, in _validate_for_predict
 (n_features, self.shape_fit_[1]))
 ValueError: X.shape[1] = 2 should be equal to 6, the number of features at 
 training time
 
 
 around that line :
 
 Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
 
 How can I make X.shape[1]=6 instead of 2
 
 Sorry if that sounds a newbie request
 
 Thanks
 
 Rad
 
 On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote:
 
  Suppose you have
 
  x = np.array([1, 2, 3, 4])
 
  Then
 
  x  2 = [False, False, True, True]
 
  Using boolean indexing
 
  x[x  2] = [3, 4]
  --
  Flavio
 
 
  On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote:
  That uses the Boolean indexing function of numpy arrays iirc
 
  On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com 
  wrote:
 
  Simply because I am new to both python and scikit (Coming from R world)
 
  The problem is that I tried using load_svmlight_file with in particular
  RBF parameters example
  http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py
 
  and I get a lot of problems / errors/ so I thought it is easier to use
  native format which preserve the structure of the variables used.
 
  What I tested in this script in particular is something like :
 
  #iris = load_iris()
  #X = iris.data
  #Y = iris.target
  X,Y = load_svmlight_file(toto)
  X = X.toarray()
 
  # dataset for decision function visualization
 
  X_2d = X[:, :2]
  X_2d = X_2d[Y  0]
  Y_2d = Y[Y  0]
  Y_2d -= 1
 
  What does the section _2d means ? not vey clear
 
  Rad
 
  On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote:
 
  Why do you want to convert libsvm to another structure?
 
  I don't quite get it.
 
  If you want to use examples: scikit learn has included datasets that can
  be directly loaded. I think this section should help:
  http://scikit-learn.org/stable/datasets/index.html
 
  Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba:
  Hello !
 
  I am wondering if someone has developed a snippet or a script that
  converts libsvm format into a format directly usable by scikit without 
  the
  need to use of load_svmlight_file.
 
  The reason is that I am trying to use the examples provided on the
  website, but all of them are written in a format that is not a libsvm 
  one.
 
  Thanks
 
  Rad
 
 
 
 
 
  --
  Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
  endpoint security space. For insight on selecting the right partner to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 
  --
  Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
  endpoint security space. For insight on selecting the right partner to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 
  --
  Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the
  endpoint security

Re: [Scikit-learn-general] Data format

2013-03-08 Thread Ronnie Ghose

could you by chance upload a part of your data if not all of it / a
representation or the like?


On Fri, Mar 8, 2013 at 3:21 PM, Mohamed Radhouane Aniba
arad...@gmail.comwrote:

 Ronnie,

 This is exactly what I did and that's what shows in the error message
 saying X.shape[1] = 2 should be equal to 6, the number of features at
 training time

 The training was made successfully, best parameters sent to output
 successfully

 but then I think it is a bug when rendering the plots

 I might miss something

 Rad

 On Mar 8, 2013, at 3:17 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote:

 X_2d = X[:, :6] if the data is formatted correctly. Its rows by cols and
 then slicing. The numpy docs should help
 On Mar 8, 2013 3:12 PM, Mohamed Radhouane Aniba arad...@gmail.com
 wrote:

 Thank you guys it makes more sense now.

 I slightly changed the code to fit my data ( I have 6 features)

 I got then an error message saying :

 File plot_rbf_parameters.py, line 109, in module
 Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
   File
 /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
 line 365, in decision_function
 X = self._validate_for_predict(X)
   File
 /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
 line 412, in _validate_for_predict
 (n_features, self.shape_fit_[1]))
 ValueError: X.shape[1] = 2 should be equal to 6, the number of features
 at training time


 around that line :

 Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])

 How can I make X.shape[1]=6 instead of 2

 Sorry if that sounds a newbie request

 Thanks

 Rad

 On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote:

  Suppose you have
 
  x = np.array([1, 2, 3, 4])
 
  Then
 
  x  2 = [False, False, True, True]
 
  Using boolean indexing
 
  x[x  2] = [3, 4]
  --
  Flavio
 
 
  On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com
 wrote:
  That uses the Boolean indexing function of numpy arrays iirc
 
  On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com
 wrote:
 
  Simply because I am new to both python and scikit (Coming from R
 world)
 
  The problem is that I tried using load_svmlight_file with in
 particular
  RBF parameters example
 
 http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py
 
  and I get a lot of problems / errors/ so I thought it is easier to use
  native format which preserve the structure of the variables used.
 
  What I tested in this script in particular is something like :
 
  #iris = load_iris()
  #X = iris.data
  #Y = iris.target
  X,Y = load_svmlight_file(toto)
  X = X.toarray()
 
  # dataset for decision function visualization
 
  X_2d = X[:, :2]
  X_2d = X_2d[Y  0]
  Y_2d = Y[Y  0]
  Y_2d -= 1
 
  What does the section _2d means ? not vey clear
 
  Rad
 
  On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote:
 
  Why do you want to convert libsvm to another structure?
 
  I don't quite get it.
 
  If you want to use examples: scikit learn has included datasets that
 can
  be directly loaded. I think this section should help:
  http://scikit-learn.org/stable/datasets/index.html
 
  Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba:
  Hello !
 
  I am wondering if someone has developed a snippet or a script that
  converts libsvm format into a format directly usable by scikit
 without the
  need to use of load_svmlight_file.
 
  The reason is that I am trying to use the examples provided on the
  website, but all of them are written in a format that is not a
 libsvm one.
 
  Thanks
 
  Rad
 
 
 
 
 
 
 --
  Symantec Endpoint Protection 12 positioned as A LEADER in The
 Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in
 the
  endpoint security space. For insight on selecting the right partner
 to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
 
 
 
 
 --
  Symantec Endpoint Protection 12 positioned as A LEADER in The
 Forrester
  Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in
 the
  endpoint security space. For insight on selecting the right partner
 to
  tackle endpoint security challenges, access the full report.
  http://p.sf.net/sfu/symantec-dev2dev
  ___
  Scikit-learn-general mailing list
  Scikit-learn-general@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Data format

2013-03-08 Thread Mohamed Radhouane Aniba

Sorry for the format but this is what it looks like


-1 1:0.0256992 2:0.89 3:16.2094 4:3.17376 5:1.03704 6:0.161745
-1 1:0.0382503 2:7.159 3:44.5586 4:65.4716 5:24.0289 6:0.168695
1 1:0.0908366 2:10.2772 3:8.25109 4:31.2472 5:47.3532 6:0.163662
-1 1:0.0158669 2:1.87153 3:8.5248 4:2.775 5:0.888333 6:0
1 1:0.0322297 2:7.76297 3:32.3831 4:32.0085 5:15.8588 6:0.176949
-1 1:0.026197 2:9.55476 3:16.128 4:64.2671 5:39.3876 6:0.161745
-1 1:0.00695965 2:0 3:0.89 4:0.88 5:1.23188 6:0
-1 1:0.00801151 2:1.21212 3:1.37105 4:0.925 5:0.846154 6:0
1 1:0.0166757 2:113.734 3:36.022 4:25.009 5:7.25888 6:2.12453
1 1:0.0338014 2:10.3112 3:67.5516 4:9.40092 5:4.40648 6:0.190324
1 1:0.122874 2:5.31028 3:12.2217 4:46.5857 5:47.8841 6:0.213615
1 1:0.0203669 2:42.8012 3:27.4512 4:48.2356 5:28.9404 6:0.609883
1 1:0.172107 2:77.7534 3:4.19294 4:61.5518 5:61.5732 6:0.191141
-1 1:0.0352764 2:22.364 3:5.66 4:62.521 5:68.556 6:0
-1 1:0.0026989 2:0 3:0.89 4:1 5:0.870968 6:0
-1 1:0.0116963 2:1.23077 3:1.58024 4:1.48475 5:1.38604 6:0
-1 1:0.00663107 2:0.859779 3:4.45632 4:1.04706 5:0 6:0
1 1:0.00396688 2:1.1296 3:13.006 4:4.16164 5:1.16139 6:0
1 1:0.0426954 2:6.19684 3:34.5593 4:40.9415 5:11.607 6:0.238037
-1 1:0.00768275 2:1.0659 3:14.1148 4:11.9666 5:3.9526 6:0


On Mar 8, 2013, at 3:24 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote:

 could you by chance upload a part of your data if not all of it / a 
 representation or the like?
 
 
 On Fri, Mar 8, 2013 at 3:21 PM, Mohamed Radhouane Aniba arad...@gmail.com 
 wrote:
 Ronnie,
 
 This is exactly what I did and that's what shows in the error message saying 
 X.shape[1] = 2 should be equal to 6, the number of features at training time
 
 The training was made successfully, best parameters sent to output 
 successfully
 
 but then I think it is a bug when rendering the plots
 
 I might miss something
 
 Rad
 
 On Mar 8, 2013, at 3:17 PM, Ronnie Ghose ronnie.gh...@gmail.com wrote:
 
 X_2d = X[:, :6] if the data is formatted correctly. Its rows by cols and 
 then slicing. The numpy docs should help
 
 On Mar 8, 2013 3:12 PM, Mohamed Radhouane Aniba arad...@gmail.com wrote:
 Thank you guys it makes more sense now.
 
 I slightly changed the code to fit my data ( I have 6 features)
 
 I got then an error message saying :
 
 File plot_rbf_parameters.py, line 109, in module
 Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
   File 
 /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
  line 365, in decision_function
 X = self._validate_for_predict(X)
   File 
 /Library/Python/2.7/site-packages/scikit_learn-0.13.1-py2.7-macosx-10.8-intel.egg/sklearn/svm/base.py,
  line 412, in _validate_for_predict
 (n_features, self.shape_fit_[1]))
 ValueError: X.shape[1] = 2 should be equal to 6, the number of features at 
 training time
 
 
 around that line :
 
 Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
 
 How can I make X.shape[1]=6 instead of 2
 
 Sorry if that sounds a newbie request
 
 Thanks
 
 Rad
 
 On Mar 8, 2013, at 2:54 PM, Flavio Vinicius flavio...@gmail.com wrote:
 
  Suppose you have
 
  x = np.array([1, 2, 3, 4])
 
  Then
 
  x  2 = [False, False, True, True]
 
  Using boolean indexing
 
  x[x  2] = [3, 4]
  --
  Flavio
 
 
  On Fri, Mar 8, 2013 at 4:41 PM, Ronnie Ghose ronnie.gh...@gmail.com 
  wrote:
  That uses the Boolean indexing function of numpy arrays iirc
 
  On Mar 8, 2013 1:28 PM, Mohamed Radhouane Aniba arad...@gmail.com 
  wrote:
 
  Simply because I am new to both python and scikit (Coming from R world)
 
  The problem is that I tried using load_svmlight_file with in particular
  RBF parameters example
  http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py
 
  and I get a lot of problems / errors/ so I thought it is easier to use
  native format which preserve the structure of the variables used.
 
  What I tested in this script in particular is something like :
 
  #iris = load_iris()
  #X = iris.data
  #Y = iris.target
  X,Y = load_svmlight_file(toto)
  X = X.toarray()
 
  # dataset for decision function visualization
 
  X_2d = X[:, :2]
  X_2d = X_2d[Y  0]
  Y_2d = Y[Y  0]
  Y_2d -= 1
 
  What does the section _2d means ? not vey clear
 
  Rad
 
  On Mar 8, 2013, at 1:21 PM, Philipp Singer kill...@gmail.com wrote:
 
  Why do you want to convert libsvm to another structure?
 
  I don't quite get it.
 
  If you want to use examples: scikit learn has included datasets that can
  be directly loaded. I think this section should help:
  http://scikit-learn.org/stable/datasets/index.html
 
  Am 08.03.2013 18:44, schrieb Mohamed Radhouane Aniba:
  Hello !
 
  I am wondering if someone has developed a snippet or a script that
  converts libsvm format into a format directly usable by scikit without 
  the
  need to use of load_svmlight_file.
 
  The reason is that I am trying to use the examples provided on the
  website, but all of them

Re: [Scikit-learn-general] Data format

Re: [Scikit-learn-general] Data format

Re: [Scikit-learn-general] Data format

Re: [Scikit-learn-general] Data format

Re: [Scikit-learn-general] Data format

Re: [Scikit-learn-general] Data format

Re: [Scikit-learn-general] Data format

Re: [Scikit-learn-general] Data format

Re: [Scikit-learn-general] Data format

9 matches

Site Navigation

Mail list logo

Footer information