Re: [scikit-learn] Failed when make PDF document by "make latexpdf"

2016-12-19 Thread James Chang
Hi Loïc,

   thank you, finally I got the PDF File.

Thanks and best regards,
James

2016-12-20 15:29 GMT+08:00 Loïc Estève via scikit-learn <
scikit-learn@python.org>:

> Hi,
>
> you can get the PDF documentation from the website, see attached
> screenshot.
>
> Cheers,
> Loïc
>
>
> On 12/20/2016 07:38 AM, James Chang wrote:
>
>> Hi,
>>
>>   Does anyone have issue when execute "make latexpdf" to get PDF format
>> Doc ?
>>
>>   or I can directly download the latest PDF format doc for the currently
>> stable version
>> scikit-learn v 0.18.1 in some where?
>>
>> PS.
>> I run the commend under Mac OS X 10.12.1
>>
>> Thanks in advance and best regards,
>> James
>>
>>
>>
>>
>>
>>
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Failed when make PDF document by "make latexpdf"

2016-12-19 Thread James Chang
Hi,

  Does anyone have issue when execute "make latexpdf" to get PDF format
Doc ?

  or I can directly download the latest PDF format doc for the currently
stable version
scikit-learn v 0.18.1 in some where?

PS.
I run the commend under Mac OS X 10.12.1

Thanks in advance and best regards,
James
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] combining arrays of features to train an MLP

2016-12-19 Thread Thomas Evangelidis
Thank you, these articles discuss about ML application of the types of
fingerprints I working with! I will read them thoroughly to get some hints.

In the meantime I tried to eliminate some features using RandomizedLasso
and the performance escalated from R=0.067 using all 615 features to
R=0.524 using only the 15 top ranked features. Naive question: does it make
sense to use the RandomizedLasso to select the good features in order to
train a MLP? I had the impression that RandomizedLasso uses multi-variate
linear regression to fit the observed values to the experimental and rank
the features.

Another question: this dataset consists of 31 observations. The Pearson's R
values that I reported above were calculated using cross-validation. Could
someone claim that they are inaccurate because the number of features used
for training the MLP is much larger than the number of observations?


On 19 December 2016 at 23:42, Sebastian Raschka 
wrote:

> Oh, sorry, I just noticed that I was in the wrong thread — meant answer a
> different Thomas :P.
>
> Regarding the fingerprints; scikit-learn’s estimators expect feature
> vectors as samples, so you can’t have a 3D array … e.g., think of image
> classification: here you also enroll the n_pixels times m_pixels array into
> 1D arrays.
>
> The low performance can have mutliple issues. In case dimensionality is an
> issue, I’d maybe try stronger regularization first, or feature selection.
> If you are working with molecular structures, and you have enough of them,
> maybe also consider alternative feature representations, e.g,. learning
> from the graphs directly:
>
> http://papers.nips.cc/paper/5954-convolutional-networks-
> on-graphs-for-learning-molecular-fingerprints.pdf
> http://pubs.acs.org/doi/abs/10.1021/ci400187y
>
> Best,
> Sebastian
>
>
> > On Dec 19, 2016, at 4:56 PM, Thomas Evangelidis 
> wrote:
> >
> > this means that both are feasible?
> >
> > On 19 December 2016 at 18:17, Sebastian Raschka 
> wrote:
> > Thanks, Thomas, that makes sense! Will submit a PR then to update the
> docstring.
> >
> > Best,
> > Sebastian
> >
> >
> > > On Dec 19, 2016, at 11:06 AM, Thomas Evangelidis 
> wrote:
> > >
> > > ​​
> > > Greetings,
> > >
> > > My dataset consists of objects which are characterised by their
> structural features which are encoded into a so called "fingerprint" form.
> There are several different types of fingerprints, each one encapsulating
> different type of information. I want to combine two specific types of
> fingerprints to train a MLP regressor. The first fingerprint consists of a
> 2048 bit array of the form:
> > >
> > >  ​FP​1 = array([ 1.,  1.,  0., ...,  0.,  0.,  1.], dtype=float32)
> > >
> > > The second is a 60 float number array of the form:
> > >
> > > FP2 = array([ 2.77494618,  0.98973243,  0.34638652,  2.88303715,
> 1.31473857,
> > >-0.56627112,  4.78847547,  2.29587913, -0.6786228 ,  4.63391109,
> > >...
> > > 0.,  0.,  5.89652792,  0.,  0.
> ])
> > >
> > > At first I tried to fuse them into a single 1D array of 2048+60
> columns but the predictions of the MLP were worse than the 2 different MLP
> models trained from one of the 2 fingerprint types individually. My
> question: is there a more effective way to combine the 2 fingerprints in
> order to indicate that they represent different type of information?
> > >
> > > To this end, I tried to create a 2-row array (1st row 2048 elements
> and 2nd row 60 elements) but sklearn complained:
> > >
> > > ​mlp.fit(x_train,y_train)
> > >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 618, in fit
> > > return self._fit(X, y, incremental=False)
> > >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 330, in _fit
> > > X, y = self._validate_input(X, y, incremental)
> > >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 1264, in _validate_input
> > > multi_output=True, y_numeric=True)
> > >   File 
> > > "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py",
> line 521, in check_X_y
> > > ensure_min_features, warn_on_dtype, estimator)
> > >   File 
> > > "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py",
> line 402, in check_array
> > > array = array.astype(np.float64)
> > > ValueError: setting an array element with a sequence.
> > > ​
> > >
> > > ​Then I tried to ​create for each object of the dataset a 2D array of
> size 2x2048, by adding 1998 zeros in the second row in order both rows to
> be of equal size. However sklearn complained again:
> > >
> > >
> > > mlp.fit(x_train,y_train)
> > >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 618, in fit
> > > return self._fit(X, y, 

Re: [scikit-learn] n_jobs for LogisticRegression

2016-12-19 Thread Sebastian Raschka
Thanks, Tom, that makes sense. Submitted a PR to fix that.

Best,
Sebastian

> On Dec 19, 2016, at 10:14 AM, Tom DLT  wrote:
> 
> Hi,
> 
> In LogisticRegression, n_jobs is only used for one-vs-rest parallelization.
> In LogisticRegressionCV, n_jobs is used for both one-vs-rest and 
> cross-validation parallelizations.
> 
> So in LogisticRegression with multi_class='multinomial', n_jobs should have 
> no impact.
> 
> The docstring should probably be updated as you mentioned. PR welcome :)
> 
> Best,
> Tom
> 
> 2016-12-19 6:13 GMT+01:00 Sebastian Raschka :
> Hi,
> 
> I just got confused what exactly n_jobs does for LogisticRegression. Always 
> thought that it was used for one-vs-rest learning, fitting the models for 
> binary classification in parallel. However, it also seem to do sth in the 
> multinomial case (at least according to the verbose option). in the docstring 
> it says
> 
> > n_jobs : int, optional
> > Number of CPU cores used during the cross-validation loop. If given
> > a value of -1, all cores are used.
> 
> and I saw a logistic_regression_path being defined in the code. I am 
> wondering, is this just a workaround for the LogisticRegressionCV, and should 
> the n_jobs docstring in LogisticRegression
> be described as "Number of CPU cores used for model fitting” instead of 
> “during cross-validation,” or am I getting this wrong?
> 
> Best,
> Sebastian
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] combining arrays of features to train an MLP

2016-12-19 Thread Sebastian Raschka
Oh, sorry, I just noticed that I was in the wrong thread — meant answer a 
different Thomas :P.

Regarding the fingerprints; scikit-learn’s estimators expect feature vectors as 
samples, so you can’t have a 3D array … e.g., think of image classification: 
here you also enroll the n_pixels times m_pixels array into 1D arrays.

The low performance can have mutliple issues. In case dimensionality is an 
issue, I’d maybe try stronger regularization first, or feature selection.
If you are working with molecular structures, and you have enough of them, 
maybe also consider alternative feature representations, e.g,. learning from 
the graphs directly:

http://papers.nips.cc/paper/5954-convolutional-networks-on-graphs-for-learning-molecular-fingerprints.pdf
http://pubs.acs.org/doi/abs/10.1021/ci400187y

Best,
Sebastian


> On Dec 19, 2016, at 4:56 PM, Thomas Evangelidis  wrote:
> 
> this means that both are feasible?
> 
> On 19 December 2016 at 18:17, Sebastian Raschka  wrote:
> Thanks, Thomas, that makes sense! Will submit a PR then to update the 
> docstring.
> 
> Best,
> Sebastian
> 
> 
> > On Dec 19, 2016, at 11:06 AM, Thomas Evangelidis  wrote:
> >
> > ​​
> > Greetings,
> >
> > My dataset consists of objects which are characterised by their structural 
> > features which are encoded into a so called "fingerprint" form. There are 
> > several different types of fingerprints, each one encapsulating different 
> > type of information. I want to combine two specific types of fingerprints 
> > to train a MLP regressor. The first fingerprint consists of a 2048 bit 
> > array of the form:
> >
> >  ​FP​1 = array([ 1.,  1.,  0., ...,  0.,  0.,  1.], dtype=float32)
> >
> > The second is a 60 float number array of the form:
> >
> > FP2 = array([ 2.77494618,  0.98973243,  0.34638652,  2.88303715,  
> > 1.31473857,
> >-0.56627112,  4.78847547,  2.29587913, -0.6786228 ,  4.63391109,
> >...
> > 0.,  0.,  5.89652792,  0.,  0.])
> >
> > At first I tried to fuse them into a single 1D array of 2048+60 columns but 
> > the predictions of the MLP were worse than the 2 different MLP models 
> > trained from one of the 2 fingerprint types individually. My question: is 
> > there a more effective way to combine the 2 fingerprints in order to 
> > indicate that they represent different type of information?
> >
> > To this end, I tried to create a 2-row array (1st row 2048 elements and 2nd 
> > row 60 elements) but sklearn complained:
> >
> > ​mlp.fit(x_train,y_train)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> >  line 618, in fit
> > return self._fit(X, y, incremental=False)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> >  line 330, in _fit
> > X, y = self._validate_input(X, y, incremental)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> >  line 1264, in _validate_input
> > multi_output=True, y_numeric=True)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 
> > 521, in check_X_y
> > ensure_min_features, warn_on_dtype, estimator)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 
> > 402, in check_array
> > array = array.astype(np.float64)
> > ValueError: setting an array element with a sequence.
> > ​
> >
> > ​Then I tried to ​create for each object of the dataset a 2D array of size 
> > 2x2048, by adding 1998 zeros in the second row in order both rows to be of 
> > equal size. However sklearn complained again:
> >
> >
> > mlp.fit(x_train,y_train)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> >  line 618, in fit
> > return self._fit(X, y, incremental=False)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> >  line 330, in _fit
> > X, y = self._validate_input(X, y, incremental)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> >  line 1264, in _validate_input
> > multi_output=True, y_numeric=True)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 
> > 521, in check_X_y
> > ensure_min_features, warn_on_dtype, estimator)
> >   File 
> > "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 
> > 405, in check_array
> > % (array.ndim, estimator_name))
> > ValueError: Found array with dim 3. Estimator expected <= 2.
> >
> >
> > In another case of fingerprints, lets name them FP3 and FP4, I observed 
> > that the MLP regressor created using FP3 yields better results when trained 
> > and evaluated using logarithmically transformed experimental values (the 
> > values in y_train 

Re: [scikit-learn] combining arrays of features to train an MLP

2016-12-19 Thread Thomas Evangelidis
this means that both are feasible?

On 19 December 2016 at 18:17, Sebastian Raschka 
wrote:

> Thanks, Thomas, that makes sense! Will submit a PR then to update the
> docstring.
>
> Best,
> Sebastian
>
>
> > On Dec 19, 2016, at 11:06 AM, Thomas Evangelidis 
> wrote:
> >
> > ​​
> > Greetings,
> >
> > My dataset consists of objects which are characterised by their
> structural features which are encoded into a so called "fingerprint" form.
> There are several different types of fingerprints, each one encapsulating
> different type of information. I want to combine two specific types of
> fingerprints to train a MLP regressor. The first fingerprint consists of a
> 2048 bit array of the form:
> >
> >  ​FP​1 = array([ 1.,  1.,  0., ...,  0.,  0.,  1.], dtype=float32)
> >
> > The second is a 60 float number array of the form:
> >
> > FP2 = array([ 2.77494618,  0.98973243,  0.34638652,  2.88303715,
> 1.31473857,
> >-0.56627112,  4.78847547,  2.29587913, -0.6786228 ,  4.63391109,
> >...
> > 0.,  0.,  5.89652792,  0.,  0.])
> >
> > At first I tried to fuse them into a single 1D array of 2048+60 columns
> but the predictions of the MLP were worse than the 2 different MLP models
> trained from one of the 2 fingerprint types individually. My question: is
> there a more effective way to combine the 2 fingerprints in order to
> indicate that they represent different type of information?
> >
> > To this end, I tried to create a 2-row array (1st row 2048 elements and
> 2nd row 60 elements) but sklearn complained:
> >
> > ​mlp.fit(x_train,y_train)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 618, in fit
> > return self._fit(X, y, incremental=False)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 330, in _fit
> > X, y = self._validate_input(X, y, incremental)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 1264, in _validate_input
> > multi_output=True, y_numeric=True)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py",
> line 521, in check_X_y
> > ensure_min_features, warn_on_dtype, estimator)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py",
> line 402, in check_array
> > array = array.astype(np.float64)
> > ValueError: setting an array element with a sequence.
> > ​
> >
> > ​Then I tried to ​create for each object of the dataset a 2D array of
> size 2x2048, by adding 1998 zeros in the second row in order both rows to
> be of equal size. However sklearn complained again:
> >
> >
> > mlp.fit(x_train,y_train)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 618, in fit
> > return self._fit(X, y, incremental=False)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 330, in _fit
> > X, y = self._validate_input(X, y, incremental)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_
> network/multilayer_perceptron.py", line 1264, in _validate_input
> > multi_output=True, y_numeric=True)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py",
> line 521, in check_X_y
> > ensure_min_features, warn_on_dtype, estimator)
> >   File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py",
> line 405, in check_array
> > % (array.ndim, estimator_name))
> > ValueError: Found array with dim 3. Estimator expected <= 2.
> >
> >
> > In another case of fingerprints, lets name them FP3 and FP4, I observed
> that the MLP regressor created using FP3 yields better results when trained
> and evaluated using logarithmically transformed experimental values (the
> values in y_train and y_test 1D arrays), while the MLP regressor created
> using FP4 yielded better results using the original experimental values. So
> my second question is: when combining both FP3 and FP4 into a single array
> is there any way to designate to the MLP that the features that correspond
> to FP3 must reproduce the logarithmic transform of the experimental values
> while the features of FP4 the original untransformed experimental values?
> >
> >
> > I would greatly appreciate any advice on any of my 2 queries.
> > Thomas
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> > ==
> > Thomas Evangelidis
> > Research Specialist
> > CEITEC - Central European Institute of Technology
> > Masaryk University
> > Kamenice 5/A35/1S081,
> > 62500 Brno, Czech Republic
> >
> > email: tev...@pharm.uoa.gr
> >   teva...@gmail.com
> >
> > website: https://sites.google.com/site/thomasevangelidishomepage/
> >
> >
> > ___
> > 

Re: [scikit-learn] combining arrays of features to train an MLP

2016-12-19 Thread Sebastian Raschka
Thanks, Thomas, that makes sense! Will submit a PR then to update the docstring.

Best,
Sebastian


> On Dec 19, 2016, at 11:06 AM, Thomas Evangelidis  wrote:
> 
> ​​
> Greetings,
> 
> My dataset consists of objects which are characterised by their structural 
> features which are encoded into a so called "fingerprint" form. There are 
> several different types of fingerprints, each one encapsulating different 
> type of information. I want to combine two specific types of fingerprints to 
> train a MLP regressor. The first fingerprint consists of a 2048 bit array of 
> the form:
> 
>  ​FP​1 = array([ 1.,  1.,  0., ...,  0.,  0.,  1.], dtype=float32)
> 
> The second is a 60 float number array of the form:
> 
> FP2 = array([ 2.77494618,  0.98973243,  0.34638652,  2.88303715,  1.31473857,
>-0.56627112,  4.78847547,  2.29587913, -0.6786228 ,  4.63391109,
>...
> 0.,  0.,  5.89652792,  0.,  0.])
> 
> At first I tried to fuse them into a single 1D array of 2048+60 columns but 
> the predictions of the MLP were worse than the 2 different MLP models trained 
> from one of the 2 fingerprint types individually. My question: is there a 
> more effective way to combine the 2 fingerprints in order to indicate that 
> they represent different type of information?
>  
> To this end, I tried to create a 2-row array (1st row 2048 elements and 2nd 
> row 60 elements) but sklearn complained:
> 
> ​mlp.fit(x_train,y_train)
>   File 
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
>  line 618, in fit
> return self._fit(X, y, incremental=False)
>   File 
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
>  line 330, in _fit
> X, y = self._validate_input(X, y, incremental)
>   File 
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
>  line 1264, in _validate_input
> multi_output=True, y_numeric=True)
>   File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", 
> line 521, in check_X_y
> ensure_min_features, warn_on_dtype, estimator)
>   File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", 
> line 402, in check_array
> array = array.astype(np.float64)
> ValueError: setting an array element with a sequence.
> ​
> 
> ​Then I tried to ​create for each object of the dataset a 2D array of size 
> 2x2048, by adding 1998 zeros in the second row in order both rows to be of 
> equal size. However sklearn complained again:
> 
> 
> mlp.fit(x_train,y_train)
>   File 
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
>  line 618, in fit
> return self._fit(X, y, incremental=False)
>   File 
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
>  line 330, in _fit
> X, y = self._validate_input(X, y, incremental)
>   File 
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
>  line 1264, in _validate_input
> multi_output=True, y_numeric=True)
>   File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", 
> line 521, in check_X_y
> ensure_min_features, warn_on_dtype, estimator)
>   File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", 
> line 405, in check_array
> % (array.ndim, estimator_name))
> ValueError: Found array with dim 3. Estimator expected <= 2.
> 
> 
> In another case of fingerprints, lets name them FP3 and FP4, I observed that 
> the MLP regressor created using FP3 yields better results when trained and 
> evaluated using logarithmically transformed experimental values (the values 
> in y_train and y_test 1D arrays), while the MLP regressor created using FP4 
> yielded better results using the original experimental values. So my second 
> question is: when combining both FP3 and FP4 into a single array is there any 
> way to designate to the MLP that the features that correspond to FP3 must 
> reproduce the logarithmic transform of the experimental values while the 
> features of FP4 the original untransformed experimental values?
> 
> 
> I would greatly appreciate any advice on any of my 2 queries.
> Thomas
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> ==
> Thomas Evangelidis
> Research Specialist
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/1S081, 
> 62500 Brno, Czech Republic 
> 
> email: tev...@pharm.uoa.gr
>   teva...@gmail.com
> 
> website: https://sites.google.com/site/thomasevangelidishomepage/
> 
> 
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

___
scikit-learn mailing list
scikit-learn@python.org

[scikit-learn] combining arrays of features to train an MLP

2016-12-19 Thread Thomas Evangelidis
​​
Greetings,

My dataset consists of objects which are characterised by their structural
features which are encoded into a so called "fingerprint" form. There are
several different types of fingerprints, each one encapsulating different
type of information. I want to combine two specific types of fingerprints
to train a MLP regressor. The first fingerprint consists of a 2048 bit
array of the form:


> ​FP​
> 1 = array([ 1.,  1.,  0., ...,  0.,  0.,  1.], dtype=float32)


The second is a 60 float number array of the form:

FP2 = array([ 2.77494618,  0.98973243,  0.34638652,  2.88303715,
>  1.31473857,
>-0.56627112,  4.78847547,  2.29587913, -0.6786228 ,  4.63391109,
>...
> 0.,  0.,  5.89652792,  0.,  0.])


At first I tried to fuse them into a single 1D array of 2048+60 columns but
the predictions of the MLP were worse than the 2 different MLP models
trained from one of the 2 fingerprint types individually. My question: is
there a more effective way to combine the 2 fingerprints in order to
indicate that they represent different type of information?

To this end, I tried to create a 2-row array (1st row 2048 elements and 2nd
row 60 elements) but sklearn complained:

​mlp.fit(x_train,y_train)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> line 618, in fit
> return self._fit(X, y, incremental=False)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> line 330, in _fit
> X, y = self._validate_input(X, y, incremental)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> line 1264, in _validate_input
> multi_output=True, y_numeric=True)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line
> 521, in check_X_y
> ensure_min_features, warn_on_dtype, estimator)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line
> 402, in check_array
> array = array.astype(np.float64)
> ValueError: setting an array element with a sequence.
> ​


​Then I tried to ​create for each object of the dataset a 2D array of size
2x2048, by adding 1998 zeros in the second row in order both rows to be of
equal size. However sklearn complained again:


mlp.fit(x_train,y_train)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> line 618, in fit
> return self._fit(X, y, incremental=False)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> line 330, in _fit
> X, y = self._validate_input(X, y, incremental)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py",
> line 1264, in _validate_input
> multi_output=True, y_numeric=True)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line
> 521, in check_X_y
> ensure_min_features, warn_on_dtype, estimator)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line
> 405, in check_array
> % (array.ndim, estimator_name))
> ValueError: Found array with dim 3. Estimator expected <= 2.



In another case of fingerprints, lets name them FP3 and FP4, I observed
that the MLP regressor created using FP3 yields better results when trained
and evaluated using logarithmically transformed experimental values (the
values in y_train and y_test 1D arrays), while the MLP regressor created
using FP4 yielded better results using the original experimental values. So
my second question is: when combining both FP3 and FP4 into a single
array is there any way to designate to the MLP that the features that
correspond to FP3 must reproduce the logarithmic transform of the
experimental values while the features of FP4 the original untransformed
experimental values?


I would greatly appreciate any advice on any of my 2 queries.
Thomas









-- 

==

Thomas Evangelidis

Research Specialist
CEITEC - Central European Institute of Technology
Masaryk University
Kamenice 5/A35/1S081,
62500 Brno, Czech Republic

email: tev...@pharm.uoa.gr

  teva...@gmail.com


website: https://sites.google.com/site/thomasevangelidishomepage/
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] n_jobs for LogisticRegression

2016-12-19 Thread Tom DLT
Hi,

In LogisticRegression, n_jobs is only used for one-vs-rest parallelization.
In LogisticRegressionCV, n_jobs is used for both one-vs-rest and
cross-validation parallelizations.

So in LogisticRegression with multi_class='multinomial', n_jobs should have
no impact.

The docstring should probably be updated as you mentioned. PR welcome :)

Best,
Tom

2016-12-19 6:13 GMT+01:00 Sebastian Raschka :

> Hi,
>
> I just got confused what exactly n_jobs does for LogisticRegression.
> Always thought that it was used for one-vs-rest learning, fitting the
> models for binary classification in parallel. However, it also seem to do
> sth in the multinomial case (at least according to the verbose option). in
> the docstring it says
>
> > n_jobs : int, optional
> > Number of CPU cores used during the cross-validation loop. If
> given
> > a value of -1, all cores are used.
>
> and I saw a logistic_regression_path being defined in the code. I am
> wondering, is this just a workaround for the LogisticRegressionCV, and
> should the n_jobs docstring in LogisticRegression
> be described as "Number of CPU cores used for model fitting” instead of
> “during cross-validation,” or am I getting this wrong?
>
> Best,
> Sebastian
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn