this means that both are feasible? On 19 December 2016 at 18:17, Sebastian Raschka <se.rasc...@gmail.com> wrote:
> Thanks, Thomas, that makes sense! Will submit a PR then to update the > docstring. > > Best, > Sebastian > > > > On Dec 19, 2016, at 11:06 AM, Thomas Evangelidis <teva...@gmail.com> > wrote: > > > > > > Greetings, > > > > My dataset consists of objects which are characterised by their > structural features which are encoded into a so called "fingerprint" form. > There are several different types of fingerprints, each one encapsulating > different type of information. I want to combine two specific types of > fingerprints to train a MLP regressor. The first fingerprint consists of a > 2048 bit array of the form: > > > > FP1 = array([ 1., 1., 0., ..., 0., 0., 1.], dtype=float32) > > > > The second is a 60 float number array of the form: > > > > FP2 = array([ 2.77494618, 0.98973243, 0.34638652, 2.88303715, > 1.31473857, > > -0.56627112, 4.78847547, 2.29587913, -0.6786228 , 4.63391109, > > ... > > 0. , 0. , 5.89652792, 0. , 0. ]) > > > > At first I tried to fuse them into a single 1D array of 2048+60 columns > but the predictions of the MLP were worse than the 2 different MLP models > trained from one of the 2 fingerprint types individually. My question: is > there a more effective way to combine the 2 fingerprints in order to > indicate that they represent different type of information? > > > > To this end, I tried to create a 2-row array (1st row 2048 elements and > 2nd row 60 elements) but sklearn complained: > > > > mlp.fit(x_train,y_train) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_ > network/multilayer_perceptron.py", line 618, in fit > > return self._fit(X, y, incremental=False) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_ > network/multilayer_perceptron.py", line 330, in _fit > > X, y = self._validate_input(X, y, incremental) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_ > network/multilayer_perceptron.py", line 1264, in _validate_input > > multi_output=True, y_numeric=True) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", > line 521, in check_X_y > > ensure_min_features, warn_on_dtype, estimator) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", > line 402, in check_array > > array = array.astype(np.float64) > > ValueError: setting an array element with a sequence. > > > > > > Then I tried to create for each object of the dataset a 2D array of > size 2x2048, by adding 1998 zeros in the second row in order both rows to > be of equal size. However sklearn complained again: > > > > > > mlp.fit(x_train,y_train) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_ > network/multilayer_perceptron.py", line 618, in fit > > return self._fit(X, y, incremental=False) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_ > network/multilayer_perceptron.py", line 330, in _fit > > X, y = self._validate_input(X, y, incremental) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/neural_ > network/multilayer_perceptron.py", line 1264, in _validate_input > > multi_output=True, y_numeric=True) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", > line 521, in check_X_y > > ensure_min_features, warn_on_dtype, estimator) > > File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", > line 405, in check_array > > % (array.ndim, estimator_name)) > > ValueError: Found array with dim 3. Estimator expected <= 2. > > > > > > In another case of fingerprints, lets name them FP3 and FP4, I observed > that the MLP regressor created using FP3 yields better results when trained > and evaluated using logarithmically transformed experimental values (the > values in y_train and y_test 1D arrays), while the MLP regressor created > using FP4 yielded better results using the original experimental values. So > my second question is: when combining both FP3 and FP4 into a single array > is there any way to designate to the MLP that the features that correspond > to FP3 must reproduce the logarithmic transform of the experimental values > while the features of FP4 the original untransformed experimental values? > > > > > > I would greatly appreciate any advice on any of my 2 queries. > > Thomas > > > > > > > > > > > > > > > > > > > > -- > > ====================================================================== > > Thomas Evangelidis > > Research Specialist > > CEITEC - Central European Institute of Technology > > Masaryk University > > Kamenice 5/A35/1S081, > > 62500 Brno, Czech Republic > > > > email: tev...@pharm.uoa.gr > > teva...@gmail.com > > > > website: https://sites.google.com/site/thomasevangelidishomepage/ > > > > > > _______________________________________________ > > scikit-learn mailing list > > scikit-learn@python.org > > https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: tev...@pharm.uoa.gr teva...@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn