Thanks, Thomas, that makes sense! Will submit a PR then to update the docstring.
Best, Sebastian > On Dec 19, 2016, at 11:06 AM, Thomas Evangelidis <teva...@gmail.com> wrote: > > > Greetings, > > My dataset consists of objects which are characterised by their structural > features which are encoded into a so called "fingerprint" form. There are > several different types of fingerprints, each one encapsulating different > type of information. I want to combine two specific types of fingerprints to > train a MLP regressor. The first fingerprint consists of a 2048 bit array of > the form: > > FP1 = array([ 1., 1., 0., ..., 0., 0., 1.], dtype=float32) > > The second is a 60 float number array of the form: > > FP2 = array([ 2.77494618, 0.98973243, 0.34638652, 2.88303715, 1.31473857, > -0.56627112, 4.78847547, 2.29587913, -0.6786228 , 4.63391109, > ... > 0. , 0. , 5.89652792, 0. , 0. ]) > > At first I tried to fuse them into a single 1D array of 2048+60 columns but > the predictions of the MLP were worse than the 2 different MLP models trained > from one of the 2 fingerprint types individually. My question: is there a > more effective way to combine the 2 fingerprints in order to indicate that > they represent different type of information? > > To this end, I tried to create a 2-row array (1st row 2048 elements and 2nd > row 60 elements) but sklearn complained: > > mlp.fit(x_train,y_train) > File > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py", > line 618, in fit > return self._fit(X, y, incremental=False) > File > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py", > line 330, in _fit > X, y = self._validate_input(X, y, incremental) > File > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py", > line 1264, in _validate_input > multi_output=True, y_numeric=True) > File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", > line 521, in check_X_y > ensure_min_features, warn_on_dtype, estimator) > File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", > line 402, in check_array > array = array.astype(np.float64) > ValueError: setting an array element with a sequence. > > > Then I tried to create for each object of the dataset a 2D array of size > 2x2048, by adding 1998 zeros in the second row in order both rows to be of > equal size. However sklearn complained again: > > > mlp.fit(x_train,y_train) > File > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py", > line 618, in fit > return self._fit(X, y, incremental=False) > File > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py", > line 330, in _fit > X, y = self._validate_input(X, y, incremental) > File > "/usr/local/lib/python2.7/dist-packages/sklearn/neural_network/multilayer_perceptron.py", > line 1264, in _validate_input > multi_output=True, y_numeric=True) > File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", > line 521, in check_X_y > ensure_min_features, warn_on_dtype, estimator) > File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", > line 405, in check_array > % (array.ndim, estimator_name)) > ValueError: Found array with dim 3. Estimator expected <= 2. > > > In another case of fingerprints, lets name them FP3 and FP4, I observed that > the MLP regressor created using FP3 yields better results when trained and > evaluated using logarithmically transformed experimental values (the values > in y_train and y_test 1D arrays), while the MLP regressor created using FP4 > yielded better results using the original experimental values. So my second > question is: when combining both FP3 and FP4 into a single array is there any > way to designate to the MLP that the features that correspond to FP3 must > reproduce the logarithmic transform of the experimental values while the > features of FP4 the original untransformed experimental values? > > > I would greatly appreciate any advice on any of my 2 queries. > Thomas > > > > > > > > > > -- > ====================================================================== > Thomas Evangelidis > Research Specialist > CEITEC - Central European Institute of Technology > Masaryk University > Kamenice 5/A35/1S081, > 62500 Brno, Czech Republic > > email: tev...@pharm.uoa.gr > teva...@gmail.com > > website: https://sites.google.com/site/thomasevangelidishomepage/ > > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn