Thanks! Arthur and Michael. >From Michael's suggestion, I use Linear Regression to train the datasets, which achieves what I want. But the CCA result is completely deviates from Linear Regression.
Here's what I do. *# training stages for Linear Regression* *regr = linear_model.LinearRegression()* *regr.fit(features, labels)* *# training stages for CCA* *cca = CCA(n_components = 1, max_iter = 500000000)* *cca.fit(features, labels) * Then *predict* method is used to evaluate final result like the following. *regr.predict(feature)* *cca.predict(feature)* But the result varies quite a lot from different regression methods and linear regression is much more close to the labels. e.g. CCA, LR -472.411445, 31.447136, -164.174335, 32.793054, -108.513509, 33.019758, -143.083823, 20.058607, 2.047881, 35.981544, -335.829902, 30.801075, -341.004312, 30.299629, -340.211106, 26.244057, -165.895824, 32.845650, May I know what's wrong within this? Thanks a lot in advance. 2015-10-20 14:22 GMT+08:00 Michael Eickenberg <michael.eickenb...@gmail.com> : > also, using CCA on a 1D Y is the same as linear regression. So you probably > do that instead > > > On Tuesday, October 20, 2015, Arthur Mensch <arthur.men...@inria.fr> > wrote: > >> Hi Dai, >> >> CCA finds the vectors in x space and Y space that maximizes the >> correlation corr(u' X, v' Y), and continues finding such vectors under the >> constrain that (u_i)_i, (v_i)_i are orthogonal. >> >> As in your case dim Y = 1, you can only set n_components = 1: the vector >> v will be [1], and u will be the linear combination u that maximizes >> corr(u' X, Y). I guess it should be in the doc. >> >> You cannot find more than one pair of vector (u, v) as v is already a >> basis of Y space. Y variability is entirely explained with (u, v) only, >> hence the warning. >> Le 20 oct. 2015 06:52, "Dai Yan" <kanshu...@gmail.com> a écrit : >> >>> Hello, >>> >>> >>> I hope use CCA(Canonical Correlation Analysis) to fit problem set with >>> size of (35000, 117) to its label (35000, 1), 35000 is samples and 117 is >>> feature dimension per sample. >>> >>> Now I have the following two problems. >>> >>> 1) How to choose appropriate CCA n_compoents parameter to fix my samples? >>> >>> >>> >>> 2) When classifying with n_components = 1 or n_components = 2, the fit >>> procedure quits with the following messages, >>> >>> /usr/local/lib/python2.7/dist-packages/sklearn/cross_decomposition/pls_.py:277: >>> UserWarning: Y residual constant at iteration 1 >>> warnings.warn('Y residual constant at iteration %s' % k) >>> >>> And here I paste some of my codes to show CCA initialization parameters. >>> >>> *cca = CCA(max_iter=500000, tol=1e-05)* >>> *cca.fit(features, labels) # features.shape = [35000, 117], labels = >>> [35000, 1]* >>> >>> >>> Could you give me some hints on this? >>> >>> >>> Thanks, >>> >>> Yan >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >>> > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general