Oh sorry, it is HAWQ 1.3.1. And the data engineer will upgrade to MADlib 1.8 tonight.
Thanks, Esther On Tue, Apr 5, 2016 at 9:26 AM, Frank McQuillan <[email protected]> wrote: > Please clarify the platform - do you mean GPDB 4.2.0? > > Would you be able to upgrade to MADlib 1.8? Then you are using the latest > software and we can see if you still have a problem. > > Frank > > On Tue, Apr 5, 2016 at 9:20 AM, Esther Vasiete <[email protected]> > wrote: > >> I am using MADlib 1.7.1 on HAWQ 4.2.0. >> >> Thanks. >> >> On Mon, Apr 4, 2016 at 8:04 PM, Frank McQuillan <[email protected]> >> wrote: >> >>> Thanks for the question, Esther. What version of MADlib are you using >>> and what database platform and version are you running on? >>> >>> It seems to be a MADlib version lower than 1.8 since the error message >>> you report is different in the 1.8 release. (There was a bug fix in 1.8 to >>> allow user-specified column names in PCA.) >>> >>> Frank >>> >>> >>> >>> >>> >>> On Mon, Apr 4, 2016 at 4:27 PM, Esther Vasiete <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> I am trying to use pca_train but I am running through this error: >>>> >>>> ERROR: plpy.SPIError: plpy.SPIError: plpy.SPIError: plpy.SPIError: >>>> Function "madlib.__matrix_densify_sfunc(double >>>> precision[],integer,integer,double precision)": invalid argument - col >>>> should be in the range of [0, col_dim) (seg35 awsaiuirl1178:40003 >>>> pid=104068) (plpython.c:4648) >>>> SQL state: XX000 >>>> Context: Traceback (most recent call last): >>>> PL/Python function "pca_train", line 23, in <module> >>>> return pca.pca(**globals()) >>>> PL/Python function "pca_train", line 404, in pca >>>> PL/Python function "pca_train" >>>> >>>> My input table has 15472 rows and two columns; a row_id and an array >>>> with 853 features. I am calling pca_train like this: >>>> >>>> DROP TABLE if exists ev.hci_subset_pca_output; >>>> SELECT madlib.pca_train( 'ev.hci_subset_pca_input', >>>> 'ev.hci_subset_pca_output', >>>> 'row_id', >>>> 3); >>>> >>>> I unfortunately cannot share the data but this is how it looks in >>>> pgAdmin3. Note that pgAmdin3 won't show a feature_vector that it is too >>>> large and this is why it appears to be empty but it isn't as you can see in >>>> the second screenshot. >>>> >>>> [image: Inline image 1] >>>> >>>> [image: Inline image 3] >>>> >>>> I am not sure why I am running through this error. Please advice. >>>> >>>> Update: I have renamed feature_vector to "row_vec" and "row_id" starts >>>> with 1. Still getting the same error. >>>> >>>> Thanks, >>>> >>>> -- >>>> *Esther Vasiete * >>>> *Data Scientist | Pivotal* >>>> [email protected] >>>> >>>> >>>> >>> >> >> >> -- >> *Esther Vasiete * >> *Data Scientist | Pivotal* >> [email protected] >> > > -- *Esther Vasiete * *Data Scientist | Pivotal* [email protected]
