Hi,

I am trying to use pca_train but I am running through this error:

ERROR: plpy.SPIError: plpy.SPIError: plpy.SPIError: plpy.SPIError: Function
"madlib.__matrix_densify_sfunc(double precision[],integer,integer,double
precision)": invalid argument - col should be in the range of [0, col_dim)
 (seg35 awsaiuirl1178:40003 pid=104068) (plpython.c:4648)
SQL state: XX000
Context: Traceback (most recent call last):
  PL/Python function "pca_train", line 23, in <module>
    return pca.pca(**globals())
  PL/Python function "pca_train", line 404, in pca
PL/Python function "pca_train"

My input table has 15472 rows and two columns; a row_id and an array with
853 features. I am calling pca_train like this:

DROP TABLE if exists ev.hci_subset_pca_output;
SELECT madlib.pca_train( 'ev.hci_subset_pca_input',
                                           'ev.hci_subset_pca_output',
                                           'row_id',
                                            3);

I unfortunately cannot share the data but this is how it looks in pgAdmin3.
Note that pgAmdin3 won't show a feature_vector that it is too large and
this is why it appears to be empty but it isn't as you can see in the
second screenshot.

[image: Inline image 1]

[image: Inline image 3]

I am not sure why I am running through this error. Please advice.

Update: I have renamed feature_vector to "row_vec" and "row_id" starts with
1. Still getting the same error.

Thanks,

-- 
*Esther Vasiete *
*Data Scientist | Pivotal*
[email protected]

Reply via email to