Hi,
I am trying to use pca_train but I am running through this error:
ERROR: plpy.SPIError: plpy.SPIError: plpy.SPIError: plpy.SPIError: Function
"madlib.__matrix_densify_sfunc(double precision[],integer,integer,double
precision)": invalid argument - col should be in the range of [0, col_dim)
(seg35 awsaiuirl1178:40003 pid=104068) (plpython.c:4648)
SQL state: XX000
Context: Traceback (most recent call last):
PL/Python function "pca_train", line 23, in <module>
return pca.pca(**globals())
PL/Python function "pca_train", line 404, in pca
PL/Python function "pca_train"
My input table has 15472 rows and two columns; a row_id and an array with
853 features. I am calling pca_train like this:
DROP TABLE if exists ev.hci_subset_pca_output;
SELECT madlib.pca_train( 'ev.hci_subset_pca_input',
'ev.hci_subset_pca_output',
'row_id',
3);
I unfortunately cannot share the data but this is how it looks in pgAdmin3.
Note that pgAmdin3 won't show a feature_vector that it is too large and
this is why it appears to be empty but it isn't as you can see in the
second screenshot.
[image: Inline image 1]
[image: Inline image 3]
I am not sure why I am running through this error. Please advice.
Update: I have renamed feature_vector to "row_vec" and "row_id" starts with
1. Still getting the same error.
Thanks,
--
*Esther Vasiete *
*Data Scientist | Pivotal*
[email protected]