Hello all,
When I try to use the GPM module in scikit-learn it tells me multiple rows
have similar values -- doesn't gpm work with duplicate rows?
Also when I remove the duplicates even then GPM gives me the same error?
Any suggestions on how can this be made to work?
I have using the following code for deduplication:
def unique_rows(X,y):
order = np.lexsort(X.T)
X = X[order]
y = y[order]
diff = np.diff(X, axis=0)
ui = np.ones(len(X), 'bool')
ui[1:] = (diff != 0).any(axis=1)
X = X[ui]
y = y[ui]
Thanks all !!
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general