2014-07-29 10:22 GMT+02:00 Eustache DIEMERT <eusta...@diemert.fr>: > So my question is : is there some utility or snippet to load a CSV into CSR > that I overlooked ?
No, but it's not that hard to write [1]. >>> import array >>> data = array.array("f") >>> indices = array.array("i") >>> indptr = array.array("i", [0]) >>> for i, row in enumerate(csv.reader(f), 1): ... row = np.array(map(float, row)) ... n_features = len(row) ... nonzero = np.where(row)[0] ... data.extend(row[nonzero]) ... indices.extend(nonzero) ... indptr.append(i) ... >>> X = csr_matrix((data, indices, indptr), dtype=float, shape=(i, n_features)) Instead of arrays, you can also use plain lists. Arrays take less space, but they can be a tiny bit slower than lists. [1] https://gist.github.com/larsmans/fe2a289818299dcb094a ------------------------------------------------------------------------------ Infragistics Professional Build stunning WinForms apps today! Reboot your WinForms applications with our WinForms controls. Build a bridge from your legacy apps to the future. http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general