Hi Eustache,
Although this might be more time consuming than needed, I load a .csv
file using `read_csv` in the `pandas` library
(http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.parsers.read_csv.html).
You will get a dataframe, say DATAFRAME that you can convert to a numpy
array by executing the command, np.array(DATAFRAME).
But I wish for a faster way to do this; +1 for a utility that reads a
.csv file directly into a dense or sparse array.
Thanks
On 7/29/2014 11:22 AM, Eustache DIEMERT wrote:
Hi list,
I've got a large dataset in a CSV or VW [0] format that I want to load
into a sparse matrix (probably CSR).
I haven't found any utilities to do this out of the box.
It seems that `numpy.loadtxt` [1] doesn't take a matrix format.
On the other hand we have a utility for loading libsvm format into CSR
matrices [2].
So my question is : is there some utility or snippet to load a CSV
into CSR that I overlooked ?
If not, I'm pondering to submit a PR to add a utility to read CSV & VW
format into sparse matrices. I'm including VW format as it has some
interesting features: sparsity and keeping features names (a kind of
advanced svmlight format).
What do you think ?
[0] https://github.com/JohnLangford/vowpal_wabbit/wiki/Input-format
[1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html
[2]
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/svmlight_format.py#L253
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general