Do you store zero entries explicitly in your CSV format? CSV doesn't strike
me as the best choice for representing sparse data...
M.
On Sun, Aug 31, 2014 at 5:21 PM, Eustache DIEMERT <eusta...@diemert.fr>
wrote:
> @Lars, shouldn't the last line of the for loop be
>
> indptr.append(indptr[-1]+len(nonzero))
>
> rather than
>
> indptr.append(i)
>
> ?
>
> FYI, here is the PR to include your snippet into the doc:
>
> https://github.com/scikit-learn/scikit-learn/pull/3610
>
> Eustache
>
>
> 2014-07-29 11:24 GMT+02:00 Lars Buitinck <larsm...@gmail.com>:
>
> 2014-07-29 10:22 GMT+02:00 Eustache DIEMERT <eusta...@diemert.fr>:
>> > So my question is : is there some utility or snippet to load a CSV into
>> CSR
>> > that I overlooked ?
>>
>> No, but it's not that hard to write [1].
>>
>>
>> >>> import array
>> >>> data = array.array("f")
>> >>> indices = array.array("i")
>> >>> indptr = array.array("i", [0])
>> >>> for i, row in enumerate(csv.reader(f), 1):
>> ... row = np.array(map(float, row))
>> ... n_features = len(row)
>> ... nonzero = np.where(row)[0]
>> ... data.extend(row[nonzero])
>> ... indices.extend(nonzero)
>> ... indptr.append(i)
>> ...
>> >>> X = csr_matrix((data, indices, indptr), dtype=float, shape=(i,
>> n_features))
>>
>>
>> Instead of arrays, you can also use plain lists. Arrays take less
>> space, but they can be a tiny bit slower than lists.
>>
>>
>> [1] https://gist.github.com/larsmans/fe2a289818299dcb094a
>>
>>
>> ------------------------------------------------------------------------------
>> Infragistics Professional
>> Build stunning WinForms apps today!
>> Reboot your WinForms applications with our WinForms controls.
>> Build a bridge from your legacy apps to the future.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds. Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Slashdot TV.
Video for Nerds. Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general