2015-02-17 11:25 GMT+01:00 abhishek <abhish...@gmail.com>:
> 4294901761:21  4294902016:18  4294967041:15  4294967296:54
>
> I am unable to understand why should it fail when maxint for python is
> 9223372036854775807.
>
> Is there any workaround available for this? Or is it just not possible to
> load at all?

Python's maxint is unrelated. The SVMlight loader uses 32-bit signed
integers internally so that it can work with older scipy.sparse
implementations. That means it's limited to 2^31-1 ≈ 2e9 features.
Newer SciPy supports 64-bit indices for sparse matrices but we haven't
changed any scikit-learn code to deal with that yet.

As a workaround, you could use a FeatureHasher: write a custom loader
for the format (easy, but probably slower than the one we provide) and
map your huge indices into a more convenient range.

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to