It seems like our svmlight reader doesn't support spaces between labels:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/_svmlight_format.pyx#L71
Could you report an issue on github?
In the mean time, you can write a small Python script that deletes the
space between labels.
Mathieu
On Fri, Feb 12, 2016 at 11:00 PM, Gunjan Dewan <dewangunjan6...@gmail.com>
wrote:
> Hi Mathieu,
>
> Thanks a lot for the help.
> But even after changing the multilabel option it is giving a value error :
>
>
> File "_svmlight_format.pyx", line 67, in
> sklearn.datasets._svmlight_format._load_svmlight_file
> (sklearn\datasets\_svmlight_format.c:2055)
>
> ValueError: could not convert string to float:
>
>
>
> But this time, it does not show any value after the error. Its blank.
> Any idea why this is happening?
>
>
> Gunjan
>
> On Fri, Feb 12, 2016 at 6:59 PM, Mathieu Blondel <math...@mblondel.org>
> wrote:
>
>> Hi Gunjan,
>>
>> Apparently the dataset is multi-label, so you need to use the
>> multilabel=True option.
>>
>>
>> http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_svmlight_file.html
>>
>> Mathieu
>>
>> On Fri, Feb 12, 2016 at 10:04 PM, Gunjan Dewan <dewangunjan6...@gmail.com
>> > wrote:
>>
>>> Hi all,
>>>
>>> I am using the following dataset from kaggle (train.csv):
>>> https://www.kaggle.com/c/lshtc/data
>>>
>>> The dataset is in libSVM format.
>>>
>>> However while trying to load it using load_svmlight_file, i get the
>>> following error
>>>
>>> File "_svmlight_format.pyx", line 72, in
>>> sklearn.datasets._svmlight_format._load_svmlight_file
>>> (sklearn\datasets\_svmlight_format.c:2120)
>>>
>>> ValueError: could not convert string to float: b'Data'
>>>
>>> I then removed the header but it is still giving me the same value error.
>>> Can anyone please help me out with this?
>>>
>>> I also wanted to know if there is any other way to convert the libSVM
>>> format into 2 matrices.
>>>
>>> Note : I just started out with sklearn and machine learning.
>>>
>>> Thanks,
>>> Gunjan
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>> Monitor end-to-end web transactions and take corrective actions now
>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general