[ 
https://issues.apache.org/jira/browse/FLINK-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385437#comment-14385437
 ] 

ASF GitHub Bot commented on FLINK-1717:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/543

    [FLINK-1717] Adds support for libSVM/SVMLight files

    Adds support to directly read libSVM/SVMLight files with Apache Flink. The 
read file is returned as a ```DataSet``` of ```LabeledVectors```. 
    
    This PR is based on #539.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink libsvm

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/543.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #543
    
----
commit 4c18940bf14f376cdb339d908324e5f2cd4593ad
Author: Till Rohrmann <[email protected]>
Date:   2015-03-25T14:27:58Z

    [FLINK-1718] [ml] Adds sparse matrix and sparse vector types

commit f3d021febf0e7796a1f250c2e693d7f9dcbc36e1
Author: Till Rohrmann <[email protected]>
Date:   2015-03-26T16:44:17Z

    [ml] Adds convenience functions for Breeze matrix/vector conversion
    
    [ml] Adds breeze to flink-dist LICENSE file
    
    [ml] Optimizes sanity checks in vector/matrix accessors
    
    [ml] Fixes scala check style error with missing whitespaces before and 
after +
    
    [ml] Fixes DenseMatrixTest

commit be8ca43b5f11c789b2acfe38127ed542cdea3cd3
Author: Till Rohrmann <[email protected]>
Date:   2015-03-28T17:31:02Z

    [FLINK-1717] [ml] Adds support to directly read libSVM and SVMLight files

----


> Add support to read libSVM and SVMLight input files
> ---------------------------------------------------
>
>                 Key: FLINK-1717
>                 URL: https://issues.apache.org/jira/browse/FLINK-1717
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>              Labels: ML
>
> In order to train SVMs, the machine learning library should be able to read 
> standard SVM input file formats. A widespread format is used by libSVM and 
> SMVLight which has the following format:
> <line> .=. <target> <feature>:<value> <feature>:<value> ... <feature>:<value> 
> # <info>
> <target> .=. +1 | -1 | 0 | <float> 
> <feature> .=. <integer> | "qid"
> <value> .=. <float>
> <info> .=. <string>
> Details can be found [here|http://svmlight.joachims.org/] and 
> [here|http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#/Q03:_Data_preparation]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to