[
https://issues.apache.org/jira/browse/FLINK-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385437#comment-14385437
]
ASF GitHub Bot commented on FLINK-1717:
---------------------------------------
GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/543
[FLINK-1717] Adds support for libSVM/SVMLight files
Adds support to directly read libSVM/SVMLight files with Apache Flink. The
read file is returned as a ```DataSet``` of ```LabeledVectors```.
This PR is based on #539.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink libsvm
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/543.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #543
----
commit 4c18940bf14f376cdb339d908324e5f2cd4593ad
Author: Till Rohrmann <[email protected]>
Date: 2015-03-25T14:27:58Z
[FLINK-1718] [ml] Adds sparse matrix and sparse vector types
commit f3d021febf0e7796a1f250c2e693d7f9dcbc36e1
Author: Till Rohrmann <[email protected]>
Date: 2015-03-26T16:44:17Z
[ml] Adds convenience functions for Breeze matrix/vector conversion
[ml] Adds breeze to flink-dist LICENSE file
[ml] Optimizes sanity checks in vector/matrix accessors
[ml] Fixes scala check style error with missing whitespaces before and
after +
[ml] Fixes DenseMatrixTest
commit be8ca43b5f11c789b2acfe38127ed542cdea3cd3
Author: Till Rohrmann <[email protected]>
Date: 2015-03-28T17:31:02Z
[FLINK-1717] [ml] Adds support to directly read libSVM and SVMLight files
----
> Add support to read libSVM and SVMLight input files
> ---------------------------------------------------
>
> Key: FLINK-1717
> URL: https://issues.apache.org/jira/browse/FLINK-1717
> Project: Flink
> Issue Type: New Feature
> Components: Machine Learning Library
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Labels: ML
>
> In order to train SVMs, the machine learning library should be able to read
> standard SVM input file formats. A widespread format is used by libSVM and
> SMVLight which has the following format:
> <line> .=. <target> <feature>:<value> <feature>:<value> ... <feature>:<value>
> # <info>
> <target> .=. +1 | -1 | 0 | <float>
> <feature> .=. <integer> | "qid"
> <value> .=. <float>
> <info> .=. <string>
> Details can be found [here|http://svmlight.joachims.org/] and
> [here|http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#/Q03:_Data_preparation]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)