[ https://issues.apache.org/jira/browse/SPARK-19683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15876812#comment-15876812 ]
Sean Owen commented on SPARK-19683: ----------------------------------- This is trivial for an application to implement. Unless it is a common format extension and one the core library can use, is there much particular reason for it to be in Spark vs app or external lib? > Support for libsvm-based learning-to-rank format > ------------------------------------------------ > > Key: SPARK-19683 > URL: https://issues.apache.org/jira/browse/SPARK-19683 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib > Affects Versions: 2.1.0 > Reporter: Craig Macdonald > Priority: Minor > > I would like to use Spark for reading/processing Learning to Rank files. The > standard format is an extension of libsvm: > {code} > 0 qid:1 1:2.9 2:9.4 # docid=clueweb09-00-01492 > {code} > Under the mlib API, a LabeledPoint would need an extension called > QueryLabeledPoint. > I would also like to investigate use through the DataFrame, extending the > libsvm source, however many of the classes/methods used there are private > (e.g. LibSVMOptions, Datatype.sameType(), VectorUDT). So would an extension > to handle LTR format be better inside Spark or outside? -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org