[
https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481342#comment-14481342
]
Peter Rudenko edited comment on SPARK-3702 at 4/6/15 4:06 PM:
--------------------------------------------------------------
For trees based algorithms curious whether there would be performance benefit
(assuming reimplementation of Decision tree) by passing directly Dataframe
columns rather than single column with vector type. E.g.:
{code}
class GBT extends Estimator with HasInputCols
val model = new GBT.setInputCols("col1","col2", "col3, ...)
{code}
and split dataset using dataframe api.
was (Author: prudenko):
For trees based algorithms curious whether there would be performance benefit
by passing directly Dataframe columns rather than single column with vector
type. E.g.:
{code}
class GBT extends Estimator with HasInputCols
val model = new GBT.setInputCols("col1","col2", "col3, ...)
{code}
> Standardize MLlib classes for learners, models
> ----------------------------------------------
>
> Key: SPARK-3702
> URL: https://issues.apache.org/jira/browse/SPARK-3702
> Project: Spark
> Issue Type: Sub-task
> Components: MLlib
> Reporter: Joseph K. Bradley
> Assignee: Joseph K. Bradley
> Priority: Blocker
>
> Summary: Create a class hierarchy for learning algorithms and the models
> those algorithms produce.
> This is a super-task of several sub-tasks (but JIRA does not allow subtasks
> of subtasks). See the "requires" links below for subtasks.
> Goals:
> * give intuitive structure to API, both for developers and for generated
> documentation
> * support meta-algorithms (e.g., boosting)
> * support generic functionality (e.g., evaluation)
> * reduce code duplication across classes
> [Design doc for class hierarchy |
> https://docs.google.com/document/d/1BH9el33kBX8JiDdgUJXdLW14CA2qhTCWIG46eXZVoJs]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]