[
https://issues.apache.org/jira/browse/SPARK-16319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15356718#comment-15356718
]
Sean Owen commented on SPARK-16319:
-----------------------------------
(Fix the title please?)
Here a "linear" pipeline is just a special DAG like A -> B -> C. In general, a
DAG's nodes might have multiple children or parents. Not all DAGs are linear,
of course. That much seems clear; what would you like to clarify? is it the
"topological" ordering? just means that any dependent stage must come after all
its predecessors.
> Pipeline / DAG
> --------------
>
> Key: SPARK-16319
> URL: https://issues.apache.org/jira/browse/SPARK-16319
> Project: Spark
> Issue Type: Documentation
> Components: ML
> Affects Versions: 2.0.0
> Reporter: Max Moroz
> Priority: Minor
>
> There's a
> [paragraph|http://spark.apache.org/docs/2.0.0-preview/ml-guide.html#details]
> about non-linear pipeline in the ML docs, but it's not clear how DAG pipeline
> differs from a linear pipeline, and in fact, it seems that a "DAG Pipeline"
> results in the behavior identical to that of a regular linear pipeline (the
> stages are simply applied in the order provided when the pipeline is
> created). In addition, no checks of input and output columns seem to occur
> when the pipeline.fit() or pipeline.transform() is called.
> It would be better to clarify in the docs and/or remove that paragraph.
> I'd be happy to write it up, but I have no idea what the intention of this
> concept is at this point.
> [Additional reference on
> SO|http://stackoverflow.com/questions/37541668/non-linear-dag-ml-pipelines-in-apache-spark]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]