[jira] [Commented] (SPARK-16319) Non-linear (DAG) pipelines need better explanation

Max Moroz (JIRA) Sun, 17 Jul 2016 23:28:05 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-16319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381788#comment-15381788
 ]


Max Moroz commented on SPARK-16319:
-----------------------------------

[~srowen] I'd love to, but best as I understand, the entire mention of DAG 
should be removed. It seems to do nothing. To your point that it checks the DAG 
property, I couldn't find anything like this in the code. It seems the pipeline 
is just executed one step after another, completely ignoring the information 
about in/out columns.

I hope I'm wrong, so if anyone can correct me please lmk.

> Non-linear (DAG) pipelines need better explanation
> --------------------------------------------------
>
>                 Key: SPARK-16319
>                 URL: https://issues.apache.org/jira/browse/SPARK-16319
>             Project: Spark
>          Issue Type: Documentation
>          Components: ML
>    Affects Versions: 2.0.0
>            Reporter: Max Moroz
>            Priority: Minor
>
> There's a 
> [paragraph|http://spark.apache.org/docs/2.0.0-preview/ml-guide.html#details] 
> about non-linear pipeline in the ML docs, but it's not clear how DAG pipeline 
> differs from a linear pipeline, and in fact, it seems that a "DAG Pipeline" 
> results in the behavior identical to that of a regular linear pipeline (the 
> stages are simply applied in the order provided when the pipeline is 
> created). In addition, no checks of input and output columns seem to occur 
> when the pipeline.fit() or pipeline.transform() is called.
> It would be better to clarify in the docs and/or remove that paragraph.
> I'd be happy to write it up, but I have no idea what the intention of this 
> concept is at this point.
> [Additional reference on 
> SO|http://stackoverflow.com/questions/37541668/non-linear-dag-ml-pipelines-in-apache-spark]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-16319) Non-linear (DAG) pipelines need better explanation

Reply via email to