[GitHub] spark pull request: [SPARK-3530][MLLIB] pipeline and parameters wi...

shivaram Tue, 11 Nov 2014 10:21:13 -0800

Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/3099#issuecomment-62591916
  
    @mengxr @mateiz -- From reading the above thread what I see is that 
developers write new transformers, estimators while users are expected to use 
existing transformers (which are a part of Spark) in pipelines.
    
    If the above definition is correct I'd be much more comfortable if 
Transformer, Evaluator etc. a part of `private[ml]` ? From the above comments I 
see that the developer API is not ready for use and traits etc. will only be 
available in `org.apache.spark.ml`. Given that I think it makes sense to make 
it private ?
    
    That would still leave things like `TFIDF` or `Tokenizer` etc. public, and 
users can write something like the `SimpleTextPipeline` [That would be similar 
to MLLib and its fine, but personally I don't see the point of Pipelines where 
you can't write new transformers].



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-3530][MLLIB] pipeline and parameters wi...

Reply via email to