Github user shivaram commented on the pull request:
https://github.com/apache/spark/pull/3099#issuecomment-62591916
@mengxr @mateiz -- From reading the above thread what I see is that
developers write new transformers, estimators while users are expected to use
existing transformers (which are a part of Spark) in pipelines.
If the above definition is correct I'd be much more comfortable if
Transformer, Evaluator etc. a part of `private[ml]` ? From the above comments I
see that the developer API is not ready for use and traits etc. will only be
available in `org.apache.spark.ml`. Given that I think it makes sense to make
it private ?
That would still leave things like `TFIDF` or `Tokenizer` etc. public, and
users can write something like the `SimpleTextPipeline` [That would be similar
to MLLib and its fine, but personally I don't see the point of Pipelines where
you can't write new transformers].
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]