Currently your only option is to write (or copy) your own implementations. Logging is definitely intended to be internal use only, and it's best to use your own logging lib - Typesafe scalalogging is a common option that I've used.
As for the VectorUDT, for now that is private. There are no plans to open it up as yet. It should not be too difficult to have your own UDT implementation. What type of extensions are you trying to do with the UDT? Likewise the shared params are for now private. It is a bit annoying to have to re-create them, but most of them are pretty simple so it's not a huge overhead. Perhaps you can add your thoughts & comments to https://issues.apache.org/jira/browse/SPARK-19498 in terms of extending Spark ML. Ultimately I support making it easier to extend. But we do have to balance that with exposing new public APIs and classes that impose backward compat guarantees. Perhaps now is a good time to think about some of the common shared params for example. Thanks Nick On Wed, 22 Feb 2017 at 22:51 Shouheng Yi <sho...@microsoft.com.invalid> wrote: Hi Spark developers, Currently my team at Microsoft is extending Spark’s machine learning functionalities to include new learners and transformers. We would like users to use these within spark pipelines so that they can mix and match with existing Spark learners/transformers, and overall have a native spark experience. We cannot accomplish this using a non-“org.apache” namespace with the current implementation, and we don’t want to release code inside the apache namespace because it’s confusing and there could be naming rights issues. We need to extend several classes from spark which happen to have “private[spark].” For example, one of our class extends VectorUDT[0] which has private[spark] class VectorUDT as its access modifier. This unfortunately put us in a strange scenario that forces us to work under the namespace org.apache.spark. To be specific, currently the private classes/traits we need to use to create new Spark learners & Transformers are HasInputCol, VectorUDT and Logging. We will expand this list as we develop more. Is there a way to avoid this namespace issue? What do other people/companies do in this scenario? Thank you for your help! [0]: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/linalg/VectorUDT.scala Best, Shouheng