Re: Any plans to migrate Transformer API to Spark SQL (closer to DataFrames)?

2016-03-26 Thread Jacek Laskowski
Hi Joseph, Thanks for the response. I'm one who doesn't understand all the hype/need for Machine Learning...yet and through Spark ML(lib) glasses I'm looking at ML space. In the meantime I've got few assignments (in a project with Spark and Scala) that have required quite extensive dataset

Re: Any plans to migrate Transformer API to Spark SQL (closer to DataFrames)?

2016-03-26 Thread Michał Zieliński
Spark ML Pipelines API (not just Transformers, Estimators and custom Pipelines classes as well) are definitely not just machine-learning specific. We use them heavily in our developement. We're building machine learning pipelines *BUT* many steps involve joining, schema manipulation,

Re: SPARK-13843 and future of streaming backends

2016-03-26 Thread Jacek Laskowski
Hi, Although I'm not that much experienced member of ASF, I share your concerns. I haven't looked at the issue from this point of view, but after having read the thread I think PMC should've signed off the migration of ASF-owned code to a non-ASF repo. At least a vote is required (and this

Re: SPARK-13843 Next steps

2016-03-26 Thread Sean Owen
Looks like this is done; docs have been moved, flume is back in, etc. For the moment Kafka streaming is still in the project and I know there's still discussion about how to manage multiple versions within the project. One other thing we need to finish up is stuff like the namespace of the code

Re: SPARK-13843 and future of streaming backends

2016-03-26 Thread Sean Owen
This has been resolved; see the JIRA and related PRs but also http://apache-spark-developers-list.1001551.n3.nabble.com/SPARK-13843-Next-steps-td16783.html This is not a scenario where a [VOTE] needs to take place, and code changes don't proceed through PMC votes. From the project perspective,

Creating Spark Extras project, was Re: SPARK-13843 and future of streaming backends

2016-03-26 Thread Luciano Resende
I believe some of this has been resolved in the context of some parts that had interest in one extra connector, but we still have a few removed, and as you mentioned, we still don't have a simple way or willingness to manage and be current on new packages like kafka. And based on the fact that

Re: Creating Spark Extras project, was Re: SPARK-13843 and future of streaming backends

2016-03-26 Thread Luciano Resende
On Sat, Mar 26, 2016 at 10:20 AM, Jean-Baptiste Onofré wrote: > Hi Luciano, > > If we take the "pure" technical vision, there's pros and cons of having > spark-extra (or whatever the name we give) still as an Apache project: > > Pro: > - Governance & Quality Insurance: we

Re: Creating Spark Extras project, was Re: SPARK-13843 and future of streaming backends

2016-03-26 Thread Jean-Baptiste Onofré
Hi Luciano, I didn't mean Spark proper, but more something like you proposed. Regards JB On 03/26/2016 06:38 PM, Luciano Resende wrote: On Sat, Mar 26, 2016 at 10:20 AM, Jean-Baptiste Onofré > wrote: Hi Luciano, If we take the "pure"

Re: SPARK-13843 and future of streaming backends

2016-03-26 Thread Mridul Muralidharan
On Saturday, March 26, 2016, Sean Owen wrote: > This has been resolved; see the JIRA and related PRs but also > > http://apache-spark-developers-list.1001551.n3.nabble.com/SPARK-13843-Next-steps-td16783.html > > This change happened subsequent to current thread (thanks