My $0.02 -- this isn't worthwhile.

Yes, there are ML-in-SQL tools. I'm thinking of MADlib for example. I think
these hold over from days when someone's only interface to a data warehouse
was SQL, and so there had to be SQL-language support for invoking ML jobs.
There was no programmatic alternative.

There's nothing particularly helpful about SQL as a language for expressing
this, versus simply writing operations in a high-level programming language.

Spark is that programmatic paradigm, and offers a more general way to
express ETL, ML and SQL within their own appropriate DSLs. There's no need
to also shoehorn Spark ML into Spark SQL.

I also think there's a bit of false abstraction here. The nice thing about
SQL-only access to these functions is it sounds much simpler, and
accessible to people that only know SQL and nothing about Python or JVMs.
In practice, using Spark means having some basic awareness of its
distributed execution environment. SQL-only analysts would struggle to be
effective with SQL-only access to Spark.

On Fri, Aug 31, 2018 at 5:05 AM Hemant Bhanawat <hemant9...@gmail.com>
wrote:

> We allow our users to interact with spark cluster using SQL queries only.
> That's easy for them. MLLib does not have SQL extensions and we cannot
> expose it to our users.
>
> SQL extensions can further accelerate MLLib's adoption. See
> https://cloud.google.com/bigquery/docs/bigqueryml-intro.
>
> Hemant
>
>
> On Thu, Aug 30, 2018 at 9:41 PM William Benton <wi...@redhat.com> wrote:
>
>> What are you interested in accomplishing?
>>
>> The spark.ml package has provided a machine learning API based on
>> DataFrames for quite some time.  If you are interested in mixing query
>> processing and machine learning, this is certainly the best place to start.
>>
>> See here:  https://spark.apache.org/docs/latest/ml-guide.html
>>
>>
>> best,
>> wb
>>
>>
>>
>> On Thu, Aug 30, 2018 at 1:45 AM Hemant Bhanawat <hemant9...@gmail.com>
>> wrote:
>>
>>> Is there a plan to support SQL extensions for mllib? Or is there an
>>> effort already underway?
>>>
>>> Any information is appreciated.
>>>
>>> Thanks in advance.
>>> Hemant
>>>
>>

Reply via email to