Re: Spark SQL Roadmap?

2014-12-13 Thread Matei Zaharia
Spark SQL is already available, the reason for the "alpha component" label is 
that we are still tweaking some of the APIs so we have not yet guaranteed API 
stability for it. However, that is likely to happen soon (possibly 1.3). One of 
the major things added in Spark 1.2 was an external data sources API 
(https://github.com/apache/spark/pull/2475), so we wanted to get a bit of 
feedback on that to provide a stable API for those as well.

Matei

> On Dec 13, 2014, at 5:26 PM, Xiaoyong Zhu  wrote:
> 
> Thanks Denny for your information!
> For #1, what I meant is the Spark SQL beta/official release date (as today it 
> is still in alpha phase)… thought today I see it has most basic 
> functionalities,  I don’t know when will the next milestone happen? i.e. Beta?
> For #2, thanks for the information! I read it and it’s really useful! My take 
> is that, Hive on Spark is still Hive (thus having all the metastore 
> information and Hive interfaces such as the REST APIs), while Spark SQL is 
> the expansion of Spark and use several interfaces (HiveContext for example) 
> to support run Hive queries. Is this correct?
>  
> Then a following question would be, does Spark SQL has some REST APIs, just 
> as what WebHCat exposes, to help users to submit queries remotely, other than 
> logging into a cluster and execute the command in spark-sql command line? 
>  
> Xiaoyong
>   <>
> From: Denny Lee [mailto:denny.g@gmail.com <mailto:denny.g@gmail.com>] 
> Sent: Saturday, December 13, 2014 10:59 PM
> To: Xiaoyong Zhu; user@spark.apache.org <mailto:user@spark.apache.org>
> Subject: Re: Spark SQL Roadmap?
>  
> Hi Xiaoyong,
> 
> SparkSQL has already been released and has been part of the Spark code-base 
> since Spark 1.0.  The latest stable release is Spark 1.1 (here's the Spark 
> SQL Programming Guide 
> <http://spark.apache.org/docs/1.1.0/sql-programming-guide.html>) and we're 
> currently voting on Spark 1.2.
>  
> Hive on Spark is an initiative by Cloudera to help folks whom are already 
> using Hive but instead of using traditional MR it will utilize Spark.  For 
> more information, check 
> outhttp://blog.cloudera.com/blog/2014/07/apache-hive-on-apache-spark-motivations-and-design-principles/
>  
> <http://blog.cloudera.com/blog/2014/07/apache-hive-on-apache-spark-motivations-and-design-principles/>.
>  
> For anyone who is building new projects in Spark, IMHO I would suggest 
> jumping to SparkSQL first.
>  
> HTH!
> Denny
>  
>  
> On Sat Dec 13 2014 at 5:00:56 AM Xiaoyong Zhu  <mailto:xiaoy...@microsoft.com>> wrote:
> Dear spark experts, I am very interested in Spark SQL availability in the 
> future – could someone share with me the information about the following 
> questions?
> 1.   Is there some ETAs for the Spark SQL release?
> 
> 2.   I heard there is a Hive on Spark program also – what’s the 
> difference between Spark SQL and Hive on Spark?
> 
>  
> Thanks!
> Xiaoyong



RE: Spark SQL Roadmap?

2014-12-13 Thread Xiaoyong Zhu
Thanks Denny for your information!
For #1, what I meant is the Spark SQL beta/official release date (as today it 
is still in alpha phase)… thought today I see it has most basic 
functionalities,  I don’t know when will the next milestone happen? i.e. Beta?
For #2, thanks for the information! I read it and it’s really useful! My take 
is that, Hive on Spark is still Hive (thus having all the metastore information 
and Hive interfaces such as the REST APIs), while Spark SQL is the expansion of 
Spark and use several interfaces (HiveContext for example) to support run Hive 
queries. Is this correct?

Then a following question would be, does Spark SQL has some REST APIs, just as 
what WebHCat exposes, to help users to submit queries remotely, other than 
logging into a cluster and execute the command in spark-sql command line?

Xiaoyong

From: Denny Lee [mailto:denny.g@gmail.com]
Sent: Saturday, December 13, 2014 10:59 PM
To: Xiaoyong Zhu; user@spark.apache.org
Subject: Re: Spark SQL Roadmap?

Hi Xiaoyong,

SparkSQL has already been released and has been part of the Spark code-base 
since Spark 1.0.  The latest stable release is Spark 1.1 (here's the Spark SQL 
Programming 
Guide<http://spark.apache.org/docs/1.1.0/sql-programming-guide.html>) and we're 
currently voting on Spark 1.2.

Hive on Spark is an initiative by Cloudera to help folks whom are already using 
Hive but instead of using traditional MR it will utilize Spark.  For more 
information, check out 
http://blog.cloudera.com/blog/2014/07/apache-hive-on-apache-spark-motivations-and-design-principles/.

For anyone who is building new projects in Spark, IMHO I would suggest jumping 
to SparkSQL first.

HTH!
Denny


On Sat Dec 13 2014 at 5:00:56 AM Xiaoyong Zhu 
mailto:xiaoy...@microsoft.com>> wrote:
Dear spark experts, I am very interested in Spark SQL availability in the 
future – could someone share with me the information about the following 
questions?

1.   Is there some ETAs for the Spark SQL release?

2.   I heard there is a Hive on Spark program also – what’s the difference 
between Spark SQL and Hive on Spark?

Thanks!
Xiaoyong


Re: Spark SQL Roadmap?

2014-12-13 Thread Denny Lee
Hi Xiaoyong,

SparkSQL has already been released and has been part of the Spark code-base
since Spark 1.0.  The latest stable release is Spark 1.1 (here's the Spark
SQL Programming Guide
) and we're
currently voting on Spark 1.2.

Hive on Spark is an initiative by Cloudera to help folks whom are already
using Hive but instead of using traditional MR it will utilize Spark.  For
more information, check out
http://blog.cloudera.com/blog/2014/07/apache-hive-on-apache-spark-motivations-and-design-principles/
.

For anyone who is building new projects in Spark, IMHO I would suggest
jumping to SparkSQL first.

HTH!
Denny


On Sat Dec 13 2014 at 5:00:56 AM Xiaoyong Zhu 
wrote:

>  Dear spark experts, I am very interested in Spark SQL availability in
> the future – could someone share with me the information about the
> following questions?
>
> 1.   Is there some ETAs for the Spark SQL release?
>
> 2.   I heard there is a Hive on Spark program also – what’s the
> difference between Spark SQL and Hive on Spark?
>
>
>
> Thanks!
>
> Xiaoyong
>


Spark SQL Roadmap?

2014-12-13 Thread Xiaoyong Zhu
Dear spark experts, I am very interested in Spark SQL availability in the 
future - could someone share with me the information about the following 
questions?

1.   Is there some ETAs for the Spark SQL release?

2.   I heard there is a Hive on Spark program also - what's the difference 
between Spark SQL and Hive on Spark?

Thanks!
Xiaoyong