Re: new Catalyst/SQL component merged into master

2014-03-25 Thread Evan Chan
HI Michael, It's not publicly available right now, though we can probably chat about it offline. It's not a super novel concept or anything, in fact I had proposed it a long time ago on the mailing lists. -Evan On Mon, Mar 24, 2014 at 1:34 PM, Michael Armbrust wrote: > Hi Evan, > > Index supp

Re: new Catalyst/SQL component merged into master

2014-03-24 Thread Michael Armbrust
Hi Evan, Index support is definitely something we would like to add, and it is possible that adding support for your custom indexing solution would not be too difficult. We already push predicates into hive table scan operators when the predicates are over partition keys. You can see an example

Re: new Catalyst/SQL component merged into master

2014-03-24 Thread Usman Ghani
How does it compare against Shark, and what is the future of Shark with this new module in place? On Sun, Mar 23, 2014 at 11:49 PM, Evan Chan wrote: > Hi Michael, > > Congrats, this is really neat! > > What thoughts do you have regarding adding indexing support and > predicate pushdown to this

Re: new Catalyst/SQL component merged into master

2014-03-23 Thread Evan Chan
Hi Michael, Congrats, this is really neat! What thoughts do you have regarding adding indexing support and predicate pushdown to this SQL framework?Right now we have custom bitmap indexing to speed up queries, so we're really curious as far as the architectural direction. -Evan On Fri, Mar

Re: new Catalyst/SQL component merged into master

2014-03-21 Thread Michael Armbrust
> > It will be great if there are any examples or usecases to look at ? > There are examples in the Spark documentation. Patrick posted and updated copy here so people can see them before 1.0 is released: http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html > Does this feat

Re: new Catalyst/SQL component merged into master

2014-03-21 Thread Debasish Das
Awesome news ! It will be great if there are any examples or usecases to look at ? We are looking into shark/ooyala job server to give in memory sql analytics, model serving/scoring features for dashboard apps... Does this feature has different usecases than shark or more cleaner as hive depende

Re: new Catalyst/SQL component merged into master

2014-03-21 Thread Matei Zaharia
Congrats Michael and all for getting this so far. Spark SQL and Catalyst will make it much easier to use structured data in Spark, and open the door for some very cool extensions later. Matei On Mar 20, 2014, at 11:15 PM, Heiko Braun wrote: > Congrats! That's a really impressive and useful ad

Re: new Catalyst/SQL component merged into master

2014-03-20 Thread Heiko Braun
Congrats! That's a really impressive and useful addition to spark. I just recently discovered a similar feature in pandas and really enjoyed using it. Regards, Heiko > Am 21.03.2014 um 02:11 schrieb Reynold Xin : > > Hi All, > > I'm excited to announce a new module in Spark (SPARK-1251). A

Re: new Catalyst/SQL component merged into master

2014-03-20 Thread Michael Armbrust
Hi Everyone, I'm very excited about merging this new feature into Spark! We have a lot of cool things in the pipeline, including: porting Shark's in-memory columnar format to Spark SQL, code-generation for expression evaluation and improved support for complex types in parquet. I would love to h

new Catalyst/SQL component merged into master

2014-03-20 Thread Reynold Xin
Hi All, I'm excited to announce a new module in Spark (SPARK-1251). After an initial review we've merged this as Spark as an alpha component to be included in Spark 1.0. This new component adds some exciting features, including: - schema-aware RDD programming via an experimental DSL - native Parq