> > It will be great if there are any examples or usecases to look at ? > There are examples in the Spark documentation. Patrick posted and updated copy here so people can see them before 1.0 is released: http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html
> Does this feature has different usecases than shark or more cleaner as > hive dependency is gone? > Depending on how you use this, there is still a dependency on Hive (By default this is not the case. See the above documentation for more details). However, the dependency is on a stock version of Hive instead of one modified by the AMPLab. Furthermore, Spark SQL has its own optimizer, instead of relying on the Hive optimizer. Long term, this is going to give us a lot more flexibility to optimize queries specifically for the Spark execution engine. We are actively porting over the best parts of shark (specifically the in-memory columnar representation). Shark still has some features that are missing in Spark SQL, including SharkServer (and years of testing). Once SparkSQL graduates from Alpha status, it'll likely become the new backend for Shark.