Hi, thanks for answers. I have read answers you provided, but I rather look for some materials on the internals. E.g how the optimizer works, how the query is translated into rdd operations etc. The API I am quite familiar with. A good starting point for me was: Spark DataFrames: Simple and Fast Analysis of Structured Data <https://www.brighttalk.com/webcast/12891/166495?utm_campaign=child-community-webcasts-feed&utm_content=Big+Data+and+Data+Management&utm_source=brighttalk-portal&utm_medium=web&utm_term=>
2015-08-20 18:29 GMT+02:00 Dhaval Patel <dhaval1...@gmail.com>: > Or if you're a python lover then this is a good place - > https://spark.apache.org/docs/1.4.1/api/python/pyspark.sql.html# > > > > On Thu, Aug 20, 2015 at 10:58 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> See also >> http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.package >> >> Cheers >> >> On Thu, Aug 20, 2015 at 7:50 AM, Muhammad Atif <muhammadatif...@gmail.com >> > wrote: >> >>> Hi Dawid >>> >>> The best pace to get started is the Spark SQL Guide from Apache >>> http://spark.apache.org/docs/latest/sql-programming-guide.html >>> >>> Regards >>> Muhammad >>> >>> On Thu, Aug 20, 2015 at 5:46 AM, Dawid Wysakowicz < >>> wysakowicz.da...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> I would like to dip into SparkSQL. Get to know better the architecture, >>>> good practices, some internals. Could you advise me some materials on this >>>> matter? >>>> >>>> Regards >>>> Dawid >>>> >>> >>> >> >