Re: Query data in subdirectories in Hive Partitions using Spark SQL

2017-02-18 Thread Jon Gregg
Spark has partition discovery if your data is laid out in a parquet-friendly directory structure: http://spark.apache.org/docs/latest/sql-programming-guide.html#partition-discovery You can also use wildcards to get subdirectories (I'm using spark 1.6 here) >> data2 = sqlContext.read.load("/my/data

Re: Query data in subdirectories in Hive Partitions using Spark SQL

2017-02-17 Thread Yan Facai
Hi, Abdelfatah, How to you read these files? spark.read.parquet or spark.sql? Could you show some code? On Wed, Feb 15, 2017 at 8:47 PM, Ahmed Kamal Abdelfatah < ahmed.abdelfa...@careem.com> wrote: > Hi folks, > > > > How can I force spark sql to recursively get data stored in parquet format > f