OK. What should the table be? Suppose I have a bunch of parquet files, do I just specify the directory as the table?
On Fri, Jan 1, 2016 at 11:32 PM, UMESH CHAUDHARY <umesh9...@gmail.com> wrote: > Ok, so whats wrong in using : > > var df=HiveContext.sql("Select * from table where id = <userId>") > //filtered data frame > df.count > > On Sat, Jan 2, 2016 at 11:56 AM, SRK <swethakasire...@gmail.com> wrote: > >> Hi, >> >> How to load partial data from hdfs using Spark SQL? Suppose I want to load >> data based on a filter like >> >> "Select * from table where id = <userId>" using Spark SQL with DataFrames, >> how can that be done? The >> >> idea here is that I do not want to load the whole data into memory when I >> use the SQL and I just want to >> >> load the data based on the filter. >> >> >> Thanks! >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-load-partial-data-from-HDFS-using-Spark-SQL-tp25855.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >