Re: How to load partial data from HDFS using Spark SQL

swetha kasireddy Sat, 02 Jan 2016 00:36:34 -0800

OK. What should the table be? Suppose I have a bunch of parquet files, do I
just specify the directory as the table?


On Fri, Jan 1, 2016 at 11:32 PM, UMESH CHAUDHARY <umesh9...@gmail.com>
wrote:

> Ok, so whats wrong in using :
>
> var df=HiveContext.sql("Select * from table where id = <userId>")
> //filtered data frame
> df.count
>
> On Sat, Jan 2, 2016 at 11:56 AM, SRK <swethakasire...@gmail.com> wrote:
>
>> Hi,
>>
>> How to load partial data from hdfs using Spark SQL? Suppose I want to load
>> data based on a filter like
>>
>> "Select * from table where id = <userId>" using Spark SQL with DataFrames,
>> how can that be done? The
>>
>> idea here is that I do not want to load the whole data into memory when I
>> use the SQL and I just want to
>>
>> load the data based on the filter.
>>
>>
>> Thanks!
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-load-partial-data-from-HDFS-using-Spark-SQL-tp25855.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Re: How to load partial data from HDFS using Spark SQL

Reply via email to