It does do a take. Run explain to make sure that is the case. Why do you think its reading the whole table?
On Mon, Oct 5, 2015 at 1:53 PM, YaoPau <jonrgr...@gmail.com> wrote: > I'm using SqlCtx connected to Hive in CDH 5.4.4. When I run "SELECT * FROM > my_db.my_tbl LIMIT 5", it scans the entire table like Hive would instead of > doing a .take(5) on it and returning results immediately. > > Is there a way to get Spark SQL to use .take(5) instead of the Hive logic > of > scanning the full table when running a SELECT? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-SELECT-LIMIT-scans-the-entire-Hive-table-tp24938.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >