Re: Spark SQL "SELECT ... LIMIT" scans the entire Hive table?

Michael Armbrust Mon, 05 Oct 2015 14:36:57 -0700

It does do a take.  Run explain to make sure that is the case.  Why do you
think its reading the whole table?


On Mon, Oct 5, 2015 at 1:53 PM, YaoPau <jonrgr...@gmail.com> wrote:

> I'm using SqlCtx connected to Hive in CDH 5.4.4.  When I run "SELECT * FROM
> my_db.my_tbl LIMIT 5", it scans the entire table like Hive would instead of
> doing a .take(5) on it and returning results immediately.
>
> Is there a way to get Spark SQL to use .take(5) instead of the Hive logic
> of
> scanning the full table when running a SELECT?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-SELECT-LIMIT-scans-the-entire-Hive-table-tp24938.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Spark SQL "SELECT ... LIMIT" scans the entire Hive table?

Reply via email to