What will happen if you LIMIT the result set to 100 rows only -- select <field>from <table> order by field LIMIT 100. Will that work?
How about running the whole query WITHOUT order by? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 30 September 2016 at 17:57, Babak Alipour <babak.alip...@gmail.com> wrote: > Greetings everyone, > > I'm trying to read a single field of a Hive table stored as Parquet in > Spark (~140GB for the entire table, this single field should be just a few > GB) and look at the sorted output using the following: > > sql("SELECT " + field + " FROM MY_TABLE ORDER BY " + field + " DESC") > > But this simple line of code gives: > > Caused by: java.lang.IllegalArgumentException: Cannot allocate a page > with more than 17179869176 bytes > > Same error for: > > sql("SELECT " + field + " FROM MY_TABLE).sort(field) > > and: > > sql("SELECT " + field + " FROM MY_TABLE).orderBy(field) > > > I'm running this on a machine with more than 200GB of RAM, running in > local mode with spark.driver.memory set to 64g. > > I do not know why it cannot allocate a big enough page, and why is it > trying to allocate such a big page in the first place? > > I hope someone with more knowledge of Spark can shed some light on this. > Thank you! > > > *Best regards,* > *Babak Alipour ,* > *University of Florida* >