Hi Imran,
It seems that you do not cache your underlying DataFrame. I would suggest
to force a cache with tweets.cache() and then tweets.count(). Let us know
if your problem persists.
Best,
Anastasios
On Wed, Jul 19, 2017 at 2:49 PM, Imran Rajjad wrote:
> Greetings,
>
> We are trying out Spark
Greetings,
We are trying out Spark 2 + ThriftServer to join multiple
collections from a Solr Cloud (6.4.x). I have followed this blog
https://lucidworks.com/2015/08/20/solr-spark-sql-datasource/
I understand that initially spark populates the temporary table with 18633014
records and takes its du