I noticed that in most SQL queries (sqlContext.sql(query)) I ran on
Parquet tables that some results are returned faster after the first and
second run of the query. Is this variation normal i.e. two executions of
the same job can take different times? or there is some intermediate
results being cached? if it is the second option, what is being cached?
I need this in a scientific paper, so it would be great that the
explanation is precise. Thanks!
--
PhD Student - EIS Department - Bonn University, Germany.
Website <http://www.mohamednadjibmami.com>.
LinkedIn <http://fr.linkedin.com/in/mohamednadjibmami>.