Thanks Daniel,

I like your answer for #1. It makes sense.

However, I don't get why you say that there are always pending 
transformations... After you call an action, you should be "clean" from pending 
transformations, no?

> On Aug 3, 2017, at 5:53 AM, Daniel Darabos <daniel.dara...@lynxanalytics.com> 
> wrote:
> 
> 
> On Wed, Aug 2, 2017 at 2:16 PM, Jean Georges Perrin <j...@jgp.net 
> <mailto:j...@jgp.net>> wrote:
> Hi Sparkians,
> 
> I understand the lazy evaluation mechanism with transformations and actions. 
> My question is simpler: 1) are show() and/or printSchema() actions? I would 
> assume so...
> 
> show() is an action (it prints data) but printSchema() is not an action. 
> Spark can tell you the schema of the result without computing the result.
> 
> and optional question: 2) is there a way to know if there are transformations 
> "pending"?
>  
> There are always transformations pending :). An RDD or DataFrame is a series 
> of pending transformations. If you say val df = spark.read.csv("foo.csv"), 
> that is a pending transformation. Even spark.emptyDataFrame is best 
> understood as a pending transformation: it does not do anything on the 
> cluster, but records locally what it will have to do on the cluster.

Reply via email to