Do you want to see the code that whole stage codegen produces?

You can prepend a SQL statement with EXPLAIN CODEGEN ...

Or you can add the following code to a DataFrame/Dataset command:

import org.apache.spark.sql.execution.debug._

and call the the debugCodegen() command on a Dataframe/Dataset, for example:

range(0, 100).debugCodegen

...

Found 1 WholeStageCodegen subtrees.

== Subtree 1 / 1 ==

*Range (0, 100, splits=8)


Generated code:

/* 001 */ public Object generate(Object[] references) {

/* 002 */   return new GeneratedIterator(references);

/* 003 */ }

/* 004 */

/* 005 */ final class GeneratedIterator extends
org.apache.spark.sql.execution.BufferedRowIterator {

/* 006 */   private Object[] references;

/* 007 */   private org.apache.spark.sql.execution.metric.SQLMetric
range_numOutputRows;

/* 008 */   private boolean range_initRange;

/* 009 */   private long range_partitionEnd;

...

On Fri, Aug 5, 2016 at 9:55 AM, Maciej Bryński <mac...@brynski.pl> wrote:

> Hi,
> I have some operation on DataFrame / Dataset.
> How can I see source code for whole stage codegen ?
> Is there any API for this ? Or maybe I should configure log4j in specific
> way ?
>
> Regards,
> --
> Maciek Bryński
>

Reply via email to