Speeding up Catalyst engine

Maciej Bryński Mon, 24 Jul 2017 09:11:43 -0700

Hi Everyone,
I'm trying to speed up my Spark streaming application and I have following
problem.
I'm using a lot of joins in my app and full catalyst analysis is triggered
during every join.


I found 2 options to speed up.

1) spark.sql.selfJoinAutoResolveAmbiguity  option
But looking at code:
https://github.com/apache/spark/blob/8cd9cdf17a7a4ad6f2eecd7c4b388ca363c20982/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L918

Shouldn't lines 925-927 be before 920-922 ?

2) https://issues.apache.org/jira/browse/SPARK-20392

Is it safe to use it on top of 2.2.0 ?

Regards,
-- 
Maciek Bryński

Speeding up Catalyst engine

Reply via email to