hi,
if you are already running hive with tez,the perf gain won't be obvious
camparing with spark.
I'd recommend experimenting with spark on something new until a better
understanding is formed

Manu Jacob <manu.ja...@sas.com>于2020年10月6日 周二23:47写道:

> Hi All,
>
>
>
> Not sure if I need to ask this question on hive community or spark
> community.
>
>
>
> We have a set of hive scripts that runs on EMR (Tez engine). We would like
> to experiment by moving some of it onto Spark. We are planning to
> experiment with two options.
>
>
>    1. Use the current code based on HQL, with engine set as spark.
>    2. Write pure spark code in scala/python using SparkQL and hive
>    integration.
>
>
>
> The first approach helps us to transition to Spark quickly but not sure if
> this is the best approach in terms of performance.  Could not find any
> reasonable comparisons of this two approaches.  It looks like writing pure
> Spark code, gives us more control to add logic and also control some of the
> performance features, for example things like caching/evicting etc.
>
>
>
>
>
> Any advise on this is much appreciated.
>
>
>
>
>
> Thanks,
>
> -Manu
>

Reply via email to