Re: Hive 3 has big performance improvement from my test

Sungwoo Park Sun, 08 Jan 2023 18:07:38 -0800

Our claim is based on TPC-DS results reported in
https://www.datamonad.com/post/2022-04-01-spark-hive-performance-1.4/ for 1)
and https://www.datamonad.com/post/2021-08-18-spark-mr3/ for 2). Not sure
about what you mean by 'Hive ability on handling MR'. One may draw
different conclusions from the same experimental results and interpret our
TPC-DS results in a different way.


Sungwoo


On Sun, Jan 8, 2023 at 10:01 PM Mich Talebzadeh <[email protected]>
wrote:

> What bothers me is that you are making sweeping statements about Spark
> inability to handle quote " ... the key weakness of Spark is 1) its poor
> performance when executing concurrent queries and 2) its poor resource
> utilization when executing multiple Spark applications concurrently"
> and conversely overstating Hive ability on handling MR.
> In fairness anything published  in a public forum is fair game for analysis
> or criticism. Thenyou are expected to back it up. I cannot see how anyone
> could object to the statement: if you make a claim, be prepared to prove
> it.
>
> I am open minded on this so please clarify the above statement
>
> HTH
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Sun, 8 Jan 2023 at 05:21, Sungwoo Park <[email protected]> wrote:
>
>>
>>> [image: image.png]
>>>
>>> from your posting, the result is amazing. glad to know hive on mr3 has
>>> that nice performance.
>>>
>>
>> Hive on MR3 is similar to Hive-LLAP in performance, so we can interpret
>> the above result as Hive being much faster than SparkSQL. For executing
>> concurrent queries, the performance gap is even greater. In my (rather
>> biased) opinion, the key weakness of Spark is 1) its poor performance when
>> executing concurrent queries and 2) its poor resource utilization when
>> executing multiple Spark applications concurrently.
>>
>> We released Hive on MR3 1.6 a couple of weeks ago. Now we have backported
>> about 700 patches to Hive 3.1. If interested, please check it out:
>> https://www.datamonad.com/
>>
>> Sungwoo
>>
>

Re: Hive 3 has big performance improvement from my test

Reply via email to