Our claim is based on TPC-DS results reported in https://www.datamonad.com/post/2022-04-01-spark-hive-performance-1.4/ for 1) and https://www.datamonad.com/post/2021-08-18-spark-mr3/ for 2). Not sure about what you mean by 'Hive ability on handling MR'. One may draw different conclusions from the same experimental results and interpret our TPC-DS results in a different way.
Sungwoo On Sun, Jan 8, 2023 at 10:01 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > What bothers me is that you are making sweeping statements about Spark > inability to handle quote " ... the key weakness of Spark is 1) its poor > performance when executing concurrent queries and 2) its poor resource > utilization when executing multiple Spark applications concurrently" > and conversely overstating Hive ability on handling MR. > In fairness anything published in a public forum is fair game for analysis > or criticism. Thenyou are expected to back it up. I cannot see how anyone > could object to the statement: if you make a claim, be prepared to prove > it. > > I am open minded on this so please clarify the above statement > > HTH > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Sun, 8 Jan 2023 at 05:21, Sungwoo Park <glap...@gmail.com> wrote: > >> >>> [image: image.png] >>> >>> from your posting, the result is amazing. glad to know hive on mr3 has >>> that nice performance. >>> >> >> Hive on MR3 is similar to Hive-LLAP in performance, so we can interpret >> the above result as Hive being much faster than SparkSQL. For executing >> concurrent queries, the performance gap is even greater. In my (rather >> biased) opinion, the key weakness of Spark is 1) its poor performance when >> executing concurrent queries and 2) its poor resource utilization when >> executing multiple Spark applications concurrently. >> >> We released Hive on MR3 1.6 a couple of weeks ago. Now we have backported >> about 700 patches to Hive 3.1. If interested, please check it out: >> https://www.datamonad.com/ >> >> Sungwoo >> >