Re: Article on the correctness of Hive on MR3, Presto, and Impala

2019-06-26 Thread Sungwoo Park
I think yes -- if you would like to scrutinize the results, perhaps sorting and conducting diff would be the best way. If you would like to test the results quickly with a bit of uncertainty allowed, I guess comparing the number of rows would be sufficient because two different results are

Re: Article on the correctness of Hive on MR3, Presto, and Impala

2019-06-26 Thread Edward Capriolo
I like the approach of applying an arbitrary limit. Hive's q files tend to add an ordering to everything. Would it make sense to simply order by multiple columns in the result set and conduct a large diff on them? On Wednesday, June 26, 2019, Sungwoo Park wrote: > I have published a new article

Fwd: Article on the correctness of Hive on MR3, Presto, and Impala

2019-06-26 Thread Sungwoo Park
I have published a new article on the correctness of Hive on MR3, Presto, and Impala: https://mr3.postech.ac.kr/blog/2019/06/26/correctness-hivemr3-presto-impala/ Hope you enjoy reading the article. --- Sungwoo