I remember there is a benchmark several months ago (not public) comparing hawq and other sql-on-hadoop engines on tpcds benchmark, hawq is much faster. Different vendors might have different benchmark results since different tuning are made on different engines. And there were a lot of discussions around how to improve HAWQ executor before hawq was open sourced including vectorization, codegen, new hardware et al.
@Michael, I also think it is a good time to discuss how to build a new HAWQ executor with various new optimizations. This may potentially improve the query performance a lot. I have started a JIRA on this topic ( https://issues.apache.org/jira/browse/HAWQ-1450). Hope that we can have a design and start working on this soon. Thanks Lei On Wed, May 3, 2017 at 6:03 AM, Michael André Pearce < [email protected]> wrote: > Indeed the intent was very much less so for the mine is bigger than yours. > > But more was to challenge the question of is the result actual, and if so, > is there ideas or improvements that could be learnt from the approaches > impala have taken, that could be used in hawq? > > https://www.slideshare.net/mobile/cloudera/impala-performance-update > http://www.sciencedirect.com/science/article/pii/S0164121216302400 > > Likewise are we benefitting at all from the upstream greenplum sister > project from, as in code gen? > > Yes we know it was greenplum in the results but hawq is its sister, and is > indicative. > > Cheers > Mike > > > > Sent from my iPad > > On 1 May 2017, at 23:27, Konstantin Boudnik <[email protected]> wrote: > > With my Apache hat on, I'd like to say that it is of little, if any at > all, relevance to the Apache projects what companies like Cloudera say > about their internal benchmarks. > > Apache projects do not compete between each other nor with any > commercial products. While it is completely ok to say "official > release of Apache Foo" was x percent faster than "official release of > Apache Bar" somewhere in Apache Foo's blog or something, it is > unacceptable for Apache Foo to get into pissing contest with something > forked from Apache Bar and sold by a commercial entity as a part of > their offering (sometimes it is even impossible to say what exactly > the entity in question is selling). > > In other words - let's not get into one of these "My Hadoop is bigger > than yours" [1] moments again. > > But by all means - let's discuss the technicalities of bringing more > efficient code generation code into the project, etc. > > [1] https://gigaom.com/2011/12/19/my-hadoop-is-bigger-than-yours/ > > -- > With regards, > Cos > > 2CAC 8312 4870 D885 8616 6115 220F 6980 1F27 E622 > > Disclaimer: Opinions expressed in this email are those of the author, > and do not necessarily represent the views of any company the author > might be affiliated with at the moment of writing. > > > On Mon, May 1, 2017 at 2:59 PM, Michael André Pearce > <[email protected]> wrote: > > No doubt if not already seen cloudera announced the following blog > > > http://blog.cloudera.com/blog/2017/04/apache-impala-leads- > traditional-analytic-database/ > > > A clear shot across the bows of hawq. > > > Also how does hawq really compare? There is some old/dated hawq performance > > blogs, Should it be something that is updated? > > > For the hawq community it be good to know how long till hawq would get > > upstream green plum improvements like codegen. > > > Likewise what features or changes have impala implemented to make it leap > > frog greenplum/hawq soo much? Are any of the changes portable to hawq? > > > > > Sent from my iPad > >
