Re: Impala vs Greenplum

Lei Chang Tue, 02 May 2017 18:17:27 -0700

I remember there is a benchmark several months ago (not public) comparing
hawq and other sql-on-hadoop engines on tpcds benchmark, hawq is much
faster. Different vendors might have different benchmark results since
different tuning are made on different engines. And there were a lot of
discussions around how to improve HAWQ executor before hawq was open
sourced including vectorization, codegen, new hardware et al.


@Michael, I also think it is a good time to discuss how to build a new HAWQ
executor with various new optimizations. This may potentially improve the
query performance a lot.

I have started a JIRA on this topic (
https://issues.apache.org/jira/browse/HAWQ-1450). Hope that we can have a
design and start working on this soon.

Thanks
Lei


On Wed, May 3, 2017 at 6:03 AM, Michael André Pearce <
[email protected]> wrote:

> Indeed the intent was very much less so for the mine is bigger than yours.
>
> But more was to challenge the question of is the result actual, and if so,
> is there ideas or improvements that could be learnt from the approaches
> impala have taken, that could be used in hawq?
>
> https://www.slideshare.net/mobile/cloudera/impala-performance-update
> http://www.sciencedirect.com/science/article/pii/S0164121216302400
>
> Likewise are we benefitting at all from the upstream greenplum sister
> project from, as in code gen?
>
> Yes we know it was greenplum in the results but hawq is its sister, and is
> indicative.
>
> Cheers
> Mike
>
>
>
> Sent from my iPad
>
> On 1 May 2017, at 23:27, Konstantin Boudnik <[email protected]> wrote:
>
> With my Apache hat on, I'd like to say that it is of little, if any at
> all, relevance to the Apache projects what companies like Cloudera say
> about their internal benchmarks.
>
> Apache projects do not compete between each other nor with any
> commercial products. While it is completely ok to say "official
> release of Apache Foo" was x percent faster than "official release of
> Apache Bar" somewhere in Apache Foo's blog or something, it is
> unacceptable for Apache Foo to get into pissing contest with something
> forked from Apache Bar and sold by a commercial entity as a part of
> their offering (sometimes it is even impossible to say what exactly
> the entity in question is selling).
>
> In other words - let's not get into one of these "My Hadoop is bigger
> than yours" [1] moments again.
>
> But by all means - let's discuss the technicalities of bringing more
> efficient code generation code into the project, etc.
>
> [1] https://gigaom.com/2011/12/19/my-hadoop-is-bigger-than-yours/
>
> --
> With regards,
>  Cos
>
> 2CAC 8312 4870 D885 8616  6115 220F 6980 1F27 E622
>
> Disclaimer: Opinions expressed in this email are those of the author,
> and do not necessarily represent the views of any company the author
> might be affiliated with at the moment of writing.
>
>
> On Mon, May 1, 2017 at 2:59 PM, Michael André Pearce
> <[email protected]> wrote:
>
> No doubt if not already seen cloudera announced the following blog
>
>
> http://blog.cloudera.com/blog/2017/04/apache-impala-leads-
> traditional-analytic-database/
>
>
> A clear shot across the bows of hawq.
>
>
> Also how does hawq really compare? There is some old/dated hawq performance
>
> blogs, Should it be something that is updated?
>
>
> For the hawq community it be good to know how long till hawq would get
>
> upstream green plum improvements like codegen.
>
>
> Likewise what features or changes have impala implemented to make it leap
>
> frog greenplum/hawq soo much? Are any of the changes portable to hawq?
>
>
>
>
> Sent from my iPad
>
>

Re: Impala vs Greenplum

Reply via email to