Todd, since you bring it up in this thread... what CDH version do you expect DECIMAL support to make it into? I recently asked Icaro Vazquez about it but still no news. We're hoping it makes it into 5.14 otherwise according to the roadmap there might not be another minor release and we'd be waiting till Summer for CDH 6.
And just in case we're forced to make do without DECIMAL initially, is the recommendation really to store as string and convert? I was thinking of storing as int/long and dividing by 10 or 1000 as needed in an impala view over the kudu table. Wouldn't a division be way more performant than a conversion from string, especially when aggregating over thousands of records in a report query? thanks, -Mauricio On Fri, Jan 5, 2018 at 11:13 AM, Todd Lipcon <t...@cloudera.com> wrote: > Oh, one other piece of feedback: maybe worth editing the title to say "vs > Apache Parquet" instead of "vs Apache Impala" since in all cases you are > using Impala as the query engine? > > -Todd > > On Fri, Jan 5, 2018 at 11:06 AM, Todd Lipcon <t...@cloudera.com> wrote: > >> Hey Boris, >> >> Thanks for publishing this. It's a great look at how an end user >> evaluates Kudu. I appreciate that you cover both the pros and cons of the >> technology, and glad to see that your conclusion leaves you excited about >> Kudu :) >> >> One quick note is that I think you'll be even more pleased when you >> upgrade to a later version (eg Kudu 1.5). We've improved performance in >> several areas and also improved scalability compared to the version you're >> testing. TIMESTAMP is also supported now, with DECIMAL soon to follow. It >> might be worth noting this as an addendum to the blog post if you feel like >> it. >> >> -Todd >> >> On Fri, Jan 5, 2018 at 10:51 AM, Boris Tyukin <bo...@boristyukin.com> >> wrote: >> >>> Hi guys, >>> >>> we just finished testing Kudu, mostly comparing Kudu to Impala on >>> HDFS/parquet. I wanted to share my blog post and results. We used typical >>> (and real) healthcare data for the test, not a synthetic data which I think >>> makes it is a bit more interesting. >>> >>> I welcome any feedback! >>> >>> http://boristyukin.com/benchmarking-apache-kudu-vs-apache-impala/ >>> >>> We are really impressed with Kudu and I wanted to take an opportunity to >>> thank Kudu developers for such an amazing and much-needed product. >>> >>> Boris >>> >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- *MAURICIO ARISTIZABAL* Architect - Business Intelligence + Data Science mauri...@impactradius.com(m)+1 323 309 4260 223 E. De La Guerra St. | Santa Barbara, CA 93101 Overview <http://www.impactradius.com/?src=slsap> | Twitter <https://twitter.com/impactradius> | Facebook <https://www.facebook.com/pages/Impact-Radius/153376411365183> | LinkedIn <https://www.linkedin.com/company/impact-radius-inc->