Hi Todd, Thank you for your answer.
>> *That said, it hasn't been a high priority relative to other performance areas we're exploring*. Could I know some main idea about performance? I tried to search them on Kudu Jira, but failed to find. Last question. in your answer: *>> Regarding sharing memory, the first step which is already implemented is to share a common in-memory layout.* *>> ...* *>> Currently, as mentioned above, the data (in the common format) is transferred from Kudu to Impala via a localhost TCP socket.* So, that means they send/receive same in-memory format using TCP socket not memcpy(). Am I right? Then, I thought Impala is just a client program of Kudu using Kudu client library. How 'sharing in-memory layout" can be achieved using Kudu client library? Can I know conceptual Idea? Actually we don't use Impala for our service but we've made Java Program to serve our service. I wanted to know same idea can be applied to our client program. Thanks. Regards, Jason 2017-06-30 3:53 GMT+09:00 Todd Lipcon <t...@cloudera.com>: > Hey Jason, > > Answers inline below > > On Thu, Jun 29, 2017 at 2:52 AM, Jason Heo <jason.heo....@gmail.com> > wrote: > >> Hi, >> >> Q1. >> >> After reading Druid vs Kudu >> <http://druid.io/docs/latest/comparisons/druid-vs-kudu.html>, I wondered >> Druid has aggregation push down. >> >> *Druid includes its own query layer that allows it to push down >>> aggregations and computations directly to data nodes for faster query >>> processing. * >> >> >> If I understand "Aggregation Push down" correctly, it seems that partial >> aggregation is done by data node side, so that only small amount of result >> set can be transferred to a client which could lead to great performance >> gain. (Am I right?) >> > > That's right. > > >> >> So I wanted to know if Apache Kudu has a plan for Aggregation push down >> scan feature (Or already has it) >> > > It currently doesn't have it, and there aren't any current plans to do so. > > Usually, we assume that Kudu tablet servers are collocated with either > Impala daemons or Spark executors. So, it's less important to provide > pushdown into the tablet server itself, since even without it, we are > typically avoiding any network transfer from the TS into the execution > environment which does the aggregation. > > The above would be less true if there were some way in which Kudu itself > could perform the aggregation more efficiently based on knowledge of its > underlying storage. For example, one could imagine doing a GROUP BY > 'foo_col' more efficiently within Kudu if the column is dictionary-encoded > by aggregating on the code-words rather than the resulting decoded strings, > since the integer code words are fixed length and faster to hash, compare, > etc. > > That said, it hasn't been a high priority relative to other performance > areas we're exploring. > > >> >> Q2. >> >> One thing that I concern when using Impala+Kudu is that all matching rows >> should transferred to impala process from kudu tserver. Usually Impala and >> Kudu tserver run on same node. So It would be happy If Impala can read Kudu >> Tablet directly. Any plan for this kind of features? >> >> How-to: Use Impala and Kudu Together for Analytic Workloads >> <https://blog.cloudera.com/blog/2016/04/how-to-use-impala-and-kudu-together-for-analytic-workloads/> >> says that: >> >> *we intend to implement the Apache Arrow in-memory data format and to >>> share memory between Kudu and Impala, which we expect will help with >>> performance and resource usage.* >>> >> >> What does "share memory between Kudu and Impala"? Does this already >> implemented? >> > > Yes, currently all matching rows are transferred from the Kudu TS to the > Impala daemon. Impala schedules scanners for locality, though, so this is a > localhost-scoped connection which is quite fast. To give you a sense of the > speed, I just tested a localhost TCP connection using 'iperf' and measured > ~6GB/sec on a single core. Although this is significantly slower than a > within-process memcpy, it's still fast enough that it usually represents a > small fraction of the overall CPU consumption of a query. > > Regarding sharing memory, the first step which is already implemented is > to share a common in-memory layout. That is to say, the format in which > Kudu returns rows over the wire to the client matches the same format that > Impala expects its rows in memory. So, when it receives a block of rows in > the scanner, it doesn't have to "parse" or "convert" them to some other > format. It can simply interpret the data in place. This saves a lot of CPU. > > Using something like Arrow would be even more efficient than the current > format since it is columnar rather than row-oriented. However, Impala > currently does not use a columnar format for its operator pipeline, so we > can't currently make use of Arrow to optimize the interchange. > > Currently, as mentioned above, the data (in the common format) is > transferred from Kudu to Impala via a localhost TCP socket. A few years ago > we had an intern who experimented with using a Unix domain socket and found > some small speedup. He also experimented with setting up a shared memory > region and also found another small speedup over the domain socket. > However, there was a lot of complexity involved in this code (particularly > the shared memory approach) relative to the gain that we saw, so we didn't > end up merging it before his internship ended :) > > -Todd > -- > Todd Lipcon > Software Engineer, Cloudera >