On Mon, May 21, 2018 at 4:37 PM, Quanlong Huang <huang_quanl...@126.com> wrote:
> Hi friends, > > We're trying to benchmark Impala+kudu to compare with other lambda > architectures like Druid. So we hope we can install the latest release > version of Impala (2.12.0) and kudu (1.7.0). However, when following the > installation guide in https://kudu.apache.org/docs/installation.html, we > can only install kudu-1.6.0-cdh5.14.2. Is it possible to install kudu-1.7 > without manual compilation? > That's right -- the installation guide there is just provided as a convenience link to a vendor who provides some binary artifacts. The Apache Kudu project itself only releases source artifacts at this point in time. You'll need to compile manually if you want a binary artifact for your particular operating system. It looks like someone on github has made RPMs available here: https://github.com/MartinWeindel/kudu-rpm . Perhaps this would work for your system? However, note that, per my email on the Impala list, impalad needs to have a libkudu_client from the 'native-toolchain' project so that it is built with the same toolchain as Impala. So, you'll want to use the kudu client bundled with your Impala build and point it at the Kudu server from your own build or the above RPM. > > Besides, I notice that Impala-2.5 is not compatible with kudu-1.6.0 since > the CREATE TABLE syntax for kudu is not recognized. Here's the error: > > Query: create TABLE my_first_table > ( > id BIGINT, > name STRING, > PRIMARY KEY(id) > ) > PARTITION BY HASH PARTITIONS 16 > STORED AS KUDU > TBLPROPERTIES ( > 'kudu.master_addresses' = 'lascorehadoop-15d26' > ) > ERROR: AnalysisException: Syntax error in line 5: > PRIMARY KEY(id) > ^ > Encountered: IDENTIFIER > Expected: ARRAY, BIGINT, BINARY, BOOLEAN, CHAR, DATE, DATETIME, DECIMAL, > REAL, FLOAT, INTEGER, MAP, SMALLINT, STRING, STRUCT, TIMESTAMP, TINYINT, > VARCHAR > > CAUSED BY: Exception: Syntax error > > > Right, I don't recall whether Impala 2.5 supported Kudu at all. If it did, it was a very early version, and the syntax has since changed. For the purpose of benchmarks I would definitely recommend using the latest versions available. > My further questions are > > - Is there a compatibility matrix for Impala and Kudu? > > We don't maintain any such matrix as part of the Apache projects. Doing so would require a lot of testing of multiple versions and it's enough of a time commitment that I don't think anyone has put in the work outside of commercial "downstream" vendors. > > - Is Impala-2.12 compatible with Kudu-1.6.0 and Kudu-1.7.0? > > Kudu itself has maintained wire compatibility, so you should be able to point an Impala 2.12 cluster at either Kudu 1.6 or Kudu 1.7 clusters with success. As above, you'll need to make sure you're using the libkudu_client.so that's built with Impala's toolchain to avoid ABI-related crashes, but that's not a compatibility issue so much as a quirk of how Impala's build works. -Todd