Re: Is anyone else hitting a problem with Sentry when starting the Impala mini cluster?
After investigating, the problem is indeed that Sentry updated the name of the property. I filed a JIRA: https://issues.apache. org/jira/browse/IMPALA-5686 I uploaded a code review that makes a change to our Sentry mini cluster xml template file and started a GVO: https://gerrit.cloudera.org/#/c/7469/ On Wed, Jul 19, 2017 at 4:36 PM, Matthew Jacobswrote: > +Bikram, who just got this in a gerrit-verify-dryrun job > > On Wed, Jul 19, 2017 at 4:32 PM, Taras Bobrovytsky > wrote: > > When I run ./testdata/bin/run-all.sh I get the following: > > Error in Impala/testdata/bin/run-all.sh at line 58: > > $IMPALA_HOME/testdata/bin/run-sentry-service.sh > > > > run-sentry-service.log shows the following: > > 17/07/19 16:22:23 ERROR testutil.SentryServicePinger: Error issuing RPC > to > > Sentry Service (attempt 4/30): > > org.apache.impala.common.InternalException: Error creating Sentry > Service > > client: > > at org.apache.impala.util.SentryPolicyService$SentryServiceClient. > > createClient(SentryPolicyService.java:96) > > at org.apache.impala.util.SentryPolicyService$SentryServiceClient. > > (SentryPolicyService.java:67) > > at > > org.apache.impala.util.SentryPolicyService.listAllRoles(Sent > ryPolicyService.java:391) > > at > > org.apache.impala.testutil.SentryServicePinger.main(SentrySe > rvicePinger.java:75) > > > > Caused by: > > sentry.org.apache.sentry.core.common.exception.MissingConfig > urationException: > > Property 'sentry.service.client.server.rpc-addresses' is missing in > > configuration > > at > > sentry.org.apache.sentry.core.common.transport.SentryPolicyC > lientTransportConfig. > > getSentryServerRpcAddress(SentryPolicyClientTransportConfig.java:73) > > at sentry.org.apache.sentry.core.common.transport.SentryTransportPool. > > (SentryTransportPool.java:103) > > at org.apache.sentry.service.thrift.SentryServiceClientFactory. > >(SentryServiceClientFactory.java:83) > > at org.apache.sentry.service.thrift.SentryServiceClientFactory. > >create(SentryServiceClientFactory.java:65) > > at org.apache.impala.util.SentryPolicyService$SentryServiceClient. > > createClient(SentryPolicyService.java:94) > > ... 3 more > > > > In our fe/src/test/resources/sentry-site.xml.template we have > > sentry.service.client.server.rpc-address instead of > > sentry.service.client.server.rpc-address*es* as it says it in the > > exception. Could this be the problem? >
Re: Problem running Impala built with dynamic linking
Didn't work, heres the output of rm CMakeCache.txt && cmake . -- Setup toolchain link flags -Wl,-rpath,/home/bikram/dev/Impala/toolchain/gcc-4.9.2/lib64 -L/home/bikram/dev/Impala/toolchain/gcc-4.9.2/lib64 -- Build type is DEBUG -- ENABLE_CODE_COVERAGE: false -- Boost version: 1.57.0 -- Found the following Boost libraries: -- thread -- regex -- filesystem -- system -- date_time -- Boost include dir: /home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/include -- Boost libraries: /home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_thread.a/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_regex.a/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_filesystem.a/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_system.a/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_date_time.a -- Found OpenSSL: /usr/lib/x86_64-linux-gnu/libssl.so;/usr/lib/x86_64-linux-gnu/libcrypto.so (found version "1.0.2g") -- --> Adding thirdparty library openssl_ssl. <-- -- Header files: /usr/include -- Added shared library dependency openssl_ssl: /usr/lib/x86_64-linux-gnu/libssl.so -- --> Adding thirdparty library openssl_crypto. <-- -- Added shared library dependency openssl_crypto: /usr/lib/x86_64-linux-gnu/libcrypto.so -- Bzip2: /home/bikram/dev/Impala/toolchain/bzip2-1.0.6-p2/include -- --> Adding thirdparty library bzip2. <-- -- Header files: /home/bikram/dev/Impala/toolchain/bzip2-1.0.6-p2/include -- Added static library dependency bzip2: /home/bikram/dev/Impala/toolchain/bzip2-1.0.6-p2/lib/libbz2.a -- Zlib: /home/bikram/dev/Impala/toolchain/zlib-1.2.8/include -- --> Adding thirdparty library zlib. <-- -- Header files: /home/bikram/dev/Impala/toolchain/zlib-1.2.8/include -- Added static library dependency zlib: /home/bikram/dev/Impala/toolchain/zlib-1.2.8/lib/libz.a -- --> Adding thirdparty library hdfs. <-- -- Header files: /home/bikram/dev/Impala/toolchain/cdh_components/hadoop-2.6.0-cdh5.13.0-SNAPSHOT/include -- Added static library dependency hdfs: /home/bikram/dev/Impala/toolchain/cdh_components/hadoop-2.6.0-cdh5.13.0-SNAPSHOT/lib/native/libhdfs.a -- --> Adding thirdparty library glog. <-- -- Header files: /home/bikram/dev/Impala/toolchain/glog-0.3.4-p2/include -- Added static library dependency glog: /home/bikram/dev/Impala/toolchain/glog-0.3.4-p2/lib/libglog.a -- --> Adding thirdparty library gflags. <-- -- Header files: /home/bikram/dev/Impala/toolchain/gflags-2.2.0-p1/include -- Added static library dependency gflags: /home/bikram/dev/Impala/toolchain/gflags-2.2.0-p1/lib/libgflags.a -- --> Adding thirdparty library pprof. <-- -- Header files: /home/bikram/dev/Impala/toolchain/gperftools-2.5/include -- Added static library dependency pprof: /home/bikram/dev/Impala/toolchain/gperftools-2.5/lib/libprofiler.a -- --> Adding thirdparty library gtest. <-- -- Header files: /home/bikram/dev/Impala/toolchain/gtest-1.6.0/include -- Added static library dependency gtest: /home/bikram/dev/Impala/toolchain/gtest-1.6.0/lib/libgtest.a -- LLVM llvm-config found at: /home/bikram/dev/Impala/toolchain/llvm-3.8.0-p1/bin/llvm-config -- LLVM clang++ found at: /home/bikram/dev/Impala/toolchain/llvm-3.8.0-p1/bin/clang++ -- LLVM opt found at: /home/bikram/dev/Impala/toolchain/llvm-3.8.0-p1/bin/opt -- LLVM_ROOT: /home/bikram/dev/Impala/toolchain/llvm-3.8.0-asserts-p1 -- LLVM llvm-config found at: /home/bikram/dev/Impala/toolchain/llvm-3.8.0-asserts-p1/bin/llvm-config -- LLVM include dir: /home/bikram/dev/Impala/toolchain/llvm-3.8.0-asserts-p1/include -- LLVM lib dir: /home/bikram/dev/Impala/toolchain/llvm-3.8.0-asserts-p1/lib -- --> Adding thirdparty library cyrus_sasl. <-- -- Header files: /usr/include -- Added shared library dependency cyrus_sasl: /usr/lib/x86_64-linux-gnu/libsasl2.so -- --> Adding thirdparty library ldap. <-- -- Header files: /home/bikram/dev/Impala/toolchain/openldap-2.4.25/include -- Added static library dependency ldap: /home/bikram/dev/Impala/toolchain/openldap-2.4.25/lib/libldap.a -- --> Adding thirdparty library lber. <-- -- Added static library dependency lber: /home/bikram/dev/Impala/toolchain/openldap-2.4.25/lib/liblber.a -- --> Adding thirdparty library thrift. <-- -- Header files: /home/bikram/dev/Impala/toolchain/thrift-0.9.0-p9/include -- Added static library dependency thrift: /home/bikram/dev/Impala/toolchain/thrift-0.9.0-p9/lib/libthrift.a -- Thrift version: Thrift version 0.9.0 -- Thrift contrib dir: /home/bikram/dev/Impala/toolchain/thrift-0.9.0-p9 -- Thrift compiler: /home/bikram/dev/Impala/toolchain/thrift-0.9.0-p9/bin/thrift -- Found FLATBUFFERS: /home/bikram/dev/Impala/toolchain/flatbuffers-1.6.0/include -- --> Adding thirdparty library flatbuffers. <-- -- Header files:
Re: Problem running Impala built with dynamic linking
Thanks, lemme try it real quick and I'll get back to you On Wed, Jul 19, 2017 at 5:04 PM, Henry Robinsonwrote: > Can you try: > > cd $IMPALA_HOME > rm CMakeCache.txt > cmake . > > > If that doesn't work, can you send me the output of rm CMakeCache.txt && > cmake . from IMPALA_HOME? > > Thanks, > Henry > > On 19 July 2017 at 17:03, Henry Robinson wrote: > > > Sorry, I read too quickly - you've done that already! Let me take a look. > > > > On 19 July 2017 at 17:01, Henry Robinson wrote: > > > >> Yep, you need to remove the downloaded version of gflags and replace it > >> with a recent toolchain version. See my mail from yesterday for > >> instructions: https://lists.apache.org/api/source.lua/a154f4 > >> 3ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%3Cdev. > >> impala.apache.org%3E > >> > >> On 19 July 2017 at 16:56, Bikramjeet Vig > >> wrote: > >> > >>> After fetching latest from asf-gerrit (that has the toolchain commit > >>> related to gflags) and doing a manual toolchain refresh, I am unable to > >>> run > >>> impala when I build with "make_debug" or "buildall -so", both > statestore > >>> and catalogd show the following error: > >>> > >>> ERROR: something wrong with flag 'flagfile' in file > >>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui > >>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/ > >>> gflags-2.2.0-p1/src/gflags.cc'. > >>> One possibility: file > >>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui > >>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/ > >>> gflags-2.2.0-p1/src/gflags.cc' > >>> is being linked both statically and dynamically into this executable. > >>> > >>> I am only able to make it work if I go with static linking by building > it > >>> with "buildall" without the "-so" > >>> > >>> Anyone facing the same issue? > >>> > >> > >> > > > > > > -- > > Henry Robinson > > Software Engineer > > Cloudera > > 415-994-6679 <(415)%20994-6679> > > >
Re: Problem running Impala built with dynamic linking
Can you try: cd $IMPALA_HOME rm CMakeCache.txt cmake . If that doesn't work, can you send me the output of rm CMakeCache.txt && cmake . from IMPALA_HOME? Thanks, Henry On 19 July 2017 at 17:03, Henry Robinsonwrote: > Sorry, I read too quickly - you've done that already! Let me take a look. > > On 19 July 2017 at 17:01, Henry Robinson wrote: > >> Yep, you need to remove the downloaded version of gflags and replace it >> with a recent toolchain version. See my mail from yesterday for >> instructions: https://lists.apache.org/api/source.lua/a154f4 >> 3ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%3Cdev. >> impala.apache.org%3E >> >> On 19 July 2017 at 16:56, Bikramjeet Vig >> wrote: >> >>> After fetching latest from asf-gerrit (that has the toolchain commit >>> related to gflags) and doing a manual toolchain refresh, I am unable to >>> run >>> impala when I build with "make_debug" or "buildall -so", both statestore >>> and catalogd show the following error: >>> >>> ERROR: something wrong with flag 'flagfile' in file >>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui >>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/ >>> gflags-2.2.0-p1/src/gflags.cc'. >>> One possibility: file >>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui >>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/ >>> gflags-2.2.0-p1/src/gflags.cc' >>> is being linked both statically and dynamically into this executable. >>> >>> I am only able to make it work if I go with static linking by building it >>> with "buildall" without the "-so" >>> >>> Anyone facing the same issue? >>> >> >> > > > -- > Henry Robinson > Software Engineer > Cloudera > 415-994-6679 <(415)%20994-6679> >
Re: Problem running Impala built with dynamic linking
Sorry, I read too quickly - you've done that already! Let me take a look. On 19 July 2017 at 17:01, Henry Robinsonwrote: > Yep, you need to remove the downloaded version of gflags and replace it > with a recent toolchain version. See my mail from yesterday for > instructions: https://lists.apache.org/api/source.lua/ > a154f43ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@% > 3Cdev.impala.apache.org%3E > > On 19 July 2017 at 16:56, Bikramjeet Vig > wrote: > >> After fetching latest from asf-gerrit (that has the toolchain commit >> related to gflags) and doing a manual toolchain refresh, I am unable to >> run >> impala when I build with "make_debug" or "buildall -so", both statestore >> and catalogd show the following error: >> >> ERROR: something wrong with flag 'flagfile' in file >> '/data/jenkins/workspace/verify-impala-toolchain-package- >> build/label/ec2-package-ubuntu-16-04/toolchain/source/ >> gflags/gflags-2.2.0-p1/src/gflags.cc'. >> One possibility: file >> '/data/jenkins/workspace/verify-impala-toolchain-package- >> build/label/ec2-package-ubuntu-16-04/toolchain/source/ >> gflags/gflags-2.2.0-p1/src/gflags.cc' >> is being linked both statically and dynamically into this executable. >> >> I am only able to make it work if I go with static linking by building it >> with "buildall" without the "-so" >> >> Anyone facing the same issue? >> > > -- Henry Robinson Software Engineer Cloudera 415-994-6679
Re: Problem running Impala built with dynamic linking
Yep, you need to remove the downloaded version of gflags and replace it with a recent toolchain version. See my mail from yesterday for instructions: https://lists.apache.org/api/source.lua/a154f43ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%3Cdev.impala.apache.org%3E On 19 July 2017 at 16:56, Bikramjeet Vigwrote: > After fetching latest from asf-gerrit (that has the toolchain commit > related to gflags) and doing a manual toolchain refresh, I am unable to run > impala when I build with "make_debug" or "buildall -so", both statestore > and catalogd show the following error: > > ERROR: something wrong with flag 'flagfile' in file > '/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2- > package-ubuntu-16-04/toolchain/source/gflags/ > gflags-2.2.0-p1/src/gflags.cc'. > One possibility: file > '/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2- > package-ubuntu-16-04/toolchain/source/gflags/ > gflags-2.2.0-p1/src/gflags.cc' > is being linked both statically and dynamically into this executable. > > I am only able to make it work if I go with static linking by building it > with "buildall" without the "-so" > > Anyone facing the same issue? >
Problem running Impala built with dynamic linking
After fetching latest from asf-gerrit (that has the toolchain commit related to gflags) and doing a manual toolchain refresh, I am unable to run impala when I build with "make_debug" or "buildall -so", both statestore and catalogd show the following error: ERROR: something wrong with flag 'flagfile' in file '/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/gflags-2.2.0-p1/src/gflags.cc'. One possibility: file '/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/gflags-2.2.0-p1/src/gflags.cc' is being linked both statically and dynamically into this executable. I am only able to make it work if I go with static linking by building it with "buildall" without the "-so" Anyone facing the same issue?
Re: Is anyone else hitting a problem with Sentry when starting the Impala mini cluster?
+Bikram, who just got this in a gerrit-verify-dryrun job On Wed, Jul 19, 2017 at 4:32 PM, Taras Bobrovytskywrote: > When I run ./testdata/bin/run-all.sh I get the following: > Error in Impala/testdata/bin/run-all.sh at line 58: > $IMPALA_HOME/testdata/bin/run-sentry-service.sh > > run-sentry-service.log shows the following: > 17/07/19 16:22:23 ERROR testutil.SentryServicePinger: Error issuing RPC to > Sentry Service (attempt 4/30): > org.apache.impala.common.InternalException: Error creating Sentry Service > client: > at org.apache.impala.util.SentryPolicyService$SentryServiceClient. > createClient(SentryPolicyService.java:96) > at org.apache.impala.util.SentryPolicyService$SentryServiceClient. > (SentryPolicyService.java:67) > at > org.apache.impala.util.SentryPolicyService.listAllRoles(SentryPolicyService.java:391) > at > org.apache.impala.testutil.SentryServicePinger.main(SentryServicePinger.java:75) > > Caused by: > sentry.org.apache.sentry.core.common.exception.MissingConfigurationException: > Property 'sentry.service.client.server.rpc-addresses' is missing in > configuration > at > sentry.org.apache.sentry.core.common.transport.SentryPolicyClientTransportConfig. > getSentryServerRpcAddress(SentryPolicyClientTransportConfig.java:73) > at sentry.org.apache.sentry.core.common.transport.SentryTransportPool. > (SentryTransportPool.java:103) > at org.apache.sentry.service.thrift.SentryServiceClientFactory. >(SentryServiceClientFactory.java:83) > at org.apache.sentry.service.thrift.SentryServiceClientFactory. >create(SentryServiceClientFactory.java:65) > at org.apache.impala.util.SentryPolicyService$SentryServiceClient. > createClient(SentryPolicyService.java:94) > ... 3 more > > In our fe/src/test/resources/sentry-site.xml.template we have > sentry.service.client.server.rpc-address instead of > sentry.service.client.server.rpc-address*es* as it says it in the > exception. Could this be the problem?
Is anyone else hitting a problem with Sentry when starting the Impala mini cluster?
When I run ./testdata/bin/run-all.sh I get the following: Error in Impala/testdata/bin/run-all.sh at line 58: $IMPALA_HOME/testdata/bin/run-sentry-service.sh run-sentry-service.log shows the following: 17/07/19 16:22:23 ERROR testutil.SentryServicePinger: Error issuing RPC to Sentry Service (attempt 4/30): org.apache.impala.common.InternalException: Error creating Sentry Service client: at org.apache.impala.util.SentryPolicyService$SentryServiceClient. createClient(SentryPolicyService.java:96) at org.apache.impala.util.SentryPolicyService$SentryServiceClient. (SentryPolicyService.java:67) at org.apache.impala.util.SentryPolicyService.listAllRoles(SentryPolicyService.java:391) at org.apache.impala.testutil.SentryServicePinger.main(SentryServicePinger.java:75) Caused by: sentry.org.apache.sentry.core.common.exception.MissingConfigurationException: Property 'sentry.service.client.server.rpc-addresses' is missing in configuration at sentry.org.apache.sentry.core.common.transport.SentryPolicyClientTransportConfig. getSentryServerRpcAddress(SentryPolicyClientTransportConfig.java:73) at sentry.org.apache.sentry.core.common.transport.SentryTransportPool. (SentryTransportPool.java:103) at org.apache.sentry.service.thrift.SentryServiceClientFactory. (SentryServiceClientFactory.java:83) at org.apache.sentry.service.thrift.SentryServiceClientFactory. create(SentryServiceClientFactory.java:65) at org.apache.impala.util.SentryPolicyService$SentryServiceClient. createClient(SentryPolicyService.java:94) ... 3 more In our fe/src/test/resources/sentry-site.xml.template we have sentry.service.client.server.rpc-address instead of sentry.service.client.server.rpc-address*es* as it says it in the exception. Could this be the problem?
Re: What is dictionary filter in Impala?
Hi, The Parquet format supports various encodings that help compress columns of data with different characteristics. Dictionary encoding is useful if there are many repeats of the same value in the same column. E.g. if you have a string column with country names - you might have "Australia", "USA", "China" repeated many times. If there are <= 40,000 distinct values a column can be encoded with a dictionary: at the start of the column there is a dictionary with all of the distinct values, then the data is represented as integers. E.g. if the dictionary was ["Australia", "USA", "China"], then "China" would be encoded as 2. Dictionary filtering takes advantage of this to speed up scans. E.g. if I have a query like "select * from my_table where country = 'Iceland'", then we can check the dictionary for a Parquet row group before scanning the row group. If no entries in the dictionary match the condition, then we can skip the whole row group. On Wed, Jul 19, 2017 at 3:22 AM, Wang Chunlingwrote: > Hi, > > I find there is dictionary filter in Impala when doing Parquet scan. The > comment says the column is 100% dictionary encoded can be dictionary > filtered. Can you explain what kind of columns can be dictionary encoded? > And is there any example of dictionary filter? Thanks a lot. > > > Chunling
Re: jenkins.impala.io switching to SSL
Hi All, I completed the Jenkins reconfiguration that I announced last night. Jenkins can now be reached at https://jenkins.impala.io and all previous URLs redirect there permanently. From now on it will post https:// links in code reviews. Links posted in old code reviews should still work. I found two jobs that were aborted by the restart and I kicked off new build for those. If one of your jobs got killed, please make sure to restart them, too. Unless we discover any issues with the new configuration there should be no more interruptions. Thank you for your patience. Cheers, Lars On Tue, Jul 18, 2017 at 1:43 PM, Lars Volkerwrote: > Hi All, > > Jenkins has been running with SSL for the past few days and I haven't > received any complaints. If no-one objects, tomorrow morning (Wednesday, > PST) I will configure http://jenkins.impala.io:8080/ to redirect to > https://jenkins.impala.io. From that point on, Jenkins will also post > links to its https endpoint in code reviews. > > Let me know if you have any questions or concerns. > > Cheers, Lars > > On Fri, Jul 14, 2017 at 10:55 PM, Lars Volker wrote: > >> Hi All, >> >> our Jenkins instance now has a proper SSL certificate and can be reached >> at https://jenkins.impala.io. The old redirect from http://j.i.o now >> points to the SSL endpoint instead of port 8080. >> >> If you run into any issues with the SSL setup, please let me know. As a >> workaround you can still access Jenkins directly at >> http://jenkins.impala.io:8080/. If no-one reports any issues in the next >> few days, I will eventually make that URL redirect to SSL, too, so all >> connections will be secured. >> >> Cheers, Lars >> > >
What is dictionary filter in Impala?
Hi, I find there is dictionary filter in Impala when doing Parquet scan. The comment says the column is 100% dictionary encoded can be dictionary filtered. Can you explain what kind of columns can be dictionary encoded? And is there any example of dictionary filter? Thanks a lot. Chunling