Re: Is anyone else hitting a problem with Sentry when starting the Impala mini cluster?

2017-07-19 Thread Taras Bobrovytsky
After investigating, the problem is indeed that Sentry updated the name of
the property. I filed a JIRA: https://issues.apache.
org/jira/browse/IMPALA-5686

I uploaded a code review that makes a change to our Sentry mini cluster xml
template file and started a GVO: https://gerrit.cloudera.org/#/c/7469/


On Wed, Jul 19, 2017 at 4:36 PM, Matthew Jacobs  wrote:

> +Bikram, who just got this in a gerrit-verify-dryrun job
>
> On Wed, Jul 19, 2017 at 4:32 PM, Taras Bobrovytsky
>  wrote:
> > When I run ./testdata/bin/run-all.sh I get the following:
> > Error in Impala/testdata/bin/run-all.sh at line 58:
> > $IMPALA_HOME/testdata/bin/run-sentry-service.sh
> >
> > run-sentry-service.log shows the following:
> > 17/07/19 16:22:23 ERROR testutil.SentryServicePinger: Error issuing RPC
> to
> > Sentry Service  (attempt 4/30):
> > org.apache.impala.common.InternalException: Error creating Sentry
> Service
> > client:
> >   at org.apache.impala.util.SentryPolicyService$SentryServiceClient.
> > createClient(SentryPolicyService.java:96)
> >   at org.apache.impala.util.SentryPolicyService$SentryServiceClient.
> > (SentryPolicyService.java:67)
> >   at
> > org.apache.impala.util.SentryPolicyService.listAllRoles(Sent
> ryPolicyService.java:391)
> >   at
> > org.apache.impala.testutil.SentryServicePinger.main(SentrySe
> rvicePinger.java:75)
> >
> > Caused by:
> > sentry.org.apache.sentry.core.common.exception.MissingConfig
> urationException:
> >   Property 'sentry.service.client.server.rpc-addresses' is missing in
> > configuration
> >   at
> > sentry.org.apache.sentry.core.common.transport.SentryPolicyC
> lientTransportConfig.
> > getSentryServerRpcAddress(SentryPolicyClientTransportConfig.java:73)
> >   at sentry.org.apache.sentry.core.common.transport.SentryTransportPool.
> > (SentryTransportPool.java:103)
> >   at org.apache.sentry.service.thrift.SentryServiceClientFactory.
> >(SentryServiceClientFactory.java:83)
> >   at org.apache.sentry.service.thrift.SentryServiceClientFactory.
> >create(SentryServiceClientFactory.java:65)
> >   at org.apache.impala.util.SentryPolicyService$SentryServiceClient.
> > createClient(SentryPolicyService.java:94)
> >   ... 3 more
> >
> > In our fe/src/test/resources/sentry-site.xml.template we have
> > sentry.service.client.server.rpc-address instead of
> > sentry.service.client.server.rpc-address*es* as it says it in the
> > exception. Could this be the problem?
>


Re: Problem running Impala built with dynamic linking

2017-07-19 Thread Bikramjeet Vig
Didn't work, heres the output of rm CMakeCache.txt && cmake .


-- Setup toolchain link flags
-Wl,-rpath,/home/bikram/dev/Impala/toolchain/gcc-4.9.2/lib64
-L/home/bikram/dev/Impala/toolchain/gcc-4.9.2/lib64
-- Build type is DEBUG
-- ENABLE_CODE_COVERAGE: false
-- Boost version: 1.57.0
-- Found the following Boost libraries:
--   thread
--   regex
--   filesystem
--   system
--   date_time
-- Boost include dir:
/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/include
-- Boost libraries:
/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_thread.a/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_regex.a/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_filesystem.a/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_system.a/home/bikram/dev/Impala/toolchain/boost-1.57.0-p3/lib/libboost_date_time.a
-- Found OpenSSL:
/usr/lib/x86_64-linux-gnu/libssl.so;/usr/lib/x86_64-linux-gnu/libcrypto.so
(found version "1.0.2g")
-- --> Adding thirdparty library openssl_ssl. <--
-- Header files: /usr/include
-- Added shared library dependency openssl_ssl:
/usr/lib/x86_64-linux-gnu/libssl.so
-- --> Adding thirdparty library openssl_crypto. <--
-- Added shared library dependency openssl_crypto:
/usr/lib/x86_64-linux-gnu/libcrypto.so
-- Bzip2: /home/bikram/dev/Impala/toolchain/bzip2-1.0.6-p2/include
-- --> Adding thirdparty library bzip2. <--
-- Header files: /home/bikram/dev/Impala/toolchain/bzip2-1.0.6-p2/include
-- Added static library dependency bzip2:
/home/bikram/dev/Impala/toolchain/bzip2-1.0.6-p2/lib/libbz2.a
-- Zlib: /home/bikram/dev/Impala/toolchain/zlib-1.2.8/include
-- --> Adding thirdparty library zlib. <--
-- Header files: /home/bikram/dev/Impala/toolchain/zlib-1.2.8/include
-- Added static library dependency zlib:
/home/bikram/dev/Impala/toolchain/zlib-1.2.8/lib/libz.a
-- --> Adding thirdparty library hdfs. <--
-- Header files:
/home/bikram/dev/Impala/toolchain/cdh_components/hadoop-2.6.0-cdh5.13.0-SNAPSHOT/include
-- Added static library dependency hdfs:
/home/bikram/dev/Impala/toolchain/cdh_components/hadoop-2.6.0-cdh5.13.0-SNAPSHOT/lib/native/libhdfs.a
-- --> Adding thirdparty library glog. <--
-- Header files: /home/bikram/dev/Impala/toolchain/glog-0.3.4-p2/include
-- Added static library dependency glog:
/home/bikram/dev/Impala/toolchain/glog-0.3.4-p2/lib/libglog.a
-- --> Adding thirdparty library gflags. <--
-- Header files: /home/bikram/dev/Impala/toolchain/gflags-2.2.0-p1/include
-- Added static library dependency gflags:
/home/bikram/dev/Impala/toolchain/gflags-2.2.0-p1/lib/libgflags.a
-- --> Adding thirdparty library pprof. <--
-- Header files: /home/bikram/dev/Impala/toolchain/gperftools-2.5/include
-- Added static library dependency pprof:
/home/bikram/dev/Impala/toolchain/gperftools-2.5/lib/libprofiler.a
-- --> Adding thirdparty library gtest. <--
-- Header files: /home/bikram/dev/Impala/toolchain/gtest-1.6.0/include
-- Added static library dependency gtest:
/home/bikram/dev/Impala/toolchain/gtest-1.6.0/lib/libgtest.a
-- LLVM llvm-config found at:
/home/bikram/dev/Impala/toolchain/llvm-3.8.0-p1/bin/llvm-config
-- LLVM clang++ found at:
/home/bikram/dev/Impala/toolchain/llvm-3.8.0-p1/bin/clang++
-- LLVM opt found at:
/home/bikram/dev/Impala/toolchain/llvm-3.8.0-p1/bin/opt
-- LLVM_ROOT: /home/bikram/dev/Impala/toolchain/llvm-3.8.0-asserts-p1
-- LLVM llvm-config found at:
/home/bikram/dev/Impala/toolchain/llvm-3.8.0-asserts-p1/bin/llvm-config
-- LLVM include dir:
/home/bikram/dev/Impala/toolchain/llvm-3.8.0-asserts-p1/include
-- LLVM lib dir: /home/bikram/dev/Impala/toolchain/llvm-3.8.0-asserts-p1/lib
-- --> Adding thirdparty library cyrus_sasl. <--
-- Header files: /usr/include
-- Added shared library dependency cyrus_sasl:
/usr/lib/x86_64-linux-gnu/libsasl2.so
-- --> Adding thirdparty library ldap. <--
-- Header files: /home/bikram/dev/Impala/toolchain/openldap-2.4.25/include
-- Added static library dependency ldap:
/home/bikram/dev/Impala/toolchain/openldap-2.4.25/lib/libldap.a
-- --> Adding thirdparty library lber. <--
-- Added static library dependency lber:
/home/bikram/dev/Impala/toolchain/openldap-2.4.25/lib/liblber.a
-- --> Adding thirdparty library thrift. <--
-- Header files: /home/bikram/dev/Impala/toolchain/thrift-0.9.0-p9/include
-- Added static library dependency thrift:
/home/bikram/dev/Impala/toolchain/thrift-0.9.0-p9/lib/libthrift.a
-- Thrift version: Thrift version 0.9.0
-- Thrift contrib dir: /home/bikram/dev/Impala/toolchain/thrift-0.9.0-p9
-- Thrift compiler:
/home/bikram/dev/Impala/toolchain/thrift-0.9.0-p9/bin/thrift
-- Found FLATBUFFERS:
/home/bikram/dev/Impala/toolchain/flatbuffers-1.6.0/include
-- --> Adding thirdparty library flatbuffers. <--
-- Header files: 

Re: Problem running Impala built with dynamic linking

2017-07-19 Thread Bikramjeet Vig
Thanks, lemme try it real quick and I'll get back to you

On Wed, Jul 19, 2017 at 5:04 PM, Henry Robinson  wrote:

> Can you try:
>
> cd $IMPALA_HOME
> rm CMakeCache.txt
> cmake .
> 
>
> If that doesn't work, can you send me the output of rm CMakeCache.txt &&
> cmake . from IMPALA_HOME?
>
> Thanks,
> Henry
>
> On 19 July 2017 at 17:03, Henry Robinson  wrote:
>
> > Sorry, I read too quickly - you've done that already! Let me take a look.
> >
> > On 19 July 2017 at 17:01, Henry Robinson  wrote:
> >
> >> Yep, you need to remove the downloaded version of gflags and replace it
> >> with a recent toolchain version. See my mail from yesterday for
> >> instructions: https://lists.apache.org/api/source.lua/a154f4
> >> 3ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%3Cdev.
> >> impala.apache.org%3E
> >>
> >> On 19 July 2017 at 16:56, Bikramjeet Vig 
> >> wrote:
> >>
> >>> After fetching latest from asf-gerrit (that has the toolchain commit
> >>> related to gflags) and doing a manual toolchain refresh, I am unable to
> >>> run
> >>> impala when I build with "make_debug" or "buildall -so", both
> statestore
> >>> and catalogd show the following error:
> >>>
> >>> ERROR: something wrong with flag 'flagfile' in file
> >>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui
> >>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/
> >>> gflags-2.2.0-p1/src/gflags.cc'.
> >>> One possibility: file
> >>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui
> >>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/
> >>> gflags-2.2.0-p1/src/gflags.cc'
> >>> is being linked both statically and dynamically into this executable.
> >>>
> >>> I am only able to make it work if I go with static linking by building
> it
> >>> with "buildall" without the "-so"
> >>>
> >>> Anyone facing the same issue?
> >>>
> >>
> >>
> >
> >
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679 <(415)%20994-6679>
> >
>


Re: Problem running Impala built with dynamic linking

2017-07-19 Thread Henry Robinson
Can you try:

cd $IMPALA_HOME
rm CMakeCache.txt
cmake .


If that doesn't work, can you send me the output of rm CMakeCache.txt &&
cmake . from IMPALA_HOME?

Thanks,
Henry

On 19 July 2017 at 17:03, Henry Robinson  wrote:

> Sorry, I read too quickly - you've done that already! Let me take a look.
>
> On 19 July 2017 at 17:01, Henry Robinson  wrote:
>
>> Yep, you need to remove the downloaded version of gflags and replace it
>> with a recent toolchain version. See my mail from yesterday for
>> instructions: https://lists.apache.org/api/source.lua/a154f4
>> 3ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%3Cdev.
>> impala.apache.org%3E
>>
>> On 19 July 2017 at 16:56, Bikramjeet Vig 
>> wrote:
>>
>>> After fetching latest from asf-gerrit (that has the toolchain commit
>>> related to gflags) and doing a manual toolchain refresh, I am unable to
>>> run
>>> impala when I build with "make_debug" or "buildall -so", both statestore
>>> and catalogd show the following error:
>>>
>>> ERROR: something wrong with flag 'flagfile' in file
>>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui
>>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/
>>> gflags-2.2.0-p1/src/gflags.cc'.
>>> One possibility: file
>>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui
>>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/
>>> gflags-2.2.0-p1/src/gflags.cc'
>>> is being linked both statically and dynamically into this executable.
>>>
>>> I am only able to make it work if I go with static linking by building it
>>> with "buildall" without the "-so"
>>>
>>> Anyone facing the same issue?
>>>
>>
>>
>
>
> --
> Henry Robinson
> Software Engineer
> Cloudera
> 415-994-6679 <(415)%20994-6679>
>


Re: Problem running Impala built with dynamic linking

2017-07-19 Thread Henry Robinson
Sorry, I read too quickly - you've done that already! Let me take a look.

On 19 July 2017 at 17:01, Henry Robinson  wrote:

> Yep, you need to remove the downloaded version of gflags and replace it
> with a recent toolchain version. See my mail from yesterday for
> instructions: https://lists.apache.org/api/source.lua/
> a154f43ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%
> 3Cdev.impala.apache.org%3E
>
> On 19 July 2017 at 16:56, Bikramjeet Vig 
> wrote:
>
>> After fetching latest from asf-gerrit (that has the toolchain commit
>> related to gflags) and doing a manual toolchain refresh, I am unable to
>> run
>> impala when I build with "make_debug" or "buildall -so", both statestore
>> and catalogd show the following error:
>>
>> ERROR: something wrong with flag 'flagfile' in file
>> '/data/jenkins/workspace/verify-impala-toolchain-package-
>> build/label/ec2-package-ubuntu-16-04/toolchain/source/
>> gflags/gflags-2.2.0-p1/src/gflags.cc'.
>> One possibility: file
>> '/data/jenkins/workspace/verify-impala-toolchain-package-
>> build/label/ec2-package-ubuntu-16-04/toolchain/source/
>> gflags/gflags-2.2.0-p1/src/gflags.cc'
>> is being linked both statically and dynamically into this executable.
>>
>> I am only able to make it work if I go with static linking by building it
>> with "buildall" without the "-so"
>>
>> Anyone facing the same issue?
>>
>
>


-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: Problem running Impala built with dynamic linking

2017-07-19 Thread Henry Robinson
Yep, you need to remove the downloaded version of gflags and replace it
with a recent toolchain version. See my mail from yesterday for
instructions:
https://lists.apache.org/api/source.lua/a154f43ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%3Cdev.impala.apache.org%3E

On 19 July 2017 at 16:56, Bikramjeet Vig 
wrote:

> After fetching latest from asf-gerrit (that has the toolchain commit
> related to gflags) and doing a manual toolchain refresh, I am unable to run
> impala when I build with "make_debug" or "buildall -so", both statestore
> and catalogd show the following error:
>
> ERROR: something wrong with flag 'flagfile' in file
> '/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-
> package-ubuntu-16-04/toolchain/source/gflags/
> gflags-2.2.0-p1/src/gflags.cc'.
> One possibility: file
> '/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-
> package-ubuntu-16-04/toolchain/source/gflags/
> gflags-2.2.0-p1/src/gflags.cc'
> is being linked both statically and dynamically into this executable.
>
> I am only able to make it work if I go with static linking by building it
> with "buildall" without the "-so"
>
> Anyone facing the same issue?
>


Problem running Impala built with dynamic linking

2017-07-19 Thread Bikramjeet Vig
After fetching latest from asf-gerrit (that has the toolchain commit
related to gflags) and doing a manual toolchain refresh, I am unable to run
impala when I build with "make_debug" or "buildall -so", both statestore
and catalogd show the following error:

ERROR: something wrong with flag 'flagfile' in file
'/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/gflags-2.2.0-p1/src/gflags.cc'.
One possibility: file
'/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/gflags-2.2.0-p1/src/gflags.cc'
is being linked both statically and dynamically into this executable.

I am only able to make it work if I go with static linking by building it
with "buildall" without the "-so"

Anyone facing the same issue?


Re: Is anyone else hitting a problem with Sentry when starting the Impala mini cluster?

2017-07-19 Thread Matthew Jacobs
+Bikram, who just got this in a gerrit-verify-dryrun job

On Wed, Jul 19, 2017 at 4:32 PM, Taras Bobrovytsky
 wrote:
> When I run ./testdata/bin/run-all.sh I get the following:
> Error in Impala/testdata/bin/run-all.sh at line 58:
> $IMPALA_HOME/testdata/bin/run-sentry-service.sh
>
> run-sentry-service.log shows the following:
> 17/07/19 16:22:23 ERROR testutil.SentryServicePinger: Error issuing RPC to
> Sentry Service  (attempt 4/30):
> org.apache.impala.common.InternalException: Error creating Sentry Service
> client:
>   at org.apache.impala.util.SentryPolicyService$SentryServiceClient.
> createClient(SentryPolicyService.java:96)
>   at org.apache.impala.util.SentryPolicyService$SentryServiceClient.
> (SentryPolicyService.java:67)
>   at
> org.apache.impala.util.SentryPolicyService.listAllRoles(SentryPolicyService.java:391)
>   at
> org.apache.impala.testutil.SentryServicePinger.main(SentryServicePinger.java:75)
>
> Caused by:
> sentry.org.apache.sentry.core.common.exception.MissingConfigurationException:
>   Property 'sentry.service.client.server.rpc-addresses' is missing in
> configuration
>   at
> sentry.org.apache.sentry.core.common.transport.SentryPolicyClientTransportConfig.
> getSentryServerRpcAddress(SentryPolicyClientTransportConfig.java:73)
>   at sentry.org.apache.sentry.core.common.transport.SentryTransportPool.
> (SentryTransportPool.java:103)
>   at org.apache.sentry.service.thrift.SentryServiceClientFactory.
>(SentryServiceClientFactory.java:83)
>   at org.apache.sentry.service.thrift.SentryServiceClientFactory.
>create(SentryServiceClientFactory.java:65)
>   at org.apache.impala.util.SentryPolicyService$SentryServiceClient.
> createClient(SentryPolicyService.java:94)
>   ... 3 more
>
> In our fe/src/test/resources/sentry-site.xml.template we have
> sentry.service.client.server.rpc-address instead of
> sentry.service.client.server.rpc-address*es* as it says it in the
> exception. Could this be the problem?


Is anyone else hitting a problem with Sentry when starting the Impala mini cluster?

2017-07-19 Thread Taras Bobrovytsky
When I run ./testdata/bin/run-all.sh I get the following:
Error in Impala/testdata/bin/run-all.sh at line 58:
$IMPALA_HOME/testdata/bin/run-sentry-service.sh

run-sentry-service.log shows the following:
17/07/19 16:22:23 ERROR testutil.SentryServicePinger: Error issuing RPC to
Sentry Service  (attempt 4/30):
org.apache.impala.common.InternalException: Error creating Sentry Service
client:
  at org.apache.impala.util.SentryPolicyService$SentryServiceClient.
createClient(SentryPolicyService.java:96)
  at org.apache.impala.util.SentryPolicyService$SentryServiceClient.
(SentryPolicyService.java:67)
  at
org.apache.impala.util.SentryPolicyService.listAllRoles(SentryPolicyService.java:391)
  at
org.apache.impala.testutil.SentryServicePinger.main(SentryServicePinger.java:75)

Caused by:
sentry.org.apache.sentry.core.common.exception.MissingConfigurationException:
  Property 'sentry.service.client.server.rpc-addresses' is missing in
configuration
  at
sentry.org.apache.sentry.core.common.transport.SentryPolicyClientTransportConfig.
getSentryServerRpcAddress(SentryPolicyClientTransportConfig.java:73)
  at sentry.org.apache.sentry.core.common.transport.SentryTransportPool.
(SentryTransportPool.java:103)
  at org.apache.sentry.service.thrift.SentryServiceClientFactory.
   (SentryServiceClientFactory.java:83)
  at org.apache.sentry.service.thrift.SentryServiceClientFactory.
   create(SentryServiceClientFactory.java:65)
  at org.apache.impala.util.SentryPolicyService$SentryServiceClient.
createClient(SentryPolicyService.java:94)
  ... 3 more

In our fe/src/test/resources/sentry-site.xml.template we have
sentry.service.client.server.rpc-address instead of
sentry.service.client.server.rpc-address*es* as it says it in the
exception. Could this be the problem?


Re: What is dictionary filter in Impala?

2017-07-19 Thread Tim Armstrong
Hi,
  The Parquet format supports various encodings that help compress columns
of data with different characteristics. Dictionary encoding is useful if
there are many repeats of the same value in the same column. E.g. if you
have a string column with country names - you might have "Australia",
"USA", "China" repeated many times. If there are <= 40,000 distinct values
a column can be encoded with a dictionary: at the start of the column there
is a dictionary with all of the distinct values, then the data is
represented as integers.

 E.g. if the dictionary was ["Australia", "USA", "China"], then "China"
would be encoded as 2.

Dictionary filtering takes advantage of this to speed up scans. E.g. if I
have a query like "select * from my_table where country = 'Iceland'", then
we can check the dictionary for a Parquet row group before scanning the row
group. If no entries in the dictionary match the condition, then we can
skip the whole row group.

On Wed, Jul 19, 2017 at 3:22 AM, Wang Chunling 
wrote:

> Hi,
>
> I find there is dictionary filter in Impala when doing Parquet scan. The
> comment says the column is 100% dictionary encoded can be dictionary
> filtered. Can you explain what kind of columns can be dictionary encoded?
> And is there any example of dictionary filter? Thanks a lot.
>
>
> Chunling


Re: jenkins.impala.io switching to SSL

2017-07-19 Thread Lars Volker
Hi All,

I completed the Jenkins reconfiguration that I announced last night.
Jenkins can now be reached at https://jenkins.impala.io and all previous
URLs redirect there permanently. From now on it will post https:// links in
code reviews. Links posted in old code reviews should still work. I found
two jobs that were aborted by the restart and I kicked off new build for
those. If one of your jobs got killed, please make sure to restart them,
too.

Unless we discover any issues with the new configuration there should be no
more interruptions. Thank you for your patience.

Cheers, Lars

On Tue, Jul 18, 2017 at 1:43 PM, Lars Volker  wrote:

> Hi All,
>
> Jenkins has been running with SSL for the past few days and I haven't
> received any complaints. If no-one objects, tomorrow morning (Wednesday,
> PST) I will configure http://jenkins.impala.io:8080/ to redirect to
> https://jenkins.impala.io. From that point on, Jenkins will also post
> links to its https endpoint in code reviews.
>
> Let me know if you have any questions or concerns.
>
> Cheers, Lars
>
> On Fri, Jul 14, 2017 at 10:55 PM, Lars Volker  wrote:
>
>> Hi All,
>>
>> our Jenkins instance now has a proper SSL certificate and can be reached
>> at https://jenkins.impala.io. The old redirect from http://j.i.o now
>> points to the SSL endpoint instead of port 8080.
>>
>> If you run into any issues with the SSL setup, please let me know. As a
>> workaround you can still access Jenkins directly at
>> http://jenkins.impala.io:8080/. If no-one reports any issues in the next
>> few days, I will eventually make that URL redirect to SSL, too, so all
>> connections will be secured.
>>
>> Cheers, Lars
>>
>
>


What is dictionary filter in Impala?

2017-07-19 Thread Wang Chunling
Hi,

I find there is dictionary filter in Impala when doing Parquet scan. The 
comment says the column is 100% dictionary encoded can be dictionary filtered. 
Can you explain what kind of columns can be dictionary encoded? And is there 
any example of dictionary filter? Thanks a lot.


Chunling