Thank you for helping me out :D > Thanks for keeping this discussion going. > > On Sun, Sep 18, 2016 at 8:13 AM, Amos Bird <amosb...@gmail.com> wrote: > >> >> > On Fri, Sep 16, 2016 at 9:06 PM, Amos Bird <amosb...@gmail.com> wrote: >> > >> >> >> >> Hi there, >> >> >> >> I followed the wiki >> >> https://cwiki.apache.org/confluence/display/IMPALA/How+ >> >> to+load+and+run+Impala+tests >> >> carefully but still have some problems in my local env. >> >> >> >> 1. I need to manually execute "hdfs dfs -mkdir >> /test-warehouse/emptytable" >> >> to get rid of some fe test error. >> >> >> >> >> > Ideally, you should not have to do this. Could you tell me what errors >> you >> > encountered? Sounds like there may be a test or data loading bug we >> should >> > fix. >> >> The error is : >> >> TestLoadData(com.cloudera.impala.analysis.AnalyzeStmtsTest) Time >> elapsed: 0.033 sec <<< FAILURE! >> java.lang.AssertionError: got error: >> INPATH location 'hdfs://localhost:20500/test-warehouse/emptytable' does >> not exist. >> expected: >> INPATH location 'hdfs://localhost:20500/test-warehouse/emptytable' >> contains no visible files. >> at org.junit.Assert.fail(Assert.java:88) >> at org.junit.Assert.assertTrue(Assert.java:41) >> at com.cloudera.impala.common.FrontendTestBase.AnalysisError( >> FrontendTestBase.java:312) >> at com.cloudera.impala.common.FrontendTestBase.AnalysisError( >> FrontendTestBase.java:292) >> at com.cloudera.impala.analysis.AnalyzeStmtsTest.TestLoadData( >> AnalyzeStmtsTest.java:2860) >> >> > Do you have a table functional.emptytable? If yes, then what location is > reported in "show create table"? Query: show create table functional.emptytable +-------------------------------------------------------------+ | result | +-------------------------------------------------------------+ | CREATE EXTERNAL TABLE functional.emptytable ( | | field STRING | | ) | | PARTITIONED BY ( | | f2 INT | | ) | | STORED AS TEXTFILE | | LOCATION 'hdfs://localhost:20500/test-warehouse/emptytable' | | TBLPROPERTIES ('transient_lastDdlTime'='1464782625') | +-------------------------------------------------------------+ Fetched 1 row(s) in 5.51s
> Does the directory exist in HDFS? No. > > You could try to manually reload the table and see if the directory is > created: > bin/load-data.py -f -w functional-query --table_names=emptytable > --table_formats=text/none After executing this command the directory appears. > >>> >> >> 2. I have authz-policy.ini in HDFS, but I still get authorization >> errors. >> >> >> >> TestSelect[0](com.cloudera.impala.analysis.AuthorizationTest) Time >> >> elapsed: 0.333 sec <<< FAILURE! >> >> java.lang.AssertionError: got error: >> >> User 'amos' does not have privileges to execute 'SELECT' on: >> default.nodb >> >> expected: >> >> User 'amos' does not have privileges to execute 'SELECT' on: >> nodb.alltypes >> >> at org.junit.Assert.fail(Assert.java:88) >> >> at org.junit.Assert.assertTrue(Assert.java:41) >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( >> >> AuthorizationTest.java:2220) >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( >> >> AuthorizationTest.java:2203) >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( >> >> AuthorizationTest.java:2197) >> >> at com.cloudera.impala.analysis.AuthorizationTest.TestSelect( >> >> AuthorizationTest.java:512) >> >> >> >> TestSelect[1](com.cloudera.impala.analysis.AuthorizationTest) Time >> >> elapsed: 0.324 sec <<< FAILURE! >> >> java.lang.AssertionError: got error: >> >> User 'amos' does not have privileges to execute 'SELECT' on: >> default.nodb >> >> expected: >> >> User 'amos' does not have privileges to execute 'SELECT' on: >> nodb.alltypes >> >> at org.junit.Assert.fail(Assert.java:88) >> >> at org.junit.Assert.assertTrue(Assert.java:41) >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( >> >> AuthorizationTest.java:2220) >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( >> >> AuthorizationTest.java:2203) >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( >> >> AuthorizationTest.java:2197) >> >> at com.cloudera.impala.analysis.AuthorizationTest.TestSelect( >> >> AuthorizationTest.java:512) >> >> >> >> >> >> Results : >> >> >> >> Failed tests: >> >> AuthorizationTest.TestSelect:512->AuthzError:2197-> >> >> AuthzError:2203->AuthzError:2220 got error: >> >> User 'amos' does not have privileges to execute 'SELECT' on: >> default.nodb >> >> expected: >> >> User 'amos' does not have privileges to execute 'SELECT' on: >> nodb.alltypes >> >> AuthorizationTest.TestSelect:512->AuthzError:2197-> >> >> AuthzError:2203->AuthzError:2220 got error: >> >> User 'amos' does not have privileges to execute 'SELECT' on: >> default.nodb >> >> expected: >> >> User 'amos' does not have privileges to execute 'SELECT' on: >> nodb.alltypes >> >> >> >> >> >> >> > Strange. In this test, we register two authorization requests, and it >> seems >> > like those are not checked in the expected order. However, that should >> not >> > be possible because we store them in a LinkedHashSet. >> > Could you dig into this a little further to see if you can figure out why >> > the order is wrong? >> > >> > This is where we register the authorization requests: >> > https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/ >> main/java/com/cloudera/impala/analysis/Analyzer.java#L544 >> > >> > This is where we check the authorization requests: >> > https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/ >> main/java/com/cloudera/impala/analysis/AnalysisContext.java#L391 >> > >> > >> >> I tried directly executing "select 1 from nodb.alltypes" in >> impala-shell, leading to this error: >> ERROR: AnalysisException: Could not resolve table reference: >> 'nodb.alltypes' >> >> How can I reproduce the authorization tests in impala-shell so I can >> debug it? >> >> >> > FYI, this is actually a known issue and may have something to do with the > JRE version you are running: https://issues.cloudera.org/browse/IMPALA-3643 > As far as I can tell the bug should be "impossible" because we use a > LinkedHashSet, but maybe certain JREs do not properly honor the guarantees. > > The AuthorizationTests in particular require a non-trivial setup, so I'd > not recommend trying to debug via the Impala shell. > > I'd recommend debugging in one of these ways: > - Run the test manually via "mvn test -Dtest=AuthorizationTest" from the FE > directory. Attach debugger and break in TestSelect(). > - Run the JUnit test from an IDE such as Eclipse and then debug the test. > I'm afraid there is no easy way to just run that single query in our > current test setup. You will need to run the whole suite, but you can break > TestSelect() or hack the code in various places to set useful breakpoints. > > Hope that helps. I tried jdk1.8.0_102. Compilation works just fine but AuthorizationTest fails with ClassNoDefine. Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.008 sec <<< FAILURE! - in com.cloudera.impala.analysis.AuthorizationTest initializationError(com.cloudera.impala.analysis.AuthorizationTest) Time elapsed: 0.006 sec <<< ERROR! java.lang.NoClassDefFoundError: Could not initialize class com.cloudera.impala.analysis.AuthorizationTest at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.runners.Parameterized.allParameters(Parameterized.java:280) at org.junit.runners.Parameterized.<init>(Parameterized.java:248) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:104) at org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:86) at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59) at org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:26) at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59) at org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:33) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) > >> >> >> >> >> >> >> 3. For end-to-end tests, I encountered two kinds of errors >> >> >> >> a) connection refused. >> >> >> >> SET sync_ddl=False; >> >> -- executing against localhost:21000 >> >> DROP DATABASE `test_drop_cleans_hdfs_dirs_fdfd4f8` CASCADE; >> >> >> >> ___________________ ERROR at setup of TestLoadData.test_load[exec_ >> option: >> >> {'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_ >> threshold': >> >> 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none] >> >> ___________________ >> >> [gw5] linux2 -- Python 2.6.6 /home/amos/incubator-impala/ >> >> bin/../infra/python/env/bin/python >> >> metadata/test_load.py:77: in setup_method >> >> "{0}/{1}/100101.txt".format(STAGING_PATH, i)) >> >> util/hdfs_util.py:122: in copy >> >> data = self.read_file(src) >> >> ../infra/python/env/lib/python2.6/site-packages/ >> pywebhdfs/webhdfs.py:183: >> >> in read_file >> >> response = requests.get(uri, allow_redirects=True) >> >> ../infra/python/env/lib/python2.6/site-packages/requests/api.py:69: in >> get >> >> return request('get', url, params=params, **kwargs) >> >> ../infra/python/env/lib/python2.6/site-packages/requests/api.py:50: in >> >> request >> >> response = session.request(method=method, url=url, **kwargs) >> >> ../infra/python/env/lib/python2.6/site-packages/ >> requests/sessions.py:465: >> >> in request >> >> resp = self.send(prep, **send_kwargs) >> >> ../infra/python/env/lib/python2.6/site-packages/ >> requests/sessions.py:594: >> >> in send >> >> history = [resp for resp in gen] if allow_redirects else [] >> >> ../infra/python/env/lib/python2.6/site-packages/ >> requests/sessions.py:196: >> >> in resolve_redirects >> >> **adapter_kwargs >> >> ../infra/python/env/lib/python2.6/site-packages/ >> requests/sessions.py:573: >> >> in send >> >> r = adapter.send(request, **kwargs) >> >> ../infra/python/env/lib/python2.6/site-packages/ >> requests/adapters.py:415: >> >> in send >> >> raise ConnectionError(err, request=request) >> >> E ConnectionError: ('Connection aborted.', error(111, 'Connection >> >> refused')) >> >> >> >> >> > The connection refused issue is very bizarre. One thing that I noticed is >> > that your Python does not seem to match what we use (Python 2.7.3). >> > Could you re-run infra/python/bootstrap_virtualenv.py and see if you get >> > the expected version into infra/python/env/local/bin? >> > >> > Alternatively, maybe there's a problem with your /etc/hosts? You can try >> > searching online for WebHdfs and /etc/hosts >> > >> >> well, I find this 'find_py26.py' file under deps. Is it normal? >> > > Yes, that's normal. That file looks Python 2.6 on your system but should > not be relevant for running tests because we use the Python from our > virtualenv and not the one on your system. > > What's your output when you run "impala-python --version". You should get > 'Python 2.7.3". [amos@t450s tests]$ impala-python --version Python 2.6.6 > Also, what's the Python version on your system? Our virtualenv will use > Python 2.6 if your system has a Python < 2.6. > You could try to upgrade your system Python and then > re-run infra/python/bootstrap_virtualenv.py My system has python 2.6.6. > > Still, theoretically Python 2.6 in the virtual env should work. I think > it's more likely you are having a connection problem due to a misconfigured > /etc/hosts. > > Are you running the test from a shell that has bin/impala-config.sh and > bin/set-classpath.sh sourced? Yes. > > To further debug this you could try to specify your namenode address when > running the test to see whether it is somehow picking up a wrong address: > cd tests > ./run-tests.py metadata/test_load.py --namenode_http_address=localhost:50070 > > And see if that works. Unfortunately, no. > > > >> [amos@nobida143 incubator-impala]$ ls infra/python/deps/ >> download_requirements find_py26.py pip_download.py requirements.txt >> [amos@nobida143 incubator-impala]$ cat infra/python/deps/download_ >> requirements >> #!/bin/bash >> >> # Licensed to the Apache Software Foundation (ASF) under one >> # or more contributor license agreements. See the NOTICE file >> # distributed with this work for additional information >> # regarding copyright ownership. The ASF licenses this file >> # to you under the Apache License, Version 2.0 (the >> # "License"); you may not use this file except in compliance >> # with the License. You may obtain a copy of the License at >> # >> # http://www.apache.org/licenses/LICENSE-2.0 >> # >> # Unless required by applicable law or agreed to in writing, >> # software distributed under the License is distributed on an >> # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY >> # KIND, either express or implied. See the License for the >> # specific language governing permissions and limitations >> # under the License. >> >> set -euo pipefail >> >> DIR="$(dirname "$0")" >> >> pushd "$DIR" >> PY26="$(./find_py26.py)" >> # Directly download packages listed in requirements.txt, but don't install >> them. >> "$PY26" pip_download.py >> # For virtualenv, other scripts rely on the .tar.gz package (not a .whl >> package). >> "$PY26" pip_download.py virtualenv 13.1.0 >> # kudu-python is downloaded separately because pip install attempts to >> execute a >> # setup.py subcommand for kudu-python that can fail even if the download >> succeeds. >> "$PY26" pip_download.py kudu-python 0.1.1 >> popd >> >> >> >> > b) stats not match >> >> >> >> [gw4] linux2 -- Python 2.6.6 /home/amos/incubator-impala/ >> >> bin/../infra/python/env/bin/python >> >> metadata/test_metadata_query_statements.py:67: in test_show_stats >> >> self.run_test_case('QueryTest/show-stats', vector, "functional") >> >> common/impala_test_suite.py:342: in run_test_case >> >> self.__verify_results_and_errors(vector, test_section, result, >> use_db) >> >> common/impala_test_suite.py:234: in __verify_results_and_errors >> >> replace_filenames_with_placeholder) >> >> common/test_result_verifier.py:398: in verify_raw_results >> >> VERIFIER_MAP[verifier](expected, actual) >> >> common/test_result_verifier.py:231: in verify_query_result_is_equal >> >> assert expected_results == actual_results >> >> >> >> ... >> >> >> >> -- executing against localhost:21000 >> >> show column stats alltypes_clone; >> >> >> >> MainThread: Comparing QueryTestResults (expected vs actual): >> >> 'bigint_col','BIGINT',10,-1,8,8 == 'bigint_col','BIGINT',10,-1,8,8 >> >> 'bool_col','BOOLEAN',2,-1,1,1 == 'bool_col','BOOLEAN',2,-1,1,1 >> >> 'date_string_col','STRING',736,-1,8,8 == 'date_string_col','STRING', >> >> 736,-1,8,8 >> >> 'double_col','DOUBLE',-1,-1,8,8 == 'double_col','DOUBLE',-1,-1,8,8 >> >> 'float_col','FLOAT',10,-1,4,4 == 'float_col','FLOAT',10,-1,4,4 >> >> 'id','INT',7505,-1,4,4 == 'id','INT',7505,-1,4,4 >> >> 'int_col','INT',-1,-1,4,4 == 'int_col','INT',-1,-1,4,4 >> >> 'month','INT',12,0,4,4 == 'month','INT',12,0,4,4 >> >> 'smallint_col','SMALLINT',10,-1,2,2 == 'smallint_col','SMALLINT',10,- >> 1,2,2 >> >> 'string_col','STRING',10,-1,-1,-1 == 'string_col','STRING',10,-1,-1,-1 >> >> 'timestamp_col','TIMESTAMP',7554,-1,16,16 != >> 'timestamp_col','TIMESTAMP', >> >> 7552,-1,16,16 >> >> 'tinyint_col','TINYINT',10,-1,1,1 == 'tinyint_col','TINYINT',10,-1,1,1 >> >> 'year','INT',2,0,4,4 == 'year','INT',2,0,4,4 >> >> >> >> >> >> Very strange. Can you do a compute stats on functional.alltypes and >> > confirm that the NDV for timestamp_col are 7552 in your setup? >> >> Yes. >> > > I'll need to ask around for help. I have no idea why this is happening. Thanks :) > >> >> > >> > >> > >> >> I'm using CentOS 6.8 final. I have no idea what goes wrong. Any help is >> >> much appreciated! >> > >> > >> > >> > >> >> >> >> Best regards, >> >> Amos >> >> >> >>