Hi Amos, They should all succeed in principle, it can be finicky though - they pass reliably in our automation environment.
With the test_hdfs_caching.py ones, probably some part of the data loading failed, specifically this part that caches the tables: https://github.com/apache/incubator-impala/blob/master/testdata/bin/create-load-data.sh#L165 . You could try running those statements by hand. The stats mismatch is more mysterious to me - maybe someone else has some ideas. On Sun, Nov 6, 2016 at 6:27 PM, Amos Bird <[email protected]> wrote: > > Are impala's e2e tests supposed to be all successful? I still get these > 7 errors: > > query_test/test_hdfs_caching.py::TestHdfsCaching::test_table_is_cached[exec_option: > {'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': > 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none] FAILED > query_test/test_hdfs_caching.py::TestHdfsCaching::test_table_is_cached[exec_option: > {'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': > 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/gzip/block] FAILED > [gw2] FAILED > metadata/test_compute_stats.py::TestComputeStats::test_compute_stats[exec_option: > {'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': > 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none] > [gw2] FAILED metadata/test_compute_stats.py::TestComputeStats::test_ > compute_stats_incremental[exec_option: {'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size': > 0, 'num_nodes': 0} | table_format: text/none] > [gw6] FAILED > metadata/test_ddl.py::TestDdlStatements::test_alter_set_column_stats[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'sync_ddl': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | > table_format: text/none-unique_database0] > [gw6] FAILED > metadata/test_ddl.py::TestDdlStatements::test_truncate_table[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'sync_ddl': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | > table_format: text/none-unique_database0] > [gw6] FAILED metadata/test_metadata_query_statements.py:: > TestMetadataQueryStatements::test_show_stats[exec_option: {'batch_size': > 0, 'num_nodes': 0, 'sync_ddl': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | > table_format: text/none] > > and this stats mismatch looks exactly on my Centos machine, > --- > -- executing against localhost:21000 > show column stats alltypes_clone; > > MainThread: Comparing QueryTestResults (expected vs actual): > 'bigint_col','BIGINT',10,-1,8,8 == 'bigint_col','BIGINT',10,-1,8,8 > 'bool_col','BOOLEAN',2,-1,1,1 == 'bool_col','BOOLEAN',2,-1,1,1 > 'date_string_col','STRING',736,-1,8,8 == 'date_string_col','STRING', > 736,-1,8,8 > 'double_col','DOUBLE',-1,-1,8,8 == 'double_col','DOUBLE',-1,-1,8,8 > 'float_col','FLOAT',10,-1,4,4 == 'float_col','FLOAT',10,-1,4,4 > 'id','INT',7505,-1,4,4 == 'id','INT',7505,-1,4,4 > 'int_col','INT',-1,-1,4,4 == 'int_col','INT',-1,-1,4,4 > 'month','INT',12,0,4,4 == 'month','INT',12,0,4,4 > 'smallint_col','SMALLINT',10,-1,2,2 == 'smallint_col','SMALLINT',10,-1,2,2 > 'string_col','STRING',10,-1,-1,-1 == 'string_col','STRING',10,-1,-1,-1 > 'timestamp_col','TIMESTAMP',7554,-1,16,16 != 'timestamp_col','TIMESTAMP', > 7552,-1,16,16 > 'tinyint_col','TINYINT',10,-1,1,1 == 'tinyint_col','TINYINT',10,-1,1,1 > 'year','INT',2,0,4,4 == 'year','INT',2,0,4,4 > --- > > Amos Bird writes: > > > Ah, re-login does the trick. Thanks for you help ;). > > > > However, the e2e test yells so many errors. > > > > 1) the name of the directory containing the error log is strange. It > > literaly looks like this: > > tests/"${RESULTS_DIR}/TEST-impala-custom-cluster.log" > > > > 2) the commit I tested is 7fc31b534d4c5cb118c559e16556a6c1ae6ca7fc > > > > 3) when executing tests/run-tests.py, it gave: > > ----- > > Traceback (most recent call last): > > File "./tests/run-tests.py", line 94, in <module> > > test_executor.run_tests(args) > > File "./tests/run-tests.py", line 63, in run_tests > > exit_code = pytest.main(args) > > File "/home/amos/impala/infra/python/env/local/lib/python2. > 7/site-packages/_pytest/config.py", line 32, in main > > config = _prepareconfig(args, plugins) > > File "/home/amos/impala/infra/python/env/local/lib/python2. > 7/site-packages/_pytest/config.py", line 78, in _prepareconfig > > args = shlex.split(args) > > File "/usr/lib/python2.7/shlex.py", line 279, in split > > return list(lex) > > File "/usr/lib/python2.7/shlex.py", line 269, in next > > token = self.get_token() > > File "/usr/lib/python2.7/shlex.py", line 96, in get_token > > raw = self.read_token() > > File "/usr/lib/python2.7/shlex.py", line 172, in read_token > > raise ValueError, "No closing quotation" > > ValueError: No closing quotation > > ----- > > > > 4) when executing "MAX_PYTEST_FAILURES=12345678 ./bin/run-all-tests.sh", > > be, fe tests are passed. e2e tests fail a lot. Log files are attached. > > > > I'm refering to this https://cwiki.apache.org/ > confluence/display/IMPALA/How+to+load+and+run+Impala+tests > > > > regards, > > Amos > > > > > > Lars Volker writes: > > > >> Yes, this is already committed to the impala-setup repo and I used it > >> yesterday on a fresh Ubuntu 14.04 machine with success. > >> > >> Amos, after running impala-setup you will need to re-login to make sure > the > >> changes made to the system limits are effective. You can check them by > >> running "ulimit -n" in your shell. > >> > >> On Wed, Nov 2, 2016 at 5:48 AM, Jim Apple <[email protected]> wrote: > >> > >>> Isn't that already part of the script? > >>> > >>> https://github.com/awleblang/impala-setup/commit/ > >>> 56fa829c99e997585eb63fcd49cb65eb8357e679 > >>> > >>> https://git-wip-us.apache.org/repos/asf?p=incubator-impala. > >>> git;a=blob;f=bin/bootstrap_development.sh;h= > 8c4f742ae058f8017858d2a749e882 > >>> 4be58bd410;hb=HEAD#l68 > >>> > >>> On Tue, Nov 1, 2016 at 9:44 PM, Dimitris Tsirogiannis > >>> <[email protected]> wrote: > >>> > Hi Amos, > >>> > > >>> > You need to increase your limits (/etc/security/limits.conf) for max > >>> > number of open files (nofile). Use a pretty big number (e.g. 500K) > for > >>> > both soft and hard. > >>> > > >>> > Hope that helps. > >>> > > >>> > Dimitris > >>> > > >>> > On Tue, Nov 1, 2016 at 8:57 PM, Amos Bird <[email protected]> > wrote: > >>> >> > >>> >> Hi there, > >>> >> > >>> >> After days of efforts to make impala's local tests work on my Centos > >>> >> machine, I finally gave up and turns to Ubuntu. I followed this > simple > >>> >> guide > >>> >> https://cwiki.apache.org/confluence/display/IMPALA/ > >>> Bootstrapping+an+Impala+Development+Environment+From+Scratch > >>> >> on a fresh installed Ubuntu 14.04. Unfortunately there are still > errors > >>> >> in loading data phase. Here is the error log, > >>> >> > >>> >> ------------------------------------------------------------ > >>> --------------------------------- > >>> >> Loading Kudu TPCH (logging to /home/amos/impala/logs/data_ > loading/load-kudu-tpch.log)... > >>> FAILED > >>> >> 'load-data tpch core kudu/none/none force' failed. Tail of log: > >>> >> distribute by hash (c_custkey) into 9 buckets stored as kudu > >>> >> > >>> >> (load-tpch-core-impala-generated-kudu-none-none.sql): > >>> >> > >>> >> > >>> >> Executing HBase Command: hbase shell load-tpch-core-hbase- > >>> generated.create > >>> >> 16/11/02 01:07:58 INFO Configuration.deprecation: hadoop.native.lib > is > >>> deprecated. Instead, use io.native.lib.available > >>> >> SLF4J: Class path contains multiple SLF4J bindings. > >>> >> SLF4J: Found binding in [jar:file:/home/amos/impala/ > >>> toolchain/cdh_components/hbase-1.2.0-cdh5.10.0- > >>> SNAPSHOT/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/ > >>> StaticLoggerBinder.class] > >>> >> SLF4J: Found binding in [jar:file:/home/amos/impala/ > >>> toolchain/cdh_components/hadoop-2.6.0-cdh5.10.0- > >>> SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/ > org/slf4j/impl/ > >>> StaticLoggerBinder.class] > >>> >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > >>> explanation. > >>> >> SLF4J: Actual binding is of type [org.slf4j.impl. > Log4jLoggerFactory] > >>> >> Executing HBase Command: hbase shell post-load-tpch-core-hbase- > >>> generated.sql > >>> >> 16/11/02 01:08:03 INFO Configuration.deprecation: hadoop.native.lib > is > >>> deprecated. Instead, use io.native.lib.available > >>> >> SLF4J: Class path contains multiple SLF4J bindings. > >>> >> SLF4J: Found binding in [jar:file:/home/amos/impala/ > >>> toolchain/cdh_components/hbase-1.2.0-cdh5.10.0- > >>> SNAPSHOT/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/ > >>> StaticLoggerBinder.class] > >>> >> SLF4J: Found binding in [jar:file:/home/amos/impala/ > >>> toolchain/cdh_components/hadoop-2.6.0-cdh5.10.0- > >>> SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/ > org/slf4j/impl/ > >>> StaticLoggerBinder.class] > >>> >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > >>> explanation. > >>> >> SLF4J: Actual binding is of type [org.slf4j.impl. > Log4jLoggerFactory] > >>> >> Invalidating Metadata > >>> >> (load-tpch-core-impala-load-generated-kudu-none-none.sql): > >>> >> INSERT INTO TABLE tpch_kudu.lineitem SELECT * FROM tpch.lineitem > >>> >> > >>> >> Data Loading from Impala failed with error: ImpalaBeeswaxException: > >>> >> Query aborted: > >>> >> Kudu error(s) reported, first error: Timed out: Failed to write > batch > >>> of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 after 329 > >>> attempt(s): Failed to write to server: (no server available): > Write(tablet: > >>> 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, num_attempts: 329) > >>> passed its deadline: Network error: recv error: Connection reset by > peer > >>> (error 104) > >>> >> > >>> >> > >>> >> > >>> >> Kudu error(s) reported, first error: Timed out: Failed to write > batch > >>> of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 after 329 > >>> attempt(s): Failed to write to server: (no server available): > Write(tablet: > >>> 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, num_attempts: 329) > >>> passed its deadline: Network error: recv error: Connection reset by > peer > >>> (error 104) > >>> >> Error in Kudu table 'impala::tpch_kudu.lineitem': Timed out: Failed > to > >>> write batch of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 > after > >>> 329 attempt(s): Failed to write to server: (no server available): > >>> Write(tablet: 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, > >>> num_attempts: 329) passed its deadline: Network error: recv error: > >>> Connection reset by peer (error 104) (1 of 2708 similar) > >>> >> > >>> >> Traceback (most recent call last): > >>> >> File "/home/amos/impala/bin/load-data.py", line 158, in > >>> exec_impala_query_from_file > >>> >> result = impala_client.execute(query) > >>> >> File "/home/amos/impala/tests/beeswax/impala_beeswax.py", line > 173, > >>> in execute > >>> >> handle = self.__execute_query(query_string.strip(), user=user) > >>> >> File "/home/amos/impala/tests/beeswax/impala_beeswax.py", line > 339, > >>> in __execute_query > >>> >> self.wait_for_completion(handle) > >>> >> File "/home/amos/impala/tests/beeswax/impala_beeswax.py", line > 359, > >>> in wait_for_completion > >>> >> raise ImpalaBeeswaxException("Query aborted:" + error_log, None) > >>> >> ImpalaBeeswaxException: ImpalaBeeswaxException: > >>> >> Query aborted: > >>> >> Kudu error(s) reported, first error: Timed out: Failed to write > batch > >>> of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 after 329 > >>> attempt(s): Failed to write to server: (no server available): > Write(tablet: > >>> 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, num_attempts: 329) > >>> passed its deadline: Network error: recv error: Connection reset by > peer > >>> (error 104) > >>> >> > >>> >> > >>> >> > >>> >> Kudu error(s) reported, first error: Timed out: Failed to write > batch > >>> of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 after 329 > >>> attempt(s): Failed to write to server: (no server available): > Write(tablet: > >>> 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, num_attempts: 329) > >>> passed its deadline: Network error: recv error: Connection reset by > peer > >>> (error 104) > >>> >> Error in Kudu table 'impala::tpch_kudu.lineitem': Timed out: Failed > to > >>> write batch of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 > after > >>> 329 attempt(s): Failed to write to server: (no server available): > >>> Write(tablet: 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, > >>> num_attempts: 329) passed its deadline: Network error: recv error: > >>> Connection reset by peer (error 104) (1 of 2708 similar) > >>> >> > >>> >> Error in /home/amos/impala/testdata/bin/create-load-data.sh at line > >>> 45: while [ -n "$*" ] > >>> >> + cleanup > >>> >> + rm -rf /tmp/tmp.hMzGwIcUo3 > >>> >> ------------------------------------------------------------ > >>> --------------------------------- > >>> >> > >>> >> This kinda blocks my patch's rebasing. Any help is much appreciated! > >>> >> > >>> >> regards, > >>> >> Amos > >>> > >
