I can reproduce this in Ubuntu on ec2:

https://issues.cloudera.org/browse/IMPALA-4433

On Sun, Nov 6, 2016 at 7:24 PM, Tim Armstrong <[email protected]> wrote:
> Hi Amos,
>  They should all succeed in principle, it can be finicky though - they pass
> reliably in our automation environment.
>
> With the test_hdfs_caching.py ones, probably some part of the data loading
> failed, specifically this part that caches the tables:
> https://github.com/apache/incubator-impala/blob/master/testdata/bin/create-load-data.sh#L165
> . You could try running those statements by hand.
>
> The stats mismatch is more mysterious to me - maybe someone else has some
> ideas.
>
> On Sun, Nov 6, 2016 at 6:27 PM, Amos Bird <[email protected]> wrote:
>
>>
>> Are impala's e2e tests supposed to be all successful? I still get these
>> 7 errors:
>>
>> query_test/test_hdfs_caching.py::TestHdfsCaching::test_table_is_cached[exec_option:
>> {'disable_codegen': False, 'abort_on_error': 1, 
>> 'exec_single_node_rows_threshold':
>> 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none] FAILED
>> query_test/test_hdfs_caching.py::TestHdfsCaching::test_table_is_cached[exec_option:
>> {'disable_codegen': False, 'abort_on_error': 1, 
>> 'exec_single_node_rows_threshold':
>> 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/gzip/block] FAILED
>> [gw2] FAILED 
>> metadata/test_compute_stats.py::TestComputeStats::test_compute_stats[exec_option:
>> {'disable_codegen': False, 'abort_on_error': 1, 
>> 'exec_single_node_rows_threshold':
>> 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none]
>> [gw2] FAILED metadata/test_compute_stats.py::TestComputeStats::test_
>> compute_stats_incremental[exec_option: {'disable_codegen': False,
>> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size':
>> 0, 'num_nodes': 0} | table_format: text/none]
>> [gw6] FAILED 
>> metadata/test_ddl.py::TestDdlStatements::test_alter_set_column_stats[exec_option:
>> {'batch_size': 0, 'num_nodes': 0, 'sync_ddl': 0, 'disable_codegen': False,
>> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} |
>> table_format: text/none-unique_database0]
>> [gw6] FAILED 
>> metadata/test_ddl.py::TestDdlStatements::test_truncate_table[exec_option:
>> {'batch_size': 0, 'num_nodes': 0, 'sync_ddl': 0, 'disable_codegen': False,
>> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} |
>> table_format: text/none-unique_database0]
>> [gw6] FAILED metadata/test_metadata_query_statements.py::
>> TestMetadataQueryStatements::test_show_stats[exec_option: {'batch_size':
>> 0, 'num_nodes': 0, 'sync_ddl': 0, 'disable_codegen': False,
>> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} |
>> table_format: text/none]
>>
>> and this stats mismatch looks exactly on my Centos machine,
>> ---
>> -- executing against localhost:21000
>> show column stats alltypes_clone;
>>
>> MainThread: Comparing QueryTestResults (expected vs actual):
>> 'bigint_col','BIGINT',10,-1,8,8 == 'bigint_col','BIGINT',10,-1,8,8
>> 'bool_col','BOOLEAN',2,-1,1,1 == 'bool_col','BOOLEAN',2,-1,1,1
>> 'date_string_col','STRING',736,-1,8,8 == 'date_string_col','STRING',
>> 736,-1,8,8
>> 'double_col','DOUBLE',-1,-1,8,8 == 'double_col','DOUBLE',-1,-1,8,8
>> 'float_col','FLOAT',10,-1,4,4 == 'float_col','FLOAT',10,-1,4,4
>> 'id','INT',7505,-1,4,4 == 'id','INT',7505,-1,4,4
>> 'int_col','INT',-1,-1,4,4 == 'int_col','INT',-1,-1,4,4
>> 'month','INT',12,0,4,4 == 'month','INT',12,0,4,4
>> 'smallint_col','SMALLINT',10,-1,2,2 == 'smallint_col','SMALLINT',10,-1,2,2
>> 'string_col','STRING',10,-1,-1,-1 == 'string_col','STRING',10,-1,-1,-1
>> 'timestamp_col','TIMESTAMP',7554,-1,16,16 != 'timestamp_col','TIMESTAMP',
>> 7552,-1,16,16
>> 'tinyint_col','TINYINT',10,-1,1,1 == 'tinyint_col','TINYINT',10,-1,1,1
>> 'year','INT',2,0,4,4 == 'year','INT',2,0,4,4
>> ---
>>
>> Amos Bird writes:
>>
>> > Ah, re-login does the trick. Thanks for you help ;).
>> >
>> > However, the e2e test yells so many errors.
>> >
>> > 1) the name of the directory containing the error log is strange. It
>> >  literaly looks like this:
>> > tests/"${RESULTS_DIR}/TEST-impala-custom-cluster.log"
>> >
>> > 2) the commit I tested is 7fc31b534d4c5cb118c559e16556a6c1ae6ca7fc
>> >
>> > 3) when executing tests/run-tests.py, it gave:
>> > -----
>> > Traceback (most recent call last):
>> >   File "./tests/run-tests.py", line 94, in <module>
>> >     test_executor.run_tests(args)
>> >   File "./tests/run-tests.py", line 63, in run_tests
>> >     exit_code = pytest.main(args)
>> >   File "/home/amos/impala/infra/python/env/local/lib/python2.
>> 7/site-packages/_pytest/config.py", line 32, in main
>> >     config = _prepareconfig(args, plugins)
>> >   File "/home/amos/impala/infra/python/env/local/lib/python2.
>> 7/site-packages/_pytest/config.py", line 78, in _prepareconfig
>> >     args = shlex.split(args)
>> >   File "/usr/lib/python2.7/shlex.py", line 279, in split
>> >     return list(lex)
>> >   File "/usr/lib/python2.7/shlex.py", line 269, in next
>> >     token = self.get_token()
>> >   File "/usr/lib/python2.7/shlex.py", line 96, in get_token
>> >     raw = self.read_token()
>> >   File "/usr/lib/python2.7/shlex.py", line 172, in read_token
>> >     raise ValueError, "No closing quotation"
>> > ValueError: No closing quotation
>> > -----
>> >
>> > 4) when executing "MAX_PYTEST_FAILURES=12345678 ./bin/run-all-tests.sh",
>> > be, fe tests are passed. e2e tests fail a lot. Log files are attached.
>> >
>> > I'm refering to this https://cwiki.apache.org/
>> confluence/display/IMPALA/How+to+load+and+run+Impala+tests
>> >
>> > regards,
>> > Amos
>> >
>> >
>> > Lars Volker writes:
>> >
>> >> Yes, this is already committed to the impala-setup repo and I used it
>> >> yesterday on a fresh Ubuntu 14.04 machine with success.
>> >>
>> >> Amos, after running impala-setup you will need to re-login to make sure
>> the
>> >> changes made to the system limits are effective. You can check them by
>> >> running "ulimit -n" in your shell.
>> >>
>> >> On Wed, Nov 2, 2016 at 5:48 AM, Jim Apple <[email protected]> wrote:
>> >>
>> >>> Isn't that already part of the script?
>> >>>
>> >>> https://github.com/awleblang/impala-setup/commit/
>> >>> 56fa829c99e997585eb63fcd49cb65eb8357e679
>> >>>
>> >>> https://git-wip-us.apache.org/repos/asf?p=incubator-impala.
>> >>> git;a=blob;f=bin/bootstrap_development.sh;h=
>> 8c4f742ae058f8017858d2a749e882
>> >>> 4be58bd410;hb=HEAD#l68
>> >>>
>> >>> On Tue, Nov 1, 2016 at 9:44 PM, Dimitris Tsirogiannis
>> >>> <[email protected]> wrote:
>> >>> > Hi Amos,
>> >>> >
>> >>> > You need to increase your limits (/etc/security/limits.conf) for max
>> >>> > number of open files (nofile). Use a pretty big number (e.g. 500K)
>> for
>> >>> > both soft and hard.
>> >>> >
>> >>> > Hope that helps.
>> >>> >
>> >>> > Dimitris
>> >>> >
>> >>> > On Tue, Nov 1, 2016 at 8:57 PM, Amos Bird <[email protected]>
>> wrote:
>> >>> >>
>> >>> >> Hi there,
>> >>> >>
>> >>> >> After days of efforts to make impala's local tests work on my Centos
>> >>> >> machine, I finally gave up and turns to Ubuntu. I followed this
>> simple
>> >>> >> guide
>> >>> >> https://cwiki.apache.org/confluence/display/IMPALA/
>> >>> Bootstrapping+an+Impala+Development+Environment+From+Scratch
>> >>> >> on a fresh installed Ubuntu 14.04. Unfortunately there are still
>> errors
>> >>> >> in loading data phase. Here is the error log,
>> >>> >>
>> >>> >> ------------------------------------------------------------
>> >>> ---------------------------------
>> >>> >> Loading Kudu TPCH (logging to /home/amos/impala/logs/data_
>> loading/load-kudu-tpch.log)...
>> >>> FAILED
>> >>> >> 'load-data tpch core kudu/none/none force' failed. Tail of log:
>> >>> >> distribute by hash (c_custkey) into 9 buckets stored as kudu
>> >>> >>
>> >>> >> (load-tpch-core-impala-generated-kudu-none-none.sql):
>> >>> >>
>> >>> >>
>> >>> >> Executing HBase Command: hbase shell load-tpch-core-hbase-
>> >>> generated.create
>> >>> >> 16/11/02 01:07:58 INFO Configuration.deprecation: hadoop.native.lib
>> is
>> >>> deprecated. Instead, use io.native.lib.available
>> >>> >> SLF4J: Class path contains multiple SLF4J bindings.
>> >>> >> SLF4J: Found binding in [jar:file:/home/amos/impala/
>> >>> toolchain/cdh_components/hbase-1.2.0-cdh5.10.0-
>> >>> SNAPSHOT/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/
>> >>> StaticLoggerBinder.class]
>> >>> >> SLF4J: Found binding in [jar:file:/home/amos/impala/
>> >>> toolchain/cdh_components/hadoop-2.6.0-cdh5.10.0-
>> >>> SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/
>> org/slf4j/impl/
>> >>> StaticLoggerBinder.class]
>> >>> >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> >>> explanation.
>> >>> >> SLF4J: Actual binding is of type [org.slf4j.impl.
>> Log4jLoggerFactory]
>> >>> >> Executing HBase Command: hbase shell post-load-tpch-core-hbase-
>> >>> generated.sql
>> >>> >> 16/11/02 01:08:03 INFO Configuration.deprecation: hadoop.native.lib
>> is
>> >>> deprecated. Instead, use io.native.lib.available
>> >>> >> SLF4J: Class path contains multiple SLF4J bindings.
>> >>> >> SLF4J: Found binding in [jar:file:/home/amos/impala/
>> >>> toolchain/cdh_components/hbase-1.2.0-cdh5.10.0-
>> >>> SNAPSHOT/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/
>> >>> StaticLoggerBinder.class]
>> >>> >> SLF4J: Found binding in [jar:file:/home/amos/impala/
>> >>> toolchain/cdh_components/hadoop-2.6.0-cdh5.10.0-
>> >>> SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/
>> org/slf4j/impl/
>> >>> StaticLoggerBinder.class]
>> >>> >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> >>> explanation.
>> >>> >> SLF4J: Actual binding is of type [org.slf4j.impl.
>> Log4jLoggerFactory]
>> >>> >> Invalidating Metadata
>> >>> >> (load-tpch-core-impala-load-generated-kudu-none-none.sql):
>> >>> >> INSERT INTO TABLE tpch_kudu.lineitem SELECT * FROM tpch.lineitem
>> >>> >>
>> >>> >> Data Loading from Impala failed with error: ImpalaBeeswaxException:
>> >>> >>  Query aborted:
>> >>> >> Kudu error(s) reported, first error: Timed out: Failed to write
>> batch
>> >>> of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 after 329
>> >>> attempt(s): Failed to write to server: (no server available):
>> Write(tablet:
>> >>> 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, num_attempts: 329)
>> >>> passed its deadline: Network error: recv error: Connection reset by
>> peer
>> >>> (error 104)
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> Kudu error(s) reported, first error: Timed out: Failed to write
>> batch
>> >>> of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 after 329
>> >>> attempt(s): Failed to write to server: (no server available):
>> Write(tablet:
>> >>> 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, num_attempts: 329)
>> >>> passed its deadline: Network error: recv error: Connection reset by
>> peer
>> >>> (error 104)
>> >>> >> Error in Kudu table 'impala::tpch_kudu.lineitem': Timed out: Failed
>> to
>> >>> write batch of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557
>> after
>> >>> 329 attempt(s): Failed to write to server: (no server available):
>> >>> Write(tablet: 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708,
>> >>> num_attempts: 329) passed its deadline: Network error: recv error:
>> >>> Connection reset by peer (error 104) (1 of 2708 similar)
>> >>> >>
>> >>> >> Traceback (most recent call last):
>> >>> >>   File "/home/amos/impala/bin/load-data.py", line 158, in
>> >>> exec_impala_query_from_file
>> >>> >>     result = impala_client.execute(query)
>> >>> >>   File "/home/amos/impala/tests/beeswax/impala_beeswax.py", line
>> 173,
>> >>> in execute
>> >>> >>     handle = self.__execute_query(query_string.strip(), user=user)
>> >>> >>   File "/home/amos/impala/tests/beeswax/impala_beeswax.py", line
>> 339,
>> >>> in __execute_query
>> >>> >>     self.wait_for_completion(handle)
>> >>> >>   File "/home/amos/impala/tests/beeswax/impala_beeswax.py", line
>> 359,
>> >>> in wait_for_completion
>> >>> >>     raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
>> >>> >> ImpalaBeeswaxException: ImpalaBeeswaxException:
>> >>> >>  Query aborted:
>> >>> >> Kudu error(s) reported, first error: Timed out: Failed to write
>> batch
>> >>> of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 after 329
>> >>> attempt(s): Failed to write to server: (no server available):
>> Write(tablet:
>> >>> 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, num_attempts: 329)
>> >>> passed its deadline: Network error: recv error: Connection reset by
>> peer
>> >>> (error 104)
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> Kudu error(s) reported, first error: Timed out: Failed to write
>> batch
>> >>> of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557 after 329
>> >>> attempt(s): Failed to write to server: (no server available):
>> Write(tablet:
>> >>> 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708, num_attempts: 329)
>> >>> passed its deadline: Network error: recv error: Connection reset by
>> peer
>> >>> (error 104)
>> >>> >> Error in Kudu table 'impala::tpch_kudu.lineitem': Timed out: Failed
>> to
>> >>> write batch of 2708 ops to tablet 84aa134fb6c24916aa16cf50f48ec557
>> after
>> >>> 329 attempt(s): Failed to write to server: (no server available):
>> >>> Write(tablet: 84aa134fb6c24916aa16cf50f48ec557, num_ops: 2708,
>> >>> num_attempts: 329) passed its deadline: Network error: recv error:
>> >>> Connection reset by peer (error 104) (1 of 2708 similar)
>> >>> >>
>> >>> >> Error in /home/amos/impala/testdata/bin/create-load-data.sh at line
>> >>> 45: while [ -n "$*" ]
>> >>> >> + cleanup
>> >>> >> + rm -rf /tmp/tmp.hMzGwIcUo3
>> >>> >> ------------------------------------------------------------
>> >>> ---------------------------------
>> >>> >>
>> >>> >> This kinda blocks my patch's rebasing. Any help is much appreciated!
>> >>> >>
>> >>> >> regards,
>> >>> >> Amos
>> >>>
>>
>>

Reply via email to