[jira] [Commented] (IMPALA-10308) Fail to load metadata for table: 'iceberg_partitioned' in a scanner test with ASAN build
[ https://issues.apache.org/jira/browse/IMPALA-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225068#comment-17225068 ] WangSheng commented on IMPALA-10308: Hi [~sql_forever], we need to create these test tables manually before execute tests, if you want to verify a specific test. Or you can run $IMPALA_HOME/bin/{{run-all-tests}}{{.sh to execute whole impala tests, impala server will create tests tables automatically, all DDL statements in functional_schema_template.sql will be executed before run tests, more details about impala test, you can refer: https://cwiki.apache.org/confluence/display/IMPALA/How+to+load%2C+run%2C+and+create+new+Impala+tests}} > Fail to load metadata for table: 'iceberg_partitioned' in a scanner test with > ASAN build > > > Key: IMPALA-10308 > URL: https://issues.apache.org/jira/browse/IMPALA-10308 > Project: IMPALA > Issue Type: Bug >Reporter: Qifan Chen >Priority: Major > > The following error was seen when running the scanner test against the ASAN > build. > {code:java} > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EINNER EXCEPTION: > EMESSAGE: AnalysisException: Failed to load metadata for table: > 'iceberg_partitioned' > E CAUSED BY: TableLoadingException: Error loading metadata for Iceberg > table hdfs://localhost:20500/test-warehouse/iceberg_test/iceberg_partitioned > E CAUSED BY: IllegalArgumentException: Can not create a Path from a null > string > TestIceberg.test_iceberg_query[protocol: beeswax | exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': True, 'abort_on_error': 1, 'debug_action': > 'HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT:FAIL@0.5', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > [gw2] linux2 -- Python 2.7.16 > /home/qchen/Impala/bin/../infra/python/env-gcc7.5.0/bin/python > query_test/test_scanners.py:357: in test_iceberg_query > self.run_test_case('QueryTest/iceberg-query', vector) > common/impala_test_suite.py:662: in run_test_case > result = exec_fn(query, user=test_section.get('USER', '').strip() or None) > common/impala_test_suite.py:600: in __exec_in_impala > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:920: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:205: in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > beeswax/impala_beeswax.py:187: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:363: in __execute_query > handle = self.execute_query_async(query_string, user=user) > beeswax/impala_beeswax.py:357: in execute_query_async > handle = self.__do_rpc(lambda: self.imp_service.query(query,)) > beeswax/impala_beeswax.py:520: in __do_rpc > {code} > To reproduce, apply the following steps. > {code:java} > 1. Build: ${IMPALA_HOME}/buildall.sh -skiptests -ninja -asan > 2. Run test: > cd {IMPALA_HOME} > $tests/run-tests.py --exploration_strategy=exhaustive > tests/query_test/test_scanners.py > {code} > Branch info. > The master branch with ttps://github.com/apache/impala.git. The HEAD points > at 193c2e773fa9f6772e4a7c30ed3a4f75029863f1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9180) Remove legacy ImpalaInternalService
[ https://issues.apache.org/jira/browse/IMPALA-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225053#comment-17225053 ] ASF subversion and git services commented on IMPALA-9180: - Commit 1af60a15605463ab4ba00d5326d130d0a3165821 in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=1af60a1 ] IMPALA-9180 (part 3): Remove legacy backend port The legacy Thrift based Impala internal service has been removed so the backend port 22000 can be freed up. This patch set flag be_port as a REMOVED_FLAG and all infrastructures around it are cleaned up. StatestoreSubscriber::subscriber_id is set as hostname + krpc_port. Testing: - Passed the exhaustive test. Change-Id: Ic6909a8da449b4d25ee98037b3eb459af4850dc6 Reviewed-on: http://gerrit.cloudera.org:8080/16533 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins > Remove legacy ImpalaInternalService > --- > > Key: IMPALA-9180 > URL: https://issues.apache.org/jira/browse/IMPALA-9180 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 3.4.0 >Reporter: Michael Ho >Assignee: Wenzhe Zhou >Priority: Minor > > Now that IMPALA-7984 is done, the legacy Thrift based Impala internal service > can now be removed. The port 22000 can also be freed up. In addition to code > change, the doc probably needs to be updated to reflect the fact that 22000 > is no longer in use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9767) ASAN crash during coordinator runtime filter updates
[ https://issues.apache.org/jira/browse/IMPALA-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225025#comment-17225025 ] Joe McDonnell commented on IMPALA-9767: --- [~fangyurao] My thought is that we may not need to clear the bloom_filter_directory_ there. If we are transitioning to a terminal state, then I think it will be freed by Coordinator::ReleaseExecResources(), which waits for publishing filters to complete. [https://github.com/apache/impala/blob/master/be/src/runtime/coordinator.cc#L744] [https://github.com/apache/impala/blob/master/be/src/runtime/coordinator.cc#L1259-L1260] I'm not very familiar with this code, so take it with a grain of salt. > ASAN crash during coordinator runtime filter updates > > > Key: IMPALA-9767 > URL: https://issues.apache.org/jira/browse/IMPALA-9767 > Project: IMPALA > Issue Type: Bug >Reporter: Sahil Takiar >Assignee: Fang-Yu Rao >Priority: Major > Labels: asan, broken-build, crash > Attachments: consoleFull_asan_939.txt > > > ASAN crash output: > {code:java} > Error MessageAddress Sanitizer message detected in > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/ee_tests/impalad.ERRORStandard > Error==4808==ERROR: AddressSanitizer: heap-use-after-free on address > 0x7f6288cbe818 at pc 0x0199f6fe bp 0x7f63c1a8b270 sp 0x7f63c1a8aa20 > READ of size 1048576 at 0x7f6288cbe818 thread T73 (rpc reactor-552) > #0 0x199f6fd in read_iovec(void*, __sanitizer::__sanitizer_iovec*, > unsigned long, unsigned long) > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904 > #1 0x19a1f57 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, > long) > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781 > #2 0x19a46c3 in __interceptor_sendmsg > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796 > #3 0x372034d in kudu::Socket::Writev(iovec const*, int, long*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3 > #4 0x331c095 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26 > #5 0x3324da1 in kudu::rpc::Connection::WriteHandler(ev::io&, int) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31 > #6 0x52ca4e2 in ev_invoke_pending > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52ca4e2) > #7 0x32aeadc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3 > #8 0x52cdb03 in ev_run > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52cdb03) > #9 0x32aecd1 in kudu::rpc::ReactorThread::RunThread() > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9 > #10 0x32c08db in boost::_bi::bind_t kudu::rpc::ReactorThread>, > boost::_bi::list1 > > >::operator()() > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16 > #11 0x2148c26 in boost::function0::operator()() const > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14 > #12 0x2144b29 in kudu::Thread::SuperviseThread(void*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3 > #13 0x7f6c0bcf4e24 in start_thread (/lib64/libpthread.so.0+0x7e24) > #14 0x7f6c0885834c in __clone (/lib64/libc.so.6+0xf834c) > 0x7f6288cbe818 is located 24 bytes inside of 1052640-byte region > [0x7f6288cbe800,0x7f6288dbf7e0) > freed by thread T114 here: > #0 0x1a773e0 in operator delete(void*) > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/asan_new_delete.cc:137 > #1 0x7f6c090faed3 in __gnu_cxx::new_allocator::deallocate(char*, > unsigned long) > /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:110 > #2 0x7f6c090faed3 in std::string::_Rep::_M_destroy(std::allocator > const&) > /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:449 > #3 0x7f6c090faed3 in std::string::_Rep::_M_dispose(std::allocator > const&) > /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:249 > #4 0x7f6c090faed3 in
[jira] [Commented] (IMPALA-10007) Impala development environment does not support Ubuntu 20.4
[ https://issues.apache.org/jira/browse/IMPALA-10007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224962#comment-17224962 ] Qifan Chen commented on IMPALA-10007: - Just changed the status. Sorry about it. > Impala development environment does not support Ubuntu 20.4 > --- > > Key: IMPALA-10007 > URL: https://issues.apache.org/jira/browse/IMPALA-10007 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Minor > Fix For: Impala 4.0 > > > The Impala development environment supports Ubuntu up to 18.4. When trying > the environment on Ubuntu 20.4, one can get the following errors. > > From ${IMPALA_HOME}/buildall.sh: > Exception: Could not find package label for OS version: ubuntu20.04. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-10007) Impala development environment does not support Ubuntu 20.4
[ https://issues.apache.org/jira/browse/IMPALA-10007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224962#comment-17224962 ] Qifan Chen edited comment on IMPALA-10007 at 11/2/20, 9:17 PM: --- Just changed the status to resolved. Sorry about it. was (Author: sql_forever): Just changed the status. Sorry about it. > Impala development environment does not support Ubuntu 20.4 > --- > > Key: IMPALA-10007 > URL: https://issues.apache.org/jira/browse/IMPALA-10007 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Minor > Fix For: Impala 4.0 > > > The Impala development environment supports Ubuntu up to 18.4. When trying > the environment on Ubuntu 20.4, one can get the following errors. > > From ${IMPALA_HOME}/buildall.sh: > Exception: Could not find package label for OS version: ubuntu20.04. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10007) Impala development environment does not support Ubuntu 20.4
[ https://issues.apache.org/jira/browse/IMPALA-10007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10007. - Resolution: Fixed > Impala development environment does not support Ubuntu 20.4 > --- > > Key: IMPALA-10007 > URL: https://issues.apache.org/jira/browse/IMPALA-10007 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Minor > Fix For: Impala 4.0 > > > The Impala development environment supports Ubuntu up to 18.4. When trying > the environment on Ubuntu 20.4, one can get the following errors. > > From ${IMPALA_HOME}/buildall.sh: > Exception: Could not find package label for OS version: ubuntu20.04. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10007) Impala development environment does not support Ubuntu 20.4
[ https://issues.apache.org/jira/browse/IMPALA-10007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224910#comment-17224910 ] Tim Armstrong commented on IMPALA-10007: [~sql_forever] can we resolve this now? > Impala development environment does not support Ubuntu 20.4 > --- > > Key: IMPALA-10007 > URL: https://issues.apache.org/jira/browse/IMPALA-10007 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Qifan Chen >Assignee: Qifan Chen >Priority: Minor > Fix For: Impala 4.0 > > > The Impala development environment supports Ubuntu up to 18.4. When trying > the environment on Ubuntu 20.4, one can get the following errors. > > From ${IMPALA_HOME}/buildall.sh: > Exception: Could not find package label for OS version: ubuntu20.04. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7572) Put remote filesystems in pre-merge testing
[ https://issues.apache.org/jira/browse/IMPALA-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7572: -- Priority: Minor (was: Major) > Put remote filesystems in pre-merge testing > --- > > Key: IMPALA-7572 > URL: https://issues.apache.org/jira/browse/IMPALA-7572 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: Jim Apple >Priority: Minor > > https://gerrit.cloudera.org/#/c/11435/ revealed that a patch can pass > pre-merge testing but fail on S3 or HDFS with erasure coding. We should have > fake versions of filesystems like these (and ADLS) that run during pre-merge > testing in order to find these type of failures earlier. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-3637) Merge codegen constant replacement mechanisms
[ https://issues.apache.org/jira/browse/IMPALA-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-3637. --- Resolution: Later > Merge codegen constant replacement mechanisms > - > > Key: IMPALA-3637 > URL: https://issues.apache.org/jira/browse/IMPALA-3637 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.6.0 >Reporter: Tim Armstrong >Priority: Minor > Labels: codegen > > We currently have two similar way to replace constants in codegen'd code: > Expr::GetConstant() and LlvmCodeGen::ReplaceCallSitesWithBoolConst(). We > should merge them so that we have a single mechanism with the functionality > of both. > E.g. > A version that takes a map where the key is a symbol and the value is a > constant, or a vector of constants: > ReplaceCallSitesWithConstants(Function* map, *) > We could then avoid the expensive Expr::GetConstant() call on the interpreted > path. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-7876) COMPUTE STATS TABLESAMPLE is not updating number of estimated rows
[ https://issues.apache.org/jira/browse/IMPALA-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224873#comment-17224873 ] Abhishek Rawat edited comment on IMPALA-7876 at 11/2/20, 6:33 PM: -- The core issue here is that the child query computing the num_rows (table stats) uses ROUND function which returns the results as a *DECIMAL* type. Eg. below. {code:java} SELECT ROUND(COUNT / 0.8935390115) FROM t1 TABLESAMPLE SYSTEM(10) REPEATABLE(1598511315168){code} The CatalogOpExecutor when setting the table stats expects the data type to be *BIGINT*. [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L243] [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L255] This used to work in the past because ROUND used to return results as type BIGINT. This behavior was later changed for the better in this [commit|https://github.com/apache/impala/commit/8fec1911e52e40aff4cc1de17265bd6803cb13f5] There are couple of ways to fix this issue. I am leaning towards a fix which will add a *CAST as BIGINT* in the generated SQL for the child query, since num_rows should be a BIGINT. [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java#L548] Also, probably best to fix this in the child query's sql, rather than adding implicit casts else where in the code. was (Author: arawat): The core issue here is that the child query computing the num_rows (table stats) uses ROUND function which returns the results as a *DECIMAL* type. Eg. below. {code:java} SELECT ROUND(COUNT / 0.8935390115) FROM t1 TABLESAMPLE SYSTEM(10) REPEATABLE(1598511315168){code} The CatalogOpExecutor when setting the table stats expects the data type to be *BIGINT*. [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L243] [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L255] This used to work in the past because ROUND used to return results as type BIGINT. This behavior was later changed for the better in this [commit|http://https//github.com/apache/impala/commit/8fec1911e52e40aff4cc1de17265bd6803cb13f5] There are couple of ways to fix this issue. I am leaning towards a fix which will add a *CAST as BIGINT* in the generated SQL for the child query, since num_rows should be a BIGINT. [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java#L548] Also, probably best to fix this in the child query's sql, rather than adding implicit casts else where in the code. > COMPUTE STATS TABLESAMPLE is not updating number of estimated rows > -- > > Key: IMPALA-7876 > URL: https://issues.apache.org/jira/browse/IMPALA-7876 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Andre Araujo >Assignee: Abhishek Rawat >Priority: Critical > > Running the command below seems to have no impact on the #rows stats. > {code} > [host:21000] default> COMPUTE STATS wide TABLESAMPLE SYSTEM(5); > Query: COMPUTE STATS wide TABLESAMPLE SYSTEM(100) > +---+ > | summary | > +---+ > | Updated 1 partition(s) and 103 column(s). | > +---+ > WARNINGS: Ignoring TABLESAMPLE because the effective sampling rate is 100%. > The minimum sample size is COMPUTE_STATS_MIN_SAMPLE_SIZE=1.00GB and the table > size 20.35GB > Fetched 1 row(s) in 43.67s > [host:21000] default> show table stats wide; > Query: show table stats wide > +---+--++-+--+---+-+---+-+ > | #Rows | Extrap #Rows | #Files | Size| Bytes Cached | Cache Replication > | Format | Incremental stats | Location| > +---+--++-+--+---+-+---+-+ > | 0 | -1 | 84 | 20.35GB | NOT CACHED | NOT CACHED > | PARQUET | false | hdfs://ns1/user/hive/warehouse/wide | > +---+--++-+--+---+-+---+-+ > Fetched 1 row(s) in 0.01s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-7876) COMPUTE STATS TABLESAMPLE is not updating number of estimated rows
[ https://issues.apache.org/jira/browse/IMPALA-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224873#comment-17224873 ] Abhishek Rawat edited comment on IMPALA-7876 at 11/2/20, 6:29 PM: -- The core issue here is that the child query computing the num_rows (table stats) uses ROUND function which returns the results as a *DECIMAL* type. Eg. below. {code:java} SELECT ROUND(COUNT / 0.8935390115) FROM t1 TABLESAMPLE SYSTEM(10) REPEATABLE(1598511315168){code} The CatalogOpExecutor when setting the table stats expects the data type to be *BIGINT*. [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L243] [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L255] This used to work in the past because ROUND used to return results as type BIGINT. This behavior was later changed for the better in this [commit|http://https//github.com/apache/impala/commit/8fec1911e52e40aff4cc1de17265bd6803cb13f5] There are couple of ways to fix this issue. I am leaning towards a fix which will add a *CAST as BIGINT* in the generated SQL for the child query, since num_rows should be a BIGINT. [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java#L548] Also, probably best to fix this in the child query's sql, rather than adding implicit casts else where in the code. was (Author: arawat): The core issue here is that the child query computing the num_rows (table stats) uses ROUND function which returns the results as a *DECIMAL* type. Eg. below. {code:java} SELECT ROUND(COUNT / 0.8935390115) FROM t1 TABLESAMPLE SYSTEM(10) REPEATABLE(1598511315168){code} The CatalogOpExecutor when setting the table stats expects the data type to be *BIGINT*. [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L243] [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L255] This used to work in the past because ROUND used to return results as type BIGINT. This behavior was later changed for the better in this [commit|http://mpala-6230%2C%20impala-6468:%20Fix%20the%20output%20type%20of%20round()%20and%20related%20fns/]. There are couple of ways to fix this issue. I am leaning towards a fix which will add a *CAST as BIGINT* in the generated SQL for the child query, since num_rows should be a BIGINT. [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java#L548] Also, probably best to fix this in the child query's sql, rather than adding implicit casts else where in the code. > COMPUTE STATS TABLESAMPLE is not updating number of estimated rows > -- > > Key: IMPALA-7876 > URL: https://issues.apache.org/jira/browse/IMPALA-7876 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Andre Araujo >Assignee: Abhishek Rawat >Priority: Critical > > Running the command below seems to have no impact on the #rows stats. > {code} > [host:21000] default> COMPUTE STATS wide TABLESAMPLE SYSTEM(5); > Query: COMPUTE STATS wide TABLESAMPLE SYSTEM(100) > +---+ > | summary | > +---+ > | Updated 1 partition(s) and 103 column(s). | > +---+ > WARNINGS: Ignoring TABLESAMPLE because the effective sampling rate is 100%. > The minimum sample size is COMPUTE_STATS_MIN_SAMPLE_SIZE=1.00GB and the table > size 20.35GB > Fetched 1 row(s) in 43.67s > [host:21000] default> show table stats wide; > Query: show table stats wide > +---+--++-+--+---+-+---+-+ > | #Rows | Extrap #Rows | #Files | Size| Bytes Cached | Cache Replication > | Format | Incremental stats | Location| > +---+--++-+--+---+-+---+-+ > | 0 | -1 | 84 | 20.35GB | NOT CACHED | NOT CACHED > | PARQUET | false | hdfs://ns1/user/hive/warehouse/wide | > +---+--++-+--+---+-+---+-+ > Fetched 1 row(s) in 0.01s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-7876) COMPUTE STATS TABLESAMPLE is not updating number of estimated rows
[ https://issues.apache.org/jira/browse/IMPALA-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224873#comment-17224873 ] Abhishek Rawat edited comment on IMPALA-7876 at 11/2/20, 6:28 PM: -- The core issue here is that the child query computing the num_rows (table stats) uses ROUND function which returns the results as a *DECIMAL* type. Eg. below. {code:java} SELECT ROUND(COUNT / 0.8935390115) FROM t1 TABLESAMPLE SYSTEM(10) REPEATABLE(1598511315168){code} The CatalogOpExecutor when setting the table stats expects the data type to be *BIGINT*. [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L243] [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L255] This used to work in the past because ROUND used to return results as type BIGINT. This behavior was later changed for the better in this [commit|http://mpala-6230%2C%20impala-6468:%20Fix%20the%20output%20type%20of%20round()%20and%20related%20fns/]. There are couple of ways to fix this issue. I am leaning towards a fix which will add a *CAST as BIGINT* in the generated SQL for the child query, since num_rows should be a BIGINT. [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java#L548] Also, probably best to fix this in the child query's sql, rather than adding implicit casts else where in the code. was (Author: arawat): The core issue here is that the child query computing the num_rows (table stats) uses ROUND function which returns the results as a *DECIMAL* type. Eg. below. SELECT ROUND(COUNT(*) / 0.8935390115) FROM t1 TABLESAMPLE SYSTEM(10) REPEATABLE(1598511315168) The CatalogOpExecutor when setting the table stats expects the data type to be *BIGINT*. [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L243] [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L255] This used to work in the past because ROUND used to return results as type BIGINT. This behavior was later changed for the better in this [commit|http://mpala-6230%2C%20impala-6468:%20Fix%20the%20output%20type%20of%20round()%20and%20related%20fns/]. There are couple of ways to fix this issue. I am leaning towards a fix which will add a *CAST as BIGINT* in the generated SQL for the child query, since num_rows should be a BIGINT. [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java#L548] Also, probably best to fix this in the child query's sql, rather than adding implicit casts else where in the code. > COMPUTE STATS TABLESAMPLE is not updating number of estimated rows > -- > > Key: IMPALA-7876 > URL: https://issues.apache.org/jira/browse/IMPALA-7876 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Andre Araujo >Assignee: Abhishek Rawat >Priority: Critical > > Running the command below seems to have no impact on the #rows stats. > {code} > [host:21000] default> COMPUTE STATS wide TABLESAMPLE SYSTEM(5); > Query: COMPUTE STATS wide TABLESAMPLE SYSTEM(100) > +---+ > | summary | > +---+ > | Updated 1 partition(s) and 103 column(s). | > +---+ > WARNINGS: Ignoring TABLESAMPLE because the effective sampling rate is 100%. > The minimum sample size is COMPUTE_STATS_MIN_SAMPLE_SIZE=1.00GB and the table > size 20.35GB > Fetched 1 row(s) in 43.67s > [host:21000] default> show table stats wide; > Query: show table stats wide > +---+--++-+--+---+-+---+-+ > | #Rows | Extrap #Rows | #Files | Size| Bytes Cached | Cache Replication > | Format | Incremental stats | Location| > +---+--++-+--+---+-+---+-+ > | 0 | -1 | 84 | 20.35GB | NOT CACHED | NOT CACHED > | PARQUET | false | hdfs://ns1/user/hive/warehouse/wide | > +---+--++-+--+---+-+---+-+ > Fetched 1 row(s) in 0.01s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7876) COMPUTE STATS TABLESAMPLE is not updating number of estimated rows
[ https://issues.apache.org/jira/browse/IMPALA-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224873#comment-17224873 ] Abhishek Rawat commented on IMPALA-7876: The core issue here is that the child query computing the num_rows (table stats) uses ROUND function which returns the results as a *DECIMAL* type. Eg. below. SELECT ROUND(COUNT(*) / 0.8935390115) FROM t1 TABLESAMPLE SYSTEM(10) REPEATABLE(1598511315168) The CatalogOpExecutor when setting the table stats expects the data type to be *BIGINT*. [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L243] [https://github.com/apache/impala/blob/master/be/src/exec/catalog-op-executor.cc#L255] This used to work in the past because ROUND used to return results as type BIGINT. This behavior was later changed for the better in this [commit|http://mpala-6230%2C%20impala-6468:%20Fix%20the%20output%20type%20of%20round()%20and%20related%20fns/]. There are couple of ways to fix this issue. I am leaning towards a fix which will add a *CAST as BIGINT* in the generated SQL for the child query, since num_rows should be a BIGINT. [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java#L548] Also, probably best to fix this in the child query's sql, rather than adding implicit casts else where in the code. > COMPUTE STATS TABLESAMPLE is not updating number of estimated rows > -- > > Key: IMPALA-7876 > URL: https://issues.apache.org/jira/browse/IMPALA-7876 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Andre Araujo >Assignee: Abhishek Rawat >Priority: Critical > > Running the command below seems to have no impact on the #rows stats. > {code} > [host:21000] default> COMPUTE STATS wide TABLESAMPLE SYSTEM(5); > Query: COMPUTE STATS wide TABLESAMPLE SYSTEM(100) > +---+ > | summary | > +---+ > | Updated 1 partition(s) and 103 column(s). | > +---+ > WARNINGS: Ignoring TABLESAMPLE because the effective sampling rate is 100%. > The minimum sample size is COMPUTE_STATS_MIN_SAMPLE_SIZE=1.00GB and the table > size 20.35GB > Fetched 1 row(s) in 43.67s > [host:21000] default> show table stats wide; > Query: show table stats wide > +---+--++-+--+---+-+---+-+ > | #Rows | Extrap #Rows | #Files | Size| Bytes Cached | Cache Replication > | Format | Incremental stats | Location| > +---+--++-+--+---+-+---+-+ > | 0 | -1 | 84 | 20.35GB | NOT CACHED | NOT CACHED > | PARQUET | false | hdfs://ns1/user/hive/warehouse/wide | > +---+--++-+--+---+-+---+-+ > Fetched 1 row(s) in 0.01s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-7876) COMPUTE STATS TABLESAMPLE is not updating number of estimated rows
[ https://issues.apache.org/jira/browse/IMPALA-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-7876 started by Abhishek Rawat. -- > COMPUTE STATS TABLESAMPLE is not updating number of estimated rows > -- > > Key: IMPALA-7876 > URL: https://issues.apache.org/jira/browse/IMPALA-7876 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Andre Araujo >Assignee: Abhishek Rawat >Priority: Critical > > Running the command below seems to have no impact on the #rows stats. > {code} > [host:21000] default> COMPUTE STATS wide TABLESAMPLE SYSTEM(5); > Query: COMPUTE STATS wide TABLESAMPLE SYSTEM(100) > +---+ > | summary | > +---+ > | Updated 1 partition(s) and 103 column(s). | > +---+ > WARNINGS: Ignoring TABLESAMPLE because the effective sampling rate is 100%. > The minimum sample size is COMPUTE_STATS_MIN_SAMPLE_SIZE=1.00GB and the table > size 20.35GB > Fetched 1 row(s) in 43.67s > [host:21000] default> show table stats wide; > Query: show table stats wide > +---+--++-+--+---+-+---+-+ > | #Rows | Extrap #Rows | #Files | Size| Bytes Cached | Cache Replication > | Format | Incremental stats | Location| > +---+--++-+--+---+-+---+-+ > | 0 | -1 | 84 | 20.35GB | NOT CACHED | NOT CACHED > | PARQUET | false | hdfs://ns1/user/hive/warehouse/wide | > +---+--++-+--+---+-+---+-+ > Fetched 1 row(s) in 0.01s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10306) [DOC] Extend FROM_UNIXTIME() doc with Timezone offset behaviour.
[ https://issues.apache.org/jira/browse/IMPALA-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224841#comment-17224841 ] shajini thayasingh commented on IMPALA-10306: - [~gaborkaszab] I assigned this ticket to me and I took care of the changes you had requested. > [DOC] Extend FROM_UNIXTIME() doc with Timezone offset behaviour. > > > Key: IMPALA-10306 > URL: https://issues.apache.org/jira/browse/IMPALA-10306 > Project: IMPALA > Issue Type: Bug > Components: Docs >Reporter: Gabor Kaszab >Assignee: shajini thayasingh >Priority: Major > > FROM_UNIXTIME() accepts a format parameter that is a string that represents > how this function should format its output timestamp. This format parameter > can contain a timezone offset, however even if we provide a TZ offset in the > format parameter it won't be included in the result. > The reason is that Impala stores Timestamp without timezone in UTC and has no > information of the timezone offset. > I think it would be nice to clarify this in the docs so that the users won't > expect to get specific timezone offsets from this function as a result. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-10306) [DOC] Extend FROM_UNIXTIME() doc with Timezone offset behaviour.
[ https://issues.apache.org/jira/browse/IMPALA-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shajini thayasingh reassigned IMPALA-10306: --- Assignee: shajini thayasingh > [DOC] Extend FROM_UNIXTIME() doc with Timezone offset behaviour. > > > Key: IMPALA-10306 > URL: https://issues.apache.org/jira/browse/IMPALA-10306 > Project: IMPALA > Issue Type: Bug > Components: Docs >Reporter: Gabor Kaszab >Assignee: shajini thayasingh >Priority: Major > > FROM_UNIXTIME() accepts a format parameter that is a string that represents > how this function should format its output timestamp. This format parameter > can contain a timezone offset, however even if we provide a TZ offset in the > format parameter it won't be included in the result. > The reason is that Impala stores Timestamp without timezone in UTC and has no > information of the timezone offset. > I think it would be nice to clarify this in the docs so that the users won't > expect to get specific timezone offsets from this function as a result. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9767) ASAN crash during coordinator runtime filter updates
[ https://issues.apache.org/jira/browse/IMPALA-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224806#comment-17224806 ] Fang-Yu Rao commented on IMPALA-9767: - Thanks [~joemcdonnell]! I think the scenario you described is possible! I will take a much closer look at this loop and try to see if we need to use a lock to prevent a state change in this loop. Will get back to you if I have any questions. > ASAN crash during coordinator runtime filter updates > > > Key: IMPALA-9767 > URL: https://issues.apache.org/jira/browse/IMPALA-9767 > Project: IMPALA > Issue Type: Bug >Reporter: Sahil Takiar >Assignee: Fang-Yu Rao >Priority: Major > Labels: asan, broken-build, crash > Attachments: consoleFull_asan_939.txt > > > ASAN crash output: > {code:java} > Error MessageAddress Sanitizer message detected in > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/ee_tests/impalad.ERRORStandard > Error==4808==ERROR: AddressSanitizer: heap-use-after-free on address > 0x7f6288cbe818 at pc 0x0199f6fe bp 0x7f63c1a8b270 sp 0x7f63c1a8aa20 > READ of size 1048576 at 0x7f6288cbe818 thread T73 (rpc reactor-552) > #0 0x199f6fd in read_iovec(void*, __sanitizer::__sanitizer_iovec*, > unsigned long, unsigned long) > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904 > #1 0x19a1f57 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, > long) > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781 > #2 0x19a46c3 in __interceptor_sendmsg > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796 > #3 0x372034d in kudu::Socket::Writev(iovec const*, int, long*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3 > #4 0x331c095 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26 > #5 0x3324da1 in kudu::rpc::Connection::WriteHandler(ev::io&, int) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31 > #6 0x52ca4e2 in ev_invoke_pending > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52ca4e2) > #7 0x32aeadc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3 > #8 0x52cdb03 in ev_run > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52cdb03) > #9 0x32aecd1 in kudu::rpc::ReactorThread::RunThread() > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9 > #10 0x32c08db in boost::_bi::bind_t kudu::rpc::ReactorThread>, > boost::_bi::list1 > > >::operator()() > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16 > #11 0x2148c26 in boost::function0::operator()() const > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14 > #12 0x2144b29 in kudu::Thread::SuperviseThread(void*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3 > #13 0x7f6c0bcf4e24 in start_thread (/lib64/libpthread.so.0+0x7e24) > #14 0x7f6c0885834c in __clone (/lib64/libc.so.6+0xf834c) > 0x7f6288cbe818 is located 24 bytes inside of 1052640-byte region > [0x7f6288cbe800,0x7f6288dbf7e0) > freed by thread T114 here: > #0 0x1a773e0 in operator delete(void*) > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/asan_new_delete.cc:137 > #1 0x7f6c090faed3 in __gnu_cxx::new_allocator::deallocate(char*, > unsigned long) > /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:110 > #2 0x7f6c090faed3 in std::string::_Rep::_M_destroy(std::allocator > const&) > /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:449 > #3 0x7f6c090faed3 in std::string::_Rep::_M_dispose(std::allocator > const&) > /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:249 > #4 0x7f6c090faed3 in std::string::reserve(unsigned long) > /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:511 > #5 0x2781865 in > impala::ClientRequestState::UpdateFilter(impala::UpdateFilterParamsPB const&, > kudu::rpc::RpcContext*)
[jira] [Commented] (IMPALA-9879) ASAN use-after-free with KRPC thread and Coordinator::FilterState::ApplyUpdate()
[ https://issues.apache.org/jira/browse/IMPALA-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224796#comment-17224796 ] Fang-Yu Rao commented on IMPALA-9879: - Thanks [~joemcdonnell] for the detailed analysis here and at IMPALA-9767! I will read and try to understand your analysis this week and will get back to you if I have any other idea, since I think I also need some time refreshing my memory of how our runtime filters aggregation and distribution works. :) > ASAN use-after-free with KRPC thread and > Coordinator::FilterState::ApplyUpdate() > - > > Key: IMPALA-9879 > URL: https://issues.apache.org/jira/browse/IMPALA-9879 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.0 >Reporter: Joe McDonnell >Assignee: Fang-Yu Rao >Priority: Critical > Labels: broken-build > > An ASAN core run failed with the following Impalad crash: > > {noformat} > ==4348==ERROR: AddressSanitizer: heap-use-after-free on address > 0x7fc144423800 at pc 0x01a50071 bp 0x7fc26d7daa40 sp 0x7fc26d7da1f0 > READ of size 1048576 at 0x7fc144423800 thread T81 (rpc reactor-464) > #0 0x1a50070 in read_iovec(void*, __sanitizer::__sanitizer_iovec*, > unsigned long, unsigned long) > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904 > #1 0x1a666d1 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, > long) > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781 > #2 0x1a68fb3 in __interceptor_sendmsg > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796 > #3 0x38074dc in kudu::Socket::Writev(iovec const*, int, long*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3 > #4 0x3411fa5 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26 > #5 0x341aa60 in kudu::rpc::Connection::WriteHandler(ev::io&, int) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31 > #6 0x55ef342 in ev_invoke_pending > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x55ef342) > #7 0x33a4d8c in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3 > #8 0x55f29ef in ev_run > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x55f29ef) > #9 0x33a4f81 in kudu::rpc::ReactorThread::RunThread() > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9 > #10 0x33b66bb in boost::_bi::bind_t kudu::rpc::ReactorThread>, > boost::_bi::list1 > > >::operator()() > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16 > #11 0x21ba196 in boost::function0::operator()() const > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14 > #12 0x21b6089 in kudu::Thread::SuperviseThread(void*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3 > #13 0x7fcabb86be24 in start_thread (/lib64/libpthread.so.0+0x7e24) > #14 0x7fcab833f34c in __clone (/lib64/libc.so.6+0xf834c) > 0x7fc144423800 is located 0 bytes inside of 1048577-byte region > [0x7fc144423800,0x7fc144523801) > freed by thread T108 here: > #0 0x1ad6050 in operator delete(void*) > /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/asan_new_delete.cc:137 > #1 0x7fcab8c425a9 in __gnu_cxx::new_allocator::deallocate(char*, > unsigned long) > /mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:125 > #2 0x7fcab8c425a9 in std::allocator_traits > >::deallocate(std::allocator&, char*, unsigned long) > /mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/alloc_traits.h:462 > #3 0x7fcab8c425a9 in std::__cxx11::basic_string std::char_traits, std::allocator >::_M_destroy(unsigned long) > /mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:226 > #4 0x7fcab8c425a9 in std::__cxx11::basic_string std::char_traits, std::allocator >::reserve(unsigned long) >
[jira] [Commented] (IMPALA-10308) Fail to load metadata for table: 'iceberg_partitioned' in a scanner test with ASAN build
[ https://issues.apache.org/jira/browse/IMPALA-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224793#comment-17224793 ] Qifan Chen commented on IMPALA-10308: - Hi [~skyyws], Sorry I have not tried copying the test files to hdfs, and thanks a lot for trying it out. Since the error was seen with running test_scanners.py. I wonder if the queries executed prior to the DDL in question have an impact. > Fail to load metadata for table: 'iceberg_partitioned' in a scanner test with > ASAN build > > > Key: IMPALA-10308 > URL: https://issues.apache.org/jira/browse/IMPALA-10308 > Project: IMPALA > Issue Type: Bug >Reporter: Qifan Chen >Priority: Major > > The following error was seen when running the scanner test against the ASAN > build. > {code:java} > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EINNER EXCEPTION: > EMESSAGE: AnalysisException: Failed to load metadata for table: > 'iceberg_partitioned' > E CAUSED BY: TableLoadingException: Error loading metadata for Iceberg > table hdfs://localhost:20500/test-warehouse/iceberg_test/iceberg_partitioned > E CAUSED BY: IllegalArgumentException: Can not create a Path from a null > string > TestIceberg.test_iceberg_query[protocol: beeswax | exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': True, 'abort_on_error': 1, 'debug_action': > 'HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT:FAIL@0.5', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > [gw2] linux2 -- Python 2.7.16 > /home/qchen/Impala/bin/../infra/python/env-gcc7.5.0/bin/python > query_test/test_scanners.py:357: in test_iceberg_query > self.run_test_case('QueryTest/iceberg-query', vector) > common/impala_test_suite.py:662: in run_test_case > result = exec_fn(query, user=test_section.get('USER', '').strip() or None) > common/impala_test_suite.py:600: in __exec_in_impala > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:920: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:205: in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > beeswax/impala_beeswax.py:187: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:363: in __execute_query > handle = self.execute_query_async(query_string, user=user) > beeswax/impala_beeswax.py:357: in execute_query_async > handle = self.__do_rpc(lambda: self.imp_service.query(query,)) > beeswax/impala_beeswax.py:520: in __do_rpc > {code} > To reproduce, apply the following steps. > {code:java} > 1. Build: ${IMPALA_HOME}/buildall.sh -skiptests -ninja -asan > 2. Run test: > cd {IMPALA_HOME} > $tests/run-tests.py --exploration_strategy=exhaustive > tests/query_test/test_scanners.py > {code} > Branch info. > The master branch with ttps://github.com/apache/impala.git. The HEAD points > at 193c2e773fa9f6772e4a7c30ed3a4f75029863f1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10308) Fail to load metadata for table: 'iceberg_partitioned' in a scanner test with ASAN build
[ https://issues.apache.org/jira/browse/IMPALA-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224611#comment-17224611 ] WangSheng commented on IMPALA-10308: Hi [~sql_forever], thanks for report this bug. It seems that this test failed when loading iceberg_partitioned. Have you even put the test files to hdfs manually like this? {code:java} // testdata/datasets/functional/functional_schema_template.sql `hadoop fs -mkdir -p /test-warehouse/iceberg_test && \ hadoop fs -put -f ${IMPALA_HOME}/testdata/data/iceberg_test/iceberg_partitioned /test-warehouse/iceberg_test/ {code} I've already rebuild code in my own environment by ninja and asan, but I can create external Iceberg table and query normally, like this: {code:java} CREATE EXTERNAL TABLE functional_parquet.iceberg_partitioned ( id INT, user STRING, action STRING, event_time TIMESTAMP ) PARTITION BY SPEC ( event_time HOUR, action IDENTITY ) STORED AS ICEBERG LOCATION 'hdfs://localhost:20500/test-warehouse/iceberg_test/iceberg_partitioned' TBLPROPERTIES ('iceberg.catalog'='hadoop.tables', 'iceberg.file_format'='parquet'); select count(1) from functional_parquet.iceberg_partitioned;{code} > Fail to load metadata for table: 'iceberg_partitioned' in a scanner test with > ASAN build > > > Key: IMPALA-10308 > URL: https://issues.apache.org/jira/browse/IMPALA-10308 > Project: IMPALA > Issue Type: Bug >Reporter: Qifan Chen >Priority: Major > > The following error was seen when running the scanner test against the ASAN > build. > {code:java} > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EINNER EXCEPTION: > EMESSAGE: AnalysisException: Failed to load metadata for table: > 'iceberg_partitioned' > E CAUSED BY: TableLoadingException: Error loading metadata for Iceberg > table hdfs://localhost:20500/test-warehouse/iceberg_test/iceberg_partitioned > E CAUSED BY: IllegalArgumentException: Can not create a Path from a null > string > TestIceberg.test_iceberg_query[protocol: beeswax | exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': True, 'abort_on_error': 1, 'debug_action': > 'HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT:FAIL@0.5', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > [gw2] linux2 -- Python 2.7.16 > /home/qchen/Impala/bin/../infra/python/env-gcc7.5.0/bin/python > query_test/test_scanners.py:357: in test_iceberg_query > self.run_test_case('QueryTest/iceberg-query', vector) > common/impala_test_suite.py:662: in run_test_case > result = exec_fn(query, user=test_section.get('USER', '').strip() or None) > common/impala_test_suite.py:600: in __exec_in_impala > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:920: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:205: in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > beeswax/impala_beeswax.py:187: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:363: in __execute_query > handle = self.execute_query_async(query_string, user=user) > beeswax/impala_beeswax.py:357: in execute_query_async > handle = self.__do_rpc(lambda: self.imp_service.query(query,)) > beeswax/impala_beeswax.py:520: in __do_rpc > {code} > To reproduce, apply the following