[jira] [Created] (IMPALA-8930) Impala Doc: Document object ownership with Ranger authorization provider
Alex Rodoni created IMPALA-8930: --- Summary: Impala Doc: Document object ownership with Ranger authorization provider Key: IMPALA-8930 URL: https://issues.apache.org/jira/browse/IMPALA-8930 Project: IMPALA Issue Type: Sub-task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8930) Impala Doc: Document object ownership with Ranger authorization provider
Alex Rodoni created IMPALA-8930: --- Summary: Impala Doc: Document object ownership with Ranger authorization provider Key: IMPALA-8930 URL: https://issues.apache.org/jira/browse/IMPALA-8930 Project: IMPALA Issue Type: Sub-task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IMPALA-8929) Impala Doc: Document the query option to only set the mem limit on executors
Alex Rodoni created IMPALA-8929: --- Summary: Impala Doc: Document the query option to only set the mem limit on executors Key: IMPALA-8929 URL: https://issues.apache.org/jira/browse/IMPALA-8929 Project: IMPALA Issue Type: Sub-task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IMPALA-8929) Impala Doc: Document the query option to only set the mem limit on executors
Alex Rodoni created IMPALA-8929: --- Summary: Impala Doc: Document the query option to only set the mem limit on executors Key: IMPALA-8929 URL: https://issues.apache.org/jira/browse/IMPALA-8929 Project: IMPALA Issue Type: Sub-task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8928) Add query option to only set the mem limit on executors
Bikramjeet Vig created IMPALA-8928: -- Summary: Add query option to only set the mem limit on executors Key: IMPALA-8928 URL: https://issues.apache.org/jira/browse/IMPALA-8928 Project: IMPALA Issue Type: Improvement Affects Versions: Product Backlog Reporter: Bikramjeet Vig Assignee: Bikramjeet Vig -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IMPALA-8928) Add query option to only set the mem limit on executors
Bikramjeet Vig created IMPALA-8928: -- Summary: Add query option to only set the mem limit on executors Key: IMPALA-8928 URL: https://issues.apache.org/jira/browse/IMPALA-8928 Project: IMPALA Issue Type: Improvement Affects Versions: Product Backlog Reporter: Bikramjeet Vig Assignee: Bikramjeet Vig -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8927) Improve HTTP auth error message
Thomas Tauber-Marshall created IMPALA-8927: -- Summary: Improve HTTP auth error message Key: IMPALA-8927 URL: https://issues.apache.org/jira/browse/IMPALA-8927 Project: IMPALA Issue Type: Improvement Reporter: Thomas Tauber-Marshall Assignee: Thomas Tauber-Marshall Currently, when a connection fails to authenticate to the hs2 http server, we log an error message that just says "HTTP auth failed." It should be possible to include more info with this message to make it clearer why auth failed. For example this error will be logged when SPNEGO auth is proceeding successfully but just incomplete, which can be confusing to users. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IMPALA-8927) Improve HTTP auth error message
Thomas Tauber-Marshall created IMPALA-8927: -- Summary: Improve HTTP auth error message Key: IMPALA-8927 URL: https://issues.apache.org/jira/browse/IMPALA-8927 Project: IMPALA Issue Type: Improvement Reporter: Thomas Tauber-Marshall Assignee: Thomas Tauber-Marshall Currently, when a connection fails to authenticate to the hs2 http server, we log an error message that just says "HTTP auth failed." It should be possible to include more info with this message to make it clearer why auth failed. For example this error will be logged when SPNEGO auth is proceeding successfully but just incomplete, which can be confusing to users. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8902) TestResultSpooling.test_spilling is flaky
[ https://issues.apache.org/jira/browse/IMPALA-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated IMPALA-8902: - Summary: TestResultSpooling.test_spilling is flaky (was: TestResultSpooling,test_spilling is flaky) > TestResultSpooling.test_spilling is flaky > - > > Key: IMPALA-8902 > URL: https://issues.apache.org/jira/browse/IMPALA-8902 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.4.0 >Reporter: Attila Jeges >Assignee: Sahil Takiar >Priority: Critical > Fix For: Impala 3.4.0 > > > Error: > {code:java} > 17:45:10 FAIL > query_test/test_result_spooling.py::TestResultSpooling::()::test_spilling[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > 17:45:10 === FAILURES > === > 17:45:10 TestResultSpooling.test_spilling[protocol: beeswax | exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > 17:45:10 [gw1] linux2 -- Python 2.7.5 > /data/jenkins/workspace/impala-cdpd-master-core-asan/repos/Impala/bin/../infra/python/env/bin/python > 17:45:10 query_test/test_result_spooling.py:104: in test_spilling > 17:45:10 .format(query, timeout)) > 17:45:10 E Timeout: Query select * from functional.alltypes order by id > limit 1500 did not spill spooled results within the timeout 10 > 17:45:10 - Captured stderr call > - > 17:45:10 SET > client_identifier=query_test/test_result_spooling.py::TestResultSpooling::()::test_spilling[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table_f; > 17:45:10 SET min_spillable_buffer_size=8192; > 17:45:10 SET batch_size=0; > 17:45:10 SET num_nodes=0; > 17:45:10 SET disable_codegen_rows_threshold=0; > 17:45:10 SET disable_codegen=False; > 17:45:10 SET abort_on_error=1; > 17:45:10 SET default_spillable_buffer_size=8192; > 17:45:10 SET max_result_spooling_mem=32768; > 17:45:10 SET exec_single_node_rows_threshold=0; > 17:45:10 -- executing against localhost:21000 > 17:45:10 > 17:45:10 select * from functional.alltypes order by id limit 1500; > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8926) TestResultSpooling::_test_full_queue is flaky
Sahil Takiar created IMPALA-8926: Summary: TestResultSpooling::_test_full_queue is flaky Key: IMPALA-8926 URL: https://issues.apache.org/jira/browse/IMPALA-8926 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.4.0 Reporter: Sahil Takiar Assignee: Sahil Takiar Has happened a few times, error message is: {code:java} query_test/test_result_spooling.py:116: in test_full_queue_large_fetch self._test_full_queue(vector, query, fetch_size=num_rows) query_test/test_result_spooling.py:148: in _test_full_queue assert re.search(send_wait_time_regex, self.client.get_runtime_profile(handle)) \ E assert None is not None E+ where None = ('RowBatchSendWaitTime: [1-9]', 'Query (id=e948cdd2bbde9430:082830be):\n DEBUG MODE WARNING: Query profile created while running a DEBUG buil...: 0.000ns\n - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 0.000ns\n') E+where = re.search E +and 'Query (id=e948cdd2bbde9430:082830be):\n DEBUG MODE WARNING: Query profile created while running a DEBUG buil...: 0.000ns\n - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 0.000ns\n' = >() E+ where > = .get_runtime_profile E+where = .client {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8926) TestResultSpooling::_test_full_queue is flaky
Sahil Takiar created IMPALA-8926: Summary: TestResultSpooling::_test_full_queue is flaky Key: IMPALA-8926 URL: https://issues.apache.org/jira/browse/IMPALA-8926 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.4.0 Reporter: Sahil Takiar Assignee: Sahil Takiar Has happened a few times, error message is: {code:java} query_test/test_result_spooling.py:116: in test_full_queue_large_fetch self._test_full_queue(vector, query, fetch_size=num_rows) query_test/test_result_spooling.py:148: in _test_full_queue assert re.search(send_wait_time_regex, self.client.get_runtime_profile(handle)) \ E assert None is not None E+ where None = ('RowBatchSendWaitTime: [1-9]', 'Query (id=e948cdd2bbde9430:082830be):\n DEBUG MODE WARNING: Query profile created while running a DEBUG buil...: 0.000ns\n - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 0.000ns\n') E+where = re.search E +and 'Query (id=e948cdd2bbde9430:082830be):\n DEBUG MODE WARNING: Query profile created while running a DEBUG buil...: 0.000ns\n - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 0.000ns\n' = >() E+ where > = .get_runtime_profile E+where = .client {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (IMPALA-8925) Consider replacing ClientRequestState ResultCache with result spooling
[ https://issues.apache.org/jira/browse/IMPALA-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated IMPALA-8925: - Component/s: Clients > Consider replacing ClientRequestState ResultCache with result spooling > -- > > Key: IMPALA-8925 > URL: https://issues.apache.org/jira/browse/IMPALA-8925 > Project: IMPALA > Issue Type: Improvement > Components: Backend, Clients >Reporter: Sahil Takiar >Priority: Major > > The {{ClientRequestState}} maintains an internal results cache (which is > really just a {{QueryResultSet}}) in order to provide support for the > {{TFetchOrientation.FETCH_FIRST}} fetch orientation (used by Hue - see > [https://github.com/apache/impala/commit/6b769d011d2016a73483f63b311e108d17d9a083]). > The cache itself has some limitations: > * It caches all results in a {{QueryResultSet}} with limited admission > control integration > * It has a max size, if the size is exceeded the cache is emptied > * It cannot spill to disk > Result spooling could potentially replace the query result cache and provide > a few benefits; it should be able to fit more rows since it can spill to > disk. The memory is better tracked as well since it integrates with both > admitted and reserved memory. Hue currently sets the max result set fetch > size to > [https://github.com/cloudera/hue/blob/master/apps/impala/src/impala/impala_flags.py#L61], > would be good to check how well that value works for Hue users so we can > decide if replacing the current result cache with result spooling makes sense. > This would require some changes to result spooling as well, currently it > discards rows whenever it reads them from the underlying > {{BufferedTupleStream}}. It would need the ability to reset the read cursor, > which would require some changes to the {{PlanRootSink}} interface as well. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8925) Consider replacing ClientRequestState ResultCache with result spooling
Sahil Takiar created IMPALA-8925: Summary: Consider replacing ClientRequestState ResultCache with result spooling Key: IMPALA-8925 URL: https://issues.apache.org/jira/browse/IMPALA-8925 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Sahil Takiar The {{ClientRequestState}} maintains an internal results cache (which is really just a {{QueryResultSet}}) in order to provide support for the {{TFetchOrientation.FETCH_FIRST}} fetch orientation (used by Hue - see [https://github.com/apache/impala/commit/6b769d011d2016a73483f63b311e108d17d9a083]). The cache itself has some limitations: * It caches all results in a {{QueryResultSet}} with limited admission control integration * It has a max size, if the size is exceeded the cache is emptied * It cannot spill to disk Result spooling could potentially replace the query result cache and provide a few benefits; it should be able to fit more rows since it can spill to disk. The memory is better tracked as well since it integrates with both admitted and reserved memory. Hue currently sets the max result set fetch size to [https://github.com/cloudera/hue/blob/master/apps/impala/src/impala/impala_flags.py#L61], would be good to check how well that value works for Hue users so we can decide if replacing the current result cache with result spooling makes sense. This would require some changes to result spooling as well, currently it discards rows whenever it reads them from the underlying {{BufferedTupleStream}}. It would need the ability to reset the read cursor, which would require some changes to the {{PlanRootSink}} interface as well. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8925) Consider replacing ClientRequestState ResultCache with result spooling
Sahil Takiar created IMPALA-8925: Summary: Consider replacing ClientRequestState ResultCache with result spooling Key: IMPALA-8925 URL: https://issues.apache.org/jira/browse/IMPALA-8925 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Sahil Takiar The {{ClientRequestState}} maintains an internal results cache (which is really just a {{QueryResultSet}}) in order to provide support for the {{TFetchOrientation.FETCH_FIRST}} fetch orientation (used by Hue - see [https://github.com/apache/impala/commit/6b769d011d2016a73483f63b311e108d17d9a083]). The cache itself has some limitations: * It caches all results in a {{QueryResultSet}} with limited admission control integration * It has a max size, if the size is exceeded the cache is emptied * It cannot spill to disk Result spooling could potentially replace the query result cache and provide a few benefits; it should be able to fit more rows since it can spill to disk. The memory is better tracked as well since it integrates with both admitted and reserved memory. Hue currently sets the max result set fetch size to [https://github.com/cloudera/hue/blob/master/apps/impala/src/impala/impala_flags.py#L61], would be good to check how well that value works for Hue users so we can decide if replacing the current result cache with result spooling makes sense. This would require some changes to result spooling as well, currently it discards rows whenever it reads them from the underlying {{BufferedTupleStream}}. It would need the ability to reset the read cursor, which would require some changes to the {{PlanRootSink}} interface as well. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IMPALA-8924) DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty
Sahil Takiar created IMPALA-8924: Summary: DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty Key: IMPALA-8924 URL: https://issues.apache.org/jira/browse/IMPALA-8924 Project: IMPALA Issue Type: Sub-task Components: Backend Affects Versions: Impala 3.4.0 Reporter: Sahil Takiar Assignee: Sahil Takiar When running exhaustive tests with result spooling enabled, there are several impalad crashes with the following stack: {code:java} #0 0x7f5e797541f7 in raise () from /lib64/libc.so.6 #1 0x7f5e797558e8 in abort () from /lib64/libc.so.6 #2 0x04cc5834 in google::DumpStackTraceAndExit() () #3 0x04cbc28d in google::LogMessage::Fail() () #4 0x04cbdb32 in google::LogMessage::SendToLog() () #5 0x04cbbc67 in google::LogMessage::Flush() () #6 0x04cbf22e in google::LogMessageFatal::~LogMessageFatal() () #7 0x029a16cd in impala::SpillableRowBatchQueue::IsEmpty (this=0x13d504e0) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/spillable-row-batch-queue.cc:128 #8 0x025f5610 in impala::BufferedPlanRootSink::IsQueueEmpty (this=0x13943000) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.h:147 #9 0x025f4e81 in impala::BufferedPlanRootSink::GetNext (this=0x13943000, state=0x13d2a1c0, results=0x173c8520, num_results=-1, eos=0xd30cde1) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.cc:158 #10 0x0294ef4d in impala::Coordinator::GetNext (this=0xe4ed180, results=0x173c8520, max_rows=-1, eos=0xd30cde1) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/coordinator.cc:683 #11 0x02251043 in impala::ClientRequestState::FetchRowsInternal (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:959 #12 0x022503e7 in impala::ClientRequestState::FetchRows (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:851 #13 0x0226a36d in impala::ImpalaServer::FetchInternal (this=0x12d14800, request_state=0xd30c800, start_over=false, fetch_size=-1, query_results=0x7f5daf861138) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:582 #14 0x02264970 in impala::ImpalaServer::fetch (this=0x12d14800, query_results=..., query_handle=..., start_over=false, fetch_size=-1) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:188 #15 0x027caf09 in beeswax::BeeswaxServiceProcessor::process_fetch (this=0x12d6fc20, seqid=0, iprot=0x119f5780, oprot=0x119f56c0, callContext=0xdf92060) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:3398 #16 0x027c94e6 in beeswax::BeeswaxServiceProcessor::dispatchCall (this=0x12d6fc20, iprot=0x119f5780, oprot=0x119f56c0, fname=..., seqid=0, callContext=0xdf92060) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:3200 #17 0x02796f13 in impala::ImpalaServiceProcessor::dispatchCall (this=0x12d6fc20, iprot=0x119f5780, oprot=0x119f56c0, fname=..., seqid=0, callContext=0xdf92060) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/ImpalaService.cpp:1824 #18 0x01b3cee4 in apache::thrift::TDispatchProcessor::process (this=0x12d6fc20, in=..., out=..., connectionContext=0xdf92060) at /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/thrift-0.9.3-p7/include/thrift/TDispatchProcessor.h:121 #19 0x01f9bf28 in apache::thrift::server::TAcceptQueueServer::Task::run (this=0xdf92000) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/rpc/TAcceptQueueServer.cpp:84 #20 0x01f9166d in impala::ThriftThread::RunRunnable (this=0x116ddfc0, runnable=..., promise=0x7f5db0862e90) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/rpc/thrift-thread.cc:74 #21 0x01f92d93 in boost::_mfi::mf2, impala::Promise*>::operator() (this=0x121e7800, p=0x116ddfc0, a1=..., a2=0x7f5db0862e90) at /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/mem_fn_template.hpp:280 #22 0x01f92c29 in boost::_bi::list3, boost::_bi::value >, boost::_bi::value*> >::operator(), impala::Promise*>, boost::_bi::list0> (this=0x121e7810, f=..., a=...) at
[jira] [Created] (IMPALA-8924) DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty
Sahil Takiar created IMPALA-8924: Summary: DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty Key: IMPALA-8924 URL: https://issues.apache.org/jira/browse/IMPALA-8924 Project: IMPALA Issue Type: Sub-task Components: Backend Affects Versions: Impala 3.4.0 Reporter: Sahil Takiar Assignee: Sahil Takiar When running exhaustive tests with result spooling enabled, there are several impalad crashes with the following stack: {code:java} #0 0x7f5e797541f7 in raise () from /lib64/libc.so.6 #1 0x7f5e797558e8 in abort () from /lib64/libc.so.6 #2 0x04cc5834 in google::DumpStackTraceAndExit() () #3 0x04cbc28d in google::LogMessage::Fail() () #4 0x04cbdb32 in google::LogMessage::SendToLog() () #5 0x04cbbc67 in google::LogMessage::Flush() () #6 0x04cbf22e in google::LogMessageFatal::~LogMessageFatal() () #7 0x029a16cd in impala::SpillableRowBatchQueue::IsEmpty (this=0x13d504e0) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/spillable-row-batch-queue.cc:128 #8 0x025f5610 in impala::BufferedPlanRootSink::IsQueueEmpty (this=0x13943000) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.h:147 #9 0x025f4e81 in impala::BufferedPlanRootSink::GetNext (this=0x13943000, state=0x13d2a1c0, results=0x173c8520, num_results=-1, eos=0xd30cde1) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.cc:158 #10 0x0294ef4d in impala::Coordinator::GetNext (this=0xe4ed180, results=0x173c8520, max_rows=-1, eos=0xd30cde1) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/coordinator.cc:683 #11 0x02251043 in impala::ClientRequestState::FetchRowsInternal (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:959 #12 0x022503e7 in impala::ClientRequestState::FetchRows (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:851 #13 0x0226a36d in impala::ImpalaServer::FetchInternal (this=0x12d14800, request_state=0xd30c800, start_over=false, fetch_size=-1, query_results=0x7f5daf861138) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:582 #14 0x02264970 in impala::ImpalaServer::fetch (this=0x12d14800, query_results=..., query_handle=..., start_over=false, fetch_size=-1) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:188 #15 0x027caf09 in beeswax::BeeswaxServiceProcessor::process_fetch (this=0x12d6fc20, seqid=0, iprot=0x119f5780, oprot=0x119f56c0, callContext=0xdf92060) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:3398 #16 0x027c94e6 in beeswax::BeeswaxServiceProcessor::dispatchCall (this=0x12d6fc20, iprot=0x119f5780, oprot=0x119f56c0, fname=..., seqid=0, callContext=0xdf92060) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:3200 #17 0x02796f13 in impala::ImpalaServiceProcessor::dispatchCall (this=0x12d6fc20, iprot=0x119f5780, oprot=0x119f56c0, fname=..., seqid=0, callContext=0xdf92060) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/ImpalaService.cpp:1824 #18 0x01b3cee4 in apache::thrift::TDispatchProcessor::process (this=0x12d6fc20, in=..., out=..., connectionContext=0xdf92060) at /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/thrift-0.9.3-p7/include/thrift/TDispatchProcessor.h:121 #19 0x01f9bf28 in apache::thrift::server::TAcceptQueueServer::Task::run (this=0xdf92000) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/rpc/TAcceptQueueServer.cpp:84 #20 0x01f9166d in impala::ThriftThread::RunRunnable (this=0x116ddfc0, runnable=..., promise=0x7f5db0862e90) at /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/rpc/thrift-thread.cc:74 #21 0x01f92d93 in boost::_mfi::mf2, impala::Promise*>::operator() (this=0x121e7800, p=0x116ddfc0, a1=..., a2=0x7f5db0862e90) at /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/mem_fn_template.hpp:280 #22 0x01f92c29 in boost::_bi::list3, boost::_bi::value >, boost::_bi::value*> >::operator(), impala::Promise*>, boost::_bi::list0> (this=0x121e7810, f=..., a=...) at
[jira] [Commented] (IMPALA-8508) Use Python 3 from toolchain for impala-python
[ https://issues.apache.org/jira/browse/IMPALA-8508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924412#comment-16924412 ] Tim Armstrong commented on IMPALA-8508: --- Here's a commit that adds it to the toolchain - https://gerrit.cloudera.org/#/c/14161/ > Use Python 3 from toolchain for impala-python > - > > Key: IMPALA-8508 > URL: https://issues.apache.org/jira/browse/IMPALA-8508 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Attachments: > 0001-WIP-IMPALA-8508-download-Python-2.7-from-toolchain-i.patch > > > We should standardise on a single python version to use for tests and other > infrastructure. Python 2.7 is going EOL soon. > I started adding it to the toolchain - https://gerrit.cloudera.org/#/c/14161/ -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8922) When startup independent impalad daemon that trys to open transport for localhost:24000
[ https://issues.apache.org/jira/browse/IMPALA-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8922. --- Resolution: Won't Fix The packaging and startup scripts are from Cloudera, not Apache Impala. We generally recommend only deploying one Impala daemon per host anyway. You can deploy multiple daemons by manually configuring ports and it works, but resource management may not behave exactly as expected without additional tuning. > When startup independent impalad daemon that trys to open transport for > localhost:24000 > --- > > Key: IMPALA-8922 > URL: https://issues.apache.org/jira/browse/IMPALA-8922 > Project: IMPALA > Issue Type: Bug >Reporter: shaozhipeng >Priority: Major > > When I have install impala-server-3.2.0+cdh6.3.0-1279813.el7.x86_64 on a > newer server node( Other impala-server and impala-state , impala-catalog have > installed on the other server node - slave3 and running healthy.) > > Newer Server, when startup impalad daemon that trys to open transport for > localhost:24000. > > The catalog and state host was configured in file /etc/default/impala. > > ps -ef|grep impala output: > /usr/lib/impala/sbin/impalad -log_dir=/sumpay/cdh-impala/logs > -catalog_service_host=slave3 -state_store_port=24000 -use_statestore > -state_store_host=slave3 -be_port=22000 > -kudu_master_hosts=slave1:7051,slave2:7051,slave3:7051 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IMPALA-8923) Don't need synchronized in HBaseTable.getEstimatedRowStats
Quanlong Huang created IMPALA-8923: -- Summary: Don't need synchronized in HBaseTable.getEstimatedRowStats Key: IMPALA-8923 URL: https://issues.apache.org/jira/browse/IMPALA-8923 Project: IMPALA Issue Type: Improvement Components: Frontend Affects Versions: Impala 3.2.0, Impala 3.1.0, Impala 2.12.0, Impala 3.0, Impala 2.11.0, Impala 2.10.0, Impala 2.9.0, Impala 2.7.1, Impala 2.8.0, Impala 2.7.0, Impala 3.3.0 Reporter: Quanlong Huang Assignee: Quanlong Huang HBaseTable.getEstimatedRowStats() estimates #rows and row size by sampling on hbase table in target key range. It requires HBase RPCs so could be slow. Currently, HBaseTable.getEstimatedRowStats() is marked as synchronized. The purpose is to protect the HTable (old HBase API) object in legacy codes (before commit [cf9d248|https://github.com/apache/impala/commit/cf9d2485dd4e6544f6f1f407e2ad0b43eba31874]). However, after commit [cf9d248|https://github.com/apache/impala/commit/cf9d2485dd4e6544f6f1f407e2ad0b43eba31874], we create org.apache.hadoop.hbase.client.Table object for each task (See comments and usages of FeHBaseTable.Util.getHBaseTable()). So we don't need the "synchronized" marker anymore in HBaseTable.getEstimatedRowStats(). Keeping the "synchronized" marker is further harmful. In high qps workload, queries on the same table will wait for entering this method and cost a lot of time in waiting (if this method is comparable slow). This can be revealed by manually adding a latency (e.g. 100ms) in FeHBaseTable.Util.getEstimatedRowStats() and run concurrent queries on the same hbase table. In my experiment, removing "synchronized" gains 40% boost in 95% percentil query time. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8923) Don't need synchronized in HBaseTable.getEstimatedRowStats
Quanlong Huang created IMPALA-8923: -- Summary: Don't need synchronized in HBaseTable.getEstimatedRowStats Key: IMPALA-8923 URL: https://issues.apache.org/jira/browse/IMPALA-8923 Project: IMPALA Issue Type: Improvement Components: Frontend Affects Versions: Impala 3.2.0, Impala 3.1.0, Impala 2.12.0, Impala 3.0, Impala 2.11.0, Impala 2.10.0, Impala 2.9.0, Impala 2.7.1, Impala 2.8.0, Impala 2.7.0, Impala 3.3.0 Reporter: Quanlong Huang Assignee: Quanlong Huang HBaseTable.getEstimatedRowStats() estimates #rows and row size by sampling on hbase table in target key range. It requires HBase RPCs so could be slow. Currently, HBaseTable.getEstimatedRowStats() is marked as synchronized. The purpose is to protect the HTable (old HBase API) object in legacy codes (before commit [cf9d248|https://github.com/apache/impala/commit/cf9d2485dd4e6544f6f1f407e2ad0b43eba31874]). However, after commit [cf9d248|https://github.com/apache/impala/commit/cf9d2485dd4e6544f6f1f407e2ad0b43eba31874], we create org.apache.hadoop.hbase.client.Table object for each task (See comments and usages of FeHBaseTable.Util.getHBaseTable()). So we don't need the "synchronized" marker anymore in HBaseTable.getEstimatedRowStats(). Keeping the "synchronized" marker is further harmful. In high qps workload, queries on the same table will wait for entering this method and cost a lot of time in waiting (if this method is comparable slow). This can be revealed by manually adding a latency (e.g. 100ms) in FeHBaseTable.Util.getEstimatedRowStats() and run concurrent queries on the same hbase table. In my experiment, removing "synchronized" gains 40% boost in 95% percentil query time. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Assigned] (IMPALA-8498) Write column index for floating types when NaN is not present
[ https://issues.apache.org/jira/browse/IMPALA-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy reassigned IMPALA-8498: - Assignee: Norbert Luksa (was: Zoltán Borók-Nagy) > Write column index for floating types when NaN is not present > - > > Key: IMPALA-8498 > URL: https://issues.apache.org/jira/browse/IMPALA-8498 > Project: IMPALA > Issue Type: Bug >Reporter: Zoltán Borók-Nagy >Assignee: Norbert Luksa >Priority: Major > Labels: ramp-up > > IMPALA-7304 disabled column index writing for floating point columns until > PARQUET-1222 is resolved. > PARQUET-1222 is responsible for defining a total order for floating values, > but the problematic values are only the NaNs. Therefore we can write the > column index if NaNs are not present in the data. Parquet-MR also does this, > following the principles in > [https://github.com/apache/parquet-format/blob/75eb7a7b84e6e62bfb09668b6d8d40b12597456e/src/main/thrift/parquet.thrift#L827-L834] > > Impala should follow this behavior, and also when storing zeroes, it should > store -0.0 as minimum and +0.0 as maximum. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8498) Write column index for floating types when NaN is not present
[ https://issues.apache.org/jira/browse/IMPALA-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy reassigned IMPALA-8498: - Assignee: Zoltán Borók-Nagy > Write column index for floating types when NaN is not present > - > > Key: IMPALA-8498 > URL: https://issues.apache.org/jira/browse/IMPALA-8498 > Project: IMPALA > Issue Type: Bug >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: ramp-up > > IMPALA-7304 disabled column index writing for floating point columns until > PARQUET-1222 is resolved. > PARQUET-1222 is responsible for defining a total order for floating values, > but the problematic values are only the NaNs. Therefore we can write the > column index if NaNs are not present in the data. Parquet-MR also does this, > following the principles in > [https://github.com/apache/parquet-format/blob/75eb7a7b84e6e62bfb09668b6d8d40b12597456e/src/main/thrift/parquet.thrift#L827-L834] > > Impala should follow this behavior, and also when storing zeroes, it should > store -0.0 as minimum and +0.0 as maximum. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8922) When startup independent impalad daemon that trys to open transport for localhost:24000
shaozhipeng created IMPALA-8922: --- Summary: When startup independent impalad daemon that trys to open transport for localhost:24000 Key: IMPALA-8922 URL: https://issues.apache.org/jira/browse/IMPALA-8922 Project: IMPALA Issue Type: Bug Reporter: shaozhipeng When I have install impala-server-3.2.0+cdh6.3.0-1279813.el7.x86_64 on a newer server node( Other impala-server and impala-state , impala-catalog have installed on the other server node - slave3 and running healthy.) Newer Server, when startup impalad daemon that trys to open transport for localhost:24000. The catalog and state host was configured in file /etc/default/impala. ps -ef|grep impala output: /usr/lib/impala/sbin/impalad -log_dir=/sumpay/cdh-impala/logs -catalog_service_host=slave3 -state_store_port=24000 -use_statestore -state_store_host=slave3 -be_port=22000 -kudu_master_hosts=slave1:7051,slave2:7051,slave3:7051 -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8922) When startup independent impalad daemon that trys to open transport for localhost:24000
shaozhipeng created IMPALA-8922: --- Summary: When startup independent impalad daemon that trys to open transport for localhost:24000 Key: IMPALA-8922 URL: https://issues.apache.org/jira/browse/IMPALA-8922 Project: IMPALA Issue Type: Bug Reporter: shaozhipeng When I have install impala-server-3.2.0+cdh6.3.0-1279813.el7.x86_64 on a newer server node( Other impala-server and impala-state , impala-catalog have installed on the other server node - slave3 and running healthy.) Newer Server, when startup impalad daemon that trys to open transport for localhost:24000. The catalog and state host was configured in file /etc/default/impala. ps -ef|grep impala output: /usr/lib/impala/sbin/impalad -log_dir=/sumpay/cdh-impala/logs -catalog_service_host=slave3 -state_store_port=24000 -use_statestore -state_store_host=slave3 -be_port=22000 -kudu_master_hosts=slave1:7051,slave2:7051,slave3:7051 -- This message was sent by Atlassian Jira (v8.3.2#803003)