[jira] [Assigned] (IMPALA-5802) COMPUTE STATS uses MT_DOP=4 by default
[ https://issues.apache.org/jira/browse/IMPALA-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned IMPALA-5802: - Assignee: Tim Armstrong > COMPUTE STATS uses MT_DOP=4 by default > -- > > Key: IMPALA-5802 > URL: https://issues.apache.org/jira/browse/IMPALA-5802 > Project: IMPALA > Issue Type: Sub-task > Components: Backend, Frontend >Affects Versions: Impala 2.9.0 >Reporter: Alexander Behm >Assignee: Tim Armstrong >Priority: Major > Labels: compute-stats > > Now that IMPALA-3905 has been completely addressed we should run COMPUTE > STATS with MT_DOP=4 by default, regardless of file format. The motivation is > consistency and speeding up COMPUTE STATS in most cases. > This task is a continuation of IMPALA-4572. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8879) Upgrade bootstrap.js
[ https://issues.apache.org/jira/browse/IMPALA-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8879. --- Fix Version/s: Impala 3.4.0 Resolution: Fixed > Upgrade bootstrap.js > > > Key: IMPALA-8879 > URL: https://issues.apache.org/jira/browse/IMPALA-8879 > Project: IMPALA > Issue Type: Task > Components: Frontend >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.4.0 > > > The version we're using is quite out-of-date. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8879) Upgrade bootstrap.js
[ https://issues.apache.org/jira/browse/IMPALA-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921166#comment-16921166 ] ASF subversion and git services commented on IMPALA-8879: - Commit b14848151b5e47ab0ca0f4e8a11add0d6e6b7b64 in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b148481 ] IMPALA-8879: upgrade bootstrap for debug page to 4.3.1 Also upgrade DataTables to 1.10.18 at the same time.. They were obtained from https://datatables.net/download/ and https://getbootstrap.com/docs/4.3/getting-started/download/. Included the version number in the css and js filenames to avoid potential issues with stale versions being cached (I got tripped up by this). I had to do some additional work to get the UI to look right after the upgrade: * added additional style classes to HTML elements for nav and breadcrumb styles - these are required in bootstrap 4. * added styling for plan visualisation graph for it to appear similar to the old graph. * Added styling for to get box around text. Testing: Manually clicked through the web UI to see if anything looked wrong or didn't function correctly. Change-Id: Ib58f407574f590825d208424a8c0fd101b0a19a7 Reviewed-on: http://gerrit.cloudera.org:8080/14119 Reviewed-by: Tim Armstrong Reviewed-by: Quanlong Huang Tested-by: Impala Public Jenkins > Upgrade bootstrap.js > > > Key: IMPALA-8879 > URL: https://issues.apache.org/jira/browse/IMPALA-8879 > Project: IMPALA > Issue Type: Task > Components: Frontend >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > The version we're using is quite out-of-date. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8907) TestResultSpooling.test_slow_query is flaky
[ https://issues.apache.org/jira/browse/IMPALA-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8907 started by Sahil Takiar. > TestResultSpooling.test_slow_query is flaky > --- > > Key: IMPALA-8907 > URL: https://issues.apache.org/jira/browse/IMPALA-8907 > Project: IMPALA > Issue Type: Bug >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > > Recently failed in an ubuntu-16.04-dockerised-tests job: > [https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/1102/testReport/junit/query_test.test_result_spooling/TestResultSpooling/test_slow_query_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/] > Error Message: > {code:java} > query_test/test_result_spooling.py:172: in test_slow_query assert > re.search(get_wait_time_regex, self.client.get_runtime_profile(handle)) \ E > assert None is not None E+ where None = 0x7f0da4115c08>('RowBatchGetWaitTime: [1-9]', 'Query > (id=7f47e1d6a1a1c804:492214eb):\n DEBUG MODE WARNING: Query profile > created while running a DEBUG buil... - OptimizationTime: 331.998ms\n >- PeakMemoryUsage: 1.09 MB (1144320)\n - PrepareTime: > 31.999ms\n') E+where = re.search > E+and 'Query (id=7f47e1d6a1a1c804:492214eb):\n DEBUG MODE > WARNING: Query profile created while running a DEBUG buil... - > OptimizationTime: 331.998ms\n - PeakMemoryUsage: 1.09 MB > (1144320)\n - PrepareTime: 31.999ms\n' = BeeswaxConnection.get_runtime_profile of > 0x7f0d94afa7d0>>( 0x7f0d94afffd0>) E+ where BeeswaxConnection.get_runtime_profile of > > > = 0x7f0d94afa7d0>.get_runtime_profile E+where > = > .client > {code} > Stacktrace: > {code:java} > query_test/test_result_spooling.py:172: in test_slow_query > assert re.search(get_wait_time_regex, > self.client.get_runtime_profile(handle)) \ > E assert None is not None > E+ where None = 0x7f0da4115c08>('RowBatchGetWaitTime: [1-9]', 'Query > (id=7f47e1d6a1a1c804:492214eb):\n DEBUG MODE WARNING: Query profile > created while running a DEBUG buil... - OptimizationTime: 331.998ms\n >- PeakMemoryUsage: 1.09 MB (1144320)\n - PrepareTime: > 31.999ms\n') > E+where = re.search > E+and 'Query (id=7f47e1d6a1a1c804:492214eb):\n DEBUG MODE > WARNING: Query profile created while running a DEBUG buil... - > OptimizationTime: 331.998ms\n - PeakMemoryUsage: 1.09 MB > (1144320)\n - PrepareTime: 31.999ms\n' = BeeswaxConnection.get_runtime_profile of > 0x7f0d94afa7d0>>( 0x7f0d94afffd0>) > E+ where > > = 0x7f0d94afa7d0>.get_runtime_profile > E+where at 0x7f0d94afa7d0> = 0x7f0d94af3d50>.client {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates
[ https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921073#comment-16921073 ] Sahil Takiar commented on IMPALA-7351: -- [~bikramjeet.vig] memory estimates for IMPALA-4268 were added in IMPALA-8818 - [https://github.com/apache/impala/blob/b7dfc18c59e831fa265d14ef4f7d26e33120b67f/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java#L68] When result spooling is disabled, the reservation is still set to {{ResourceProfile.noReservation(0)}} since no rows are actually buffered. Do we think that there should be a reservation for {{PlanRootSink}} when result spooling is disabled as well? > Add memory estimates for plan nodes and sinks with missing estimates > > > Key: IMPALA-7351 > URL: https://issues.apache.org/jira/browse/IMPALA-7351 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Tim Armstrong >Assignee: Bikramjeet Vig >Priority: Major > Labels: admission-control, resource-management > > Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, > etc are missing memory estimates entirely. > We should add a basic estimate for all these cases based on experiments and > data from real workloads. In some cases 0 may be the right estimate (e.g. for > streaming nodes like SelectNode that just pass through data) but we should > remove TODOs and document the reasoning in those cases. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-558) HS2::FetchResults sets hasMoreRows in many cases where no more rows are to be returned
[ https://issues.apache.org/jira/browse/IMPALA-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921071#comment-16921071 ] Sahil Takiar commented on IMPALA-558: - The frequency of this issue should be significantly reduced when result spooling is enabled. The call to {{BufferedPlanRootSink::Send}} no longer blocks waiting for a corresponding call to {{GetNext}}, so {{FlushFinal}} is called immediately after sending the last batch. However, even with result spooling, the issue could still occur because {{Send}} releases the lock and then {{FlushFinal}} re-acquires it before setting the sender state. So it is possible that the client calls {{GetNext}} before {{FlushFinal}} can set the state to EOS. This could be fixed by re-factoring the {{PlanRootSink}} interface so that {{Send}} takes in an {{eos}} flag. This would allow {{Send}} to know if the batch being sent, is the last one. It could then set the {{sender_state_}} flag. However, I'm not sure its worth the effort. > HS2::FetchResults sets hasMoreRows in many cases where no more rows are to be > returned > -- > > Key: IMPALA-558 > URL: https://issues.apache.org/jira/browse/IMPALA-558 > Project: IMPALA > Issue Type: Sub-task > Components: Clients >Affects Versions: Impala 1.1 >Reporter: Henry Robinson >Priority: Minor > Labels: query-lifecycle > > The first call to {{FetchResults}} always sets {{hasMoreRows}} even when 0 > rows should be returned. The next call correctly sets {{hasMoreRows == > False}}. The upshot is there's always an extra round-trip, although > correctness isn't affected. > {code} > execute_statement_req = TCLIService.TExecuteStatementReq() > execute_statement_req.sessionHandle = resp.sessionHandle > execute_statement_req.statement = "SELECT COUNT(*) FROM > functional.alltypes WHERE 1 = 2" > execute_statement_resp = > self.hs2_client.ExecuteStatement(execute_statement_req) > > fetch_results_req = TCLIService.TFetchResultsReq() > fetch_results_req.operationHandle = execute_statement_resp.operationHandle > fetch_results_req.maxRows = 100 > fetch_results_resp = self.hs2_client.FetchResults(fetch_results_req) > > assert not fetch_results_resp.hasMoreRows # Fails > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-1618) Impala server should always try to fulfill requested fetch size
[ https://issues.apache.org/jira/browse/IMPALA-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921060#comment-16921060 ] Sahil Takiar commented on IMPALA-1618: -- I ran the test script in the JIRA description as well and confirmed that when result spooling is enabled it always returns the requested number of rows. > Impala server should always try to fulfill requested fetch size > --- > > Key: IMPALA-1618 > URL: https://issues.apache.org/jira/browse/IMPALA-1618 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Affects Versions: Impala 2.0.1 >Reporter: casey >Priority: Minor > Labels: usability > Fix For: Impala 3.4.0 > > > The thrift fetch request specifies the number of rows that it would like but > the Impala server may return fewer even though more results are available. > For example, using the default row_batch size of 1024, if the client requests > 1023 rows, the first response contains 1023 rows but the second response > contains only 1 row. This is because the server internally uses row_batch > (1024), returns the requested count (1023) and caches the remaining row, then > the next time around only uses the cache. > In general the end user should set both the row batch size and the thrift > request size. In practice the query writer setting row_batch and the > driver/programmer setting fetch size may often be different people. > There is one case that works fine now though - setting the batch size to less > than the thrift req size. In this case the thrift response is always the same > as batch size. > Code example: > {noformat} > dev@localhost:~/impyla$ git diff > diff --git a/impala/_rpc/hiveserver2.py b/impala/_rpc/hiveserver2.py > index 6139002..31fdab7 100644 > --- a/impala/_rpc/hiveserver2.py > +++ b/impala/_rpc/hiveserver2.py > @@ -265,6 +265,7 @@ def fetch_results(service, operation_handle, > hs2_protocol_version, schema=None, > req = TFetchResultsReq(operationHandle=operation_handle, > orientation=orientation, > maxRows=max_rows) > +print("req: " + str(max_rows)) > resp = service.FetchResults(req) > err_if_rpc_not_ok(resp) > > @@ -273,6 +274,7 @@ def fetch_results(service, operation_handle, > hs2_protocol_version, schema=None, > for (i, col) in enumerate(resp.results.columns)] > num_cols = len(tcols) > num_rows = len(tcols[0].values) > +print("rec: " + str(num_rows)) > rows = [] > for i in xrange(num_rows): > row = [] > dev@localhost:~/impyla$ cat test.py > from impala.dbapi import connect > conn = connect() > cur = conn.cursor() > cur.set_arraysize(1024) > cur.execute("set batch_size=1025") > cur.execute("select * from tpch.lineitem") > while True: > rows = cur.fetchmany() > if not rows: > break > cur.close() > conn.close() > dev@localhost:~/impyla$ python test.py | head > Failed to import pandas > req: 1024 > rec: 1024 > req: 1024 > rec: 1 > req: 1024 > rec: 1024 > req: 1024 > rec: 1 > req: 1024 > {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7235) Allow the Statestore to shut down cleanly
[ https://issues.apache.org/jira/browse/IMPALA-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-7235. --- Resolution: Won't Fix I think this is difficult because the thrift interfaces are messy and isn't really that important - I don't think we want to complicate test code to solve this. > Allow the Statestore to shut down cleanly > - > > Key: IMPALA-7235 > URL: https://issues.apache.org/jira/browse/IMPALA-7235 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 2.0 >Reporter: Sailesh Mukil >Priority: Major > Labels: cleanup, statestore > > The Statestore class was written with the assumption that it will live for > the entire lifetime of the cluster and never have to be shut down. This is > true today, however, as a result of this, we have to have all our Statestore > objects leak in the BE tests. > Adding a clean shut down mechanism shouldn't be too hard, so let's do that. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7235) Allow the Statestore to shut down cleanly
[ https://issues.apache.org/jira/browse/IMPALA-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920984#comment-16920984 ] Tim Armstrong commented on IMPALA-7235: --- [~rishjain] I actually looked at this a while back and I think it is actually complicated and not much fun (tagging it with newbie was very optimistic). I'm going to close it since I don't think it's worth doing it. > Allow the Statestore to shut down cleanly > - > > Key: IMPALA-7235 > URL: https://issues.apache.org/jira/browse/IMPALA-7235 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 2.0 >Reporter: Sailesh Mukil >Priority: Major > Labels: cleanup, statestore > > The Statestore class was written with the assumption that it will live for > the entire lifetime of the cluster and never have to be shut down. This is > true today, however, as a result of this, we have to have all our Statestore > objects leak in the BE tests. > Adding a clean shut down mechanism shouldn't be too hard, so let's do that. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7235) Allow the Statestore to shut down cleanly
[ https://issues.apache.org/jira/browse/IMPALA-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7235: -- Labels: cleanup statestore (was: cleanup newbie statestore) > Allow the Statestore to shut down cleanly > - > > Key: IMPALA-7235 > URL: https://issues.apache.org/jira/browse/IMPALA-7235 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 2.0 >Reporter: Sailesh Mukil >Priority: Major > Labels: cleanup, statestore > > The Statestore class was written with the assumption that it will live for > the entire lifetime of the cluster and never have to be shut down. This is > true today, however, as a result of this, we have to have all our Statestore > objects leak in the BE tests. > Adding a clean shut down mechanism shouldn't be too hard, so let's do that. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8851) Drop table if exists throws authorization exception when table does not exist
[ https://issues.apache.org/jira/browse/IMPALA-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csaba Ringhofer resolved IMPALA-8851. - Fix Version/s: Impala 3.4.0 Resolution: Implemented > Drop table if exists throws authorization exception when table does not exist > - > > Key: IMPALA-8851 > URL: https://issues.apache.org/jira/browse/IMPALA-8851 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Fix For: Impala 3.4.0 > > > When authorization is enabled, a {{drop table if exists }} on a > non-existing table throws an authorization exception. In such a case if the > user has required permissions to the query the tables on the database, this > is unnecessary and the SQL should succeed saying the table does not exists > instead of erroring out. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org