[jira] [Assigned] (IMPALA-5802) COMPUTE STATS uses MT_DOP=4 by default

2019-09-02 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-5802:
-

Assignee: Tim Armstrong

> COMPUTE STATS uses MT_DOP=4 by default
> --
>
> Key: IMPALA-5802
> URL: https://issues.apache.org/jira/browse/IMPALA-5802
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Affects Versions: Impala 2.9.0
>Reporter: Alexander Behm
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: compute-stats
>
> Now that IMPALA-3905 has been completely addressed we should run COMPUTE 
> STATS with MT_DOP=4 by default, regardless of file format. The motivation is 
> consistency and speeding up COMPUTE STATS in most cases.
> This task is a continuation of IMPALA-4572.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8879) Upgrade bootstrap.js

2019-09-02 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8879.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Upgrade bootstrap.js
> 
>
> Key: IMPALA-8879
> URL: https://issues.apache.org/jira/browse/IMPALA-8879
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> The version we're using is quite out-of-date.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8879) Upgrade bootstrap.js

2019-09-02 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921166#comment-16921166
 ] 

ASF subversion and git services commented on IMPALA-8879:
-

Commit b14848151b5e47ab0ca0f4e8a11add0d6e6b7b64 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b148481 ]

IMPALA-8879: upgrade bootstrap for debug page to 4.3.1

Also upgrade DataTables to 1.10.18 at the same time..

They were obtained from https://datatables.net/download/
and https://getbootstrap.com/docs/4.3/getting-started/download/.

Included the version number in the css and js filenames to
avoid potential issues with stale versions being cached
(I got tripped up by this).

I had to do some additional work to get the UI to look right
after the upgrade:
* added additional style classes to HTML elements for nav and
  breadcrumb styles - these are required in bootstrap 4.
* added styling for plan visualisation graph for it to
  appear similar to the old graph.
* Added styling for  to get box around text.

Testing:
Manually clicked through the web UI to see if anything looked
wrong or didn't function correctly.

Change-Id: Ib58f407574f590825d208424a8c0fd101b0a19a7
Reviewed-on: http://gerrit.cloudera.org:8080/14119
Reviewed-by: Tim Armstrong 
Reviewed-by: Quanlong Huang 
Tested-by: Impala Public Jenkins 


> Upgrade bootstrap.js
> 
>
> Key: IMPALA-8879
> URL: https://issues.apache.org/jira/browse/IMPALA-8879
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> The version we're using is quite out-of-date.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8907) TestResultSpooling.test_slow_query is flaky

2019-09-02 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8907 started by Sahil Takiar.

> TestResultSpooling.test_slow_query is flaky
> ---
>
> Key: IMPALA-8907
> URL: https://issues.apache.org/jira/browse/IMPALA-8907
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> Recently failed in an ubuntu-16.04-dockerised-tests job: 
> [https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/1102/testReport/junit/query_test.test_result_spooling/TestResultSpooling/test_slow_query_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/]
> Error Message:
> {code:java}
> query_test/test_result_spooling.py:172: in test_slow_query assert 
> re.search(get_wait_time_regex, self.client.get_runtime_profile(handle)) \ E   
> assert None is not None E+  where None =  0x7f0da4115c08>('RowBatchGetWaitTime: [1-9]', 'Query 
> (id=7f47e1d6a1a1c804:492214eb):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil...  - OptimizationTime: 331.998ms\n
>- PeakMemoryUsage: 1.09 MB (1144320)\n   - PrepareTime: 
> 31.999ms\n') E+where  = re.search 
> E+and   'Query (id=7f47e1d6a1a1c804:492214eb):\n  DEBUG MODE 
> WARNING: Query profile created while running a DEBUG buil...  - 
> OptimizationTime: 331.998ms\n   - PeakMemoryUsage: 1.09 MB 
> (1144320)\n   - PrepareTime: 31.999ms\n' =  BeeswaxConnection.get_runtime_profile of 
>  0x7f0d94afa7d0>>( 0x7f0d94afffd0>) E+  where  BeeswaxConnection.get_runtime_profile of 
> > 
> =  0x7f0d94afa7d0>.get_runtime_profile E+where 
>  = 
> .client 
> {code}
> Stacktrace:
> {code:java}
> query_test/test_result_spooling.py:172: in test_slow_query
> assert re.search(get_wait_time_regex, 
> self.client.get_runtime_profile(handle)) \
> E   assert None is not None
> E+  where None =  0x7f0da4115c08>('RowBatchGetWaitTime: [1-9]', 'Query 
> (id=7f47e1d6a1a1c804:492214eb):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil...  - OptimizationTime: 331.998ms\n
>- PeakMemoryUsage: 1.09 MB (1144320)\n   - PrepareTime: 
> 31.999ms\n')
> E+where  = re.search
> E+and   'Query (id=7f47e1d6a1a1c804:492214eb):\n  DEBUG MODE 
> WARNING: Query profile created while running a DEBUG buil...  - 
> OptimizationTime: 331.998ms\n   - PeakMemoryUsage: 1.09 MB 
> (1144320)\n   - PrepareTime: 31.999ms\n' =  BeeswaxConnection.get_runtime_profile of 
>  0x7f0d94afa7d0>>( 0x7f0d94afffd0>)
> E+  where  > 
> =  0x7f0d94afa7d0>.get_runtime_profile
> E+where  at 0x7f0d94afa7d0> =  0x7f0d94af3d50>.client {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates

2019-09-02 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921073#comment-16921073
 ] 

Sahil Takiar commented on IMPALA-7351:
--

[~bikramjeet.vig] memory estimates for IMPALA-4268 were added in IMPALA-8818 - 
[https://github.com/apache/impala/blob/b7dfc18c59e831fa265d14ef4f7d26e33120b67f/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java#L68]

When result spooling is disabled, the reservation is still set to 
{{ResourceProfile.noReservation(0)}} since no rows are actually buffered. Do we 
think that there should be a reservation for {{PlanRootSink}} when result 
spooling is disabled as well?

> Add memory estimates for plan nodes and sinks with missing estimates
> 
>
> Key: IMPALA-7351
> URL: https://issues.apache.org/jira/browse/IMPALA-7351
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
>
> Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, 
> etc are missing memory estimates entirely. 
> We should add a basic estimate for all these cases based on experiments and 
> data from real workloads. In some cases 0 may be the right estimate (e.g. for 
> streaming nodes like SelectNode that just pass through data) but we should 
> remove TODOs and document the reasoning in those cases.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-558) HS2::FetchResults sets hasMoreRows in many cases where no more rows are to be returned

2019-09-02 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921071#comment-16921071
 ] 

Sahil Takiar commented on IMPALA-558:
-

The frequency of this issue should be significantly reduced when result 
spooling is enabled. The call to {{BufferedPlanRootSink::Send}} no longer 
blocks waiting for a corresponding call to {{GetNext}}, so {{FlushFinal}} is 
called immediately after sending the last batch.

However, even with result spooling, the issue could still occur because 
{{Send}} releases the lock and then {{FlushFinal}} re-acquires it before 
setting the sender state. So it is possible that the client calls {{GetNext}} 
before {{FlushFinal}} can set the state to EOS.

This could be fixed by re-factoring the {{PlanRootSink}} interface so that 
{{Send}} takes in an {{eos}} flag. This would allow {{Send}} to know if the 
batch being sent, is the last one. It could then set the {{sender_state_}} 
flag. However, I'm not sure its worth the effort.

> HS2::FetchResults sets hasMoreRows in many cases where no more rows are to be 
> returned
> --
>
> Key: IMPALA-558
> URL: https://issues.apache.org/jira/browse/IMPALA-558
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Clients
>Affects Versions: Impala 1.1
>Reporter: Henry Robinson
>Priority: Minor
>  Labels: query-lifecycle
>
> The first call to {{FetchResults}} always sets {{hasMoreRows}} even when 0 
> rows should be returned. The next call correctly sets {{hasMoreRows == 
> False}}. The upshot is there's always an extra round-trip, although 
> correctness isn't affected.
> {code}
> execute_statement_req = TCLIService.TExecuteStatementReq()
> execute_statement_req.sessionHandle = resp.sessionHandle
> execute_statement_req.statement = "SELECT COUNT(*) FROM 
> functional.alltypes WHERE 1 = 2"
> execute_statement_resp = 
> self.hs2_client.ExecuteStatement(execute_statement_req)
> 
> fetch_results_req = TCLIService.TFetchResultsReq()
> fetch_results_req.operationHandle = execute_statement_resp.operationHandle
> fetch_results_req.maxRows = 100
> fetch_results_resp = self.hs2_client.FetchResults(fetch_results_req)
> 
> assert not fetch_results_resp.hasMoreRows # Fails
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-1618) Impala server should always try to fulfill requested fetch size

2019-09-02 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921060#comment-16921060
 ] 

Sahil Takiar commented on IMPALA-1618:
--

I ran the test script in the JIRA description as well and confirmed that when 
result spooling is enabled it always returns the requested number of rows.

> Impala server should always try to fulfill requested fetch size
> ---
>
> Key: IMPALA-1618
> URL: https://issues.apache.org/jira/browse/IMPALA-1618
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.0.1
>Reporter: casey
>Priority: Minor
>  Labels: usability
> Fix For: Impala 3.4.0
>
>
> The thrift fetch request specifies the number of rows that it would like but 
> the Impala server may return fewer even though more results are available.
> For example, using the default row_batch size of 1024, if the client requests 
> 1023 rows, the first response contains 1023 rows but the second response 
> contains only 1 row. This is because the server internally uses row_batch 
> (1024), returns the requested count (1023) and caches the remaining row, then 
> the next time around only uses the cache.
> In general the end user should set both the row batch size and the thrift 
> request size. In practice the query writer setting row_batch and the 
> driver/programmer setting fetch size may often be different people.
> There is one case that works fine now though - setting the batch size to less 
> than the thrift req size. In this case the thrift response is always the same 
> as batch size.
> Code example:
> {noformat}
> dev@localhost:~/impyla$ git diff
> diff --git a/impala/_rpc/hiveserver2.py b/impala/_rpc/hiveserver2.py
> index 6139002..31fdab7 100644
> --- a/impala/_rpc/hiveserver2.py
> +++ b/impala/_rpc/hiveserver2.py
> @@ -265,6 +265,7 @@ def fetch_results(service, operation_handle, 
> hs2_protocol_version, schema=None,
>  req = TFetchResultsReq(operationHandle=operation_handle,
> orientation=orientation,
> maxRows=max_rows)
> +print("req: " + str(max_rows))
>  resp = service.FetchResults(req)
>  err_if_rpc_not_ok(resp)
>  
> @@ -273,6 +274,7 @@ def fetch_results(service, operation_handle, 
> hs2_protocol_version, schema=None,
>   for (i, col) in enumerate(resp.results.columns)]
>  num_cols = len(tcols)
>  num_rows = len(tcols[0].values)
> +print("rec: " + str(num_rows))
>  rows = []
>  for i in xrange(num_rows):
>  row = []
> dev@localhost:~/impyla$ cat test.py 
> from impala.dbapi import connect
> conn = connect()
> cur = conn.cursor()
> cur.set_arraysize(1024)
> cur.execute("set batch_size=1025")
> cur.execute("select * from tpch.lineitem")
> while True:
> rows = cur.fetchmany()
> if not rows:
> break
> cur.close()
> conn.close()
> dev@localhost:~/impyla$ python test.py | head
> Failed to import pandas
> req: 1024
> rec: 1024
> req: 1024
> rec: 1
> req: 1024
> rec: 1024
> req: 1024
> rec: 1
> req: 1024
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7235) Allow the Statestore to shut down cleanly

2019-09-02 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7235.
---
Resolution: Won't Fix

I think this is difficult because the thrift interfaces are messy and isn't 
really that important - I don't think we want to complicate test code to solve 
this.

> Allow the Statestore to shut down cleanly
> -
>
> Key: IMPALA-7235
> URL: https://issues.apache.org/jira/browse/IMPALA-7235
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.0
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: cleanup, statestore
>
> The Statestore class was written with the assumption that it will live for 
> the entire lifetime of the cluster and never have to be shut down. This is 
> true today, however, as a result of this, we have to have all our Statestore 
> objects leak in the BE tests.
> Adding a clean shut down mechanism shouldn't be too hard, so let's do that.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7235) Allow the Statestore to shut down cleanly

2019-09-02 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920984#comment-16920984
 ] 

Tim Armstrong commented on IMPALA-7235:
---

[~rishjain] I actually looked at this a while back and I think it is actually 
complicated and not much fun (tagging it with newbie was very optimistic). I'm 
going to close it since I don't think it's worth doing it.

> Allow the Statestore to shut down cleanly
> -
>
> Key: IMPALA-7235
> URL: https://issues.apache.org/jira/browse/IMPALA-7235
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.0
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: cleanup, statestore
>
> The Statestore class was written with the assumption that it will live for 
> the entire lifetime of the cluster and never have to be shut down. This is 
> true today, however, as a result of this, we have to have all our Statestore 
> objects leak in the BE tests.
> Adding a clean shut down mechanism shouldn't be too hard, so let's do that.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7235) Allow the Statestore to shut down cleanly

2019-09-02 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7235:
--
Labels: cleanup statestore  (was: cleanup newbie statestore)

> Allow the Statestore to shut down cleanly
> -
>
> Key: IMPALA-7235
> URL: https://issues.apache.org/jira/browse/IMPALA-7235
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.0
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: cleanup, statestore
>
> The Statestore class was written with the assumption that it will live for 
> the entire lifetime of the cluster and never have to be shut down. This is 
> true today, however, as a result of this, we have to have all our Statestore 
> objects leak in the BE tests.
> Adding a clean shut down mechanism shouldn't be too hard, so let's do that.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8851) Drop table if exists throws authorization exception when table does not exist

2019-09-02 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer resolved IMPALA-8851.
-
Fix Version/s: Impala 3.4.0
   Resolution: Implemented

> Drop table if exists throws authorization exception when table does not exist
> -
>
> Key: IMPALA-8851
> URL: https://issues.apache.org/jira/browse/IMPALA-8851
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> When authorization is enabled, a {{drop table if exists }} on a 
> non-existing table throws an authorization exception. In such a case if the 
> user has required permissions to the query the tables on the database, this 
> is unnecessary and the SQL should succeed saying the table does not exists 
> instead of erroring out.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org