[jira] [Updated] (IMPALA-7961) Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail fast

2018-12-11 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-7961:
--
Description: 
When catalog server is under heavy load with concurrent updates to objects, 
queries with SYNC_DDL can fail with the following message.

*User facing error message:*
{noformat}
ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
SYNC_DDL operation after 3 attempts.The operation has been successfully 
executed but its effects may have not been broadcast to all the coordinators.
{noformat}
*Exception from the catalog server log:*
{noformat}
I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 1088
I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 12625
I1031 00:00:49.168851 1131986 jni-util.cc:230] 
org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog topic 
version for the SYNC_DDL operation after 3 attempts.The operation has been 
successfully executed but its effects may have not been broadcast to all the 
coordinators.
at 
org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)

{noformat}
*What this means*

The Catalog operation is actually successful (the change has been committed to 
HMS and Catalog server cache) but the Catalog server noticed that it is taking 
longer than expected time for it to broadcast the changes (for whatever reason) 
and instead of hanging in there, it fails fast. The coordinators are expected 
to eventually sync up in the background.

*Problem*
 - This violates the contract of the SYNC_DDL query option since the query 
returns early.
 - This is a behavioral regression from pre IMPALA-5058 state where the queries 
would wait forever for SYNC_DDL based changes to propagate.

*Notes*
 - Usual suspect here is heavily concurrent catalog operations with long 
running DDLs.
 - Introduced by IMPALA-5058
 - My understanding is that this also applies to the Catalog V2 (or 
LocalCatalog mode) since we still rely on the CatalogServer for DDL 
orchestration and hence it takes this codepath.

Please refer to the jira comment for technical explanation as to why this is 
happening.

  was:
When catalog server is under heavy load with concurrent updates to objects, 
queries with SYNC_DDL can fail with the following message.

*User facing error message:*

{noformat}
ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
SYNC_DDL operation after 3 attempts.The operation has been successfully 
executed but its effects may have not been broadcast to all the coordinators.
{noformat}

*Exception from the catalog server log:*

{noformat}
I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 1088
I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 12625
I1031 00:00:49.168851 1131986 jni-util.cc:230] 
org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog topic 
version for the SYNC_DDL operation after 3 attempts.The operation has been 
successfully executed but its effects may have not been broadcast to all the 
coordinators.
at 
org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)

{noformat}

*What this means*

This means that the Catalog operation is actually successful (the change has 
been committed to HMS and Catalog server cache) but the Catalog server noticed 
that it is taking longer than expected time for it to broadcast the changes 
(for whatever reason) and instead of hanging in there, it fails fast. The 
coordinators are expected to eventually sync up in the background.

*Problem*

- This violates the contract of the SYNC_DDL query option since the query 
returns early.
- This is a behavioral regression from pre IMPALA-5058 state where the queries 
would wait forever for SYNC_DDL based changes to propagate.

*Notes*

- Usual suspect here is heavily concurrent catalog operations with long running 
DDLs.
- Introduced by IMPALA-5058
- My understanding is that this also applies to the Catalog V2 (or LocalCatalog 
mode) since we still rely on the 

[jira] [Updated] (IMPALA-7961) Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail fast

2018-12-11 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-7961:
--
Affects Version/s: Impala 2.12.0
   Impala 3.1.0

> Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail 
> fast
> ---
>
> Key: IMPALA-7961
> URL: https://issues.apache.org/jira/browse/IMPALA-7961
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.12.0, Impala 3.1.0
>Reporter: bharath v
>Priority: Critical
>
> When catalog server is under heavy load with concurrent updates to objects, 
> queries with SYNC_DDL can fail with the following message.
> *User facing error message:*
> {noformat}
> ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
> SYNC_DDL operation after 3 attempts.The operation has been successfully 
> executed but its effects may have not been broadcast to all the coordinators.
> {noformat}
> *Exception from the catalog server log:*
> {noformat}
> I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation 
> using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify 
> topic version (msec): 1088
> I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation 
> using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify 
> topic version (msec): 12625
> I1031 00:00:49.168851 1131986 jni-util.cc:230] 
> org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog 
> topic version for the SYNC_DDL operation after 3 attempts.The operation has 
> been successfully executed but its effects may have not been broadcast to all 
> the coordinators.
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
> at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
> at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)
> 
> {noformat}
> *What this means*
> This means that the Catalog operation is actually successful (the change has 
> been committed to HMS and Catalog server cache) but the Catalog server 
> noticed that it is taking longer than expected time for it to broadcast the 
> changes (for whatever reason) and instead of hanging in there, it fails fast. 
> The coordinators are expected to eventually sync up in the background.
> *Problem*
> - This violates the contract of the SYNC_DDL query option since the query 
> returns early.
> - This is a behavioral regression from pre IMPALA-5058 state where the 
> queries would wait forever for SYNC_DDL based changes to propagate.
> *Notes*
> - Usual suspect here is heavily concurrent catalog operations with long 
> running DDLs.
> - Introduced by IMPALA-5058
> - My understanding is that this also applies to the Catalog V2 (or 
> LocalCatalog mode) since we still rely on the CatalogServer for DDL 
> orchestration and hence it takes this codepath.
> Please refer to the jira comment for technical explanation as to why this is 
> happening. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7954) Support automatic invalidates using metastore notification events

2018-12-11 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718366#comment-16718366
 ] 

Vihang Karajgaonkar commented on IMPALA-7954:
-

Adding a google doc where interested folks can comment and add suggestions.

> Support automatic invalidates using metastore notification events
> -
>
> Key: IMPALA-7954
> URL: https://issues.apache.org/jira/browse/IMPALA-7954
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: Automatic_invalidate_DesignDoc_v1.pdf
>
>
> Currently, in Impala there are multiple ways to invalidate or refresh the 
> metadata stored in Catalog for Tables. Objects in Catalog can be invalidated 
> either on usage based approach (invalidate_tables_timeout_s) or when there is 
> GC pressure (invalidate_tables_on_memory_pressure) as added in IMPALA-7448. 
> However, most users issue invalidate commands when they want to sync to the 
> latest information from HDFS or HMS. Unfortunately, when data is modified or 
> new data is added outside Impala (eg. Hive) or a different Impala cluster, 
> users don't have a clear idea on whether they have to issue invalidate or 
> not. To be on the safer side, users keep issuing invalidate commands more 
> than necessary and it causes performance as well as stability issues.
> Hive Metastore provides a simple API to get incremental updates to the 
> metadata information stored in its database. Each API which does a 
> add/alter/drop operation in metastore generates event(s) which can be fetched 
> using {{get_next_notification}} API. Each event has a unique and increasing 
> event_id. The current notification event id can be fetched using 
> {{get_current_notificationEventId}} API.
> This JIRA proposes to make use of such events from metastore to proactively 
> either invalidate or refresh information in the catalogD. When configured, 
> CatalogD could poll for such events and take action (like add/drop/refresh 
> partition, add/drop/invalidate tables and databases) based on the events. 
> This way we can automatically refresh the catalogD state using events and it 
> would greatly help the use-cases where users want to see the latest 
> information (within a configurable interval of time delay) without flooding 
> the system with invalidate requests.
> I will be attaching a design doc to this JIRA and create subtasks for the 
> work. Feel free to make comments on the JIRA or make suggestions to improve 
> the design.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-5832) Simplify the backend's memory ownership and transfer model

2018-12-11 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-5832:
-

Assignee: (was: Tim Armstrong)

> Simplify the backend's memory ownership and transfer model
> --
>
> Key: IMPALA-5832
> URL: https://issues.apache.org/jira/browse/IMPALA-5832
> Project: IMPALA
>  Issue Type: Epic
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-1048) Data sinks do not show up in the exec summary

2018-12-11 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-1048.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Data sinks do not show up in the exec summary
> -
>
> Key: IMPALA-1048
> URL: https://issues.apache.org/jira/browse/IMPALA-1048
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Nong Li
>Assignee: Tim Armstrong
>Priority: Minor
>  Labels: observability, ramp-up, supportability
> Fix For: Impala 3.2.0
>
>
> Exec summary only contains nodes so the data sink is not shown. This is good 
> for data stream senders since the time is not double counted in the sender 
> and receiver but bad for inserts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6460) More flexible memory-based admission control policies

2018-12-11 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-6460:
-

Assignee: (was: Tim Armstrong)

> More flexible memory-based admission control policies
> -
>
> Key: IMPALA-6460
> URL: https://issues.apache.org/jira/browse/IMPALA-6460
> Project: IMPALA
>  Issue Type: Epic
>  Components: Distributed Exec
>Affects Versions: Impala 2.11.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: admission-control
>
> Currently there are only two ways to decided how much memory to reserve for 
> each query in memory-based admission control:
> * Using the memory estimates, which often makes bad decisions (e.g. huge 
> overestimates) and doesn't have any enforcement on the backend
> * Using a static pool or user-set mem_limit, which is very difficult to set 
> to a reasonable value for all queries.
> The memory reservation work will allow us to come up with more powerful and 
> flexible polices



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-4179) Remove RowBatch::MarkNeedsDeepCopy() memory management API

2018-12-11 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-4179:
-

Assignee: (was: Tim Armstrong)

> Remove RowBatch::MarkNeedsDeepCopy() memory management API
> --
>
> Key: IMPALA-4179
> URL: https://issues.apache.org/jira/browse/IMPALA-4179
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management
>
> We should remove this API as a simplification of the memory transfer model.
> It was used in various places to manage lifetime of BufferHandles, but those 
> can be replaced with attaching the BufferHandle + MarkFlushResources().
> It is also used to work around the fact that some memory is never returned 
> from ExecNodes and is freed in Close(). Part of the solution is probably to 
> add a way to attach all resources to a RowBatch before Close() and during 
> Reset().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7906) Crash in JVM PSPromotionManager::copy_to_survivor_space

2018-12-11 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7906.
---
Resolution: Cannot Reproduce

> Crash in JVM PSPromotionManager::copy_to_survivor_space
> ---
>
> Key: IMPALA-7906
> URL: https://issues.apache.org/jira/browse/IMPALA-7906
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build, crash
> Attachments: hs_err_pid6290.log
>
>
> {noformat}
> #0  0x7f44ca5261f7 in raise () from /lib64/libc.so.6
> #1  0x7f44ca5278e8 in abort () from /lib64/libc.so.6
> #2  0x7f44cd726185 in os::abort(bool) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #3  0x7f44cd8c8593 in VMError::report_and_die() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #4  0x7f44cd8c8a7e in crash_handler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #5  0x7f44cd724f72 in os::Linux::chained_handler(int, siginfo*, void*) () 
> from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #6  0x7f44cd72b5f6 in JVM_handle_linux_signal () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #7  0x7f44cd721be3 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #8  
> #9  0x7f44cd713e95 in oopDesc::print_on(outputStream*) const () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #10 0x7f44cd72afdb in os::print_register_info(outputStream*, void*) () 
> from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #11 0x7f44cd8c6c13 in VMError::report(outputStream*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #12 0x7f44cd8c818a in VMError::report_and_die() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #13 0x7f44cd72b68f in JVM_handle_linux_signal () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #14 0x7f44cd721be3 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #15 
> #16 0x7f44cd78f562 in oopDesc* 
> PSPromotionManager::copy_to_survivor_space(oopDesc*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #17 0x7f44cd7924a5 in PSRootsClosure::do_oop(oopDesc**) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #18 0x7f44cd716a96 in InterpreterOopMap::iterate_oop(OffsetClosure*) 
> const () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #19 0x7f44cd38f789 in frame::oops_interpreted_do(OopClosure*, 
> CLDClosure*, RegisterMap const*, bool) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #20 0x7f44cd86eaa1 in JavaThread::oops_do(OopClosure*, CLDClosure*, 
> CodeBlobClosure*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #21 0x7f44cd79270f in ThreadRootsTask::do_it(GCTaskManager*, unsigned 
> int) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #22 0x7f44cd3d7ecf in GCTaskThread::run() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #23 0x7f44cd727338 in java_start(Thread*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #24 0x7f44ca8bbe25 in start_thread () from /lib64/libpthread.so.0
> #25 0x7f44ca5e934d in clone () from /lib64/libc.so.6
> {noformat}
> These are the tests running at the time
> {noformat}
> 006:53:04 [gw1] PASSED 
> query_test/test_mem_usage_scaling.py::TestQueryMemLimitScaling::test_mem_usage_scaling[mem_limit:
>  -1 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 06:53:07 
> query_test/test_mem_usage_scaling.py::TestQueryMemLimitScaling::test_mem_usage_scaling[mem_limit:
>  400m | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 06:53:07 [gw5] PASSED 
> query_test/test_analytic_tpcds.py::TestAnalyticTpcds::test_analytic_functions_tpcds[batch_size:
>  1 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 06:53:08 
> 

[jira] [Created] (IMPALA-7962) Figure out how to handle localtime in docker minicluster containers

2018-12-11 Thread Tim Armstrong (JIRA)
Tim Armstrong created IMPALA-7962:
-

 Summary: Figure out how to handle localtime in docker minicluster 
containers
 Key: IMPALA-7962
 URL: https://issues.apache.org/jira/browse/IMPALA-7962
 Project: IMPALA
  Issue Type: Sub-task
Reporter: Tim Armstrong
Assignee: Tim Armstrong


The timezone from the host is not inherited by the container - it is instead 
mapped to /usr/share/zoneinfo/Etc/UTC

We should figure out if we need to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7960) wrong results when comparing timestamp casted to varchar of smaller length to a string literal in a binary predicate

2018-12-11 Thread Dinesh Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Garg updated IMPALA-7960:

Priority: Blocker  (was: Critical)

> wrong results when comparing timestamp casted to varchar of smaller length to 
> a string literal in a binary predicate
> 
>
> Key: IMPALA-7960
> URL: https://issues.apache.org/jira/browse/IMPALA-7960
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 2.12.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Blocker
>  Labels: correctness
>
> Expression rewriting seems to identify this as a redundant cast and removes 
> it.
> Steps to re-create:
> {noformat}
> select * from (select cast('2018-12-11 09:59:37' as timestamp) as ts) tbl 
> where cast(ts as varchar(10)) = '2018-12-11';
> {noformat}
> output:
> {noformat}
> Fetched 0 row(s) 
> {noformat}
> Now disable expression re-writes.
> {noformat}
> set ENABLE_EXPR_REWRITES=false;
> select * from (select cast('2018-12-11 09:59:37' as timestamp) as ts) tbl 
> where cast(ts as varchar(10)) = '2018-12-11';
> {noformat}
> output:
> {noformat}
> +-+
> | ts  |
> +-+
> | 2018-12-11 09:59:37 |
> +-+
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7961) Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail fast

2018-12-11 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-7961:
--
Description: 
When catalog server is under heavy load with concurrent updates to objects, 
queries with SYNC_DDL can fail with the following message.

*User facing error message:*

{noformat}
ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
SYNC_DDL operation after 3 attempts.The operation has been successfully 
executed but its effects may have not been broadcast to all the coordinators.
{noformat}

*Exception from the catalog server log:*

{noformat}
I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 1088
I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 12625
I1031 00:00:49.168851 1131986 jni-util.cc:230] 
org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog topic 
version for the SYNC_DDL operation after 3 attempts.The operation has been 
successfully executed but its effects may have not been broadcast to all the 
coordinators.
at 
org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)

{noformat}

*What this means*

This means that the Catalog operation is actually successful (the change has 
been committed to HMS and Catalog server cache) but the Catalog server noticed 
that it is taking longer than expected time for it to broadcast the changes 
(for whatever reason) and instead of hanging in there, it fails fast. The 
coordinators are expected to eventually sync up in the background.

*Problem*

- This violates the contract of the SYNC_DDL query option since the query 
returns early.
- This is a behavioral regression from pre IMPALA-5058 state where the queries 
would wait forever for SYNC_DDL based changes to propagate.

*Notes*

- Usual suspect here is heavily concurrent catalog operations with long running 
DDLs.
- Introduced by IMPALA-5058
- My understanding is that this also applies to the Catalog V2 (or LocalCatalog 
mode) since we still rely on the CatalogServer for DDL orchestration and hence 
it takes this codepath.

Please refer to the jira comment for technical explanation as to why this is 
happening. 

  was:
When catalog server is under heavy load with concurrent updates to objects, 
queries with SYNC_DDL can fail with the following message.

*User facing error message:*

{noformat}
ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
SYNC_DDL operation after 3 attempts.The operation has been successfully 
executed but its effects may have not been broadcast to all the coordinators.
{noformat}

*Exception from the catalog server log:*

{noformat}
I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 1088
I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 12625
I1031 00:00:49.168851 1131986 jni-util.cc:230] 
org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog topic 
version for the SYNC_DDL operation after 3 attempts.The operation has been 
successfully executed but its effects may have not been broadcast to all the 
coordinators.
at 
org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)

{noformat}

*What this means*

This means that the Catalog operation is actually successful (the change has 
been committed to HMS and Catalog server cache) but the Catalog server noticed 
that it is taking longer than expected time for it to broadcast the changes 
(for whatever reason) and instead of hanging in there, it fails fast. The 
coordinators are expected to eventually sync up in the background.

*Problem*

- This violates the contract of the SYNC_DDL query option since the query 
returns early.
- This is a behavioral regression from pre IMPALA-5058 state where the queries 
would wait forever for SYNC_DDL based changes to propagate.

*Notes*

- Usual suspect here is heavily concurrent catalog operations with long running 
DDLs.
- Introduced by IMPALA-5058
- Also applies to the Catalog V2 (or LocalCatalog mode) since we still rely on 
the CatalogServer for 

[jira] [Updated] (IMPALA-7961) Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail fast

2018-12-11 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-7961:
--
Description: 
When catalog server is under heavy load with concurrent updates to objects, 
queries with SYNC_DDL can fail with the following message.

*User facing error message:*

{noformat}
ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
SYNC_DDL operation after 3 attempts.The operation has been successfully 
executed but its effects may have not been broadcast to all the coordinators.
{noformat}

*Exception from the catalog server log:*

{noformat}
I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 1088
I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 12625
I1031 00:00:49.168851 1131986 jni-util.cc:230] 
org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog topic 
version for the SYNC_DDL operation after 3 attempts.The operation has been 
successfully executed but its effects may have not been broadcast to all the 
coordinators.
at 
org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)

{noformat}

*What this means*

This means that the Catalog operation is actually successful (the change has 
been committed to HMS and Catalog server cache) but the Catalog server noticed 
that it is taking longer than expected time for it to broadcast the changes 
(for whatever reason) and instead of hanging in there, it fails fast. The 
coordinators are expected to eventually sync up in the background.

*Problem*

- This violates the contract of the SYNC_DDL query option since the query 
returns early.
- This is a behavioral regression from pre IMPALA-5058 state where the queries 
would wait forever for SYNC_DDL based changes to propagate.

*Notes*

- Usual suspect here is heavily concurrent catalog operations with long running 
DDLs.
- Introduced by IMPALA-5058
- Also applies to the Catalog V2 (or LocalCatalog mode) since we still rely on 
the CatalogServer for DDL orchestration and hence it takes this codepath.

Please refer to the jira comment for technical explanation as to why this is 
happening. 

  was:
When catalog server is under heavy load with concurrent updates to objects, 
queries with SYNC_DDL can fail with the following message.

*User facing error message:*

{noformat}
ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
SYNC_DDL operation after 3 attempts.The operation has been successfully 
executed but its effects may have not been broadcast to all the coordinators.
{noformat}

*Exception from the catalog server log:*

{noformat}
I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 1088
I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 12625
I1031 00:00:49.168851 1131986 jni-util.cc:230] 
org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog topic 
version for the SYNC_DDL operation after 3 attempts.The operation has been 
successfully executed but its effects may have not been broadcast to all the 
coordinators.
at 
org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)

{noformat}

*What this means*

This means that the Catalog operation is actually successful (the change has 
been committed to HMS and Catalog server cache) but the Catalog server noticed 
that it is taking longer than expected time for it to broadcast the changes 
(for whatever reason) and instead of hanging in there, it fails fast. The 
coordinators are expected to eventually sync up in the background.

*Problem*

- This violates the contract of the SYNC_DDL query option since the query 
returns early.
- This is a behavioral regression from pre IMPALA-5058 state where the queries 
would wait forever for SYNC_DDL based changes to propagate.

*Notes*

- Usual suspect here is heavily concurrent catalog operations with long running 
DDLs.
- Introduced by IMPALA-5058

Please refer to the jira comment for technical explanation as to why this is 
happening. 


> Concurrent catalog heavy workloads can 

[jira] [Resolved] (IMPALA-7509) Create table after drop can lead to table not found exception

2018-12-11 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v resolved IMPALA-7509.
---
   Resolution: Fixed
 Assignee: Vuk Ercegovac  (was: Todd Lipcon)
Fix Version/s: Impala 3.1.0

> Create table after drop can lead to table not found exception
> -
>
> Key: IMPALA-7509
> URL: https://issues.apache.org/jira/browse/IMPALA-7509
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Rituparna Agrawal
>Assignee: Vuk Ercegovac
>Priority: Critical
> Fix For: Impala 3.1.0
>
> Attachments: failure_snippet.txt
>
>
> There are two impalads. One running with old mode and one with fetch from 
> catalogd mode.
> Now create a table and drop it in the impalad running in old mode. Following 
> this create the same table from the new mode coodinator. It sometimes throws 
> table not found exception.
> Test Create and Drop of a table in V1
>   Running  1  iterations
> Going for :  CREATE TABLE PARTS ( PART_ID DOUBLE, CREATE_TIME DOUBLE, 
> LAST_ACCESS_TIME DOUBLE, PART_NAME STRING, SD_ID DOUBLE, TBL_ID DOUBLE) 
> STORED AS PARQUETFILE;
> Going for :  DROP TABLE PARTS;
> Test Create and Drop of a table in V2
>   Running  1  iterations
> Going for :  CREATE TABLE PARTS ( PART_ID DOUBLE, CREATE_TIME DOUBLE, 
> LAST_ACCESS_TIME DOUBLE, PART_NAME STRING, SD_ID DOUBLE, TBL_ID DOUBLE) 
> STORED AS PARQUETFILE;
> Traceback (most recent call last):
>   File "testing.py", line 21, in execute_query
> cursor.execute(query)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 302, in execute
> configuration=configuration)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 343, in execute_async
> self._execute_async(op)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 362, in _execute_async
> operation_fn()
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 340, in op
> async=True)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 1027, in execute
> return self._operation('ExecuteStatement', req)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 957, in _operation
> resp = self._rpc(kind, request)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 925, in _rpc
> err_if_rpc_not_ok(response)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 704, in err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> HiveServer2Error: LocalCatalogException: Could not load table parnatest.parts 
> from metastore
> CAUSED BY: TException: 
> TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
> error_msgs:[CatalogException: Table not found: parts]))
> Going for :  DROP TABLE PARTS;
> Traceback (most recent call last):
>   File "testing.py", line 21, in execute_query
> cursor.execute(query)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 302, in execute
> configuration=configuration)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 343, in execute_async
> self._execute_async(op)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 362, in _execute_async
> operation_fn()
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 340, in op
> async=True)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 1027, in execute
> return self._operation('ExecuteStatement', req)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 957, in _operation
> resp = self._rpc(kind, request)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 925, in _rpc
> err_if_rpc_not_ok(response)
>   File 
> "/Users/parna/workspace/virtual_envs/metadata/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 704, in err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> HiveServer2Error: LocalCatalogException: Could not load table parnatest.parts 
> from metastore
> CAUSED BY: TException: 
> 

[jira] [Commented] (IMPALA-7802) Implement support for closing idle sessions

2018-12-11 Thread Zoram Thanga (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718239#comment-16718239
 ] 

Zoram Thanga commented on IMPALA-7802:
--

Thanks for the comment, [~tarmstrong]. You brought up a good point regarding 
what the client experience has to be, when the session has expired. I think we 
will have to maintain the current behavior, but without tying up FE service 
threads. That is, expired sessions should not consume any resources on the 
Impala server besides the session states.

I am thinking about moving expired sessions to a separate list (a death row) 
where they remain until explicitly cancelled and/or closed by the client. 
Perhaps we can have a single thread handle cancellation and closing of expired 
sessions.

> Implement support for closing idle sessions
> ---
>
> Key: IMPALA-7802
> URL: https://issues.apache.org/jira/browse/IMPALA-7802
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Zoram Thanga
>Priority: Critical
>  Labels: supportability
>
> Currently, the query option {{idle_session_timeout}} specifies a timeout in 
> seconds after which all running queries of that idle session will be 
> cancelled and no new queries can be issued to it. However, the idle session 
> will remain open and it needs to be closed explicitly. Please see the 
> [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html]
>  for details.
> This behavior may be undesirable as each session still consumes an Impala 
> frontend service thread. The number of frontend service threads is bound by 
> the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala 
> server can have a lot of idle sessions but they still consume against the 
> quota of {{fe_service_threads}}. If the number of sessions established 
> reaches {{fe_service_threads}}, all new session creations will block until 
> some of the existing sessions exit. There may be no time bound on when these 
> zombie idle sessions will be closed and it's at the mercy of the client 
> implementation to close them. In some sense, leaving many idle sessions open 
> is a way to launch a denial of service attack on Impala.
> To fix this situation, we should have an option to forcefully close a session 
> when it's considered idle so it won't unnecessarily consume the limited 
> number of frontend service threads. cc'ing [~zoram]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7961) Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail fast

2018-12-11 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-7961:
--
Priority: Critical  (was: Major)

> Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail 
> fast
> ---
>
> Key: IMPALA-7961
> URL: https://issues.apache.org/jira/browse/IMPALA-7961
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: bharath v
>Priority: Critical
>
> When catalog server is under heavy load with concurrent updates to objects, 
> queries with SYNC_DDL can fail with the following message.
> *User facing error message:*
> {noformat}
> ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
> SYNC_DDL operation after 3 attempts.The operation has been successfully 
> executed but its effects may have not been broadcast to all the coordinators.
> {noformat}
> *Exception from the catalog server log:*
> {noformat}
> I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation 
> using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify 
> topic version (msec): 1088
> I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation 
> using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify 
> topic version (msec): 12625
> I1031 00:00:49.168851 1131986 jni-util.cc:230] 
> org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog 
> topic version for the SYNC_DDL operation after 3 attempts.The operation has 
> been successfully executed but its effects may have not been broadcast to all 
> the coordinators.
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
> at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
> at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)
> 
> {noformat}
> *What this means*
> This means that the Catalog operation is actually successful (the change has 
> been committed to HMS and Catalog server cache) but the Catalog server 
> noticed that it is taking longer than expected time for it to broadcast the 
> changes (for whatever reason) and instead of hanging in there, it fails fast. 
> The coordinators are expected to eventually sync up in the background.
> *Problem*
> - This violates the contract of the SYNC_DDL query option since the query 
> returns early.
> - This is a behavioral regression from pre IMPALA-5058 state where the 
> queries would wait forever for SYNC_DDL based changes to propagate.
> *Notes*
> - Usual suspect here is heavily concurrent catalog operations with long 
> running DDLs.
> - Introduced by IMPALA-5058
> Please refer to the jira comment for technical explanation as to why this is 
> happening. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7961) Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail fast

2018-12-11 Thread bharath v (JIRA)
bharath v created IMPALA-7961:
-

 Summary: Concurrent catalog heavy workloads can cause queries with 
SYNC_DDL to fail fast
 Key: IMPALA-7961
 URL: https://issues.apache.org/jira/browse/IMPALA-7961
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: bharath v


When catalog server is under heavy load with concurrent updates to objects, 
queries with SYNC_DDL can fail with the following message.

*User facing error message:*

{noformat}
ERROR: CatalogException: Couldn't retrieve the catalog topic version for the 
SYNC_DDL operation after 3 attempts.The operation has been successfully 
executed but its effects may have not been broadcast to all the coordinators.
{noformat}

*Exception from the catalog server log:*

{noformat}
I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 1088
I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation using 
SYNC_DDL is waiting for catalog topic version: 236535. Time to identify topic 
version (msec): 12625
I1031 00:00:49.168851 1131986 jni-util.cc:230] 
org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog topic 
version for the SYNC_DDL operation after 3 attempts.The operation has been 
successfully executed but its effects may have not been broadcast to all the 
coordinators.
at 
org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891)
at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336)
at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)

{noformat}

*What this means*

This means that the Catalog operation is actually successful (the change has 
been committed to HMS and Catalog server cache) but the Catalog server noticed 
that it is taking longer than expected time for it to broadcast the changes 
(for whatever reason) and instead of hanging in there, it fails fast. The 
coordinators are expected to eventually sync up in the background.

*Problem*

- This violates the contract of the SYNC_DDL query option since the query 
returns early.
- This is a behavioral regression from pre IMPALA-5058 state where the queries 
would wait forever for SYNC_DDL based changes to propagate.

*Notes*

- Usual suspect here is heavily concurrent catalog operations with long running 
DDLs.
- Introduced by IMPALA-5058

Please refer to the jira comment for technical explanation as to why this is 
happening. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7954) Support automatic invalidates using metastore notification events

2018-12-11 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718226#comment-16718226
 ] 

Lars Volker commented on IMPALA-7954:
-

Thank you for attaching the document, [~vihangk1]. It is indeed common to 
create a Google doc and allow comments from everyone.

> Support automatic invalidates using metastore notification events
> -
>
> Key: IMPALA-7954
> URL: https://issues.apache.org/jira/browse/IMPALA-7954
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: Automatic_invalidate_DesignDoc_v1.pdf
>
>
> Currently, in Impala there are multiple ways to invalidate or refresh the 
> metadata stored in Catalog for Tables. Objects in Catalog can be invalidated 
> either on usage based approach (invalidate_tables_timeout_s) or when there is 
> GC pressure (invalidate_tables_on_memory_pressure) as added in IMPALA-7448. 
> However, most users issue invalidate commands when they want to sync to the 
> latest information from HDFS or HMS. Unfortunately, when data is modified or 
> new data is added outside Impala (eg. Hive) or a different Impala cluster, 
> users don't have a clear idea on whether they have to issue invalidate or 
> not. To be on the safer side, users keep issuing invalidate commands more 
> than necessary and it causes performance as well as stability issues.
> Hive Metastore provides a simple API to get incremental updates to the 
> metadata information stored in its database. Each API which does a 
> add/alter/drop operation in metastore generates event(s) which can be fetched 
> using {{get_next_notification}} API. Each event has a unique and increasing 
> event_id. The current notification event id can be fetched using 
> {{get_current_notificationEventId}} API.
> This JIRA proposes to make use of such events from metastore to proactively 
> either invalidate or refresh information in the catalogD. When configured, 
> CatalogD could poll for such events and take action (like add/drop/refresh 
> partition, add/drop/invalidate tables and databases) based on the events. 
> This way we can automatically refresh the catalogD state using events and it 
> would greatly help the use-cases where users want to see the latest 
> information (within a configurable interval of time delay) without flooding 
> the system with invalidate requests.
> I will be attaching a design doc to this JIRA and create subtasks for the 
> work. Feel free to make comments on the JIRA or make suggestions to improve 
> the design.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7960) wrong results when comparing timestamp casted to varchar of smaller length to a string literal in a binary predicate

2018-12-11 Thread Jim Apple (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Apple updated IMPALA-7960:
--
Labels: correctness  (was: )

> wrong results when comparing timestamp casted to varchar of smaller length to 
> a string literal in a binary predicate
> 
>
> Key: IMPALA-7960
> URL: https://issues.apache.org/jira/browse/IMPALA-7960
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 2.12.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: correctness
>
> Expression rewriting seems to identify this as a redundant cast and removes 
> it.
> Steps to re-create:
> {noformat}
> select * from (select cast('2018-12-11 09:59:37' as timestamp) as ts) tbl 
> where cast(ts as varchar(10)) = '2018-12-11';
> {noformat}
> output:
> {noformat}
> Fetched 0 row(s) 
> {noformat}
> Now disable expression re-writes.
> {noformat}
> set ENABLE_EXPR_REWRITES=false;
> select * from (select cast('2018-12-11 09:59:37' as timestamp) as ts) tbl 
> where cast(ts as varchar(10)) = '2018-12-11';
> {noformat}
> output:
> {noformat}
> +-+
> | ts  |
> +-+
> | 2018-12-11 09:59:37 |
> +-+
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7960) wrong results when comparing timestamp casted to varchar of smaller length to a string literal in a binary predicate

2018-12-11 Thread Bikramjeet Vig (JIRA)
Bikramjeet Vig created IMPALA-7960:
--

 Summary: wrong results when comparing timestamp casted to varchar 
of smaller length to a string literal in a binary predicate
 Key: IMPALA-7960
 URL: https://issues.apache.org/jira/browse/IMPALA-7960
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 2.12.0
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


Expression rewriting seems to identify this as a redundant cast and removes it.

Steps to re-create:
{noformat}
select * from (select cast('2018-12-11 09:59:37' as timestamp) as ts) tbl where 
cast(ts as varchar(10)) = '2018-12-11';
{noformat}
output:
{noformat}
Fetched 0 row(s) 
{noformat}

Now disable expression re-writes.
{noformat}
set ENABLE_EXPR_REWRITES=false;
select * from (select cast('2018-12-11 09:59:37' as timestamp) as ts) tbl where 
cast(ts as varchar(10)) = '2018-12-11';
{noformat}

output:
{noformat}
+-+
| ts  |
+-+
| 2018-12-11 09:59:37 |
+-+
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-7728) Impala Doc: Add the Changing Privileges section in Impala Sentry doc

2018-12-11 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7728 started by Alex Rodoni.
---
> Impala Doc: Add the Changing Privileges section in Impala Sentry doc
> 
>
> Key: IMPALA-7728
> URL: https://issues.apache.org/jira/browse/IMPALA-7728
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.1.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7728) Impala Doc: Add the Changing Privileges section in Impala Sentry doc

2018-12-11 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7728:

Description: https://gerrit.cloudera.org/#/c/12071/

> Impala Doc: Add the Changing Privileges section in Impala Sentry doc
> 
>
> Key: IMPALA-7728
> URL: https://issues.apache.org/jira/browse/IMPALA-7728
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> https://gerrit.cloudera.org/#/c/12071/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-5821) Distinguish numeric types and show implicit cast in EXTENDED explain plans

2018-12-11 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman resolved IMPALA-5821.

   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Distinguish numeric types and show implicit cast in EXTENDED explain plans
> --
>
> Key: IMPALA-5821
> URL: https://issues.apache.org/jira/browse/IMPALA-5821
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Matthew Jacobs
>Assignee: Andrew Sherman
>Priority: Minor
>  Labels: supportability, usability
> Fix For: Impala 3.1.0
>
>
> In this plan, it wasn't clear that the constant in the predicate was being 
> evaluated to a double. Then the lhs required an implicit cast, and the 
> predicate couldn't be pushed to Kudu:
> {code}
> [localhost:21000] > explain select * from functional_kudu.alltypestiny where 
> bigint_col < 1000 / 100;
> Query: explain select * from functional_kudu.alltypestiny where bigint_col < 
> 1000 / 100
> +-+
> | Explain String  |
> +-+
> | Per-Host Resource Reservation: Memory=0B|
> | Per-Host Resource Estimates: Memory=10.00MB |
> | Codegen disabled by planner |
> | |
> | PLAN-ROOT SINK  |
> | |   |
> | 00:SCAN KUDU [functional_kudu.alltypestiny] |
> |predicates: bigint_col < 10  |
> +-+
> {code}
> We should make it more clear by printing it in a way that makes it clear that 
> it's being interpreted as a DOUBLE, e.g. by wrapping in a cast.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7744) In PlannerTest print the name of a .test file if it diffs

2018-12-11 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman resolved IMPALA-7744.

Resolution: Won't Fix

> In PlannerTest print the name of a .test file if it diffs
> -
>
> Key: IMPALA-7744
> URL: https://issues.apache.org/jira/browse/IMPALA-7744
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Minor
>
> PlannerTest runs queries from .test files and writes a .test file containing 
> the output under $IMPALA_FE_TEST_LOGS_DIR.  It is sometimes hard to tell 
> which .test file is causing problems. If the test output is causing a test 
> failure due to a diff, then print the name of the culprit .test file.
> Separated out from IMPALA-5821
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6658) Parquet RLE encoding can waste space with small repeated runs

2018-12-11 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman resolved IMPALA-6658.

   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Parquet RLE encoding can waste space with small repeated runs
> -
>
> Key: IMPALA-6658
> URL: https://issues.apache.org/jira/browse/IMPALA-6658
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Andrew Sherman
>Priority: Minor
>  Labels: parquet, ramp-up
> Fix For: Impala 3.1.0
>
>
> Currently RleEncoder creates repeated runs from 8 repeated values, which can 
> be less space efficient than bit-packed if bit width is 1 or 2. In the worst 
> case, the whole data page can be ~2X larger if bit width is 1, and ~1.25X 
> larger if bit is 2 compared to bit-packing.
> A comment in rle_encoding.h writes different numbers, but it probably does 
> not calculate with the overhead of splitting long runs into smaller ones 
> (every run adds +1 byte for its length): 
> [https://github.com/apache/impala/blob/8079cd9d2a87051f81a41910b74fab15e35f36ea/be/src/util/rle-encoding.h#L62]
> Note that if the data page is compressed, this size difference probably 
> disappears, but the larger uncompressed buffer size can still affect 
> performance.
> Parquet RLE encoding is described here: 
> [https://github.com/apache/parquet-format/blob/master/Encodings.md#run-length-encoding-bit-packing-hybrid-rle-3]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7801) Remove toSql() from ParseNode interface

2018-12-11 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman resolved IMPALA-7801.

Resolution: Won't Fix

> Remove toSql() from ParseNode interface
> ---
>
> Key: IMPALA-7801
> URL: https://issues.apache.org/jira/browse/IMPALA-7801
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Minor
>
> In IMPALA-5821 the method "toSql(ToSqlOptions)" was added to ParseNode, to 
> allow options to be passed when generating SQL from a parse tree. Now that 
> this method is available, remove the old "toSql()" method and have all 
> callers call the new method instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-7728) Impala Doc: Add the Changing Privileges section in Impala Sentry doc

2018-12-11 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni reopened IMPALA-7728:
-

> Impala Doc: Add the Changing Privileges section in Impala Sentry doc
> 
>
> Key: IMPALA-7728
> URL: https://issues.apache.org/jira/browse/IMPALA-7728
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.1.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-7728) Impala Doc: Add the Changing Privileges section in Impala Sentry doc

2018-12-11 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-7728.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Impala Doc: Add the Changing Privileges section in Impala Sentry doc
> 
>
> Key: IMPALA-7728
> URL: https://issues.apache.org/jira/browse/IMPALA-7728
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.1.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7959) Trace and visualize concurrent Catalog operations

2018-12-11 Thread bharath v (JIRA)
bharath v created IMPALA-7959:
-

 Summary: Trace and visualize concurrent Catalog operations
 Key: IMPALA-7959
 URL: https://issues.apache.org/jira/browse/IMPALA-7959
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog
Reporter: bharath v


The idea here is to leverage chromium tracing (about:trace) for visualizing 
Catalog events [1]. We could periodically dump the trace events in the JSON 
format [2] that can be plotted using the visualizer tool [3]. 

This helps us get a sense of concurrent catalog operations using a timeline 
view of chromium tracing.Useful for debugging contention and hangs of 
concurrent operations.

This idea was borrowed from the Apache Kudu project [4]

[1] https://github.com/catapult-project/catapult/tree/master/tracing
[2] 
https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/preview
[3] 
https://www.chromium.org/developers/how-tos/trace-event-profiling-tool/using-frameviewer
[4] https://kudu.apache.org/docs/troubleshooting.html#kudu_tracing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-2691) Missing examples of using a UDAF with a different intermediate type

2018-12-11 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-2691.
---
   Resolution: Fixed
Fix Version/s: impala 2.3

> Missing examples of using a UDAF with a different intermediate type
> ---
>
> Key: IMPALA-2691
> URL: https://issues.apache.org/jira/browse/IMPALA-2691
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: Impala 2.2.4
>Reporter: Chris Channing
>Assignee: Alex Rodoni
>Priority: Minor
>  Labels: usability
> Fix For: impala 2.3
>
> Attachments: udaf-sample.tar.gz
>
>
> The [current 
> documentation|http://www.cloudera.com/content/www/en-us/documentation/enterprise/5-4-x/topics/impala_create_function.html]
>  for UDAFs has a reference to 'INTERMEDIATE' but not how it should be used 
> during function registration. Since [~tarmstrong] added the facility in the 
> FE ([IMPALA-1829|https://issues.cloudera.org/browse/IMPALA-1829]) to allow 
> function registrations with an intermediate type in the 2.3 release, we 
> should provide examples of what this looks like, in particular, where a fixed 
> buffer type is required.
> To facilitate this, I have attached an example UDAF which shows how a fixed 
> buffer intermediate type can be used.
> DDL:
> {code}
> CREATE AGGREGATE FUNCTION myavgfn(double) RETURNS double INTERMEDIATE 
> char(16) LOCATION '/foo/libudasample.so' INIT_FN='AvgInit' 
> UPDATE_FN='AvgUpdate' MERGE_FN='AvgMerge' FINALIZE_FN='AvgFinalize';
> {code}
> Code/Build scripts:
> Refer to the attachment
> To build the UDAF, simply execute:
> cmake .
> make
> The output will be generated in build/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7958) Instrument catalog lock objects to dump diagnostics

2018-12-11 Thread bharath v (JIRA)
bharath v created IMPALA-7958:
-

 Summary: Instrument catalog lock objects to dump diagnostics
 Key: IMPALA-7958
 URL: https://issues.apache.org/jira/browse/IMPALA-7958
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog
Affects Versions: Impala 3.1.0
Reporter: bharath v


If a particular thread is holding a given lock for longer than X ms 
(configurable), we should dump the following information to the logs.

- Thread stack of the thread holding the lock
- Thread stacks of the threads blocked on this lock.

We initially need to do this for the re-entrant table lock protecting 
concurrent table updates. We can also do this for catalogVersionLock_ but we 
don't anticipate much contention around it with v2 architecture.

The idea is to root cause Catalog hangs better, especially in the context of 
concurrent table updates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7450) catalogd should use thread names to make jstack more readable

2018-12-11 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-7450:
--
Labels: supportability  (was: )

> catalogd should use thread names to make jstack more readable
> -
>
> Key: IMPALA-7450
> URL: https://issues.apache.org/jira/browse/IMPALA-7450
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>  Labels: supportability
>
> Currently when long refresh or DDL operations are being processed, it's hard 
> to understand what's going on when looking at a jstack. We should have such 
> potentially-long-running operations temporarily modify the current thread's 
> name to indicate what action is being taken so we can debug more easily.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7939) Impala shell not displaying results for a CTE query.

2018-12-11 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-7939.
--
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Impala shell not displaying results for a CTE query.
> 
>
> Key: IMPALA-7939
> URL: https://issues.apache.org/jira/browse/IMPALA-7939
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Anuj Phadke
>Assignee: Fredy Wijaya
>Priority: Critical
> Fix For: Impala 3.2.0
>
>
> {noformat}
> 1.
> [localhost:21000] > CREATE TABLE test (cp_master_id STRING, data_status 
> STRING) stored as parquet;
>  Query: CREATE TABLE test (cp_master_id STRING, data_status STRING) stored as 
> parquet
>  Fetched 0 row(s) in 0.03s
> 2. 
> [localhost:21000] > insert into test (cp_master_id, data_status) values 
> ('111','NEWREC');
>  Query: insert into test (cp_master_id, data_status) values 
> ('111','NEWREC')
>  Query submitted at: 2018-12-06 14:02:21 (Coordinator: 
> [http://anuj-OptiPlex-9020:25000|http://anuj-optiplex-9020:25000/])
>  Query progress can be monitored at: 
> [http://anuj-OptiPlex-9020:25000/query_plan?query_id=14182abf0a7e4bb:8788cdab|http://anuj-optiplex-9020:25000/query_plan?query_id=14182abf0a7e4bb:8788cdab]
>  Modified 1 row(s) in 4.14s
> ***
> 3.
>  [localhost:21000] > WITH tbl_incoming
>  > AS
>  > (
>  > SELECT cp_master_id, data_status FROM test WHERE (data_status = 'NEWREC' 
> OR data_status='UPDATE')
>  > )
>  > select * from test;
>  Query: WITH tbl_incoming
>  AS
>  (
>  SELECT cp_master_id, data_status FROM test WHERE (data_status = 'NEWREC' OR 
> data_status='UPDATE')
>  )
>  select * from test
>  Modified 0 row(s) in 0.12s
>  [localhost:21000] >
>  
> 4.
> [localhost:21000] > SELECT cp_master_id, data_status FROM test WHERE 
> (data_status = 'NEWREC' OR data_status='UPDATE');
>  Query: SELECT cp_master_id, data_status FROM test WHERE (data_status = 
> 'NEWREC' OR data_status='UPDATE')
>  Query submitted at: 2018-12-06 14:05:48 (Coordinator: 
> [http://anuj-OptiPlex-9020:25000|http://anuj-optiplex-9020:25000/])
>  Query progress can be monitored at: 
> [http://anuj-OptiPlex-9020:25000/query_plan?query_id=4b49d50ec0b973c1:ce474a15|http://anuj-optiplex-9020:25000/query_plan?query_id=4b49d50ec0b973c1:ce474a15]
>  ++---+
> |cp_master_id|data_status|
> ++---+
> |111|NEWREC|
> ++---+
>  Fetched 1 row(s) in 0.12s
> {noformat}
>  
> I think the bug is in the regex here -
> https://github.com/apache/impala/blob/master/shell/impala_shell.py#L1157
> It matches for "UPDATE" -
> https://github.com/apache/impala/blob/master/shell/impala_shell.py#L139



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-4212) Explain plan should show the expression evaluated in all exec node

2018-12-11 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-4212:
--
Description: 
*Problem Statement*
The explain plan does not show the expression evaluated at HDFS Table Sinks and 
UNIONs, as well as the final result expressions evaluated at the coordinator. 
These expression can be very complex. If these complex expression are hidden 
inside a view (e.g. insert into dst select * from view), it's very hard to tell 
from the query profile or the plan that the slow HDFS table sink is due complex 
expressions.

*Proposed Solution*
Explain plan should show the expression evaluated at exec nodes and data sinks 
as well as the final result exprs. Then it would be quite obvious that a slow 
sink or union is due to these expressions. 

  was:
*Problem Statement*
The explain plan does not show the expression evaluated at HDFS Table Sinks and 
UNIONs, as well as the final result expressions evaluated at the coordinator. 
These expression can be very complex. If these complex expression are hidden 
inside a view (e.g. insert into dst select * from view), it's very hard to tell 
from the query profile or the plan that the slow HDFS table sink is due complex 
expressions.

*Proposed Solution*
1. Explain plan should show the expression evaluated at exec nodes and data 
sinks as well as the final result exprs. Then it would be quite obvious that a 
slow sink or union is due to these expressions. 
2. (Optional) In the query profile, report the time spent in evaluating the 
expressions.


> Explain plan should show the expression evaluated in all exec node
> --
>
> Key: IMPALA-4212
> URL: https://issues.apache.org/jira/browse/IMPALA-4212
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 2.7.0
>Reporter: Alan Choi
>Assignee: Henry Robinson
>Priority: Major
>  Labels: supportability
>
> *Problem Statement*
> The explain plan does not show the expression evaluated at HDFS Table Sinks 
> and UNIONs, as well as the final result expressions evaluated at the 
> coordinator. These expression can be very complex. If these complex 
> expression are hidden inside a view (e.g. insert into dst select * from 
> view), it's very hard to tell from the query profile or the plan that the 
> slow HDFS table sink is due complex expressions.
> *Proposed Solution*
> Explain plan should show the expression evaluated at exec nodes and data 
> sinks as well as the final result exprs. Then it would be quite obvious that 
> a slow sink or union is due to these expressions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7957) UNION ALL query returns incorrect results

2018-12-11 Thread Luis E Martinez-Poblete (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luis E Martinez-Poblete updated IMPALA-7957:

Description: 
Synopsis:
=
UNION ALL query returns incorrect results

Problem:

Customer reported a UNION ALL query returning incorrect results. The UNION ALL 
query has 2 legs, but Impala is only returning information from one leg.

Issue can be reproduced in the latest version of Impala. Below is the 
reproduction case:

{noformat}
create table mytest_t (c1 timestamp, c2 timestamp, c3 int, c4 int);
insert into mytest_t values (now(), ADDDATE (now(),1), 1,1);
insert into mytest_t values (now(), ADDDATE (now(),1), 2,2);
insert into mytest_t values (now(), ADDDATE (now(),1), 3,3);

SELECT t.c1
FROM
 (SELECT c1, c2
 FROM mytest_t) t
LEFT JOIN
 (SELECT c1, c2
 FROM mytest_t
 WHERE c2 = c1) t2 ON (t.c2 = t2.c2)
UNION ALL
VALUES (NULL)
{noformat}

The above query produces the following execution plan:

{noformat}
++
| Explain String
 |
++
| Max Per-Host Resource Reservation: Memory=34.02MB Threads=5   
 |
| Per-Host Resource Estimates: Memory=2.06GB
 |
| WARNING: The following tables are missing relevant table and/or column 
statistics. |
| default.mytest_t  
 |
|   
 |
| PLAN-ROOT SINK
 |
| | 
 |
| 06:EXCHANGE [UNPARTITIONED]   
 |
| | 
 |
| 00:UNION  
 |
| |  constant-operands=1
 |
| | 
 |
| 04:SELECT 
 |
| |  predicates: default.mytest_t.c1 = default.mytest_t.c2  
 |
| | 
 |
| 03:HASH JOIN [LEFT OUTER JOIN, BROADCAST] 
 |
| |  hash predicates: c2 = c2   
 |
| | 
 |
| |--05:EXCHANGE [BROADCAST]
 |
| |  |  
 |
| |  02:SCAN HDFS [default.mytest_t]
 |
| | partitions=1/1 files=3 size=192B
 |
| | predicates: c2 = c1 
 |
| | 
 |
| 01:SCAN HDFS [default.mytest_t]   
 |
|partitions=1/1 files=3 size=192B   
 |
++
{noformat}

The issue is in operator 4:

{noformat}
| 04:SELECT |
| | predicates: default.mytest_t.c1 = default.mytest_t.c2 |
{noformat}

It's definitely a bug with predicate placement - that c1 = c2 predicate 
shouldn't be evaluated outside the right branch of the LEFT JOIN.

Thanks,
Luis Martinez.

  was:
Synopsis:
=
UNION ALL query returns incorrect results

Problem:

Customer reported a UNION ALL query returning incorrect results. The UNION ALL 
query has 2 legs, but Impala is only returning information from one leg.

Issue can be reproduced in the latest version of Impala. Below is the 
reproduction case:

create table mytest_t (c1 timestamp, c2 timestamp, c3 int, c4 int);
insert into mytest_t values (now(), ADDDATE (now(),1), 1,1);
insert into mytest_t values (now(), ADDDATE (now(),1), 2,2);
insert into mytest_t values (now(), ADDDATE (now(),1), 3,3);

SELECT t.c1
FROM
 (SELECT c1, c2
 FROM mytest_t) t
LEFT JOIN
 (SELECT c1, c2
 FROM mytest_t
 WHERE c2 = c1) t2 ON (t.c2 = t2.c2)
UNION ALL
VALUES (NULL)


The above query produces the following execution plan:

++
| Explain String |
++
| Max 

[jira] [Created] (IMPALA-7957) UNION ALL query returns incorrect results

2018-12-11 Thread Luis E Martinez-Poblete (JIRA)
Luis E Martinez-Poblete created IMPALA-7957:
---

 Summary: UNION ALL query returns incorrect results
 Key: IMPALA-7957
 URL: https://issues.apache.org/jira/browse/IMPALA-7957
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 2.12.0
Reporter: Luis E Martinez-Poblete


Synopsis:
=
UNION ALL query returns incorrect results

Problem:

Customer reported a UNION ALL query returning incorrect results. The UNION ALL 
query has 2 legs, but Impala is only returning information from one leg.

Issue can be reproduced in the latest version of Impala. Below is the 
reproduction case:

create table mytest_t (c1 timestamp, c2 timestamp, c3 int, c4 int);
insert into mytest_t values (now(), ADDDATE (now(),1), 1,1);
insert into mytest_t values (now(), ADDDATE (now(),1), 2,2);
insert into mytest_t values (now(), ADDDATE (now(),1), 3,3);

SELECT t.c1
FROM
 (SELECT c1, c2
 FROM mytest_t) t
LEFT JOIN
 (SELECT c1, c2
 FROM mytest_t
 WHERE c2 = c1) t2 ON (t.c2 = t2.c2)
UNION ALL
VALUES (NULL)


The above query produces the following execution plan:

++
| Explain String |
++
| Max Per-Host Resource Reservation: Memory=34.02MB Threads=5 |
| Per-Host Resource Estimates: Memory=2.06GB |
| WARNING: The following tables are missing relevant table and/or column 
statistics. |
| default.mytest_t |
| |
| PLAN-ROOT SINK |
| | |
| 06:EXCHANGE [UNPARTITIONED] |
| | |
| 00:UNION |
| | constant-operands=1 |
| | |
| 04:SELECT |
| | predicates: default.mytest_t.c1 = default.mytest_t.c2 |
| | |
| 03:HASH JOIN [LEFT OUTER JOIN, BROADCAST] |
| | hash predicates: c2 = c2 |
| | |
| |--05:EXCHANGE [BROADCAST] |
| | | |
| | 02:SCAN HDFS [default.mytest_t] |
| | partitions=1/1 files=3 size=192B |
| | predicates: c2 = c1 |
| | |
| 01:SCAN HDFS [default.mytest_t] |
| partitions=1/1 files=3 size=192B |
++

The issue is in operator 4:

| 04:SELECT |
| | predicates: default.mytest_t.c1 = default.mytest_t.c2 |

It's definitely a bug with predicate placement - that c1 = c2 predicate 
shouldn't be evaluated outside the right branch of the LEFT JOIN.

Thanks,
Luis Martinez.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7467) Port ExecQueryFInstances() to KPRC

2018-12-11 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-7467:
---
Target Version: Product Backlog  (was: Impala 3.2.0)

> Port ExecQueryFInstances() to KPRC
> --
>
> Key: IMPALA-7467
> URL: https://issues.apache.org/jira/browse/IMPALA-7467
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>
> Port ExecQueryFInstances() from Thrift to KRPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-3380) Add TCP timeouts to all RPCs that don't block

2018-12-11 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-3380:
---
Target Version: Product Backlog, Impala 2.13.0  (was: Impala 2.13.0)

> Add TCP timeouts to all RPCs that don't block
> -
>
> Key: IMPALA-3380
> URL: https://issues.apache.org/jira/browse/IMPALA-3380
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.5.0
>Reporter: Henry Robinson
>Assignee: Michael Ho
>Priority: Minor
>  Labels: observability, supportability
>
> Most RPCs should not take an unbounded amount of time to complete (the 
> exception is {{TransmitData()}}, but that may also change). To handle hang 
> failures on the remote machine, we should add timeouts to every RPC (so, 
> really, every RPC client), and handle the timeout failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-5119) Don't make RPCs from Coordinator::UpdateBackendExecStatus()

2018-12-11 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-5119:
---
Target Version: Product Backlog

> Don't make RPCs from Coordinator::UpdateBackendExecStatus()
> ---
>
> Key: IMPALA-5119
> URL: https://issues.apache.org/jira/browse/IMPALA-5119
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Dan Hecht
>Priority: Major
>
> If it reports a bad status, {{UpdateFragmentExecStatus()}} will call 
> {{UpdateStatus()}}, which takes {{Coordinator::lock_}} and then calls 
> {{Cancel()}}. That method issues one RPC per fragment instance.
> In KRPC, doing so much work from {{UpdateFragmentExecStatus()}} - which is an 
> RPC handler - is a bad idea, even if the RPCs are issued asynchronously. 
> There's still some serialization cost.
> It's also a bad idea to do all this work while holding {{lock_}}. We should 
> address both of these to ensure scalability of the cancellation path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6788) Query fragments can spend lots of time starting up then fail right after "starting" all backends

2018-12-11 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-6788:
---
Target Version: Product Backlog

> Query fragments can spend lots of time starting up then fail right after 
> "starting" all backends
> 
>
> Key: IMPALA-6788
> URL: https://issues.apache.org/jira/browse/IMPALA-6788
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.12.0
>Reporter: Mostafa Mokhtar
>Assignee: Dan Hecht
>Priority: Major
>  Labels: krpc, rpc
> Attachments: connect_thread_busy_queries_failing.txt, 
> impalad.va1007.foo.com.impala.log.INFO.20180401-200453.1800807.zip
>
>
> Logs from a large cluster show that query startup can take a long time, then 
> once the startup completes the query is cancelled, this is because one of the 
> intermediate rpcs failed. 
> Not clear what the right answer is as fragments are started asynchronously, 
> possibly a timeout?
> {code}
> I0401 21:25:30.776803 1830900 coordinator.cc:99] Exec() 
> query_id=334cc7dd9758c36c:ec38aeb4 stmt=with customer_total_return as
> I0401 21:25:30.813993 1830900 coordinator.cc:357] starting execution on 644 
> backends for query_id=334cc7dd9758c36c:ec38aeb4
> I0401 21:29:58.406466 1830900 coordinator.cc:370] started execution on 644 
> backends for query_id=334cc7dd9758c36c:ec38aeb4
> I0401 21:29:58.412132 1830900 coordinator.cc:896] Cancel() 
> query_id=334cc7dd9758c36c:ec38aeb4
> I0401 21:29:59.188817 1830900 coordinator.cc:906] CancelBackends() 
> query_id=334cc7dd9758c36c:ec38aeb4, tried to cancel 643 backends
> I0401 21:29:59.189177 1830900 coordinator.cc:1092] Release admission control 
> resources for query_id=334cc7dd9758c36c:ec38aeb4
> {code}
> {code}
> I0401 21:23:48.218379 1830386 coordinator.cc:99] Exec() 
> query_id=e44d553b04d47cfb:28f06bb8 stmt=with customer_total_return as
> I0401 21:23:48.270226 1830386 coordinator.cc:357] starting execution on 640 
> backends for query_id=e44d553b04d47cfb:28f06bb8
> I0401 21:29:58.402195 1830386 coordinator.cc:370] started execution on 640 
> backends for query_id=e44d553b04d47cfb:28f06bb8
> I0401 21:29:58.403818 1830386 coordinator.cc:896] Cancel() 
> query_id=e44d553b04d47cfb:28f06bb8
> I0401 21:29:59.255903 1830386 coordinator.cc:906] CancelBackends() 
> query_id=e44d553b04d47cfb:28f06bb8, tried to cancel 639 backends
> I0401 21:29:59.256251 1830386 coordinator.cc:1092] Release admission control 
> resources for query_id=e44d553b04d47cfb:28f06bb8
> {code}
> Checked the coordinator and threads appear to be spending lots of time 
> waiting on exec_complete_barrier_
> {code}
> #0  0x7fd928c816d5 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x01222944 in impala::Promise::Get() ()
> #2  0x01220d7b in impala::Coordinator::StartBackendExec() ()
> #3  0x01221c87 in impala::Coordinator::Exec() ()
> #4  0x00c3a925 in 
> impala::ClientRequestState::ExecQueryOrDmlRequest(impala::TQueryExecRequest 
> const&) ()
> #5  0x00c41f7e in 
> impala::ClientRequestState::Exec(impala::TExecRequest*) ()
> #6  0x00bff597 in 
> impala::ImpalaServer::ExecuteInternal(impala::TQueryCtx const&, 
> std::shared_ptr, bool*, 
> std::shared_ptr*) ()
> #7  0x00c061d9 in impala::ImpalaServer::Execute(impala::TQueryCtx*, 
> std::shared_ptr, 
> std::shared_ptr*) ()
> #8  0x00c561c5 in impala::ImpalaServer::query(beeswax::QueryHandle&, 
> beeswax::Query const&) ()
> /StartBackendExec
> #11 0x00d60c9a in boost::detail::thread_data void (*)(std::string const&, std::string const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > > >::run() ()
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7956) Use Impala SQL parser in Impala shell

2018-12-11 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717692#comment-16717692
 ] 

Tim Armstrong commented on IMPALA-7956:
---

Two options I mentioned on the CR:
* Get rid of the is_dml logic and just display whatever result set the server 
sends back (this is the same as other clients).
* Get the statement type from impala after the query is planned

> Use Impala SQL parser in Impala shell
> -
>
> Key: IMPALA-7956
> URL: https://issues.apache.org/jira/browse/IMPALA-7956
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 3.2.0
>Reporter: Fredy Wijaya
>Priority: Major
>
> Impala shell uses regular expression instead of a real SQL parser to 
> determine whether the with clause is a DML statement or not: 
> https://github.com/apache/impala/blob/ecf12bec42e11262b88dc0993e375fe4d8acaafb/shell/impala_shell.py#L1157.
>  We need to investigate on using Impala SQL parser instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7945) test_hdfs_timeout.py fails on Centos6/python2.6

2018-12-11 Thread Joe McDonnell (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-7945.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> test_hdfs_timeout.py fails on Centos6/python2.6
> ---
>
> Key: IMPALA-7945
> URL: https://issues.apache.org/jira/browse/IMPALA-7945
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.2.0
>
>
> custom_cluster/test_hdfs_timeout.py uses subprocess.check_output, which does 
> not exist in python 2.6. This causes the test to fail with:
> {noformat}
> custom_cluster/test_hdfs_timeout.py:24: in 
> from subprocess import check_call, check_output
> E   ImportError: cannot import name check_output{noformat}
> This was introduced in my recent code change for IMPALA-7738.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7956) Use Impala SQL parser in Impala shell

2018-12-11 Thread Fredy Wijaya (JIRA)
Fredy Wijaya created IMPALA-7956:


 Summary: Use Impala SQL parser in Impala shell
 Key: IMPALA-7956
 URL: https://issues.apache.org/jira/browse/IMPALA-7956
 Project: IMPALA
  Issue Type: Improvement
  Components: Clients
Affects Versions: Impala 3.2.0
Reporter: Fredy Wijaya


Impala shell uses regular expression instead of a real SQL parser to determine 
whether the with clause is a DML statement or not: 
https://github.com/apache/impala/blob/ecf12bec42e11262b88dc0993e375fe4d8acaafb/shell/impala_shell.py#L1157.
 We need to investigate on using Impala SQL parser instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7824) Running INVALIDATE METADATA with authorization enabled can cause a hang if Sentry is unavailable

2018-12-11 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-7824:
-
Description: 
When authorization is enabled and Sentry is unavailable, calling INVALIDATE 
METADATA will hang.

Steps to reproduce:
1. Start Impala with authorization
2. Kill Sentry
3. Run INVALIDATE METADATA

  was:
Steps to reproduce:
1. Start Impala with authorization
2. Kill Sentry
3. Run INVALIDATE METADATA


> Running INVALIDATE METADATA with authorization enabled can cause a hang if 
> Sentry is unavailable
> 
>
> Key: IMPALA-7824
> URL: https://issues.apache.org/jira/browse/IMPALA-7824
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Critical
> Fix For: Impala 3.1.0
>
>
> When authorization is enabled and Sentry is unavailable, calling INVALIDATE 
> METADATA will hang.
> Steps to reproduce:
> 1. Start Impala with authorization
> 2. Kill Sentry
> 3. Run INVALIDATE METADATA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org