date:20190328

[jira] [Work started] (IMPALA-7031) Debug page "Cancel" action actually unregisters query

2019-03-28 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7031 started by Alice Fan.
-
> Debug page "Cancel" action actually unregisters query
> -
>
> Key: IMPALA-7031
> URL: https://issues.apache.org/jira/browse/IMPALA-7031
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Distributed Exec
>Affects Versions: Impala 3.0
>Reporter: Adriano
>Assignee: Alice Fan
>Priority: Major
>  Labels: query-lifecycle
> Attachments: Screen Shot 2018-07-20 at 10.19.42.png
>
>
> In big clusters with many jdbc/odbc users, in order to save resources are 
> often implemented scripts that automatically cancel queries (e.g. long 
> running queries) (the scripts typically are using the Impala Webui).
> Typical Scenario:
>  # A jdbc/odbc client submit a query
>  # The Coordinator start the query execution
>  # The query is cancelled from the Coordinator WebUi
>  # The jdbc/odbc client ask to the Coordinator the query status 
> (GetOperationStatus)
>  # The Coordinator answer "unknown query ID" (as the query was cancelled)
>  # For the client perspective the query failed for "unknown query ID"
> Currently, if a running query is cancelled from the impalad WebUI, the client 
> will just receive an 'unknown query ID' error on the next 
> fetch/getOperationStatus attempt. It would be good to be able to explicitly 
> call out this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8319) ThreadPoolTest.SynchronousThreadPoolTest failure + SIGSEGV on Centos 6

2019-03-28 Thread Joe McDonnell (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell reassigned IMPALA-8319:
-

Assignee: Joe McDonnell

> ThreadPoolTest.SynchronousThreadPoolTest failure + SIGSEGV on Centos 6
> --
>
> Key: IMPALA-8319
> URL: https://issues.apache.org/jira/browse/IMPALA-8319
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: broken-build
>
> One run of Centos 6 tests saw this failure for 
> ThreadPoolTest.SynchronousPoolTest:
> {noformat}
> [==] Running 2 tests from 1 test case.
> [--] Global test environment set-up.
> [--] 2 tests from ThreadPoolTest
> [ RUN ] ThreadPoolTest.BasicTest
> 19/03/17 16:26:21 INFO util.JvmPauseMonitor: Starting JVM pause monitor
> [ OK ] ThreadPoolTest.BasicTest (38 ms)
> [ RUN ] ThreadPoolTest.SynchronousThreadPoolTest
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-centos6/repos/Impala/be/src/util/thread-pool-test.cc:132:
>  Failure
> Value of: TErrorCode::THREAD_POOL_TASK_TIMED_OUT
> Actual: 124
> Expected: queued_task_status.code()
> Which is: 123
> [ FAILED ] ThreadPoolTest.SynchronousThreadPoolTest (109 ms)
> [--] 2 tests from ThreadPoolTest (148 ms total){noformat}
> This also produced a minidump indicating a SIGSEGV. It has the following 
> stack:
> {noformat}
> C [unifiedbetests+0x4846c93] 
> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*,
>  unsigned long, int)+0x133
> C [unifiedbetests+0x4846d2c] 
> tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned 
> long)+0x1c
> C [unifiedbetests+0x499e5a0] operator delete[](void*)+0x3c0
> C [unifiedbetests+0x19b7232] 
> __gnu_cxx::new_allocator::deallocate(std::string*, unsigned 
> long)+0x20
> C [unifiedbetests+0x19a9bb3] 
> std::allocator_traits 
> >::deallocate(std::allocator&, std::string*, unsigned long)+0x2b
> C [unifiedbetests+0x199d5ce] std::_Vector_base std::allocator >::_M_deallocate(std::string*, unsigned long)+0x32
> C [unifiedbetests+0x1996037] std::_Vector_base std::allocator >::~_Vector_base()+0x41
> C [unifiedbetests+0x1990791] std::vector std::allocator >::~vector()+0x41
> C [unifiedbetests+0x28264de] impala::TMetricDef::~TMetricDef()+0x4e
> C [unifiedbetests+0x2549370] std::pair impala::TMetricDef>::~pair()+0x1c
> C [unifiedbetests+0x254939a] void 
> __gnu_cxx::new_allocator impala::TMetricDef> > >::destroy impala::TMetricDef> >(std::pair*)+0x1c
> C [unifiedbetests+0x25474a3] 
> std::enable_if  const, impala::TMetricDef> > > >::__destroy_helper const, impala::TMetricDef> >::type>::value, void>::type 
> std::allocator_traits const, impala::TMetricDef> > > >::_S_destroy impala::TMetricDef> >(std::allocator const, impala::TMetricDef> > >&, std::pair impala::TMetricDef>*)+0x23
> C [unifiedbetests+0x2544d87] void 
> std::allocator_traits const, impala::TMetricDef> > > >::destroy impala::TMetricDef> >(std::allocator const, impala::TMetricDef> > >&, std::pair impala::TMetricDef>*)+0x23
> C [unifiedbetests+0x2540841] std::_Rb_tree const, impala::TMetricDef>, std::_Select1st impala::TMetricDef> >, std::less, 
> std::allocator > 
> >::_M_destroy_node(std::_Rb_tree_node impala::TMetricDef> >*)+0x37
> C [unifiedbetests+0x253ab1f] std::_Rb_tree const, impala::TMetricDef>, std::_Select1st impala::TMetricDef> >, std::less, 
> std::allocator > 
> >::_M_erase(std::_Rb_tree_node impala::TMetricDef> >*)+0x53
> C [unifiedbetests+0x253aafc] std::_Rb_tree const, impala::TMetricDef>, std::_Select1st impala::TMetricDef> >, std::less, 
> std::allocator > 
> >::_M_erase(std::_Rb_tree_node impala::TMetricDef> >*)+0x30
> C [unifiedbetests+0x253aafc] std::_Rb_tree const, impala::TMetricDef>, std::_Select1st impala::TMetricDef> >, std::less, 
> std::allocator > 
> >::_M_erase(std::_Rb_tree_node impala::TMetricDef> >*)+0x30
> C [unifiedbetests+0x2825e56] std::_Rb_tree const, impala::TMetricDef>, std::_Select1st impala::TMetricDef> >, std::less, 
> std::allocator > 
> >::~_Rb_tree()+0x2a
> C [unifiedbetests+0x2824962] std::map std::less, std::allocator impala::TMetricDef> > >::~map()+0x18
> C [unifiedbetests+0x282648e] 
> impala::MetricDefsConstants::~MetricDefsConstants()+0x18{noformat}
> This stack doesn't make much sense, because thread-pool-test doesn't use 
> MetricDefs. When I tested locally on Ubuntu 16.04, MetricDefs is not 
> instantiated. This is using the unified backend executable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail:

[jira] [Work stopped] (IMPALA-3816) Codegen perf-critical loops in Sorter

2019-03-28 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-3816 stopped by Tim Armstrong.
-
> Codegen perf-critical loops in Sorter
> -
>
> Key: IMPALA-3816
> URL: https://issues.apache.org/jira/browse/IMPALA-3816
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.7.0
>Reporter: Tim Armstrong
>Assignee: Tianyi Wang
>Priority: Minor
>  Labels: codegen
> Attachments: percentile query profile.txt, tpch_30.txt
>
>
> In the sorter, we codegen the comparator function but call it indirectly via 
> a function pointer. We should consider codegening the perf-critical loops so 
> that we can make the comparator function call direct and inlinable. Inlining 
> the comparison will be very beneficial if it is trivial, e.g. order by a 
> numeric column: I expect sorts on simple keys will get noticably faster.
> We should also be able to get rid of FreeLocalAllocations() calls for most 
> comparators, although I'm not sure what the best way to approach that is.
> The Partition() loop is the most perf-critical, followed by InsertionSort().
> We also don't do this yet for the TopN node, see IMPALA-3815.
> Mostafa's analysis:
> While evaluating Sort performance I noticed that the codegened compare 
> function is not inlined which results in large overhead per row. 
> Expected speedup is 10-15%
> {code}
>   /// Returns a negative value if lhs is less than rhs, a positive value if 
> lhs is
>   /// greater than rhs, or 0 if they are equal. All exprs 
> (ordering_exprs_lhs_ and
>   /// ordering_exprs_rhs_) must have been prepared and opened before calling 
> this,
>   /// i.e. 'sort_key_exprs' in the constructor must have been opened.
>   int ALWAYS_INLINE Compare(const TupleRow* lhs, const TupleRow* rhs) const {
> return codegend_compare_fn_ == NULL ?
> CompareInterpreted(lhs, rhs) :
> (*codegend_compare_fn_)(ordering_expr_evals_lhs_.data(),
> ordering_expr_evals_rhs_.data(), lhs, rhs);
>   } 
> {code}
> From Perf
> {code}
>   │bool Sorter::TupleSorter::Less(const TupleRow* lhs, const 
> TupleRow* rhs) {  
>   
>▒
>   7.43 │  push   %rbp 
>   
>   
>▒
>   3.23 │  mov%rsp,%rbp
>   
>   
>▒
>   9.44 │  push   %r12 
>   
>   
>▒
>   2.69 │  push   %rbx 
>   
>   
>▒
>   3.89 │  mov%rsi,%r12
>   
>   
>▒
>   2.98 │  mov%rdi,%rbx
>   
>   
>▒
>   6.06 │  sub$0x10,%rsp   
>   
>   
>◆
>│  --num_comparisons_till_free_;   
>   
>   
>▒
>│  DCHECK_GE(num_comparisons_till_free_, 0);   
>   
>   
>▒
>│  if (UNLIKELY(num_comparisons_till_free_ == 0)) {
>

[jira] [Commented] (IMPALA-3816) Codegen perf-critical loops in Sorter

2019-03-28 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804464#comment-16804464
 ] 

Tim Armstrong commented on IMPALA-3816:
---

I hacked on the follow-on patch a bit: https://gerrit.cloudera.org/#/c/12828/

> Codegen perf-critical loops in Sorter
> -
>
> Key: IMPALA-3816
> URL: https://issues.apache.org/jira/browse/IMPALA-3816
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.7.0
>Reporter: Tim Armstrong
>Assignee: Tianyi Wang
>Priority: Minor
>  Labels: codegen
> Attachments: percentile query profile.txt, tpch_30.txt
>
>
> In the sorter, we codegen the comparator function but call it indirectly via 
> a function pointer. We should consider codegening the perf-critical loops so 
> that we can make the comparator function call direct and inlinable. Inlining 
> the comparison will be very beneficial if it is trivial, e.g. order by a 
> numeric column: I expect sorts on simple keys will get noticably faster.
> We should also be able to get rid of FreeLocalAllocations() calls for most 
> comparators, although I'm not sure what the best way to approach that is.
> The Partition() loop is the most perf-critical, followed by InsertionSort().
> We also don't do this yet for the TopN node, see IMPALA-3815.
> Mostafa's analysis:
> While evaluating Sort performance I noticed that the codegened compare 
> function is not inlined which results in large overhead per row. 
> Expected speedup is 10-15%
> {code}
>   /// Returns a negative value if lhs is less than rhs, a positive value if 
> lhs is
>   /// greater than rhs, or 0 if they are equal. All exprs 
> (ordering_exprs_lhs_ and
>   /// ordering_exprs_rhs_) must have been prepared and opened before calling 
> this,
>   /// i.e. 'sort_key_exprs' in the constructor must have been opened.
>   int ALWAYS_INLINE Compare(const TupleRow* lhs, const TupleRow* rhs) const {
> return codegend_compare_fn_ == NULL ?
> CompareInterpreted(lhs, rhs) :
> (*codegend_compare_fn_)(ordering_expr_evals_lhs_.data(),
> ordering_expr_evals_rhs_.data(), lhs, rhs);
>   } 
> {code}
> From Perf
> {code}
>   │bool Sorter::TupleSorter::Less(const TupleRow* lhs, const 
> TupleRow* rhs) {  
>   
>▒
>   7.43 │  push   %rbp 
>   
>   
>▒
>   3.23 │  mov%rsp,%rbp
>   
>   
>▒
>   9.44 │  push   %r12 
>   
>   
>▒
>   2.69 │  push   %rbx 
>   
>   
>▒
>   3.89 │  mov%rsi,%r12
>   
>   
>▒
>   2.98 │  mov%rdi,%rbx
>   
>   
>▒
>   6.06 │  sub$0x10,%rsp   
>   
>   
>◆
>│  --num_comparisons_till_free_;   
>   
>   
>▒
>│  DCHECK_GE(num_comparisons_till_free_, 0);   
>   
>   
>▒
>│  if (UNLIKELY(num_comparisons_till_free_ == 0))

[jira] [Work stopped] (IMPALA-8309) Use a more human-readable flag to switch to a different authorization provider

2019-03-28 Thread radford nguyen (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8309 stopped by radford nguyen.
--
> Use a more human-readable flag to switch to a different authorization provider
> --
>
> Key: IMPALA-8309
> URL: https://issues.apache.org/jira/browse/IMPALA-8309
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Fredy Wijaya
>Assignee: radford nguyen
>Priority: Minor
>
> We currently use authorization_factory_class flag to switch to a different 
> authorization provider, which is useful for any third party to provide an 
> implementation of authorization provider. Since, Sentry and Ranger are 
> officially supported by Impala, we should have a flag, i.e. 
> authorization_provider=[sentry|ranger] to easily switch between officially 
> supported authorization providers.
> At the time of this writing, the existing {{authorization_factory_class}} 
> flag is being retained but its default value removed.  If present, it will 
> take precedence over the {{authorization_provider}} flag being added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8309) Use a more human-readable flag to switch to a different authorization provider

2019-03-28 Thread radford nguyen (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8309:
---
Description: 
We currently use authorization_factory_class flag to switch to a different 
authorization provider, which is useful for any third party to provide an 
implementation of authorization provider. Since, Sentry and Ranger are 
officially supported by Impala, we should have a flag, i.e. 
authorization_provider=[sentry|ranger] to easily switch between officially 
supported authorization providers.

At the time of this writing, the existing {{authorization_factory_class}} flag 
is being retained but its default value removed.  If present, it will take 
precedence over the {{authorization_provider}} flag being added.

  was:We currently use authorization_factory_class flag to switch to a 
different authorization provider, which is useful for any third party to 
provide an implementation of authorization provider. Since, Sentry and Ranger 
are officially supported by Impala, we should have a flag, i.e. 
authorization_provider=[sentry|ranger] to easily switch between officially 
supported authorization providers.


> Use a more human-readable flag to switch to a different authorization provider
> --
>
> Key: IMPALA-8309
> URL: https://issues.apache.org/jira/browse/IMPALA-8309
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Fredy Wijaya
>Assignee: radford nguyen
>Priority: Minor
>
> We currently use authorization_factory_class flag to switch to a different 
> authorization provider, which is useful for any third party to provide an 
> implementation of authorization provider. Since, Sentry and Ranger are 
> officially supported by Impala, we should have a flag, i.e. 
> authorization_provider=[sentry|ranger] to easily switch between officially 
> supported authorization providers.
> At the time of this writing, the existing {{authorization_factory_class}} 
> flag is being retained but its default value removed.  If present, it will 
> take precedence over the {{authorization_provider}} flag being added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8372) Impala Doc: Consistent uses of hyphens with global flags

2019-03-28 Thread Alex Rodoni (JIRA)

Alex Rodoni created IMPALA-8372:
---

 Summary: Impala Doc: Consistent uses of hyphens with global flags
 Key: IMPALA-8372
 URL: https://issues.apache.org/jira/browse/IMPALA-8372
 Project: IMPALA
  Issue Type: Bug
  Components: Docs
Reporter: Alex Rodoni
Assignee: Alex Rodoni


Standardize to use 2 non-breaking hyphens for global flags. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-8224) Impala Doc: Update the Web UI doc with missing contents

2019-03-28 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8224 started by Alex Rodoni.
---
> Impala Doc: Update the Web UI doc with missing contents
> ---
>
> Key: IMPALA-8224
> URL: https://issues.apache.org/jira/browse/IMPALA-8224
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8360) SynchronousThreadPoolTest ASSERT_TRUE(*no_sleep_destroyed) failed

2019-03-28 Thread Joe McDonnell (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell reassigned IMPALA-8360:
-

Assignee: Joe McDonnell

> SynchronousThreadPoolTest ASSERT_TRUE(*no_sleep_destroyed) failed
> -
>
> Key: IMPALA-8360
> URL: https://issues.apache.org/jira/browse/IMPALA-8360
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Joe McDonnell
>Priority: Major
>  Labels: broken-build
>
> Jenkins output:
> {code}
> Error Message
> Value of: *no_sleep_destroyed   Actual: false Expected: true
> Stacktrace
> /data/jenkins/workspace/impala-cdh6.x-core-data-load/repos/Impala/be/src/util/thread-pool-test.cc:112
> Value of: *no_sleep_destroyed
>   Actual: false
> Expected: true
> {code}
> Seen once on Centos 7 during a core-data-load.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8360) SynchronousThreadPoolTest ASSERT_TRUE(*no_sleep_destroyed) failed

2019-03-28 Thread Joe McDonnell (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804296#comment-16804296
 ] 

Joe McDonnell commented on IMPALA-8360:
---

I think this is a race condition in the test. I'm trying out a fix that loops 
and sleeps for a bit if no_sleep_destroyed is false. 

> SynchronousThreadPoolTest ASSERT_TRUE(*no_sleep_destroyed) failed
> -
>
> Key: IMPALA-8360
> URL: https://issues.apache.org/jira/browse/IMPALA-8360
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Joe McDonnell
>Priority: Major
>  Labels: broken-build
>
> Jenkins output:
> {code}
> Error Message
> Value of: *no_sleep_destroyed   Actual: false Expected: true
> Stacktrace
> /data/jenkins/workspace/impala-cdh6.x-core-data-load/repos/Impala/be/src/util/thread-pool-test.cc:112
> Value of: *no_sleep_destroyed
>   Actual: false
> Expected: true
> {code}
> Seen once on Centos 7 during a core-data-load.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-2990) Coordinator should timeout and cancel queries with unresponsive / stuck executors

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-2990:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Coordinator should timeout and cancel queries with unresponsive / stuck 
> executors
> -
>
> Key: IMPALA-2990
> URL: https://issues.apache.org/jira/browse/IMPALA-2990
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.3.0
>Reporter: Sailesh Mukil
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: hang, observability, supportability
>
> The coordinator currently waits indefinitely if it does not hear back from a 
> backend. This could cause a query to hang indefinitely in case of a network 
> error, etc.
> We should add logic for determining when a backend is unresponsive and kill 
> the query. The logic should mostly revolve around Coordinator::Wait() and 
> Coordinator::UpdateFragmentExecStatus() based on whether it receives periodic 
> updates from a backed (via FragmentExecState::ReportStatusCb()).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6046) test_partition_metadata_compatibility error: Hive query failing (HADOOP-13809)

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6046:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> test_partition_metadata_compatibility error: Hive query failing (HADOOP-13809)
> --
>
> Key: IMPALA-6046
> URL: https://issues.apache.org/jira/browse/IMPALA-6046
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.11.0, Impala 2.12.0
>Reporter: Bikramjeet Vig
>Priority: Major
>  Labels: flaky
>
> for the test 
> metadata/test_partition_metadata.py::TestPartitionMetadata::test_partition_metadata_compatibility,
>  a query to hive using beeline/HS2 is failing.
> From Hive logs:
> {noformat}
> 2017-10-11 17:59:13,631 ERROR transport.TSaslTransport 
> (TSaslTransport.java:open(315)) - SASL negotiation failure
> javax.security.sasl.SaslException: Invalid message format [Caused by 
> java.lang.IllegalStateException: zip file closed]
>   at 
> org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:107)
>   at 
> org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
>   at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
>   at 
> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
>   at 
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: zip file closed
>   at java.util.zip.ZipFile.ensureOpen(ZipFile.java:634)
>   at java.util.zip.ZipFile.getEntry(ZipFile.java:305)
>   at java.util.jar.JarFile.getEntry(JarFile.java:227)
>   at sun.net.www.protocol.jar.URLJarFile.getEntry(URLJarFile.java:128)
>   at 
> sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:132)
>   at 
> sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:150)
>   at java.net.URLClassLoader.getResourceAsStream(URLClassLoader.java:233)
>   at javax.xml.parsers.SecuritySupport$4.run(SecuritySupport.java:94)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at 
> javax.xml.parsers.SecuritySupport.getResourceAsStream(SecuritySupport.java:87)
>   at 
> javax.xml.parsers.FactoryFinder.findJarServiceProvider(FactoryFinder.java:283)
>   at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:255)
>   at 
> javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:121)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2606)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2583)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2489)
>   at org.apache.hadoop.conf.Configuration.set(Configuration.java:1174)
>   at org.apache.hadoop.conf.Configuration.set(Configuration.java:1146)
>   at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:525)
>   at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:543)
>   at org.apache.hadoop.mapred.JobConf.(JobConf.java:437)
>   at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:2803)
>   at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:2761)
>   at 
> org.apache.hive.service.auth.AuthenticationProviderFactory.getAuthenticationProvider(AuthenticationProviderFactory.java:61)
>   at 
> org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:104)
>   at 
> org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:102)
>   ... 8 more
> 2017-10-11 17:59:13,633 INFO  session.SessionState 
> (SessionState.java:dropPathAndUnregisterDeleteOnExit(785)) - Deleted 
> directory: /tmp/hive/jenkins/72505700-e690-4355-bdd2-55db2188a976 on fs with 
> scheme hdfs
> 2017-10-11 17:59:13,635 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(297)) - Error occurred during processing of 
> message.
> java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: 
> Invalid message format
>   at 
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
>   at 
>

[jira] [Updated] (IMPALA-6294) Concurrent hung with lots of spilling make slow progress due to blocking in DataStreamRecvr and DataStreamSender

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6294:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Concurrent hung with lots of spilling make slow progress due to blocking in 
> DataStreamRecvr and DataStreamSender
> 
>
> Key: IMPALA-6294
> URL: https://issues.apache.org/jira/browse/IMPALA-6294
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Mostafa Mokhtar
>Assignee: Michael Ho
>Priority: Critical
> Attachments: IMPALA-6285 TPCDS Q3 slow broadcast, 
> slow_broadcast_q3_reciever.txt, slow_broadcast_q3_sender.txt
>
>
> While running a highly concurrent spilling workload on a large cluster 
> queries start running slower, even light weight queries that are not running 
> are affected by this slow down. 
> {code}
>   EXCHANGE_NODE (id=9):(Total: 3m1s, non-child: 3m1s, % non-child: 
> 100.00%)
>  - ConvertRowBatchTime: 999.990us
>  - PeakMemoryUsage: 0
>  - RowsReturned: 108.00K (108001)
>  - RowsReturnedRate: 593.00 /sec
> DataStreamReceiver:
>   BytesReceived(4s000ms): 254.47 KB, 338.82 KB, 338.82 KB, 852.43 
> KB, 1.32 MB, 1.33 MB, 1.50 MB, 2.53 MB, 2.99 MB, 3.00 MB, 3.00 MB, 3.00 MB, 
> 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.16 MB, 3.49 MB, 3.80 
> MB, 4.15 MB, 4.55 MB, 4.84 MB, 4.99 MB, 5.07 MB, 5.41 MB, 5.75 MB, 5.92 MB, 
> 6.00 MB, 6.00 MB, 6.00 MB, 6.07 MB, 6.28 MB, 6.33 MB, 6.43 MB, 6.67 MB, 6.91 
> MB, 7.29 MB, 8.03 MB, 9.12 MB, 9.68 MB, 9.90 MB, 9.97 MB, 10.44 MB, 11.25 MB
>- BytesReceived: 11.73 MB (12301692)
>- DeserializeRowBatchTimer: 957.990ms
>- FirstBatchArrivalWaitTime: 0.000ns
>- PeakMemoryUsage: 644.44 KB (659904)
>- SendersBlockedTimer: 0.000ns
>- SendersBlockedTotalTimer(*): 0.000ns
> {code}
> {code}
> DataStreamSender (dst_id=9):(Total: 1s819ms, non-child: 1s819ms, % 
> non-child: 100.00%)
>- BytesSent: 234.64 MB (246033840)
>- NetworkThroughput(*): 139.58 MB/sec
>- OverallThroughput: 128.92 MB/sec
>- PeakMemoryUsage: 33.12 KB (33920)
>- RowsReturned: 108.00K (108001)
>- SerializeBatchTime: 133.998ms
>- TransmitDataRPCTime: 1s680ms
>- UncompressedRowBatchSize: 446.42 MB (468102200)
> {code}
> Timeouts seen in IMPALA-6285 are caused by this issue
> {code}
> I1206 12:44:14.925405 25274 status.cc:58] RPC recv timed out: Client 
> foo-17.domain.com:22000 timed-out during recv call.
> @   0x957a6a  impala::Status::Status()
> @  0x11dd5fe  
> impala::DataStreamSender::Channel::DoTransmitDataRpc()
> @  0x11ddcd4  
> impala::DataStreamSender::Channel::TransmitDataHelper()
> @  0x11de080  impala::DataStreamSender::Channel::TransmitData()
> @  0x11e1004  impala::ThreadPool<>::WorkerThread()
> @   0xd10063  impala::Thread::SuperviseThread()
> @   0xd107a4  boost::detail::thread_data<>::run()
> @  0x128997a  (unknown)
> @ 0x7f68c5bc7e25  start_thread
> @ 0x7f68c58f534d  __clone
> {code}
> A similar behavior was also observed with KRPC enabled IMPALA-6048



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-4268) Rework coordinator buffering to buffer more data

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-4268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-4268:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Rework coordinator buffering to buffer more data
> 
>
> Key: IMPALA-4268
> URL: https://issues.apache.org/jira/browse/IMPALA-4268
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Henry Robinson
>Priority: Major
>  Labels: query-lifecycle, resource-management
> Attachments: rows-produced-histogram.png
>
>
> {{PlanRootSink}} executes the producer thread (the coordinator fragment 
> execution thread) in a separate thread to the consumer (i.e. the thread 
> handling the fetch RPC), which calls {{GetNext()}} to retrieve the rows. The 
> implementation was simplified by handing off a single batch at a time from 
> the producers to consumer.
> This decision causes some problems:
> * Many context switches for the sender. Adding buffering would allow the 
> sender to append to the buffer and continue progress without a context switch.
> * Query execution can't release resources until the client has fetched the 
> final batch, because the coordinator fragment thread is still running and 
> potentially producing backpressure all the way down the plan tree.
> * The consumer can't fulfil fetch requests greater than Impala's internal 
> BATCH_SIZE, because it is only given one batch at a time.
> The tricky part is managing the mismatch between the size of the row batches 
> processed in {{Send()}} and the size of the fetch result asked for by the 
> client without impacting performance too badly. The sender materializes 
> output rows in a {{QueryResultSet}} that is owned by the coordinator. That is 
> not, currently, a splittable object - instead it contains the actual RPC 
> response struct that will hit the wire when the RPC completes. As 
> asynchronous sender does not know the batch size, because it can in theory 
> change on every fetch call (although most reasonable clients will not 
> randomly change the fetch size).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-5121) avg() on timestamp col is wrong with -use_local_tz_for_unix_timestamp_conversions

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-5121:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> avg() on timestamp col is wrong with 
> -use_local_tz_for_unix_timestamp_conversions
> -
>
> Key: IMPALA-5121
> URL: https://issues.apache.org/jira/browse/IMPALA-5121
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.5.0, Impala 2.2.10, Impala 2.3.4
>Reporter: Matthew Jacobs
>Priority: Critical
>  Labels: timestamp
>
> The flag '-use_local_tz_for_unix_timestamp_conversions' was added for 
> IMPALA-97. Enabling it results in timestamps sometimes being converted into 
> localtime, but unfortunately this doesn't seem to be well defined when/where 
> this conversion will happen.
> I've noticed that its use seems to break the avg() aggregate function on 
> timestamp types (despite being an odd function on timestamps, it should still 
> work).
> Impala by default, i.e. not enabling this flag:
> {code}
> [localhost:21000] > select timestamp_col from functional.alltypestiny;
> Query: select timestamp_col from functional.alltypestiny
> Query submitted at: 2017-03-27 18:50:57 (Coordinator: 
> http://mj-desktop.ca.cloudera.com:25000)
> Query progress can be monitored at: 
> http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=8242bb6012948f06:143961ed
> +-+
> | timestamp_col   |
> +-+
> | 2009-01-01 00:00:00 |
> | 2009-01-01 00:01:00 |
> | 2009-02-01 00:00:00 |
> | 2009-02-01 00:01:00 |
> | 2009-03-01 00:00:00 |
> | 2009-03-01 00:01:00 |
> | 2009-04-01 00:00:00 |
> | 2009-04-01 00:01:00 |
> +-+
> Fetched 8 row(s) in 0.02s
> [localhost:21000] > select avg(timestamp_col) from functional.alltypestiny;
> Query: select avg(timestamp_col) from functional.alltypestiny
> Query submitted at: 2017-03-27 18:50:59 (Coordinator: 
> http://mj-desktop.ca.cloudera.com:25000)
> Query progress can be monitored at: 
> http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=534f6ab59b201b5e:40e2a86d
> +-+
> | avg(timestamp_col)  |
> +-+
> | 2009-02-14 23:45:30 |
> +-+
> {code}
> Then enabling the flag results in the same timestamps returned when scanning, 
> but evaluating them in avg() results in them being converted:
> {code}
> [localhost:21000] > select timestamp_col from functional.alltypestiny;
> Query: select timestamp_col from functional.alltypestiny
> Query submitted at: 2017-03-27 18:51:17 (Coordinator: 
> http://mj-desktop.ca.cloudera.com:25000)
> Query progress can be monitored at: 
> http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=ac4ab8fd8caf4be9:ebb0834d
> +-+
> | timestamp_col   |
> +-+
> | 2009-01-01 00:00:00 |
> | 2009-01-01 00:01:00 |
> | 2009-02-01 00:00:00 |
> | 2009-02-01 00:01:00 |
> | 2009-03-01 00:00:00 |
> | 2009-03-01 00:01:00 |
> | 2009-04-01 00:00:00 |
> | 2009-04-01 00:01:00 |
> +-+
> Fetched 8 row(s) in 0.30s
> [localhost:21000] > select avg(timestamp_col) from functional.alltypestiny;
> Query: select avg(timestamp_col) from functional.alltypestiny
> Query submitted at: 2017-03-27 18:51:25 (Coordinator: 
> http://mj-desktop.ca.cloudera.com:25000)
> Query progress can be monitored at: 
> http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=9e4e2c16896090f7:8922c4f2
> +-+
> | avg(timestamp_col)  |
> +-+
> | 2009-02-15 00:00:30 |
> +-+
> Fetched 1 row(s) in 0.12s
> {code}
> This behavior seems inconsistent and I'm pretty sure is not intentional. 
> There are two misleading functions on TimestampValue that will do this 
> conversion when the flag is set: ToUnixTime() and ToSubsecondUnixTime(). 
> avg() seems to have started using ToSubsecondUnixTime() after IMPALA-2914.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6590) Disable expr rewrites and codegen for VALUES() statements

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6590:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Disable expr rewrites and codegen for VALUES() statements
> -
>
> Key: IMPALA-6590
> URL: https://issues.apache.org/jira/browse/IMPALA-6590
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0
>Reporter: Alexander Behm
>Priority: Major
>  Labels: perf, planner, ramp-up, regression
>
> The analysis of statements with big VALUES clauses like INSERT INTO  
> VALUES is slow due to expression rewrites like constant folding. The 
> performance of such statements has regressed since the introduction of expr 
> rewrites and constant folding in IMPALA-1788.
> We should skip expr rewrites for VALUES altogether since it mostly provides 
> no benefit but can have a large overhead due to evaluation of expressions in 
> the backend (constant folding). These expressions are ultimately evaluated 
> and materialized in the backend anyway, so there's no point in folding them 
> during analysis.
> Similarly, there is no point in doing codegen for these exprs in the backend 
> union node.
> *Workaround*
> {code}
> SET ENABLE_EXPR_REWRITES=FALSE;
> SET DISABLE_CODEGEN=TRUE;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6048) Queries make very slow progress and report WaitForRPC() stuck for too long

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6048:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Queries make very slow progress and report  WaitForRPC() stuck for too long
> ---
>
> Key: IMPALA-6048
> URL: https://issues.apache.org/jira/browse/IMPALA-6048
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 2.11.0
>Reporter: Mostafa Mokhtar
>Assignee: Michael Ho
>Priority: Critical
> Attachments: Archive 2.zip
>
>
> When running 32 concurrent queries from TPCDS a couple of instances from 
> TPC-DS Q78 9 hours to finish and it appeared to be hung.
> On an idle cluster the query finished in under 5 minutes, profiles attached. 
> When the query ran for long fragments reported +16 hours of network 
> send/receive time
> The logs show there is a lot of messages like the one below, there are 
> incidents for this log message where a node waited too long from an RPC from 
> itself
> {code}
> W1012 00:47:57.633549 117475 krpc-data-stream-sender.cc:360] XXX: 
> WaitForRPC() stuck for too long address=10.17.234.37:29000 
> fragment_instace_id_=1e48ef897e797131:2f05789b05eb dest_node_id_=24 
> sender_id_=81
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6194) Ensure all fragment instances notice cancellation

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6194:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Ensure all fragment instances notice cancellation
> -
>
> Key: IMPALA-6194
> URL: https://issues.apache.org/jira/browse/IMPALA-6194
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Lars Volker
>Priority: Critical
>  Labels: observability, supportability
>
> Currently queries can get stuck in an uncancellable state, e.g. when blocking 
> on function calls or condition variables without periodically checking for 
> cancellation. We should eliminate all those calls and make sure we don't 
> re-introduce such issues. One option would be a watchdog to check that each 
> fragment instance regularly calls RETURN_IF_CANCEL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6692) When partition exchange is followed by sort each sort node becomes a synchronization point across the cluster

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6692:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> When partition exchange is followed by sort each sort node becomes a 
> synchronization point across the cluster
> -
>
> Key: IMPALA-6692
> URL: https://issues.apache.org/jira/browse/IMPALA-6692
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 2.10.0
>Reporter: Mostafa Mokhtar
>Priority: Critical
>  Labels: perf, resource-management
> Attachments: Kudu table insert without KRPC no sort.txt, Kudu table 
> insert without KRPC.txt, kudu_partial_sort_insert_vd1129.foo.com_2.txt, 
> profile-spilling.txt
>
>
> Issue described in this JIRA applies to 
> * Analytical functions
> * Writes to Partitioned Parquet tables
> * Writes to Kudu tables
> When inserting into a Kudu table from Impala the plan is something like HDFS 
> SCAN -> Partition Exchange -> Partial Sort -> Kudu Insert.
> The query initially makes good progress then significantly slows down and 
> very few nodes make progress.
> While the insert is running the query goes through different phases 
> * Phase 1
> ** Scan is reading data fast, sending data through to exchange 
> ** Partial Sort keeps accumulating batches
> ** Network and CPU is busy, life appears to be OK
> * Phase 2
> ** One of the Sort operators reaches its memory limit and stops calling 
> ExchangeNode::GetNext for a while
> ** This creates back pressure against the DataStreamSenders
> ** The Partial Sort doesn't call GetNext until it has finished sorting GBs of 
> data (Partial sort memory is unbounded as of 03/16/2018)
> ** All exchange operators in the cluster eventually get blocked on that Sort 
> operator and can no longer make progress
> ** After a while the Sort is able to accept more batches which temporarily 
> unblocks execution across the cluster
> ** Another sort operator reaches its memory limit and this loop repeats itself
> Below are stacks from one of the blocked hosts
> _Sort node waiting on data from exchange node as it didn't start sorting 
> since the memory limit for the sort wasn't reached_
> {code}
> Thread 90 (Thread 0x7f8d7d233700 (LWP 21625)):
> #0  0x003a6f00b68c in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x7fab1422174c in 
> std::condition_variable::wait(std::unique_lock&) () from 
> /opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.205/lib/impala/lib/libstdc++.so.6
> #2  0x00b4d5aa in void 
> std::_V2::condition_variable_any::wait 
> >(boost::unique_lock&) ()
> #3  0x00b4ab6a in 
> impala::KrpcDataStreamRecvr::SenderQueue::GetBatch(impala::RowBatch**) ()
> #4  0x00b4b0c8 in 
> impala::KrpcDataStreamRecvr::GetBatch(impala::RowBatch**) ()
> #5  0x00dca7c5 in 
> impala::ExchangeNode::FillInputRowBatch(impala::RuntimeState*) ()
> #6  0x00dcacae in 
> impala::ExchangeNode::GetNext(impala::RuntimeState*, impala::RowBatch*, 
> bool*) ()
> #7  0x01032ac3 in 
> impala::PartialSortNode::GetNext(impala::RuntimeState*, impala::RowBatch*, 
> bool*) ()
> #8  0x00ba9c92 in impala::FragmentInstanceState::ExecInternal() ()
> #9  0x00bac7df in impala::FragmentInstanceState::Exec() ()
> #10 0x00b9ab1a in 
> impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) ()
> #11 0x00d5da9f in 
> impala::Thread::SuperviseThread(std::basic_string std::char_traits, std::allocator > const&, 
> std::basic_string, std::allocator > 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #12 0x00d5e29a in boost::detail::thread_data void (*)(std::basic_string, std::allocator 
> > const&, std::basic_string, 
> std::allocator > const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value, 
> std::allocator > >, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > > >::run() ()
> #13 0x012d70ba in thread_proxy ()
> #14 0x003a6f007aa1 in start_thread () from /lib64/libpthread.so.0
> #15 0x003a6ece893d in clone () from /lib64/libc.so.6
> {code}
> _DataStreamSender blocked due to back pressure from the DataStreamRecvr on 
> the node which has a Sort that is spilling_
> {code}
> Thread 89 (Thread 0x7fa8f6a15700 (LWP 21626)):
> #0  0x003a6f00ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x01237e77 in 
> impala::KrpcDataStreamSender::Channel::WaitForRpc(std::unique_lock*)
>  ()
> #2  0x01238b8d in 
>

[jira] [Updated] (IMPALA-6861) Avoid spurious OpenSSL warning printed by KRPC

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6861:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Avoid spurious OpenSSL warning printed by KRPC
> --
>
> Key: IMPALA-6861
> URL: https://issues.apache.org/jira/browse/IMPALA-6861
> Project: IMPALA
>  Issue Type: Task
>  Components: Distributed Exec
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Sailesh Mukil
>Assignee: Michael Ho
>Priority: Major
>
> This warning has no effect, we should opt for an initialization codepath that 
> does not print this error message.
> {code:java}
> W0416 02:04:25.040552 19359 openssl_util.cc:107] It appears that OpenSSL has 
> been previously initialized by code outside of Kudu. Please use 
> kudu::client::DisableOpenSSLInitialization() to avoid potential crashes due 
> to conflicting initialization.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6874) Add more tests for mixed-format tables, including parquet

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6874:
-
Target Version: Impala 2.13.0, Impala 3.3.0  (was: Impala 2.13.0, Impala 
3.2.0)

> Add more tests for mixed-format tables, including parquet
> -
>
> Key: IMPALA-6874
> URL: https://issues.apache.org/jira/browse/IMPALA-6874
> Project: IMPALA
>  Issue Type: Test
>  Components: Infrastructure
>Affects Versions: Impala 2.12.0
>Reporter: Tim Armstrong
>Priority: Major
>
> We only have a single very basic table with mixed formats that I can see. It 
> would be good to have a larger table that includes parquet, so we can 
> exercise some of the more interesting memory reservation code paths. I think 
> the requirements are:
> * Files of at least multiple MBs in size
> * Includes Parquet and at least one row-based format
> * Ideally a mix of different file and partition sizes
> We want queries that exercise "interesting" code paths:
> * Many columns returned: select * from table.
> * Single column returned
> * No columns returned - e.g. select count(*)
> * Selective scan with predicates



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6783) Rethink the end-to-end queuing at KrpcDataStreamReceiver

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6783:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Rethink the end-to-end queuing at KrpcDataStreamReceiver
> 
>
> Key: IMPALA-6783
> URL: https://issues.apache.org/jira/browse/IMPALA-6783
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.12.0
>Reporter: Michael Ho
>Priority: Critical
>
> Follow up from IMPALA-6116. We currently bound the memory usage of service 
> queue and force a RPC to retry if the memory usage exceeds the configured 
> limit. The deserialization of row batches happen in the context of service 
> threads. The deserialized row batches are stored in a queue in the receiver 
> and its memory consumption is bound by FLAGS_exchg_node_buffer_size_bytes. 
> Exceeding that limit, we will put incoming row batches into a deferred RPC 
> queue, which will be drained by deserialization threads. This makes it hard 
> to size the service queues as its capacity may need to grow as the number of 
> nodes in the cluster grows.
> We may need to reconsider the role of service queue: it could just be a 
> transition queue before KrpcDataStreamMgr routes the incoming row batches to 
> the appropriate receivers. The actual queuing may happen in the receiver. The 
> deserialization should always happen in the context of deserialization 
> threads so the service threads will just be responsible for routing the RPC 
> requests. This allows us to keep a rather small service queue. Incoming 
> serialized row batches will always sit in a queue to be drained by 
> deserialization threads. We may still need to keep a certain number of 
> deserialized row batches around ready to be consumed. In this way, we can 
> account for the memory consumption and size the queue based on number of 
> senders and memory budget of a query.
> One hurdle is that we need to overcome the undesirable cross-thread 
> allocation pattern as rpc_context is allocated from service threads but freed 
> by the deserialization thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7027) Multiple Cast to Varchar with different limit fails with "AnalysisException: null CAUSED BY: IllegalArgumentException: "

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7027:
-
Target Version: Impala 2.13.0, Impala 3.3.0  (was: Impala 2.13.0, Impala 
3.2.0)

> Multiple Cast to Varchar with different limit fails with "AnalysisException: 
> null CAUSED BY: IllegalArgumentException: "
> 
>
> Key: IMPALA-7027
> URL: https://issues.apache.org/jira/browse/IMPALA-7027
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 
> 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Meenakshi
>Priority: Major
>  Labels: planner, regression
>
> If we have multiple cast of '' to varchar statements in a impala query which 
> has a distinct like below, the query breaks for scenario when the cast to 
> varchar limit in the SQL is lower than the previous cast.
>  
> Query 1> Fails with " AnalysisException: null CAUSED BY: 
> IllegalArgumentException: targetType=VARCHAR(100) type=VARCHAR(101)"
> SELECT DISTINCT CAST('' as VARCHAR(101)) as CL_COMMENTS,CAST('' as 
> VARCHAR(100))  as CL_USER_ID FROM tablename limit 1
> Where as the below query succeeds
> Query 2> Success
>  SELECT DISTINCT CAST('' as VARCHAR(100)) as CL_COMMENTS,CAST('' as 
> VARCHAR(101))  as CL_USER_ID FROM  tablename limit 1
> *Workaround*
> SET ENABLE_EXPR_REWRITES=false;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7070) Failed test: query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays on S3

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7070:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Failed test: 
> query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays
>  on S3
> -
>
> Key: IMPALA-7070
> URL: https://issues.apache.org/jira/browse/IMPALA-7070
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Priority: Critical
>  Labels: broken-build, flaky, s3, test-failure
>
>  
> {code:java}
> Error Message
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays "col1 
> array>") query_test/test_nested_types.py:579: in 
> _create_test_table check_call(["hadoop", "fs", "-put", local_path, 
> location], shell=False) /usr/lib64/python2.6/subprocess.py:505: in check_call 
> raise CalledProcessError(retcode, cmd) E   CalledProcessError: Command 
> '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Stacktrace
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays
> "col1 array>")
> query_test/test_nested_types.py:579: in _create_test_table
> check_call(["hadoop", "fs", "-put", local_path, location], shell=False)
> /usr/lib64/python2.6/subprocess.py:505: in check_call
> raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Standard Error
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_thrift_array_of_arrays_11da5fde` CASCADE;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_thrift_array_of_arrays_11da5fde`;
> MainThread: Created database "test_thrift_array_of_arrays_11da5fde" for test 
> ID 
> "query_test/test_nested_types.py::TestParquetArrayEncodings::()::test_thrift_array_of_arrays[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
> -- executing against localhost:21000
> create table test_thrift_array_of_arrays_11da5fde.ThriftArrayOfArrays (col1 
> array>) stored as parquet location 
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays';
> 18/05/20 18:31:03 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 
> 10 second(s).
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> started
> 18/05/20 18:31:06 INFO Configuration.deprecation: 
> fs.s3a.server-side-encryption-key is deprecated. Instead, use 
> fs.s3a.server-side-encryption.key
> put: rename 
> `s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet._COPYING_'
>  to 
> `s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet':
>  Input/output error
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: Stopping s3a-file-system 
> metrics system...
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> stopped.
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> shutdown complete.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6876) Entries in CatalogUsageMonitor are not cleared after invalidation

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6876:
-
Target Version: Impala 2.13.0, Impala 3.3.0  (was: Impala 2.13.0, Impala 
3.2.0)

> Entries in CatalogUsageMonitor are not cleared after invalidation
> -
>
> Key: IMPALA-6876
> URL: https://issues.apache.org/jira/browse/IMPALA-6876
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Dimitris Tsirogiannis
>Priority: Major
>  Labels: memory-leak
>
> The CatalogUsageMonitor in the catalog maintains a small cache of references 
> to tables that: a) are accessed frequently in the catalog and b) have the 
> highest memory requirements. These entries are not cleared upon server or 
> table invalidation, thus preventing the GC from collecting the memory of 
> these tables. We should make sure that the CatalogUsageMonitor does not 
> maintain entries of tables that have been invalidated or deleted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6701) stress test compute stats binary search can't find a start point

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6701:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> stress test compute stats binary search can't find a start point
> 
>
> Key: IMPALA-6701
> URL: https://issues.apache.org/jira/browse/IMPALA-6701
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 
> 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Michael Brown
>Priority: Critical
>
> The stress test compute stats statements recently took 9 hours to do a binary 
> search.
> The stress test cannot find a start point for mem_limit for compute stats 
> statements, because explain is not supported.
> {noformat}
> [localhost:21000] > explain compute stats tpch.lineitem;
> Query: explain compute stats tpch.lineitem
> ERROR: AnalysisException: Syntax error in line 1:
> explain compute stats tpch.lineitem
> ^
> Encountered: COMPUTE
> Expected: CREATE, DELETE, INSERT, SELECT, UPDATE, UPSERT, VALUES, WITH
> CAUSED BY: Exception: Syntax error
> [localhost:21000] >
> {noformat}
> The stress test has done this ever since it supported such:
> {noformat}
> 1370 def estimate_query_mem_mb_usage(query, query_runner):
> 1371   """Runs an explain plan then extracts and returns the estimated memory 
> needed to run
> 1372   the query.
> 1373   """
> 1374   with query_runner.impalad_conn.cursor() as cursor:
> 1375 LOG.debug("Using %s database", query.db_name)
> 1376 if query.db_name:
> 1377   cursor.execute('USE ' + query.db_name)
> 1378 if query.query_type == QueryType.COMPUTE_STATS:
> 1379   # Running "explain" on compute stats is not supported by Impala.
> 1380   return
> {noformat}
> This means the stress test is starting with the full limit of impalad.
> {noformat}
> 2018-03-17 08:00:38,684 12313 MainThread 
> INFO:concurrent_select[1164]:Collecting runtime info for query 
> compute_stats_call_center_mt_dop_1: 
> COMPUTE STATS call_center
> 2018-03-17 08:00:38,925 12313 MainThread DEBUG:concurrent_select[1375]:Using 
> tpcds_300_decimal_parquet database
> 2018-03-17 08:00:38,925 12313 MainThread DEBUG:db_connection[203]:IMPALA: USE 
> tpcds_300_decimal_parquet
> 2018-03-17 08:00:39,007 12313 MainThread INFO:hiveserver2[265]:Closing active 
> operation
> 2018-03-17 08:00:39,123 12313 MainThread INFO:concurrent_select[1247]:Finding 
> a starting point for binary search
> 2018-03-17 08:00:39,148 12313 MainThread DEBUG:concurrent_select[866]:Using 
> tpcds_300_decimal_parquet database
> 2018-03-17 08:00:39,148 12313 MainThread DEBUG:db_connection[203]:IMPALA: USE 
> tpcds_300_decimal_parquet
> 2018-03-17 08:00:39,206 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET 
> MT_DOP=1
> 2018-03-17 08:00:39,333 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET 
> ABORT_ON_ERROR=1
> 2018-03-17 08:00:39,416 12313 MainThread DEBUG:concurrent_select[878]:Setting 
> mem limit to 77308 MB
> 2018-03-17 08:00:39,416 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET 
> MEM_LIMIT=77308M
> 2018-03-17 08:00:39,503 12313 MainThread DEBUG:concurrent_select[882]:Running 
> query with 77308 MB mem limit at vc0718.halxg.cloudera.com with timeout secs 
> 9223372036854775807:
> COMPUTE STATS call_center
> 2018-03-17 08:00:39,741 12313 MainThread DEBUG:concurrent_select[890]:Query 
> id is 3b4213033bf2359c:d44b29c5
> 2018-03-17 08:00:41,084 12313 MainThread INFO:hiveserver2[265]:Closing active 
> operation
> 2018-03-17 08:00:41,202 12313 MainThread 
> DEBUG:concurrent_select[1209]:Spilled: False
> 2018-03-17 08:00:41,202 12313 MainThread INFO:concurrent_select[1267]:Finding 
> minimum memory required to avoid spilling
> 2018-03-17 08:00:41,227 12313 MainThread DEBUG:concurrent_select[866]:Using 
> tpcds_300_decimal_parquet database
> 2018-03-17 08:00:41,227 12313 MainThread DEBUG:db_connection[203]:IMPALA: USE 
> tpcds_300_decimal_parquet
> 2018-03-17 08:00:41,286 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET 
> MT_DOP=1
> 2018-03-17 08:00:41,367 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET 
> ABORT_ON_ERROR=1
> 2018-03-17 08:00:41,449 12313 MainThread DEBUG:concurrent_select[878]:Setting 
> mem limit to 38654 MB
> 2018-03-17 08:00:41,449 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET 
> MEM_LIMIT=38654M
> 2018-03-17 08:00:41,530 12313 MainThread DEBUG:concurrent_select[882]:Running 
> query with 38654 MB mem limit at vc0718.halxg.cloudera.com with timeout secs 
> 9223372036854775807:
> COMPUTE STATS call_center
> 2018-03-17 08:00:41,589 12313 MainThread DEBUG:concurrent_select[890]:Query 
> id is 74db40c3f221cf3:d67997c
> 2018-03-17 08:00:42,184 12313 MainThread

[jira] [Updated] (IMPALA-6890) split-hbase.sh: Can't get master address from ZooKeeper; znode data == null

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-6890:
-
Target Version: Impala 2.13.0, Impala 3.3.0  (was: Impala 2.13.0, Impala 
3.2.0)

> split-hbase.sh: Can't get master address from ZooKeeper; znode data == null
> ---
>
> Key: IMPALA-6890
> URL: https://issues.apache.org/jira/browse/IMPALA-6890
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.12.0
>Reporter: Vuk Ercegovac
>Assignee: Joe McDonnell
>Priority: Critical
>
> {noformat}
> 20:57:13 FAILED (Took: 7 min 58 sec)
> 20:57:13 
> '/data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/testdata/bin/split-hbase.sh'
>  failed. Tail of log:
> 20:57:13 Wed Apr 18 20:49:43 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 Wed Apr 18 20:49:43 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 Wed Apr 18 20:49:44 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> ...
> 20:57:13 Wed Apr 18 20:57:13 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 
> 20:57:13  at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:157)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4329)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4321)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:2952)
> 20:57:13  at 
> org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.(HBaseTestDataRegionAssigment.java:74)
> 20:57:13  at 
> org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.main(HBaseTestDataRegionAssigment.java:310)
> 20:57:13 Caused by: org.apache.hadoop.hbase.MasterNotRunningException: 
> java.io.IOException: Can't get master address from ZooKeeper; znode data == 
> null
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1698)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1718)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1875)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
> 20:57:13  ... 5 more
> 20:57:13 Caused by: java.io.IOException: Can't get master address from 
> ZooKeeper; znode data == null
> 20:57:13  at 
> org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:154)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1648)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1689)
> 20:57:13  ... 9 more
> 20:57:13 Error in 
> /data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/testdata/bin/split-hbase.sh
>  at line 41: "$JAVA" ${JAVA_KERBEROS_MAGIC} \
> 20:57:13 Error in 
> /data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/bin/run-all-tests.sh
>  at line 48: # Run End-to-end Tests{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7245) test_kudu.py::test_kudu_insert() fails with "Error adding columns to Kudu table tbl_with_defaults"

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7245:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> test_kudu.py::test_kudu_insert() fails with "Error adding columns to Kudu 
> table tbl_with_defaults"
> --
>
> Key: IMPALA-7245
> URL: https://issues.apache.org/jira/browse/IMPALA-7245
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Tianyi Wang
>Priority: Critical
>  Labels: broken-build, flaky
>
> query_test.test_kudu.TestKuduOperations.test_kudu_insert hit the following 
> error:
> {noformat}
> query_test/test_kudu.py:93: in test_kudu_insert
> self.run_test_case('QueryTest/kudu_insert', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:405: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:620: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:343: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:339: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:483: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error adding columns to Kudu table 
> tbl_with_defaults
> E   CAUSED BY: NonRecoverableException: The column already exists: j{noformat}
> This is very similar to what we saw in IMPALA-6107, which we couldn't 
> reproduce.
> It seems to be related to our interaction with the Kudu client. The statement 
> starts executing at 23:05:11.17. From impalad.INFO:
> {noformat}
> I0629 23:05:11.170009 20602 impala-beeswax-server.cc:54] query(): query=alter 
> table tbl_with_defaults add columns (j int null, k int not null default 
> 1){noformat}
> This results in actions in catalogd, but catalogd encounters a timeout in the 
> AsyncKuduClient while processing this and then immediately hits the error:
> {noformat}
> I0629 23:05:11.212054 28109 AsyncKuduClient.java:1756] Invalidating location 
> master-127.0.0.1:7051(127.0.0.1:7051) for tablet Kudu Master: [peer 
> master-127.0.0.1:7051(127.0.0.1:7051)] encountered a read timeout; closing 
> the channel
> I0629 23:05:11.288261 11518 jni-util.cc:230] 
> org.apache.impala.common.ImpalaRuntimeException: Error adding columns to Kudu 
> table tbl_with_defaults
> at 
> org.apache.impala.service.KuduCatalogOpExecutor.alterKuduTable(KuduCatalogOpExecutor.java:499)
> at 
> org.apache.impala.service.KuduCatalogOpExecutor.addColumn(KuduCatalogOpExecutor.java:412)
> at 
> org.apache.impala.service.CatalogOpExecutor.alterKuduTable(CatalogOpExecutor.java:600)
> at 
> org.apache.impala.service.CatalogOpExecutor.alterTable(CatalogOpExecutor.java:420)
> at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:270)
> at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)
> Caused by: org.apache.kudu.client.NonRecoverableException: The column already 
> exists: j
> at 
> org.apache.kudu.client.KuduException.transformException(KuduException.java:110)
> at 
> org.apache.kudu.client.KuduClient.joinAndHandleException(KuduClient.java:351)
> at org.apache.kudu.client.KuduClient.alterTable(KuduClient.java:141)
> at 
> org.apache.impala.service.KuduCatalogOpExecutor.alterKuduTable(KuduCatalogOpExecutor.java:494)
> ... 5 more
> {noformat}
> The Kudu master sees the alter table request, then the connection is torn 
> down, then a duplicate alter table request:
> {noformat}
> I0629 23:05:11.197782 16157 catalog_manager.cc:2086] Servicing AlterTable 
> request from {username='jenkins'} at 127.0.0.1:34332:
> table { table_name: "impala::test_kudu_insert_ca9324f5.tbl_with_defaults" } 
> alter_schema_steps { type: ADD_COLUMN add_column { schema { name: "j" type: 
> INT32 is_key: false is_nullable: true cfile_block_size: 0 } } } 
> alter_schema_steps { type: ADD_COLUMN add_column { schema { name: "k" type: 
> INT32 is_key: false is_nullable: false read_default_value: "\020\'\000\000" 
> cfile_block_size: 0 } } }
> W0629 23:05:11.218549 15925 connection.cc:420] Connection torn down before 
> Call kudu.master.MasterService.AlterTable from 127.0.0.1:34332 (request call 
> id 24) could

[jira] [Updated] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7282:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7117) Lower debug level for HDFS S3 connector back to INFO

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7117:
-
Target Version: Impala 2.13.0, Impala 3.3.0  (was: Impala 2.13.0, Impala 
3.2.0)

> Lower debug level for HDFS S3 connector back to INFO
> 
>
> Key: IMPALA-7117
> URL: https://issues.apache.org/jira/browse/IMPALA-7117
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Blocker
>  Labels: s3
>
> This change will increase the log level for the HDFS S3 connector to DEBUG to 
> help with IMPALA-6910 and IMPALA-7070. Before the next release we need to 
> lower it again.
> https://gerrit.cloudera.org/#/c/10596/
> I'm making this a P1 to remind us that we must do this before cutting a 
> release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7194) Impala crashed at FragmentInstanceState::Open()

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7194:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Impala crashed at FragmentInstanceState::Open()
> ---
>
> Key: IMPALA-7194
> URL: https://issues.apache.org/jira/browse/IMPALA-7194
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Xiaomin Zhang
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: crash
> Attachments: f2eef59a-e25a-4dcf-a5b83da0-ab63e8e1.dmp, 
> f2eef59a-e25a-4dcf-a5b83da0-ab63e8e1.dmp.resolve
>
>
>  Below is the crash stack resolved from minidump
>  {code}
> Thread 3473 (crashed)
> 0 0x7fb0222e9000
> 1 impalad!impala::FragmentInstanceState::Open() [fragment-instance-state.cc : 
> 255 + 0x11]
> 2 impalad!impala::FragmentInstanceState::Exec() [fragment-instance-state.cc : 
> 80 + 0xb]
> 3 impalad!impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) 
> [query-state.cc : 382 + 0x10]
> 4 impalad!impala::Thread::SuperviseThread(std::string const&, std::string 
> const&, boost::function, impala::Promise*) 
> [function_template.hpp : 767 + 0x7]
> 5 impalad!boost::detail::thread_data (std::string const&, std::string const&, boost::function, 
> impala::Promise), boost::_bi::list4, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value> > > >::run() [bind.hpp : 457 + 0x6]
> 6 impalad!thread_proxy + 0xda
> {code}
> We do not have further details about this crash, and we only saw it happened 
> once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7083) AnalysisException for GROUP BY and ORDER BY expressions that are folded to constants from 2.9 onwards

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7083:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> AnalysisException for GROUP BY and ORDER BY expressions that are folded to 
> constants from 2.9 onwards
> -
>
> Key: IMPALA-7083
> URL: https://issues.apache.org/jira/browse/IMPALA-7083
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.9.0
>Reporter: Eric Lin
>Assignee: Paul Rogers
>Priority: Critical
>  Labels: regression
>
> To reproduce, please run below impala query:
> {code}
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (a int);
> SELECT   ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE a
> end) AS b
> FROM  test 
> GROUP BY 1 
> ORDER BY ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE a
> end);
> {code}
> It will fail with below error:
> {code}
> ERROR: AnalysisException: ORDER BY expression not produced by aggregation 
> output (missing from GROUP BY clause?): (CASE WHEN TRUE THEN 1 ELSE a END)
> {code}
> However, if I replace column name "a" as a constant value, it works:
> {code}
> SELECT   ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE 2
> end) AS b
> FROM  test 
> GROUP BY 1 
> ORDER BY ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE 2
> end);
> {code}
> This issue is identified in CDH5.12.x (Impala 2.9), and no issues in 5.11.x 
> (Impala 2.8).
> We know that it can be worked around by re-write as below:
> {code}
> SELECT   ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE a
> end) AS b
> FROM  test 
> GROUP BY 1 
> ORDER BY 1;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7154) Error making 'dropDatabase' RPC to Hive Metastore

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7154:
-
Target Version: Impala 2.13.0, Impala 3.3.0  (was: Impala 2.13.0, Impala 
3.2.0)

> Error making 'dropDatabase' RPC to Hive Metastore
> -
>
> Key: IMPALA-7154
> URL: https://issues.apache.org/jira/browse/IMPALA-7154
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.13.0
>Reporter: Tim Armstrong
>Assignee: Vuk Ercegovac
>Priority: Critical
>  Labels: broken-build, flaky
> Attachments: TEST-impala-parallel.log.gz, 
> TEST-impala-parallel.xml.gz, 
> catalogd.ec2-m2-4xlarge-centos-6-4-0f46.vpc.cloudera.com.jenkins.log.INFO.20180608-024815.32143.gz,
>  hive.log.gz
>
>
> {noformat}
> conftest.py:293: in cleanup
> {'sync_ddl': sync_ddl})
> common/impala_test_suite.py:528: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:535: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options)
> common/impala_test_suite.py:620: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error making 'dropDatabase' RPC to Hive 
> Metastore: 
> E   CAUSED BY: NoSuchObjectException: test_resolution_by_name_56b45511
> {noformat}
> The backtrace in the catalogd log is:
> {noformat}
> I0608 05:49:26.111824 24195 jni-util.cc:230] 
> org.apache.impala.common.ImpalaRuntimeException: Error making 'dropDatabase' 
> RPC to Hive Metastore: 
> at 
> org.apache.impala.service.CatalogOpExecutor.dropDatabase(CatalogOpExecutor.java:1309)
> at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:300)
> at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146)
> Caused by: NoSuchObjectException(message:test_resolution_by_name_56b45511)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_database_result$get_database_resultStandardScheme.read(ThriftHiveMetastore.java:16387)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_database_result$get_database_resultStandardScheme.read(ThriftHiveMetastore.java:16364)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_database_result.read(ThriftHiveMetastore.java:16295)
> at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:702)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:689)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:1232)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:791)
> at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:101)
> at com.sun.proxy.$Proxy5.dropDatabase(Unknown Source)
> at 
> org.apache.impala.service.CatalogOpExecutor.dropDatabase(CatalogOpExecutor.java:1305)
> ... 2 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7326) test_kudu_partition_ddl failed with exception message: "Table already exists"

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7326:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> test_kudu_partition_ddl failed with exception message: "Table already exists"
> -
>
> Key: IMPALA-7326
> URL: https://issues.apache.org/jira/browse/IMPALA-7326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build, flaky, kudu
>
> cc'ing [~twm378]. Does it look like some known issue ? Putting it in the 
> catalog category for now but please feel free to update the component as you 
> see fit.
> {noformat}
> query_test/test_kudu.py:96: in test_kudu_partition_ddl
> self.run_test_case('QueryTest/kudu_partition_ddl', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:397: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:612: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error creating Kudu table 
> 'impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range'
> E   CAUSED BY: NonRecoverableException: Table 
> impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range already exists 
> with id 3e81a4ceff27471cad9fcb3bc0b977c3
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7371) TestInsertQueries.test_insert fails on S3 with 0 rows returned

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7371:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> TestInsertQueries.test_insert fails on S3 with 0 rows returned
> --
>
> Key: IMPALA-7371
> URL: https://issues.apache.org/jira/browse/IMPALA-7371
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: bharath v
>Priority: Critical
> Attachments: catalogd_excerpt.INFO, impalad_excerpt.INFO, 
> profile.txt, profile_excerpt.log
>
>
> Stacktrace
> {noformat}
> query_test/test_insert.py:118: in test_insert
> multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1)
> /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/impala_test_suite.py:426:
>  in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/impala_test_suite.py:299:
>  in __verify_results_and_errors
> replace_filenames_with_placeholder)
> /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/test_result_verifier.py:434:
>  in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/test_result_verifier.py:261:
>  in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 75,false,0,0,0,0,0,0,'04/01/09','0' != None
> E 76,true,1,1,1,10,1.10023841858,10.1,'04/01/09','1' != None
> E 77,false,2,2,2,20,2.20047683716,20.2,'04/01/09','2' != None
> E 78,true,3,3,3,30,3.29952316284,30.3,'04/01/09','3' != None
> E 79,false,4,4,4,40,4.40095367432,40.4,'04/01/09','4' != None
> E 80,true,5,5,5,50,5.5,50.5,'04/01/09','5' != None
> E 81,false,6,6,6,60,6.59904632568,60.6,'04/01/09','6' != None
> E 82,true,7,7,7,70,7.69809265137,70.7,'04/01/09','7' != None
> E 83,false,8,8,8,80,8.80190734863,80.8,'04/01/09','8' != None
> E 84,true,9,9,9,90,9.89618530273,90.91,'04/01/09','9' != 
> None
> E 85,false,0,0,0,0,0,0,'04/02/09','0' != None
> E 86,true,1,1,1,10,1.10023841858,10.1,'04/02/09','1' != None
> E 87,false,2,2,2,20,2.20047683716,20.2,'04/02/09','2' != None
> E 88,true,3,3,3,30,3.29952316284,30.3,'04/02/09','3' != None
> E 89,false,4,4,4,40,4.40095367432,40.4,'04/02/09','4' != None
> E 90,true,5,5,5,50,5.5,50.5,'04/02/09','5' != None
> E 91,false,6,6,6,60,6.59904632568,60.6,'04/02/09','6' != None
> E 92,true,7,7,7,70,7.69809265137,70.7,'04/02/09','7' != None
> E 93,false,8,8,8,80,8.80190734863,80.8,'04/02/09','8' != None
> E 94,true,9,9,9,90,9.89618530273,90.91,'04/02/09','9' != 
> None
> E 95,false,0,0,0,0,0,0,'04/03/09','0' != None
> E 96,true,1,1,1,10,1.10023841858,10.1,'04/03/09','1' != None
> E 97,false,2,2,2,20,2.20047683716,20.2,'04/03/09','2' != None
> E 98,true,3,3,3,30,3.29952316284,30.3,'04/03/09','3' != None
> E 99,false,4,4,4,40,4.40095367432,40.4,'04/03/09','4' != None
> E Number of rows returned (expected vs actual): 25 != 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7355) Ensure planner test coverage for memory estimates of all PlanNodes and DataSinks

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7355:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Ensure planner test coverage for memory estimates of all PlanNodes and 
> DataSinks
> 
>
> Key: IMPALA-7355
> URL: https://issues.apache.org/jira/browse/IMPALA-7355
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management
>
> After we've added missing estimates (IMPALA-7351), we should audit the 
> planner tests and make sure that we cover resource requirements of all the 
> operators.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7350) More accurate memory estimates for admission

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7350:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> More accurate memory estimates for admission
> 
>
> Key: IMPALA-7350
> URL: https://issues.apache.org/jira/browse/IMPALA-7350
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
>
> For IMPALA-7349, we will be relying more on memory estimates. This is an 
> umbrella JIRA to track improvements to memory estimates where the current 
> estimates are way off and result in over- or under- admission. over-admission 
> is probably the more significant concern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7312) Non-blocking mode for Fetch() RPC

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7312:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Non-blocking mode for Fetch() RPC
> -
>
> Key: IMPALA-7312
> URL: https://issues.apache.org/jira/browse/IMPALA-7312
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Clients
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management
>
> Currently Fetch() can block for an arbitrary amount of time until a batch of 
> rows is produced. It might be helpful to have a mode where it returns quickly 
> when there is no data available, so that threads and RPC slots are not tied 
> up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7523) Planner Test failing with "Failed to assign regions to servers after 60000 millis."

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7523:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Planner Test failing with "Failed to assign regions to servers after 6 
> millis."
> ---
>
> Key: IMPALA-7523
> URL: https://issues.apache.org/jira/browse/IMPALA-7523
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Reporter: Philip Zeyliger
>Priority: Critical
>  Labels: broken-build, flaky
>
> I've seen 
> {{org.apache.impala.planner.PlannerTest.org.apache.impala.planner.PlannerTest}}
>  fail with the following trace:
> {code}
> java.lang.IllegalStateException: Failed to assign regions to servers after 
> 6 millis.
>   at 
> org.apache.impala.datagenerator.HBaseTestDataRegionAssignment.performAssignment(HBaseTestDataRegionAssignment.java:153)
>   at 
> org.apache.impala.planner.PlannerTestBase.setUp(PlannerTestBase.java:120)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> {code}
> I think we've seen it before as indicated in IMPALA-7061.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7665) Bringing up stopped statestore causes queries to fail

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7665:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Bringing up stopped statestore causes queries to fail
> -
>
> Key: IMPALA-7665
> URL: https://issues.apache.org/jira/browse/IMPALA-7665
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: query-lifecycle, statestore
>
> I can reproduce this by running a long-running query then cycling the 
> statestore:
> {noformat}
> tarmstrong@tarmstrong-box:~/Impala/incubator-impala$ impala-shell.sh -q 
> "select distinct * from tpch10_parquet.lineitem"
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build 
> c486fb9ea4330e1008fa9b7ceaa60492e43ee120)
> Query: select distinct * from tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 17:06:48 (Coordinator: 
> http://tarmstrong-box:25000)
> {noformat}
> If I kill the statestore, the query runs fine, but if I start up the 
> statestore again, it fails.
> {noformat}
> # In one terminal, start up the statestore
> $ 
> /home/tarmstrong/Impala/incubator-impala/be/build/latest/statestore/statestored
>  -log_filename=statestored 
> -log_dir=/home/tarmstrong/Impala/incubator-impala/logs/cluster -v=1 
> -logbufsecs=5 -max_log_files=10
> # The running query then fails
> WARNINGS: Failed due to unreachable impalad(s): tarmstrong-box:22001, 
> tarmstrong-box:22002
> {noformat}
> Note that I've seen different subsets impalads reported as failed, e.g. 
> "Failed due to unreachable impalad(s): tarmstrong-box:22001"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7672) Play nice with load balancers when shutting down coordinator

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7672:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Play nice with load balancers when shutting down coordinator
> 
>
> Key: IMPALA-7672
> URL: https://issues.apache.org/jira/browse/IMPALA-7672
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management
>
> This is a placeholder to figure out what we need to do to get load balancers 
> like HAProxy and F5 to cleanly switch to alternative coordinators when we do 
> a graceful shutdown. E.g. do we need to stop accepting new TCP connections?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7404) query_test.test_delimited_text.TestDelimitedText.test_delimited_text_newlines fails to return any rows

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7404:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> query_test.test_delimited_text.TestDelimitedText.test_delimited_text_newlines 
> fails to return any rows
> --
>
> Key: IMPALA-7404
> URL: https://issues.apache.org/jira/browse/IMPALA-7404
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Vuk Ercegovac
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, flaky-test, s3
>
> {noformat}
> query_test/test_delimited_text.py:65: in test_delimited_text_newlines
> assert len(result.data) == 2
> E   assert 0 == 2
> E+  where 0 = len([])
> E+where [] =  at 0x63977d0>.data{noformat}
> Expected results from this query after first inserting:
> {noformat}
> insert into test_delimited_text_newlines_ff243aaa.nl_queries values 
> ("the\n","\nquick\nbrown","fox\n"), ("\njumped","over the lazy\n","\ndog");
> select * from test_delimited_text_newlines_ff243aaa.nl_queries;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7604) In AggregationNode.computeStats, handle cardinality overflow better

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7604:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> In AggregationNode.computeStats, handle cardinality overflow better
> ---
>
> Key: IMPALA-7604
> URL: https://issues.apache.org/jira/browse/IMPALA-7604
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.12.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Consider the cardinality overflow logic in 
> [{{AggregationNode.computeStats()}}|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/planner/AggregationNode.java].
>  Current code:
> {noformat}
> // if we ended up with an overflow, the estimate is certain to be wrong
> if (cardinality_ < 0) cardinality_ = -1;
> {noformat}
> This code has a number of issues.
> * The check is done after looping over all conjuncts. It could be that, as a 
> result, the number overflowed twice. The check should be done after each 
> multiplication.
> * Since we know that the number overflowed, a better estimate of the total 
> count is {{Long.MAX_VALUE}}.
> * The code later checks for the -1 value and, if found, uses the cardinality 
> of the first child. This is a worse estimate than using the max value, since 
> the first child might have a low cardinality (it could be the later children 
> that caused the overflow.)
> * If we really do expect overflow, then we are dealing with very large 
> numbers. Being accurate to the row is not needed. Better to use a {{double}} 
> which can handle the large values.
> Since overflow probably seldom occurs, this is not an urgent issue. Though, 
> if overflow does occur, the query is huge, and having at least some estimate 
> of the hugeness is better than none. Also, seems that this code probably 
> evolved; this newbie is looking at it fresh and seeing that the accumulated 
> fixes could be tidied up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7471) Impala crashes or returns incorrect results when querying parquet nested types

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7471:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Impala crashes or returns incorrect results when querying parquet nested types
> --
>
> Key: IMPALA-7471
> URL: https://issues.apache.org/jira/browse/IMPALA-7471
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: correctness, crash, parquet
> Attachments: test_users_131786401297925138_0.parquet
>
>
> From 
> http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-bug-with-nested-arrays-of-structures-where-some-of/m-p/78507/highlight/false#M4779
> {quote}We found a case where Impala returns incorrect values from simple 
> query. Our data contains nested array of structures and structures contains 
> other structures.
> We generated minimal sample data allowing to reproduce the issue.
>  
> SQL to create a table:
> {quote}
> {code}
> CREATE TABLE plat_test.test_users (
>   id INT,
>   name STRING,   
>   devices ARRAY<
> STRUCT<
>   id:STRING,
>   device_info:STRUCT<
> model:STRING
>   >
> >
>   >
> )
> STORED AS PARQUET
> {code}
> {quote}
> Please put attached parquet file to the location of the table and refresh the 
> table.
> In sample data we have 2 users, one with 2 devices, second one with 3. Some 
> of the devices.device_info.model fields are NULL.
>  
> When I issue a query:
> {quote}
> {code}
> SELECT u.name, d.device_info.model as model
> FROM test_users u,
> u.devices d;
> {code}
>  {quote}
> I'm expecting to get 5 records in results, but getting only one1.png
> If I change query to:
>  {quote}
> {code}
> SELECT u.name, d.device_info.model as model
> FROM test_users u
> LEFT OUTER JOIN u.devices d;
>  {code}
> {quote}
> I'm getting two records in the results, but still not as it should be.
> We found some workaround to this problem. If we add to the result columns 
> device.id we will get all records from parquet file:
> {quote}
> {code}
> SELECT u.name, d.id, d.device_info.model as model
> FROM test_users u
> , u.devices d
>  {code}
> {quote}
> And result is 3.png
>  
> But we can't rely on this workaround, because we don't need device.id in all 
> queries and Impala optimizes it, and as a result we are getting unpredicted 
> results.
>  
> I tested Hive query on this table and it returns expected results:
> {quote}
> {code}
> SELECT u.name, d.device_info.model
> FROM test_users u
> lateral view outer inline (u.devices) d;
>  {code}
> {quote}
> results:
> 4.png
> Please advice if it's a problem in Impala engine or we did some mistake in 
> our query.
>  
> Best regards,
> Come2Play team.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7482) Deadlock with unknown lock holder in JVM in java.security.Provider.getService()

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7482:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Deadlock with unknown lock holder in JVM in 
> java.security.Provider.getService()
> ---
>
> Key: IMPALA-7482
> URL: https://issues.apache.org/jira/browse/IMPALA-7482
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Tim Armstrong
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: hang
> Attachments: vb1220-jstack2.out
>
>
> We've seen several instances of these mystery deadlocks in impalad's embedded 
> JVM. The signature is a deadlock stemming from sun.security.provider.Sun 
> being locked by an unknown owner.
> {noformat}
> Found one Java-level deadlock:
> =
> "Thread-24":
>   waiting to lock monitor 0x12364688 (object 0x8027ef30, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x14120800
> {noformat}
> If this happens in HDFS, it causes HDFS I/O to hang and queries to get stuck. 
> If it happens in the Kudu client it also causes hangs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7670) Drop table with a concurrent refresh throws ConcurrentModificationException

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7670:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Drop table with a concurrent refresh throws ConcurrentModificationException
> ---
>
> Key: IMPALA-7670
> URL: https://issues.apache.org/jira/browse/IMPALA-7670
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Assignee: Tianyi Wang
>Priority: Critical
>
> * This bug was found on a V2 Catalog and probably also applies to V1.
> Saw this in the Catalog server.
> {noformat}
> I1004 16:38:55.236702 85380 jni-util.cc:308] 
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442)
> at java.util.HashMap$ValueIterator.next(HashMap.java:1471)
> at 
> org.apache.impala.catalog.FeFsTable$Utils.getPartitionFromThriftPartitionSpec(FeFsTable.java:407)
> at 
> org.apache.impala.catalog.HdfsTable.getPartitionFromThriftPartitionSpec(HdfsTable.java:694)
> at 
> org.apache.impala.catalog.Catalog.getHdfsPartition(Catalog.java:407)
> at 
> org.apache.impala.catalog.Catalog.getHdfsPartition(Catalog.java:386)
> at 
> org.apache.impala.service.CatalogOpExecutor.bulkAlterPartitions(CatalogOpExecutor.java:3193)
> at 
> org.apache.impala.service.CatalogOpExecutor.dropTableStats(CatalogOpExecutor.java:1255)
> at 
> org.apache.impala.service.CatalogOpExecutor.dropStats(CatalogOpExecutor.java:1148)
> at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:301)
> at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:157)
> {noformat}
> Still need to dig into it, but seems like something is off with locking 
> somewhere.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7784) Partition pruning handles escaped strings incorrectly

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7784:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Partition pruning handles escaped strings incorrectly
> -
>
> Key: IMPALA-7784
> URL: https://issues.apache.org/jira/browse/IMPALA-7784
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Csaba Ringhofer
>Assignee: bharath v
>Priority: Critical
>  Labels: correctness
>
> Repro:
> {code}
> create table tpart (i int) partitioned by (p string)
> insert into tpart partition (p="\"") values (1);
> select  * from tpart where p = "\"";
> Result;
> Fetched 0 row(s)
> select  * from tpart where p = '"';
> Result:
> 1,
> {code}
> Hive returns the row for both queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7782) discrepancy in results with a subquery containing an agg that produces an empty set

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7782:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> discrepancy in results with a subquery containing an agg that produces an 
> empty set
> ---
>
> Key: IMPALA-7782
> URL: https://issues.apache.org/jira/browse/IMPALA-7782
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.12.0, Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Paul Rogers
>Priority: Major
>  Labels: correctness, query_generator
>
> A discrepancy exists between Impala and Postgres when a subquery contains an 
> agg and results in an empty set, yet the WHERE clause looking at the subquery 
> should produce a "True" condition.
> Example queries include:
> {noformat}
> USE functional;
> SELECT id
> FROM alltypestiny
> WHERE -1 NOT IN (SELECT COUNT(id) FROM alltypestiny HAVING false);
> SELECT id
> FROM alltypestiny
> WHERE NULL NOT IN (SELECT COUNT(id) FROM alltypestiny HAVING false);
> SELECT id
> FROM alltypestiny
> WHERE (SELECT COUNT(id) FROM alltypestiny HAVING false) IS NULL;
> {noformat}
> These queries do not produce any rows in Impala. In Postgres, the queries 
> produce all 8 rows for the functional.alltypestiny id column.
> Thinking maybe there were Impala and Postgres differences with {{NOT IN}} 
> behavior, I also tried this:
> {noformat}
> USE functional;
> SELECT id
> FROM alltypestiny
> WHERE -1 NOT IN (SELECT 1 FROM alltypestiny WHERE bool_col IS NULL);
> {noformat}
> This subquery also produces an empty set just like the subquery in the 
> problematic queries at the top, but unlike those queries, this full query 
> returns the same results in Impala and Postgres (all 8 rows for the 
> functional.alltypestiny id column).
> For anyone interested in this bug, you can migrate data into postgres in a 
> dev environment using
> {noformat}
> tests/comparison/data_generator.py --use-postgresql --migrate-table-names 
> alltypestiny --db-name functional migrate
> {noformat}
> This is in 2.12 at least, so it's not a 3.1 regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7825) Upgrade Thrift version to 0.11.0

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7825:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Upgrade Thrift version to 0.11.0
> 
>
> Key: IMPALA-7825
> URL: https://issues.apache.org/jira/browse/IMPALA-7825
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Lars Volker
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: performance
> Attachments: thrift-0.11-upgrade.patch
>
>
> Thrift has added performance improvements to its Python deserialization code. 
> We should upgrade to 0.11.0 to make use of those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7732) Check / Implement resource limits documented in IMPALA-5605

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7732:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Check / Implement resource limits documented in IMPALA-5605
> ---
>
> Key: IMPALA-7732
> URL: https://issues.apache.org/jira/browse/IMPALA-7732
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Priority: Critical
>
> IMPALA-5605 documents a list of recommended bump in system resource limits 
> which may be necessary when running Impala at scale. We may consider checking 
> those limits at startup with {{getrlimit()}} and potentially setting them 
> with {{setrlimit()}} if possible. At the minimum, may be helpful to log a 
> warning message if the limit is below certain threshold.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7733) TestInsertParquetQueries.test_insert_parquet is flaky in S3 due to rename

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7733:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> TestInsertParquetQueries.test_insert_parquet is flaky in S3 due to rename
> -
>
> Key: IMPALA-7733
> URL: https://issues.apache.org/jira/browse/IMPALA-7733
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Vuk Ercegovac
>Assignee: Tianyi Wang
>Priority: Blocker
>  Labels: broken-build, flaky
>
> I see two examples in the past two months or so where this test fails due to 
> a rename error on S3. The test's stacktrace looks like this:
> {noformat}
> query_test/test_insert_parquet.py:112: in test_insert_parquet
> self.run_test_case('insert_parquet', vector, unique_database, 
> multiple_impalad=True)
> common/impala_test_suite.py:408: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:625: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:176: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:350: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:371: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:Error(s) moving partition files. First error (of 1) was: 
> Hdfs op (RENAME 
> s3a:///test_insert_parquet_968f37fe.db/orders_insert_table/_impala_insert_staging/4e45cd68bcddd451_3c7156ed/.4e45cd68bcddd451-3c7156ed0002_803672621_dir/4e45cd68bcddd451-3c7156ed0002_448261088_data.0.parq
>  TO 
> s3a:///test-warehouse/test_insert_parquet_968f37fe.db/orders_insert_table/4e45cd68bcddd451-3c7156ed0002_448261088_data.0.parq)
>  failed, error was: 
> s3a:///test-warehouse/test_insert_parquet_968f37fe.db/orders_insert_table/_impala_insert_staging/4e45cd68bcddd451_3c7156ed/.4e45cd68bcddd451-3c7156ed0002_803672621_dir/4e45cd68bcddd451-3c7156ed0002_448261088_data.0.parq
> E   Error(5): Input/output error{noformat}
> Since we know this happens once in a while, some ideas to deflake it:
>  * retry
>  * check for this specific issue... if we think its platform flakiness, then 
> we should skip it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7833) Audit and fix other string builtins for long string handling

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7833:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Audit and fix other string builtins for long string handling
> 
>
> Key: IMPALA-7833
> URL: https://issues.apache.org/jira/browse/IMPALA-7833
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0, Impala 3.0, Impala 3.1.0
>Reporter: Tim Armstrong
>Priority: Critical
>  Labels: crash, ramp-up
>
> Following on from IMPALA-7822, there are some other string builtins that seem 
> to follow the same pattern of having a string size overflow an int passed 
> into the StringVal constructor. I think in some cases we get lucky and it 
> works out, but others it seems possible to crash given the right input 
> values. 
> Here are some examples of cases where we can hit such bugs:
> {noformat}
> select lpad('foo', 17179869184 , ' ');
> select rpad('foo', 17179869184 , ' ');
> select space(17179869184 );
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7802) Implement support for closing idle sessions

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7802:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Implement support for closing idle sessions
> ---
>
> Key: IMPALA-7802
> URL: https://issues.apache.org/jira/browse/IMPALA-7802
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Priority: Critical
>  Labels: supportability
>
> Currently, the query option {{idle_session_timeout}} specifies a timeout in 
> seconds after which all running queries of that idle session will be 
> cancelled and no new queries can be issued to it. However, the idle session 
> will remain open and it needs to be closed explicitly. Please see the 
> [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html]
>  for details.
> This behavior may be undesirable as each session still consumes an Impala 
> frontend service thread. The number of frontend service threads is bound by 
> the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala 
> server can have a lot of idle sessions but they still consume against the 
> quota of {{fe_service_threads}}. If the number of sessions established 
> reaches {{fe_service_threads}}, all new session creations will block until 
> some of the existing sessions exit. There may be no time bound on when these 
> zombie idle sessions will be closed and it's at the mercy of the client 
> implementation to close them. In some sense, leaving many idle sessions open 
> is a way to launch a denial of service attack on Impala.
> To fix this situation, we should have an option to forcefully close a session 
> when it's considered idle so it won't unnecessarily consume the limited 
> number of frontend service threads. cc'ing [~zoram]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7910) COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7910:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore
> 
>
> Key: IMPALA-7910
> URL: https://issues.apache.org/jira/browse/IMPALA-7910
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.9.0, Impala 2.11.0, Impala 2.12.0
>Reporter: Michael Brown
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> COMPUTE STATS and possibly other DDL operations unnecessarily do the 
> equivalent of a REFRESH after writing to the Hive Metastore. This unnecessary 
> operation can be very expensive, so should be avoided.
> The behavior can be confirmed from the catalogd logs:
> {code}
> compute stats functional_parquet.alltypes;
> +---+
> | summary   |
> +---+
> | Updated 24 partition(s) and 11 column(s). |
> +---+
> Relevant catalogd.INFO snippet
> I0413 14:40:24.210749 27295 HdfsTable.java:1263] Incrementally loading table 
> metadata for: functional_parquet.alltypes
> I0413 14:40:24.242122 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.244634 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=10: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.247174 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=11: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.249713 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=12: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.252288 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=2: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.254629 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=3: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.256991 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=4: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.259464 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=5: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.262197 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=6: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.264463 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=7: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.266736 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=8: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.269210 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=9: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.271800 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.274348 27295 HdfsTable.java:555] Refreshed file metadata for

[jira] [Updated] (IMPALA-7862) Conversion of timestamps after validation can move them out of range in Parquet scans

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7862:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Conversion of timestamps after validation can move them out of range in 
> Parquet scans
> -
>
> Key: IMPALA-7862
> URL: https://issues.apache.org/jira/browse/IMPALA-7862
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: parquet
>
> On https://gerrit.cloudera.org/#/c/8319/ Csaba observed that the sequencing 
> of conversion and validation could result in invalid timestamps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7969) Always admit trivial queries immediately

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7969:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Always admit trivial queries immediately
> 
>
> Key: IMPALA-7969
> URL: https://issues.apache.org/jira/browse/IMPALA-7969
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control
>
> Here are two common query types that consume minimal resources:
> * {{select ... from ... limit 0}}, which is used by some clients to determine 
> column types
> * {{select , , }}, which just evaluates some constant 
> expressions on the coordinator
> Currently these queries get queued if there are existing queued queries or 
> the number of queries limit is exceeded, which is inconvenient for use cases 
> where latency is important. I think the planner should identify trivial 
> queries and admission controller should admit immediately.
> Here's an initial thought on the definition of a trivial query:
> * Must have PLAN ROOT SINK as the root
> * Can contain UNION and EMPTYSET nodes only



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7957) UNION ALL query returns incorrect results

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7957:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> UNION ALL query returns incorrect results
> -
>
> Key: IMPALA-7957
> URL: https://issues.apache.org/jira/browse/IMPALA-7957
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.12.0
>Reporter: Luis E Martinez-Poblete
>Assignee: Paul Rogers
>Priority: Blocker
>  Labels: correctness
>
> Synopsis:
> =
> UNION ALL query returns incorrect results
> Problem:
> 
> Customer reported a UNION ALL query returning incorrect results. The UNION 
> ALL query has 2 legs, but Impala is only returning information from one leg.
> Issue can be reproduced in the latest version of Impala. Below is the 
> reproduction case:
> {noformat}
> create table mytest_t (c1 timestamp, c2 timestamp, c3 int, c4 int);
> insert into mytest_t values (now(), ADDDATE (now(),1), 1,1);
> insert into mytest_t values (now(), ADDDATE (now(),1), 2,2);
> insert into mytest_t values (now(), ADDDATE (now(),1), 3,3);
> SELECT t.c1
> FROM
>  (SELECT c1, c2
>  FROM mytest_t) t
> LEFT JOIN
>  (SELECT c1, c2
>  FROM mytest_t
>  WHERE c2 = c1) t2 ON (t.c2 = t2.c2)
> UNION ALL
> VALUES (NULL)
> {noformat}
> The above query produces the following execution plan:
> {noformat}
> ++
> | Explain String  
>|
> ++
> | Max Per-Host Resource Reservation: Memory=34.02MB Threads=5 
>|
> | Per-Host Resource Estimates: Memory=2.06GB  
>|
> | WARNING: The following tables are missing relevant table and/or column 
> statistics. |
> | default.mytest_t
>|
> | 
>|
> | PLAN-ROOT SINK  
>|
> | |   
>|
> | 06:EXCHANGE [UNPARTITIONED] 
>|
> | |   
>|
> | 00:UNION
>|
> | |  constant-operands=1  
>|
> | |   
>|
> | 04:SELECT   
>|
> | |  predicates: default.mytest_t.c1 = default.mytest_t.c2
>|
> | |   
>|
> | 03:HASH JOIN [LEFT OUTER JOIN, BROADCAST]   
>|
> | |  hash predicates: c2 = c2 
>|
> | |   
>|
> | |--05:EXCHANGE [BROADCAST]  
>|
> | |  |
>|
> | |  02:SCAN HDFS [default.mytest_t]  
>|
> | | partitions=1/1 files=3 size=192B  
>|
> | | predicates: c2 = c1   
>|
> | |   
>|
> | 01:SCAN HDFS [default.mytest_t] 
>|
> |partitions=1/1 files=3 size=192B 
>|
> ++
> {noformat}
> The issue is in operator 4:
> {noformat}
> | 04:SELECT |
> | | predicates: default.mytest_t.c1 = default.mytest_t.c2 |
> {noformat}
> It's definitely a bug with predicate placement - that c1 = c2 predicate 
> shouldn't be evaluated outside the right branch of the LEFT JOIN.
> Thanks,
> Luis Martinez.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7982) Add network I/O throughput to query profiles

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7982:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Add network I/O throughput to query profiles
> 
>
> Key: IMPALA-7982
> URL: https://issues.apache.org/jira/browse/IMPALA-7982
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: observability, supportability
>
> IMPALA-7694 added a framework to collect system resource usage during query 
> execution and aggregate it at the coordinator. We should add network I/O 
> throughput there, too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7981) Add system disk I/O throughput to query profiles

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-7981:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Add system disk I/O throughput to query profiles
> 
>
> Key: IMPALA-7981
> URL: https://issues.apache.org/jira/browse/IMPALA-7981
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: observability, supportability
>
> IMPALA-7694 added a framework to collect system resource usage during query 
> execution and aggregate it at the coordinator. We should add disk I/O 
> throughput there, too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8126) Move per-host resource utilization counters to per-host profiles in coordinator

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-8126:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Move per-host resource utilization counters to per-host profiles in 
> coordinator
> ---
>
> Key: IMPALA-8126
> URL: https://issues.apache.org/jira/browse/IMPALA-8126
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Priority: Major
>  Labels: observability, profile, supportability
>
> Once IMPALA-7694 gets in we should move the per-host resource utilization 
> counters in {{Coordinator::ComputeQuerySummary()}} to the host profiles in 
> {{Coordinator::BackendState}}.
> [~tarmstrong] pointed out 
> [here|https://gerrit.cloudera.org/#/c/12069/13/be/src/runtime/coordinator.cc@789]:
> {quote}
> We could also simplify BackendState::ComputeResourceUtilization() to just use 
> the per-backend counters instead of iterating over fragments.
> I think there may be some compatibility concerns about removing these - 
> existence of the counters isn't contractual but we don't want to break useful 
> tools if avoidable.
> For example, I confirmed that Cloudera Manager actually does parse the 
> existing strings (which is a little sad, but understandable given the lack of 
> other counters).
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8224) Impala Doc: Update the Web UI doc with missing contents

2019-03-28 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-8224:
-
Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Impala Doc: Update the Web UI doc with missing contents
> ---
>
> Key: IMPALA-8224
> URL: https://issues.apache.org/jira/browse/IMPALA-8224
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-4475) Compress ExecPlanFragment before shipping it to worker nodes to reduce network traffic

2019-03-28 Thread Michael Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho reassigned IMPALA-4475:
--

Assignee: (was: Vuk Ercegovac)

> Compress ExecPlanFragment before shipping it to worker nodes to reduce 
> network traffic
> --
>
> Key: IMPALA-4475
> URL: https://issues.apache.org/jira/browse/IMPALA-4475
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Distributed Exec
>Affects Versions: Impala 2.6.0
>Reporter: Mostafa Mokhtar
>Priority: Major
>  Labels: ramp-up, scalability
> Attachments: count_store_returns.txt.zip, 
> slow_query_start_250K_partitions_134nodes.txt
>
>
> Sending the ExecPlanFragment to remote nodes dominates the query startup time 
> on clusters larger than 100 nodes, size of the ExecPlanFragment grows with 
> number of tables, blocks and partitions in the table. 
> On large cluster this is limits query throughput.
> From TPC-DS Q11 on 1K node cluster
> {code}
> Query Timeline: 5m6s
>- Query submitted: 75.256us (75.256us)
>- Planning finished: 1s580ms (1s580ms)
>- Submit for admission: 2s376ms (795.652ms)
>- Completed admission: 2s377ms (1.512ms)
>- Ready to start 15993 fragment instances: 2s458ms (80.378ms)
>- First dynamic filter received: 2m35s (2m33s)
>- All 15993 fragment instances started: 2m35s (40.934ms)
>- Rows available: 4m53s (2m17s)
>- First row fetched: 4m53s (176.254ms)
>- Unregister query: 4m58s (4s828ms)
>  - ComputeScanRangeAssignmentTimer: 600.086ms
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-4475) Compress ExecPlanFragment before shipping it to worker nodes to reduce network traffic

2019-03-28 Thread Michael Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804252#comment-16804252
 ] 

Michael Ho commented on IMPALA-4475:


FWIW, IMPALA-7467 is the JIRA for converting ExecQueryFInstance RPC to KRPC.

> Compress ExecPlanFragment before shipping it to worker nodes to reduce 
> network traffic
> --
>
> Key: IMPALA-4475
> URL: https://issues.apache.org/jira/browse/IMPALA-4475
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Distributed Exec
>Affects Versions: Impala 2.6.0
>Reporter: Mostafa Mokhtar
>Assignee: Vuk Ercegovac
>Priority: Major
>  Labels: ramp-up, scalability
> Attachments: count_store_returns.txt.zip, 
> slow_query_start_250K_partitions_134nodes.txt
>
>
> Sending the ExecPlanFragment to remote nodes dominates the query startup time 
> on clusters larger than 100 nodes, size of the ExecPlanFragment grows with 
> number of tables, blocks and partitions in the table. 
> On large cluster this is limits query throughput.
> From TPC-DS Q11 on 1K node cluster
> {code}
> Query Timeline: 5m6s
>- Query submitted: 75.256us (75.256us)
>- Planning finished: 1s580ms (1s580ms)
>- Submit for admission: 2s376ms (795.652ms)
>- Completed admission: 2s377ms (1.512ms)
>- Ready to start 15993 fragment instances: 2s458ms (80.378ms)
>- First dynamic filter received: 2m35s (2m33s)
>- All 15993 fragment instances started: 2m35s (40.934ms)
>- Rows available: 4m53s (2m17s)
>- First row fetched: 4m53s (176.254ms)
>- Unregister query: 4m58s (4s828ms)
>  - ComputeScanRangeAssignmentTimer: 600.086ms
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8371) Unified backend tests need to return appropriate return code

2019-03-28 Thread Joe McDonnell (JIRA)

Joe McDonnell created IMPALA-8371:
-

 Summary: Unified backend tests need to return appropriate return 
code
 Key: IMPALA-8371
 URL: https://issues.apache.org/jira/browse/IMPALA-8371
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.3.0
Reporter: Joe McDonnell
Assignee: Joe McDonnell


The scripts generated by bin/gen-backend-test-script.sh need to return the 
return code from the call to the unified backend executable. The JUnitXML 
contains a failure, which Jenkins and other tools can process, but the return 
code must match up for scripts to be able to loop the test, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7184) Support Kudu's READ_YOUR_WRITES scan mode

2019-03-28 Thread Thomas Tauber-Marshall (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804227#comment-16804227
 ] 

Thomas Tauber-Marshall commented on IMPALA-7184:


[~andrew.wong] I played around with this and it doesn't work, at least not as 
Impala expects it to work:
- I create a Kudu table, insert some stuff into it, scan it back at 
READ_YOUR_WRITES. Everything works as expected.
- I wait greater than 'tablet_history_max_age_sec' and attempt to scan it again 
at READ_YOUR_WRITES (using either the same KuduClient or a new one). This 
results in an error of the form 'Snapshot timestamp is earlier than the ancient 
history mark...'

I can avoid the error if I interact with the table in some other way (eg. 
performing a scan at READ_LATEST or do an ALTER) and then scan it at 
READ_YOUR_WRITES in less than  'tablet_history_max_age_sec'

I can avoid the error if I call SetLatestObservedTimestamp() to something that 
is more recent than 'tablet_history_max_age_sec', but currently Impala only 
calls SetLatestObservedTimestamp() if you're in a session in which there was a 
previous DML operation (in which case its set to the return of 
GetLatestObservedTimestamp() just after the dml has been flushed for the last 
time). Is the expectation that Impala should always be calling 
SetLatestObservedTimestamp() before scans?

> Support Kudu's READ_YOUR_WRITES scan mode
> -
>
> Key: IMPALA-7184
> URL: https://issues.apache.org/jira/browse/IMPALA-7184
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: kudu
>
> Kudu recently added a new scan mode called READ_YOUR_WRITES which provides 
> better consistency guarantees than READ_LATEST or READ_AT_SNAPSHOT, the 
> options currently supported by Impala.
> Unfortunately, READ_YOUR_WRITES is currently affected by a bug that makes it 
> unusable by Impala (KUDU-2233). Once this is fixed, we should add support for 
> it, and consider either setting it as the default, or at least using it in 
> tests, see the discussion in https://gerrit.cloudera.org/#/c/10503/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-4475) Compress ExecPlanFragment before shipping it to worker nodes to reduce network traffic

2019-03-28 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804219#comment-16804219
 ] 

Todd Lipcon commented on IMPALA-4475:
-

Perhaps at this point it would be better to move the appropriate 
ExecQueryFInstances RPC to krpc, and then implement optional compression in 
general for KRPC? Or if we end up using a sidecar to encapsulate the serialized 
thrift plan (because converting it to protobuf is a ton of work) we can easily 
compress just the sidecar.

> Compress ExecPlanFragment before shipping it to worker nodes to reduce 
> network traffic
> --
>
> Key: IMPALA-4475
> URL: https://issues.apache.org/jira/browse/IMPALA-4475
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Distributed Exec
>Affects Versions: Impala 2.6.0
>Reporter: Mostafa Mokhtar
>Assignee: Vuk Ercegovac
>Priority: Major
>  Labels: ramp-up, scalability
> Attachments: count_store_returns.txt.zip, 
> slow_query_start_250K_partitions_134nodes.txt
>
>
> Sending the ExecPlanFragment to remote nodes dominates the query startup time 
> on clusters larger than 100 nodes, size of the ExecPlanFragment grows with 
> number of tables, blocks and partitions in the table. 
> On large cluster this is limits query throughput.
> From TPC-DS Q11 on 1K node cluster
> {code}
> Query Timeline: 5m6s
>- Query submitted: 75.256us (75.256us)
>- Planning finished: 1s580ms (1s580ms)
>- Submit for admission: 2s376ms (795.652ms)
>- Completed admission: 2s377ms (1.512ms)
>- Ready to start 15993 fragment instances: 2s458ms (80.378ms)
>- First dynamic filter received: 2m35s (2m33s)
>- All 15993 fragment instances started: 2m35s (40.934ms)
>- Rows available: 4m53s (2m17s)
>- First row fetched: 4m53s (176.254ms)
>- Unregister query: 4m58s (4s828ms)
>  - ComputeScanRangeAssignmentTimer: 600.086ms
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8367) from_unixtime Bad date/time conversion format: u on NULL value

2019-03-28 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804192#comment-16804192
 ] 

Tim Armstrong commented on IMPALA-8367:
---

This might be a good starter bugfix. This logic is in 
TimestampFunctions::UnixAndFromUnixPrepare() and 
TimestampFunctions::ToTimestamp(FunctionContext* context, const StringVal& 
date, const StringVal& fmt). 

I agree that it would make sense to defer failing the query we encounter a 
non-NULL timestamp value.

> from_unixtime Bad date/time conversion format: u on NULL value
> --
>
> Key: IMPALA-8367
> URL: https://issues.apache.org/jira/browse/IMPALA-8367
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
> Environment: impalad version 2.11.0-cdh5.14.2 RELEASE (build 
> ed85dce709da9557aeb28be89e8044947708876c) Built on Tue Mar 27 13:39:48 PDT 
> 2018
>Reporter: Sergio Leoni
>Priority: Minor
>  Labels: ramp-up
>
> The function
> {code:sql}
>  from_unixtime(bigint unixtime[, string format]) {code}
> output error if the value of unixtime is NULL and format is 'u'.
>  
> This doesn't work:
> {code:sql}
> SELECT FROM_UNIXTIME(NULL, 'u')
> {code}
> {noformat}
> Bad date/time conversion format: u{noformat}
>  
> This works:
> {code:sql}
> SELECT FROM_UNIXTIME(NULL, '-MM-dd')
> {code}
> {noformat}
> |from_unixtime(null, '-mm-dd')|
> |-|
> | NULL|
> |-|{noformat}
>  
> I haven't checked all the possible combinations.
> Other software like Hive handles this correctly.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8367) from_unixtime Bad date/time conversion format: u on NULL value

2019-03-28 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-8367:
--
Labels: newbie ramp-up  (was: ramp-up)

> from_unixtime Bad date/time conversion format: u on NULL value
> --
>
> Key: IMPALA-8367
> URL: https://issues.apache.org/jira/browse/IMPALA-8367
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
> Environment: impalad version 2.11.0-cdh5.14.2 RELEASE (build 
> ed85dce709da9557aeb28be89e8044947708876c) Built on Tue Mar 27 13:39:48 PDT 
> 2018
>Reporter: Sergio Leoni
>Priority: Minor
>  Labels: newbie, ramp-up
>
> The function
> {code:sql}
>  from_unixtime(bigint unixtime[, string format]) {code}
> output error if the value of unixtime is NULL and format is 'u'.
>  
> This doesn't work:
> {code:sql}
> SELECT FROM_UNIXTIME(NULL, 'u')
> {code}
> {noformat}
> Bad date/time conversion format: u{noformat}
>  
> This works:
> {code:sql}
> SELECT FROM_UNIXTIME(NULL, '-MM-dd')
> {code}
> {noformat}
> |from_unixtime(null, '-mm-dd')|
> |-|
> | NULL|
> |-|{noformat}
>  
> I haven't checked all the possible combinations.
> Other software like Hive handles this correctly.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8367) from_unixtime Bad date/time conversion format: u on NULL value

2019-03-28 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-8367:
--
Labels: ramp-up  (was: )

> from_unixtime Bad date/time conversion format: u on NULL value
> --
>
> Key: IMPALA-8367
> URL: https://issues.apache.org/jira/browse/IMPALA-8367
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
> Environment: impalad version 2.11.0-cdh5.14.2 RELEASE (build 
> ed85dce709da9557aeb28be89e8044947708876c) Built on Tue Mar 27 13:39:48 PDT 
> 2018
>Reporter: Sergio Leoni
>Priority: Minor
>  Labels: ramp-up
>
> The function
> {code:sql}
>  from_unixtime(bigint unixtime[, string format]) {code}
> output error if the value of unixtime is NULL and format is 'u'.
>  
> This doesn't work:
> {code:sql}
> SELECT FROM_UNIXTIME(NULL, 'u')
> {code}
> {noformat}
> Bad date/time conversion format: u{noformat}
>  
> This works:
> {code:sql}
> SELECT FROM_UNIXTIME(NULL, '-MM-dd')
> {code}
> {noformat}
> |from_unixtime(null, '-mm-dd')|
> |-|
> | NULL|
> |-|{noformat}
>  
> I haven't checked all the possible combinations.
> Other software like Hive handles this correctly.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8370) Impala Doc: Impala works with Hive 3

2019-03-28 Thread Alex Rodoni (JIRA)

Alex Rodoni created IMPALA-8370:
---

 Summary: Impala Doc: Impala works with Hive 3
 Key: IMPALA-8370
 URL: https://issues.apache.org/jira/browse/IMPALA-8370
 Project: IMPALA
  Issue Type: Sub-task
  Components: Docs
Reporter: Alex Rodoni
Assignee: Alex Rodoni






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8369) Impala should be able to interoperate with Hive 3.1.0

2019-03-28 Thread Vihang Karajgaonkar (JIRA)

Vihang Karajgaonkar created IMPALA-8369:
---

 Summary: Impala should be able to interoperate with Hive 3.1.0
 Key: IMPALA-8369
 URL: https://issues.apache.org/jira/browse/IMPALA-8369
 Project: IMPALA
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


Currently, Impala only works with Hive 2.1.1. Since Hive 3.1.0 has been 
released for a while it would be good to add support for Hive 3.1.0 (HMS 
3.1.0). This patch will focus on ability to connect to HMS 3.1.0 and run 
existing tests. It will not focus on adding support for newer features like 
ACID in Hive 3.1.0 which can be taken up as separate JIRA.

It would be good to make changes to Impala source code such that it can work 
with both Hive 2.1.0 and Hive 3.1.0 without the need to create a separate 
branch. However, this should be a aspirational goal. If we hit a blocker we 
should investigate alternative approaches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8345) Add option to set up minicluster to use Hive 3

2019-03-28 Thread Vihang Karajgaonkar (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar resolved IMPALA-8345.
-
Resolution: Fixed

Thanks for the review [~fredyw] [~asherman]

> Add option to set up minicluster to use Hive 3
> --
>
> Key: IMPALA-8345
> URL: https://issues.apache.org/jira/browse/IMPALA-8345
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> Hive 3.1.0 has been released and being used in production for a while. It 
> would be a nice improvement for Impala to have ability to use Hive 3.1.0 
> Metastore so that we can potentially use newer features (eg. ACID).
> As a first step, in order to make sure Impala can run against a 3.1 
> Metastore, we should enable our test infrastructure to use Hive 3 instead of 
> CDH Hive 2.1.1. This can be implemented as a optional configuration flag 
> which when set (either via environment variable or command arg) sets up Hive 
> 3.1.0 binaries in the mini-cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7918) Remove support for authorization policy file

2019-03-28 Thread Fredy Wijaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-7918:
-
Fix Version/s: (was: Impala 3.3.0)

> Remove support for authorization policy file
> 
>
> Key: IMPALA-7918
> URL: https://issues.apache.org/jira/browse/IMPALA-7918
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Major
>
> Support for authorization policy file has been deprecated in Impala and it 
> does not work with object ownership. Furthermore, authorization policy file 
> is very specific to Sentry. Supporting authorization policy will make it 
> difficult to create a generic authorization framework in Impala. Hence, the 
> task will involve removing support for authorization policy file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7918) Remove support for authorization policy file

2019-03-28 Thread Fredy Wijaya (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804177#comment-16804177
 ] 

Fredy Wijaya commented on IMPALA-7918:
--

[~arodoni_cloudera] We still haven't decided whether this will go to 3.3.0 and 
4.0. I'm going to remove it from the Fix version/s for now.

> Remove support for authorization policy file
> 
>
> Key: IMPALA-7918
> URL: https://issues.apache.org/jira/browse/IMPALA-7918
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> Support for authorization policy file has been deprecated in Impala and it 
> does not work with object ownership. Furthermore, authorization policy file 
> is very specific to Sentry. Supporting authorization policy will make it 
> difficult to create a generic authorization framework in Impala. Hence, the 
> task will involve removing support for authorization policy file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-889) Add trim() function matching ANSI SQL definition

2019-03-28 Thread Zoram Thanga (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoram Thanga reassigned IMPALA-889:
---

Assignee: (was: Zoram Thanga)

> Add trim() function matching ANSI SQL definition
> 
>
> Key: IMPALA-889
> URL: https://issues.apache.org/jira/browse/IMPALA-889
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 1.2.4
>Reporter: Jonathan Seidman
>Priority: Minor
>  Labels: ansi-sql, built-in-function, ramp-up
>
> Add support for an ISO-SQL compliant trim() function, i.e. trim([leading | 
> trailing | both] [characters] from string). Lack of this impacts users of BI 
> tools migrating existing SQL from other systems.
> Reference: 
> https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/String/TRIM.htm
> Part of the ANSI definition
> {noformat}
>  ::=
>   TRIM   
>  ::=
>   [ [  ] [  ] FROM ] 
>  ::=
>   
>  ::=
> LEADING
>   | TRAILING
>   | BOTH
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-6665) Tag CatalogOp logs with query IDs

2019-03-28 Thread Zoram Thanga (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoram Thanga reassigned IMPALA-6665:


Assignee: (was: Zoram Thanga)

> Tag CatalogOp logs with query IDs
> -
>
> Key: IMPALA-6665
> URL: https://issues.apache.org/jira/browse/IMPALA-6665
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.12.0
>Reporter: bharath v
>Priority: Major
>  Labels: supportability
>
> Similar to IMPALA-6664. The idea is to improve catalog server logging by 
> adding query-ID to each of the Catalog server log statements. This helps map 
> Catalog errors to specific queries, which is currently not possible. 
> Raising a separate jira for the Catalog server since fixing it could be a 
> little tricker than the other components since we don't have the query hash 
> readily available in the Catalog context. We need to augment the Catalog RPCs 
> with this data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-5922) Scanners should include file and offset information in errors

2019-03-28 Thread Zoram Thanga (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoram Thanga reassigned IMPALA-5922:


Assignee: (was: Zoram Thanga)

> Scanners should include file and offset information in errors
> -
>
> Key: IMPALA-5922
> URL: https://issues.apache.org/jira/browse/IMPALA-5922
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Lars Volker
>Priority: Major
>
> Currently we have to print the location of a parse error in 
> {{BaseSequenceScanner::GetNextInternal()}}:
> {code}
> state_->LogError(ErrorMsg(TErrorCode::SEQUENCE_SCANNER_PARSE_ERROR, 
> stream_->filename(), stream_->file_offset(),
> (stream_->eof() ? "(EOF)" : "")));  
> {code}
> Instead, the scanners should include this information when constructing the 
> error, which will allow us to simplify the error handling in the base class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Comment Edited] (IMPALA-7918) Remove support for authorization policy file

2019-03-28 Thread Alex Rodoni (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803174#comment-16803174
 ] 

Alex Rodoni edited comment on IMPALA-7918 at 3/28/19 6:22 PM:
--

[~anobis] [~fredyw] Is it safe to remove the policy file support in docs in 3.3?


was (Author: arodoni_cloudera):
[~anobis] [~fredyw] Is it safe to remove the policy file support in docs in 3.3?

> Remove support for authorization policy file
> 
>
> Key: IMPALA-7918
> URL: https://issues.apache.org/jira/browse/IMPALA-7918
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> Support for authorization policy file has been deprecated in Impala and it 
> does not work with object ownership. Furthermore, authorization policy file 
> is very specific to Sentry. Supporting authorization policy will make it 
> difficult to create a generic authorization framework in Impala. Hence, the 
> task will involve removing support for authorization policy file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-7674) Impala should compress older log files

2019-03-28 Thread Zoram Thanga (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoram Thanga reassigned IMPALA-7674:


Assignee: (was: Zoram Thanga)

> Impala should compress older log files
> --
>
> Key: IMPALA-7674
> URL: https://issues.apache.org/jira/browse/IMPALA-7674
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0
>Reporter: Zoram Thanga
>Priority: Major
>  Labels: supportability
>
> By default, Impala keeps ten log files of each severity level (INFO, WARN, 
> ERROR), and the size limit of each is set to 200MB or so. The cleaning or old 
> file deletion is controlled by the FLAGS_max_log_files parameter. 
> On busy clusters we've found that log deletion can throw away debug 
> information too quickly, often making troubleshooting harder than it needs to 
> be.
> We can compress the log files to:
> # Reduce the disk space consumption by 10x or more.
> # Keep more log files around for the same disk space budget.
> # Have 10x or more historical diagnostics data available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8306) Debug WebUI's Sessions page verbiage clarification

2019-03-28 Thread Zoram Thanga (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoram Thanga reassigned IMPALA-8306:


Assignee: (was: Zoram Thanga)

> Debug WebUI's Sessions page verbiage clarification
> --
>
> Key: IMPALA-8306
> URL: https://issues.apache.org/jira/browse/IMPALA-8306
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 2.12.0, Impala 3.1.0
>Reporter: Vincent Tran
>Priority: Minor
> Attachments: sessions.png
>
>
> Currently, the Debug WebUI's Sessions page captures both active sessions and 
> expired sessions. On the top of the page there is a message along the line of:
> {noformat}
> There are {{num_sessions}} sessions, of which {{num_active}} are active. 
> Sessions may be closed either when they are idle for some time (see Idle 
> Timeout
> below), or if they are deliberately closed, otherwise they are called active.
> {noformat}
> This text is ambiguous for me. If all non-active sessions are expired 
> sessions, it should explicitly tell the user that. And since an active 
> session becomes an expired session when it breaches the Session Idle Timeout, 
> the second sentence is also somewhat misleading. User has to "deliberately 
> close" both active sessions and expired sessions to close them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-8368) Create database/table with Ranger throws UnsupportedOperationException

2019-03-28 Thread Austin Nobis (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8368 started by Austin Nobis.

> Create database/table with Ranger throws UnsupportedOperationException
> --
>
> Key: IMPALA-8368
> URL: https://issues.apache.org/jira/browse/IMPALA-8368
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Austin Nobis
>Assignee: Austin Nobis
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> When executing *create database ;* in Impala with Ranger enabled, a 
> *UnsupportedOperationException* will be thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8368) Create database/table with Ranger throws UnsupportedOperationException

2019-03-28 Thread Austin Nobis (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Nobis updated IMPALA-8368:
-
Description: When executing *create database ;* in Impala with 
Ranger enabled, an *UnsupportedOperationException* will be thrown.  (was: When 
executing *create database ;* in Impala with Ranger enabled, a 
*UnsupportedOperationException* will be thrown.)

> Create database/table with Ranger throws UnsupportedOperationException
> --
>
> Key: IMPALA-8368
> URL: https://issues.apache.org/jira/browse/IMPALA-8368
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Austin Nobis
>Assignee: Austin Nobis
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> When executing *create database ;* in Impala with Ranger enabled, 
> an *UnsupportedOperationException* will be thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8225) Implement GRANT/REVOKE privilege to USER

2019-03-28 Thread Austin Nobis (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Nobis resolved IMPALA-8225.
--
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Implement GRANT/REVOKE privilege to USER
> 
>
> Key: IMPALA-8225
> URL: https://issues.apache.org/jira/browse/IMPALA-8225
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> Ranger supports granting/revoking a privilege to a user directly. Only admin 
> should be able to do a grant/revoke.
> Syntax:
> {noformat}
> GRANT  ON  TO USER 
> REVOKE  ON  FROM USER 
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8368) Create database/table with Ranger throws UnsupportedOperationException

2019-03-28 Thread Fredy Wijaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8368:
-
Issue Type: Sub-task  (was: Bug)
Parent: IMPALA-7916

> Create database/table with Ranger throws UnsupportedOperationException
> --
>
> Key: IMPALA-8368
> URL: https://issues.apache.org/jira/browse/IMPALA-8368
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Austin Nobis
>Assignee: Austin Nobis
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> When executing *create database ;* in Impala with Ranger enabled, a 
> *UnsupportedOperationException* will be thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8368) Create database/table with Ranger throws UnsupportedOperationException

2019-03-28 Thread Fredy Wijaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8368:
-
Summary: Create database/table with Ranger throws 
UnsupportedOperationException  (was: Create database with Ranger throws 
UnsupportedOperationException)

> Create database/table with Ranger throws UnsupportedOperationException
> --
>
> Key: IMPALA-8368
> URL: https://issues.apache.org/jira/browse/IMPALA-8368
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Austin Nobis
>Assignee: Austin Nobis
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> When executing *create database ;* in Impala with Ranger enabled, a 
> *UnsupportedOperationException* will be thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6263) Assert hit during service restart Mutex.cpp:130: apache::thrift::concurrency::Mutex::impl::~impl(): Assertion `ret == 0' failed

2019-03-28 Thread Andrew Sherman (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804137#comment-16804137
 ] 

Andrew Sherman commented on IMPALA-6263:


The crash in thrift-server-test looks similar to IMPALA-7930

> Assert hit during service restart Mutex.cpp:130: 
> apache::thrift::concurrency::Mutex::impl::~impl(): Assertion `ret == 0' failed
> ---
>
> Key: IMPALA-6263
> URL: https://issues.apache.org/jira/browse/IMPALA-6263
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Reporter: Mostafa Mokhtar
>Assignee: Andrew Sherman
>Priority: Critical
>  Labels: broken-build
> Attachments: 061ff302-918f-4a2a-000f0b96-29841f85.dmp
>
>
> On a large secure cluster when the Impala service is restarted a core files 
> are generated.
> Found in in impalad.ERR 
> impalad: src/thrift/concurrency/Mutex.cpp:130: 
> apache::thrift::concurrency::Mutex::impl::~impl(): Assertion `ret == 0' 
> failed.
> Wrote minidump to 
> /var/log/impala-minidumps/impalad/061ff302-918f-4a2a-000f0b96-29841f85.dmp
> Mini dump is based off
> {code}
>  Server version: impalad version 2.11.0-SNAPSHOT RELEASE (build 
> b9ccd44599f43776bce7838014cd99e4c76ddb9a)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6326) segfault during impyla HiveServer2Cursor.cancel_operation() over SSL

2019-03-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804083#comment-16804083
 ] 

ASF subversion and git services commented on IMPALA-6326:
-

Commit ca21c0cf0908048dbeec38b0a6874064958e0cce in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ca21c0c ]

IMPALA-6326: part 2: remove fetch thread in stress test

Ensure cursor is only accessed from a single thread. The means reworking
the code so that we check the time limit between fetch calls.

Use EXEC_TIME_LIMIT_S as an alternative to the previous multi-threaded
cancellation logic - it allows queries to be cancelled even when the
client is blocked or slow. This is implemented with the concept of
a CancelMechanism that determines *how* a query should be cancelled.
Query timeouts (where we want to cancel queries that run longer
than expected) are implemented using both cancel mechanisms, in
case the client is stuck in fetch or similar. Expected cancellations
are implemented with a random mechanism so that both code paths get
covered.

Testing:
Ran a cluster stress test.

Ran a couple of single-node stress tests with TPC-H and random queries.

Change-Id: If9afd74e1408823a9e5c0f2628ec9f8aafdcec69
Reviewed-on: http://gerrit.cloudera.org:8080/12681
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> segfault during impyla HiveServer2Cursor.cancel_operation() over SSL
> 
>
> Key: IMPALA-6326
> URL: https://issues.apache.org/jira/browse/IMPALA-6326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.10.0, Impala 2.11.0
>Reporter: Matthew Mulder
>Assignee: Tim Armstrong
>Priority: Major
> Attachments: test_fork_crash.py
>
>
> During a stress test on a secure cluster one of the clients crashed in 
> HiveServer2Cursor.cancel_operation().
> The stress test debug log shows{code}2017-12-13 16:50:52,624 21607 Query 
> Consumer DEBUG:concurrent_select[579]:Requesting memory reservation
> 2017-12-13 16:50:52,624 21607 Query Consumer 
> DEBUG:concurrent_select[245]:Reserved 102 MB; 1455 MB available; 95180 MB 
> overcommitted
> 2017-12-13 16:50:52,625 21607 Query Consumer 
> DEBUG:concurrent_select[581]:Received memory reservation
> 2017-12-13 16:50:52,658 21607 Query Consumer 
> DEBUG:concurrent_select[865]:Using tpcds_300_decimal_parquet database
> 2017-12-13 16:50:52,658 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: 
> USE tpcds_300_decimal_parquet
> 2017-12-13 16:50:52,825 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: 
> SET ABORT_ON_ERROR=1
> 2017-12-13 16:50:53,060 21607 Query Consumer 
> DEBUG:concurrent_select[877]:Setting mem limit to 102 MB
> 2017-12-13 16:50:53,060 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: 
> SET MEM_LIMIT=102M
> 2017-12-13 16:50:53,370 21607 Query Consumer 
> DEBUG:concurrent_select[881]:Running query with 102 MB mem limit at 
> vc0704.test with timeout secs 52:
> select
>   dt.d_year,
>   item.i_category_id,
>   item.i_category,
>   sum(ss_ext_sales_price)
> from
>   date_dim dt,
>   store_sales,
>   item
> where
>   dt.d_date_sk = store_sales.ss_sold_date_sk
>   and store_sales.ss_item_sk = item.i_item_sk
>   and item.i_manager_id = 1
>   and dt.d_moy = 11
>   and dt.d_year = 2000
> group by
>   dt.d_year,
>   item.i_category_id,
>   item.i_category
> order by
>   sum(ss_ext_sales_price) desc,
>   dt.d_year,
>   item.i_category_id,
>   item.i_category
> limit 100;
> 2017-12-13 16:51:08,491 21607 Query Consumer 
> DEBUG:concurrent_select[889]:Query id is b6425b84aa45f633:9ce7cad9
> 2017-12-13 16:51:15,337 21607 Query Consumer 
> DEBUG:concurrent_select[900]:Waiting for query to execute
> 2017-12-13 16:51:22,316 21607 Query Consumer 
> DEBUG:concurrent_select[900]:Waiting for query to execute
> 2017-12-13 16:51:27,266 21607 Fetch Results b6425b84aa45f633:9ce7cad9 
> DEBUG:concurrent_select[1009]:Fetching result for query with id 
> b6425b84aa45f633:9ce7cad9
> 2017-12-13 16:51:44,625 21607 Query Consumer 
> DEBUG:concurrent_select[940]:Attempting cancellation of query with id 
> b6425b84aa45f633:9ce7cad9
> 2017-12-13 16:51:44,627 21607 Query Consumer INFO:hiveserver2[259]:Canceling 
> active operation{code}The impalad log shows{code}I1213 16:50:54.287511 136399 
> admission-controller.cc:510] Schedule for 
> id=b6425b84aa45f633:9ce7cad9 in pool_name=root.systest 
> cluster_mem_needed=816.00 MB PoolConfig: max_requests=-1 max_queued=200 
> max_mem=-1.00 B
> I1213 16:50:54.289767 136399 admission-controller.cc:515] Stats: 
> agg_num_running=184, agg_num_queued=0, agg_mem_reserved=1529.63 GB,  
> local_host(local_mem_admitted=132.02 GB, num_admitted_running=21, 
>

[jira] [Resolved] (IMPALA-6326) segfault during impyla HiveServer2Cursor.cancel_operation() over SSL

2019-03-28 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-6326.
---
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> segfault during impyla HiveServer2Cursor.cancel_operation() over SSL
> 
>
> Key: IMPALA-6326
> URL: https://issues.apache.org/jira/browse/IMPALA-6326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.10.0, Impala 2.11.0
>Reporter: Matthew Mulder
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.3.0
>
> Attachments: test_fork_crash.py
>
>
> During a stress test on a secure cluster one of the clients crashed in 
> HiveServer2Cursor.cancel_operation().
> The stress test debug log shows{code}2017-12-13 16:50:52,624 21607 Query 
> Consumer DEBUG:concurrent_select[579]:Requesting memory reservation
> 2017-12-13 16:50:52,624 21607 Query Consumer 
> DEBUG:concurrent_select[245]:Reserved 102 MB; 1455 MB available; 95180 MB 
> overcommitted
> 2017-12-13 16:50:52,625 21607 Query Consumer 
> DEBUG:concurrent_select[581]:Received memory reservation
> 2017-12-13 16:50:52,658 21607 Query Consumer 
> DEBUG:concurrent_select[865]:Using tpcds_300_decimal_parquet database
> 2017-12-13 16:50:52,658 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: 
> USE tpcds_300_decimal_parquet
> 2017-12-13 16:50:52,825 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: 
> SET ABORT_ON_ERROR=1
> 2017-12-13 16:50:53,060 21607 Query Consumer 
> DEBUG:concurrent_select[877]:Setting mem limit to 102 MB
> 2017-12-13 16:50:53,060 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: 
> SET MEM_LIMIT=102M
> 2017-12-13 16:50:53,370 21607 Query Consumer 
> DEBUG:concurrent_select[881]:Running query with 102 MB mem limit at 
> vc0704.test with timeout secs 52:
> select
>   dt.d_year,
>   item.i_category_id,
>   item.i_category,
>   sum(ss_ext_sales_price)
> from
>   date_dim dt,
>   store_sales,
>   item
> where
>   dt.d_date_sk = store_sales.ss_sold_date_sk
>   and store_sales.ss_item_sk = item.i_item_sk
>   and item.i_manager_id = 1
>   and dt.d_moy = 11
>   and dt.d_year = 2000
> group by
>   dt.d_year,
>   item.i_category_id,
>   item.i_category
> order by
>   sum(ss_ext_sales_price) desc,
>   dt.d_year,
>   item.i_category_id,
>   item.i_category
> limit 100;
> 2017-12-13 16:51:08,491 21607 Query Consumer 
> DEBUG:concurrent_select[889]:Query id is b6425b84aa45f633:9ce7cad9
> 2017-12-13 16:51:15,337 21607 Query Consumer 
> DEBUG:concurrent_select[900]:Waiting for query to execute
> 2017-12-13 16:51:22,316 21607 Query Consumer 
> DEBUG:concurrent_select[900]:Waiting for query to execute
> 2017-12-13 16:51:27,266 21607 Fetch Results b6425b84aa45f633:9ce7cad9 
> DEBUG:concurrent_select[1009]:Fetching result for query with id 
> b6425b84aa45f633:9ce7cad9
> 2017-12-13 16:51:44,625 21607 Query Consumer 
> DEBUG:concurrent_select[940]:Attempting cancellation of query with id 
> b6425b84aa45f633:9ce7cad9
> 2017-12-13 16:51:44,627 21607 Query Consumer INFO:hiveserver2[259]:Canceling 
> active operation{code}The impalad log shows{code}I1213 16:50:54.287511 136399 
> admission-controller.cc:510] Schedule for 
> id=b6425b84aa45f633:9ce7cad9 in pool_name=root.systest 
> cluster_mem_needed=816.00 MB PoolConfig: max_requests=-1 max_queued=200 
> max_mem=-1.00 B
> I1213 16:50:54.289767 136399 admission-controller.cc:515] Stats: 
> agg_num_running=184, agg_num_queued=0, agg_mem_reserved=1529.63 GB,  
> local_host(local_mem_admitted=132.02 GB, num_admitted_running=21, 
> num_queued=0, backend_mem_reserved=194.58 GB)
> I1213 16:50:54.291550 136399 admission-controller.cc:531] Admitted query 
> id=b6425b84aa45f633:9ce7cad9
> I1213 16:50:54.296922 136399 coordinator.cc:99] Exec() 
> query_id=b6425b84aa45f633:9ce7cad9 stmt=/* Mem: 102 MB. Coordinator: 
> vc0704.test. */
> select
>   dt.d_year,
>   item.i_category_id,
>   item.i_category,
>   sum(ss_ext_sales_price)
> from
>   date_dim dt,
>   store_sales,
>   item
> where
>   dt.d_date_sk = store_sales.ss_sold_date_sk
>   and store_sales.ss_item_sk = item.i_item_sk
>   and item.i_manager_id = 1
>   and dt.d_moy = 11
>   and dt.d_year = 2000
> group by
>   dt.d_year,
>   item.i_category_id,
>   item.i_category
> order by
>   sum(ss_ext_sales_price) desc,
>   dt.d_year,
>   item.i_category_id,
>   item.i_category
> limit 100;
> I1213 16:50:59.263310 136399 query-state.cc:151] Using query memory limit 
> from query options: 102.00 MB
> I1213 16:50:59.267033 136399 mem-tracker.cc:189] Using query memory limit: 
> 102.00 MB
> I1213 16:50:59.272271 136399 coordinator.cc:357] starting execution on 8 
> backends for query b6425b84aa45f633:9ce7cad9
> I1213

[jira] [Work started] (IMPALA-7369) Implement DATE builtin functions

2019-03-28 Thread Attila Jeges (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7369 started by Attila Jeges.

> Implement DATE builtin functions
> 
>
> Key: IMPALA-7369
> URL: https://issues.apache.org/jira/browse/IMPALA-7369
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Major
>
> - Built-in functions supported in Hive should be implemented in Impala es 
> well.
> - Already implemented TIMESTAMP built-in functions that work on the date part 
> of timestamps should be implemented for DATE types too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8367) from_unixtime Bad date/time conversion format: u on NULL value

2019-03-28 Thread Sergio Leoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Leoni updated IMPALA-8367:
-
Description: 
The function
{code:sql}
 from_unixtime(bigint unixtime[, string format]) {code}
output error if the value of unixtime is NULL and format is 'u'.

 

This doesn't work:
{code:sql}
SELECT FROM_UNIXTIME(NULL, 'u')
{code}
{noformat}
Bad date/time conversion format: u{noformat}
 

This works:
{code:sql}
SELECT FROM_UNIXTIME(NULL, '-MM-dd')
{code}
{noformat}
|from_unixtime(null, '-mm-dd')|
|-|
| NULL|
|-|{noformat}
 

I haven't checked all the possible combinations.
Other software like Hive handles this correctly.

 

  was:
The function
{code:sql}
 from_unixtime(bigint unixtime[, string format]) {code}
output error if the value of unixtime is NULL and format is 'u'.

 

This doesn't work:
{code:sql}
SELECT FROM_UNIXTIME(NULL, 'u')
{code}
{noformat}
Bad date/time conversion format: u{noformat}
 

This works:
{code:sql}
SELECT FROM_UNIXTIME(NULL, '-MM-dd')
{code}
{noformat}
|from_unixtime(null, '-mm-dd')|
|-|
| NULL|
|-|{noformat}
 

I haven't check all the possible combinations.

 


> from_unixtime Bad date/time conversion format: u on NULL value
> --
>
> Key: IMPALA-8367
> URL: https://issues.apache.org/jira/browse/IMPALA-8367
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
> Environment: impalad version 2.11.0-cdh5.14.2 RELEASE (build 
> ed85dce709da9557aeb28be89e8044947708876c) Built on Tue Mar 27 13:39:48 PDT 
> 2018
>Reporter: Sergio Leoni
>Priority: Minor
>
> The function
> {code:sql}
>  from_unixtime(bigint unixtime[, string format]) {code}
> output error if the value of unixtime is NULL and format is 'u'.
>  
> This doesn't work:
> {code:sql}
> SELECT FROM_UNIXTIME(NULL, 'u')
> {code}
> {noformat}
> Bad date/time conversion format: u{noformat}
>  
> This works:
> {code:sql}
> SELECT FROM_UNIXTIME(NULL, '-MM-dd')
> {code}
> {noformat}
> |from_unixtime(null, '-mm-dd')|
> |-|
> | NULL|
> |-|{noformat}
>  
> I haven't checked all the possible combinations.
> Other software like Hive handles this correctly.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8367) from_unixtime Bad date/time conversion format: u on NULL value

2019-03-28 Thread Sergio Leoni (JIRA)

Sergio Leoni created IMPALA-8367:


 Summary: from_unixtime Bad date/time conversion format: u on NULL 
value
 Key: IMPALA-8367
 URL: https://issues.apache.org/jira/browse/IMPALA-8367
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.11.0
 Environment: impalad version 2.11.0-cdh5.14.2 RELEASE (build 
ed85dce709da9557aeb28be89e8044947708876c) Built on Tue Mar 27 13:39:48 PDT 2018

Reporter: Sergio Leoni


The function
{code:sql}
 from_unixtime(bigint unixtime[, string format]) {code}
output error if the value of unixtime is NULL and format is 'u'.

 

This doesn't work:
{code:sql}
SELECT FROM_UNIXTIME(NULL, 'u')
{code}
{noformat}
Bad date/time conversion format: u{noformat}
 

This works:
{code:sql}
SELECT FROM_UNIXTIME(NULL, '-MM-dd')
{code}
{noformat}
|from_unixtime(null, '-mm-dd')|
|-|
| NULL|
|-|{noformat}
 

I haven't check all the possible combinations.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8108) Impala query returns TIMESTAMP values in different types

2019-03-28 Thread Balazs Jeszenszky (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803682#comment-16803682
 ] 

Balazs Jeszenszky commented on IMPALA-8108:
---

Not sure if this is a good idea - if someone requires a certain format, it's 
best to use a specific format string, and I wouldn't expect every timestamp to 
have a bunch of trailing zeroes. [~grahn] thoughts?

> Impala query returns TIMESTAMP values in different types
> 
>
> Key: IMPALA-8108
> URL: https://issues.apache.org/jira/browse/IMPALA-8108
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> When a timestamp has a .000 or .00 or .0 (when fraction value is 
> zeros) the timestamp is displayed with no fraction of second. For example:
> {code:java}
> select cast(ts as timestamp) from 
>  (values 
>  ('2019-01-11 10:40:18' as ts),
>  ('2019-01-11 10:40:19.0'),
>  ('2019-01-11 10:40:19.00'), 
>  ('2019-01-11 10:40:19.000'),
>  ('2019-01-11 10:40:19.'),
>  ('2019-01-11 10:40:19.0'),
>  ('2019-01-11 10:40:19.00'),
>  ('2019-01-11 10:40:19.000'),
>  ('2019-01-11 10:40:19.'),
>  ('2019-01-11 10:40:19.0'),
>  ('2019-01-11 10:40:19.1')
>  ) t;{code}
> The output is:
> {code:java}
> +---+
> |cast(ts as timestamp)|
> +---+
> |2019-01-11 10:40:18|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19.1|
> +---+
> {code}
> As we can see, values of the same column are returned in two different types. 
> The inconsistency breaks some downstream use cases. 
> The reason is that impala uses function 
> boost::posix_time::to_simple_string(time_duration) to convert timestamp to a 
> string and to_simple_string() remove fractional seconds if they are all 
> zeros. Perhaps we can append ".0" if the length of the string is 8 
> (HH:MM:SS).
> For now we can work around it by using function from_timestamp(ts, 
> '-mm-dd hh:mm.ss.s') to unify the output (convert to string), or 
> using function millisecond(ts) to get fractional seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8225) Implement GRANT/REVOKE privilege to USER

2019-03-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803641#comment-16803641
 ] 

ASF subversion and git services commented on IMPALA-8225:
-

Commit 5578ccca154712b45bc472252e132e389a75d6c2 in impala's branch 
refs/heads/master from Austin Nobis
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5578ccc ]

IMPALA-8225: Add Ranger support for grant/revoke privilege to/from user

This patch adds support for GRANT privilege statements to USER and
REVOKE privilege statements from USER. The RangerAuthorizationManager
class has been created and will throw UnsupportedOperationException when
an unimplemented method is called. The grammar has been updated to
support FROM USER and TO USER for GRANT/REVOKE statements. Previously,
privileges could be granted to a ROLE via GRANT/REVOKE statements even
when the ROLE keyword was omitted, i.e:

GRANT  ON  TO 

This is still the case for ROLE based authorization to preserve backward
compatibility, but Ranger will throw an exception when a GRANT/REVOKE
statement excludes the USER keyword. The syntax for the new statement is:

GRANT  ON  TO USER 
REVOKE  ON  FROM USER 

Sentry does not support grant/revoke to/from user.

Testing:
- An additional end to end test, test_ranger.py, was added. A single test
  was added that grants and revokes for a user and asserts permissions on
  a table. The test uses sleep statements to work with Ranger's polling
  interval for policy changes. More end to end tests will be added in the
  future when the refresh authorization statement works properly with
  Ranger.
- AuthorizationStmtTest has been refactored to use the new
  RangerCatalogdAuthorizationManager grant/revoke methods for better
  test coverage.
- Ran all FE tests
- Ran all E2E authorization tests

Change-Id: I6ee97bf41546d63385026c0e2b19545565402462
Reviewed-on: http://gerrit.cloudera.org:8080/12769
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Implement GRANT/REVOKE privilege to USER
> 
>
> Key: IMPALA-8225
> URL: https://issues.apache.org/jira/browse/IMPALA-8225
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Major
>
> Ranger supports granting/revoking a privilege to a user directly. Only admin 
> should be able to do a grant/revoke.
> Syntax:
> {noformat}
> GRANT  ON  TO USER 
> REVOKE  ON  FROM USER 
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8345) Add option to set up minicluster to use Hive 3

2019-03-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803642#comment-16803642
 ] 

ASF subversion and git services commented on IMPALA-8345:
-

Commit 6b77c61d9460f372edd3e98fa28754e2235f4888 in impala's branch 
refs/heads/master from Vihang Karajgaonkar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=6b77c61 ]

IMPALA-8345 : Add option to set up minicluster to use Hive 3

As a first step to integrate Impala with Hive 3.1.0 this patch modifies
the minicluster scripts to optionally use Hive 3.1.0 instead of
CDH Hive 2.1.1.

In order to make sure that existing setups don't break this is
enabled via a environment variable override to bin/impala-config.sh.
When the environment variable USE_CDP_HIVE is set to true the
bootstrap_toolchain script downloads Hive 3.1.0 tarballs and extracts it
in the toolchain directory. These binaries are used to start the Hive
services (Hiveserver2 and metastore). The default is still CDH Hive 2.1.1

Also, since Hive 3.1.0 uses a upgraded metastore schema, this patch
makes use of a different database name so that it is easy to switch from
working from one environment which uses Hive 2.1.1 metastore to another
which usese Hive 3.1.0 metastore.

In order to start a minicluster which uses Hive 3.1.0 users should
follow the steps below:

1. Make sure that minicluster, if running, is stopped
before you run the following commands.
2. Open a new terminal and run following commands.
> export USE_CDP_HIVE=true
> source bin/impala-config.sh
> bin/bootstrap_toolchain.py
  The above command downloads the Hive 3.1.0 tarballs and extracts them
in toolchain/cdp_components-${CDP_BUILD_NUMBER} directory. This is a
no-op if the CDP_BUILD_NUMBER has not changed and if the cdp_components
are already downloaded by a previous invocation of the script.

> source bin/create-test-configuration.sh -create-metastore
   The above step should provide "-create-metastore" only the first time
so that a new metastore db is created and the Hive 3.1.0 schema is
initialized. For all subsequent invocations, the "-create-metastore"
argument can be skipped. We should still source this script since the
hive-site.xml of Hive 3.1.0 is different than Hive 2.1.0 and
needs to be regenerated.

> testdata/bin/run-all.sh

Note that the testing was performed locally by downloading the Hive 3.1
binaries into
toolchain/cdp_components-976603/apache-hive-3.1.0.6.0.99.0-9-bin. Once
the binaries are available in S3 bucket, the bootstrap_toolchain script
should automatically do this for you.

Testing Done:
1. Made sure that the cluster comes up with Hive 3.1 when the steps
above are performed.
2. Made sure that existing scripts work as they do currently when
argument is not provided.
3. Impala cluster comes and connects to HMS 3.1.0 (Note that Impala
still uses Hive 2.1.1 client. Upgrading client libraries in Impala will
be done as a separate change)

Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605
Reviewed-on: http://gerrit.cloudera.org:8080/12846
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add option to set up minicluster to use Hive 3
> --
>
> Key: IMPALA-8345
> URL: https://issues.apache.org/jira/browse/IMPALA-8345
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> Hive 3.1.0 has been released and being used in production for a while. It 
> would be a nice improvement for Impala to have ability to use Hive 3.1.0 
> Metastore so that we can potentially use newer features (eg. ACID).
> As a first step, in order to make sure Impala can run against a 3.1 
> Metastore, we should enable our test infrastructure to use Hive 3 instead of 
> CDH Hive 2.1.1. This can be implemented as a optional configuration flag 
> which when set (either via environment variable or command arg) sets up Hive 
> 3.1.0 binaries in the mini-cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

94 matches

Mail list logo