[jira] [Commented] (IMPALA-7638) Lower default timeout for connection setup

2018-10-05 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639961#comment-16639961
 ] 

Sailesh Mukil commented on IMPALA-7638:
---

Sorry, just saw this.
[~kwho] If it's the first time the client is connecting to that specific Impala 
server, *and* the KDC is under heavy load at that time, the negotiation could 
take quite a while, since the client doesn't have a ticket to talk to that 
server yet. But I agree that as more RPCs move over to KRPC, the number of 
requests to the KDC reduce, and this value can be brought down.

If we're looking for the right number, I think the best way would be to 
empirically find out on a large cluster with the latest Impala given that more 
RPCs have moved over to KRPC.

> Lower default timeout for connection setup
> --
>
> Key: IMPALA-7638
> URL: https://issues.apache.org/jira/browse/IMPALA-7638
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Lars Volker
>Priority: Major
> Fix For: Impala 2.11.0
>
>
> IMPALA-5394 added the sasl_connect_tcp_timeout_ms flag with a default timeout 
> of 5 minutes. This seems too long as broken clients will prevent new clients 
> from establishing connections for this time. In addition to increasing the 
> acceptor thread pool size (IMPALA-7565) we should lower this timeout 
> considerably, e.g. to 5 seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7072) Kudu's kinit does not support auth_to_local rules with Heimdal kerberos

2018-08-10 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576340#comment-16576340
 ] 

Sailesh Mukil commented on IMPALA-7072:
---

[~arodoni_cloudera] Yes that's right, as a limitation.

> Kudu's kinit does not support auth_to_local rules with Heimdal kerberos
> ---
>
> Key: IMPALA-7072
> URL: https://issues.apache.org/jira/browse/IMPALA-7072
> Project: IMPALA
>  Issue Type: Bug
>  Components: Security
>Affects Versions: Impala 2.12.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>
> On deployments that use Heimdal kerberos configured with 'auth_to_local' 
> rules set, and with the Impala startup flag 'use_kudu_kinit'= true, the 
> auth_to_local rules will not be respected as it's not supported with Kudu's 
> kinit.
> The implication of this is that from Impala 2.12.0 onwards, clusters with the 
> above configuration will not be able to use KRPC with kerberos enabled.
> A workaround is to get rid of the auth_to_local rules for such deployments.
> We need to have a good long term solution to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-4978) Impala should set the kerberos principal to the FQDN

2018-08-09 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-4978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575612#comment-16575612
 ] 

Sailesh Mukil commented on IMPALA-4978:
---

https://github.com/apache/impala/blob/3e17705ecaba0b6ab9ae929e6c7c409e0b6aea1d/be/src/rpc/authentication.cc#L787-L788

We already do this now since we get the principal from the Kudu security code, 
which already tries to get the FQDN. We should do the same here however:
https://github.com/apache/impala/blob/3e17705ecaba0b6ab9ae929e6c7c409e0b6aea1d/be/src/rpc/authentication.cc#L814

And also make sure that our process wide hostname flag (FLAGS_hostname) has the 
same value:
https://github.com/apache/impala/blob/7f9a74ffcaf1818f1f3c9d427557acca21a627da/be/src/common/init.cc#L191

> Impala should set the kerberos principal to the FQDN
> 
>
> Key: IMPALA-4978
> URL: https://issues.apache.org/jira/browse/IMPALA-4978
> Project: IMPALA
>  Issue Type: Bug
>  Components: Security
>Affects Versions: Impala 2.3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
>  Labels: security
>
> Impala calls gethostname() to get the local system's name which is used as a 
> part of the kerberos principal. This usually works fine under most settings, 
> however, this is not guaranteed to return the FQDN of the host under certain 
> settings (Eg: possibly while using a DNS GSLB).
> Impala should attempt to get the FQDN first which can be obtained by using 
> getaddrinfo(), and fallback to gethostname() otherwise. This is the behavior 
> of Hadoop, which we should try to match as closely as possible:
> https://github.com/apache/hadoop/blob/master/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SecurityUtil.java#L169



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7072) Kudu's kinit does not support auth_to_local rules with Heimdal kerberos

2018-08-09 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575597#comment-16575597
 ] 

Sailesh Mukil commented on IMPALA-7072:
---

Turns out that we never supported the kerberos configuration: 'auth_to_local' 
rules, in Impala, so it's not a supported configuration. This is the setting 
that one would set in the krb5.conf file.

We do however support Hadoop 'auth_to_local' rules which is different.

CC: [~arodoni_cloudera] It would be good to document the above.

> Kudu's kinit does not support auth_to_local rules with Heimdal kerberos
> ---
>
> Key: IMPALA-7072
> URL: https://issues.apache.org/jira/browse/IMPALA-7072
> Project: IMPALA
>  Issue Type: Bug
>  Components: Security
>Affects Versions: Impala 2.12.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>
> On deployments that use Heimdal kerberos configured with 'auth_to_local' 
> rules set, and with the Impala startup flag 'use_kudu_kinit'= true, the 
> auth_to_local rules will not be respected as it's not supported with Kudu's 
> kinit.
> The implication of this is that from Impala 2.12.0 onwards, clusters with the 
> above configuration will not be able to use KRPC with kerberos enabled.
> A workaround is to get rid of the auth_to_local rules for such deployments.
> We need to have a good long term solution to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6859) De-templatize RpcMgrTestBase

2018-08-09 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil reassigned IMPALA-6859:
-

Assignee: Michael Ho  (was: Sailesh Mukil)

> De-templatize RpcMgrTestBase
> 
>
> Key: IMPALA-6859
> URL: https://issues.apache.org/jira/browse/IMPALA-6859
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Michael Ho
>Priority: Major
>  Labels: security, test
>
> Now that we've gotten rid of the old way of Kinit-ing (IMPALA-5893), we can 
> detemplatize RpcMgrTestBase, since there's only one option to run the 
> kerberos tests with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7241) progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)

2018-08-08 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573930#comment-16573930
 ] 

Sailesh Mukil commented on IMPALA-7241:
---

[~tarmstrong] Sorry for the delayed response. Yes, it looks like there's an 
existing bug there. [~kwho] found it and is fixing it as part of 
IMPALA-7213

> progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)
> ---
>
> Key: IMPALA-7241
> URL: https://issues.apache.org/jira/browse/IMPALA-7241
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Michael Ho
>Priority: Blocker
>  Labels: crash, stress
>
> During a stress test with 8 nodes running an insecure debug build based off 
> master, an impalad hit a DCHECK. The concurrency level at the time was 
> between 150-180 queries.
> The DCHECK was {{progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 
> 0)}}.
> The stack was:
> {noformat}
> #0  0x7f6a73e811f7 in raise () from /lib64/libc.so.6
> #1  0x7f6a73e828e8 in abort () from /lib64/libc.so.6
> #2  0x04300e34 in google::DumpStackTraceAndExit() ()
> #3  0x042f78ad in google::LogMessage::Fail() ()
> #4  0x042f9152 in google::LogMessage::SendToLog() ()
> #5  0x042f7287 in google::LogMessage::Flush() ()
> #6  0x042fa84e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x01f7fedd in impala::ProgressUpdater::Update(long) ()
> #8  0x0313912b in 
> impala::Coordinator::BackendState::InstanceStats::Update(impala::TFragmentInstanceExecStatus
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #9  0x031369e6 in 
> impala::Coordinator::BackendState::ApplyExecStatusReport(impala::TReportExecStatusParams
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #10 0x031250b4 in 
> impala::Coordinator::UpdateBackendExecStatus(impala::TReportExecStatusParams 
> const&) ()
> #11 0x01e86395 in 
> impala::ClientRequestState::UpdateBackendExecStatus(impala::TReportExecStatusParams
>  const&) ()
> #12 0x01e27594 in 
> impala::ImpalaServer::ReportExecStatus(impala::TReportExecStatusResult&, 
> impala::TReportExecStatusParams const&) ()
> #13 0x01ebb8a0 in 
> impala::ImpalaInternalService::ReportExecStatus(impala::TReportExecStatusResult&,
>  impala::TReportExecStatusParams const&) ()
> #14 0x02fa6f62 in 
> impala::ImpalaInternalServiceProcessor::process_ReportExecStatus(int, 
> apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, 
> void*) ()
> #15 0x02fa6540 in 
> impala::ImpalaInternalServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*,
>  apache::thrift::protocol::TProtocol*, std::string const&, int, void*) ()
> #16 0x018892b0 in 
> apache::thrift::TDispatchProcessor::process(boost::shared_ptr,
>  boost::shared_ptr, void*) ()
> #17 0x01c80d1b in 
> apache::thrift::server::TAcceptQueueServer::Task::run() ()
> #18 0x01c78ffb in 
> impala::ThriftThread::RunRunnable(boost::shared_ptr,
>  impala::Promise*) ()
> #19 0x01c7a721 in boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise (impala::PromiseMode)0>*>::operator()(impala::ThriftThread*, 
> boost::shared_ptr, 
> impala::Promise*) const ()
> #20 0x01c7a5b7 in void 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> 
> >::operator() boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf2 impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>&, 
> boost::_bi::list0&, int) ()
> #21 0x01c7a303 in boost::_bi::bind_t impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >::operator()() ()
> #22 0x01c7a216 in 
> boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >, void>::invoke(boost::detail::function::function_buffer&) ()
> #23 0x01bbb81c in boost::function0::operator()() const ()
> #24 0x01fb6eaf in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #25 0x01fbef87 in void 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 

[jira] [Comment Edited] (IMPALA-7241) progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)

2018-08-08 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573930#comment-16573930
 ] 

Sailesh Mukil edited comment on IMPALA-7241 at 8/8/18 9:42 PM:
---

[~tarmstrong] Sorry for the delayed response. Yes, it looks like there's an 
existing bug there. [~kwho] found it and is fixing it as part of  IMPALA-7213


was (Author: sailesh):
[~tarmstrong] Sorry for the delayed response. Yes, it looks like there's an 
existing bug there. [~kwho] found it and is fixing it as part of 
IMPALA-7213

> progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)
> ---
>
> Key: IMPALA-7241
> URL: https://issues.apache.org/jira/browse/IMPALA-7241
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Michael Ho
>Priority: Blocker
>  Labels: crash, stress
>
> During a stress test with 8 nodes running an insecure debug build based off 
> master, an impalad hit a DCHECK. The concurrency level at the time was 
> between 150-180 queries.
> The DCHECK was {{progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 
> 0)}}.
> The stack was:
> {noformat}
> #0  0x7f6a73e811f7 in raise () from /lib64/libc.so.6
> #1  0x7f6a73e828e8 in abort () from /lib64/libc.so.6
> #2  0x04300e34 in google::DumpStackTraceAndExit() ()
> #3  0x042f78ad in google::LogMessage::Fail() ()
> #4  0x042f9152 in google::LogMessage::SendToLog() ()
> #5  0x042f7287 in google::LogMessage::Flush() ()
> #6  0x042fa84e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x01f7fedd in impala::ProgressUpdater::Update(long) ()
> #8  0x0313912b in 
> impala::Coordinator::BackendState::InstanceStats::Update(impala::TFragmentInstanceExecStatus
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #9  0x031369e6 in 
> impala::Coordinator::BackendState::ApplyExecStatusReport(impala::TReportExecStatusParams
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #10 0x031250b4 in 
> impala::Coordinator::UpdateBackendExecStatus(impala::TReportExecStatusParams 
> const&) ()
> #11 0x01e86395 in 
> impala::ClientRequestState::UpdateBackendExecStatus(impala::TReportExecStatusParams
>  const&) ()
> #12 0x01e27594 in 
> impala::ImpalaServer::ReportExecStatus(impala::TReportExecStatusResult&, 
> impala::TReportExecStatusParams const&) ()
> #13 0x01ebb8a0 in 
> impala::ImpalaInternalService::ReportExecStatus(impala::TReportExecStatusResult&,
>  impala::TReportExecStatusParams const&) ()
> #14 0x02fa6f62 in 
> impala::ImpalaInternalServiceProcessor::process_ReportExecStatus(int, 
> apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, 
> void*) ()
> #15 0x02fa6540 in 
> impala::ImpalaInternalServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*,
>  apache::thrift::protocol::TProtocol*, std::string const&, int, void*) ()
> #16 0x018892b0 in 
> apache::thrift::TDispatchProcessor::process(boost::shared_ptr,
>  boost::shared_ptr, void*) ()
> #17 0x01c80d1b in 
> apache::thrift::server::TAcceptQueueServer::Task::run() ()
> #18 0x01c78ffb in 
> impala::ThriftThread::RunRunnable(boost::shared_ptr,
>  impala::Promise*) ()
> #19 0x01c7a721 in boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise (impala::PromiseMode)0>*>::operator()(impala::ThriftThread*, 
> boost::shared_ptr, 
> impala::Promise*) const ()
> #20 0x01c7a5b7 in void 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> 
> >::operator() boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf2 impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>&, 
> boost::_bi::list0&, int) ()
> #21 0x01c7a303 in boost::_bi::bind_t impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >::operator()() ()
> #22 0x01c7a216 in 
> boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >, void>::invoke(boost::detail::function::function_buffer&) ()
> #23 0x01bbb81c in boost::function0::operator()() const ()
> #24 0x01fb6eaf in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #25 0x01fbef87 in void 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() 

[jira] [Resolved] (IMPALA-7163) Implement a state machine for the QueryState class

2018-08-08 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-7163.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Implement a state machine for the QueryState class
> --
>
> Key: IMPALA-7163
> URL: https://issues.apache.org/jira/browse/IMPALA-7163
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> We've recently been improving our query lifecycle by adding explicit state 
> transitions so it's easier to reason about what should happen at a given 
> stage in the lifetime of a query or a fragment instance.
> On the coordinator side. The coordinator's view of the query: 
> https://github.com/apache/impala/commit/6ca87e46736a1e591ed7d7d5fee05b4b4d2fbb50
> On the fragment instance state side. A FIS's view of its own execution:
> https://github.com/apache/impala/blob/e12ee485cf4c77203b144c053ee167509cc39374/be/src/runtime/fragment-instance-state.h#L182-L203
> We don't have something like this for the QueryState class which maintains 
> query wide state per executor. Adding it should make the lifecycle of a query 
> from an executors point of view much easier to reason about.
> Additional info: This was identified as part of work for 
> IMPALA-2990/IMPALA-4063, and is a precursor to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7163) Implement a state machine for the QueryState class

2018-08-08 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-7163.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Implement a state machine for the QueryState class
> --
>
> Key: IMPALA-7163
> URL: https://issues.apache.org/jira/browse/IMPALA-7163
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> We've recently been improving our query lifecycle by adding explicit state 
> transitions so it's easier to reason about what should happen at a given 
> stage in the lifetime of a query or a fragment instance.
> On the coordinator side. The coordinator's view of the query: 
> https://github.com/apache/impala/commit/6ca87e46736a1e591ed7d7d5fee05b4b4d2fbb50
> On the fragment instance state side. A FIS's view of its own execution:
> https://github.com/apache/impala/blob/e12ee485cf4c77203b144c053ee167509cc39374/be/src/runtime/fragment-instance-state.h#L182-L203
> We don't have something like this for the QueryState class which maintains 
> query wide state per executor. Adding it should make the lifecycle of a query 
> from an executors point of view much easier to reason about.
> Additional info: This was identified as part of work for 
> IMPALA-2990/IMPALA-4063, and is a precursor to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7376) Impala hits a DCHECK if a fragment instance fails to initialize the filter bank

2018-08-08 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-7376.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Impala hits a DCHECK if a fragment instance fails to initialize the filter 
> bank
> ---
>
> Key: IMPALA-7376
> URL: https://issues.apache.org/jira/browse/IMPALA-7376
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>  Labels: crash
> Fix For: Impala 3.1.0
>
>
> While Prepare()-ing a fragment instance, if we fail to initialize the runtime 
> filter bank, we will exit FIS::Prepare() without acquiring a thread token 
> (AcquireThreadToken()):
> https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L135-L139
> FIS::Finalize() is called *always* regardless of whether the fragment 
> instance succeeded or failed. And FIS::Finalize() tries to 
> ReleaseThreadToken() even though it might not have gotten acquired:
> https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L464
> , causing a DCHECK to be hit.
> This was found while I was adding global debug actions (IMPALA-7046) to the 
> FIS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7376) Impala hits a DCHECK if a fragment instance fails to initialize the filter bank

2018-08-08 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-7376.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Impala hits a DCHECK if a fragment instance fails to initialize the filter 
> bank
> ---
>
> Key: IMPALA-7376
> URL: https://issues.apache.org/jira/browse/IMPALA-7376
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>  Labels: crash
> Fix For: Impala 3.1.0
>
>
> While Prepare()-ing a fragment instance, if we fail to initialize the runtime 
> filter bank, we will exit FIS::Prepare() without acquiring a thread token 
> (AcquireThreadToken()):
> https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L135-L139
> FIS::Finalize() is called *always* regardless of whether the fragment 
> instance succeeded or failed. And FIS::Finalize() tries to 
> ReleaseThreadToken() even though it might not have gotten acquired:
> https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L464
> , causing a DCHECK to be hit.
> This was found while I was adding global debug actions (IMPALA-7046) to the 
> FIS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IMPALA-7221) While reading from object store S3/ADLS at +500MB/sec TypeArrayKlass::allocate_common becomes a CPU bottleneck

2018-08-07 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil reassigned IMPALA-7221:
-

Assignee: (was: Sailesh Mukil)

> While reading from object store S3/ADLS at +500MB/sec 
> TypeArrayKlass::allocate_common becomes a CPU bottleneck
> --
>
> Key: IMPALA-7221
> URL: https://issues.apache.org/jira/browse/IMPALA-7221
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Mostafa Mokhtar
>Priority: Major
> Attachments: s3_alloc_expensive_1_js.txt, s3_alloc_expensive_2_ps.txt
>
>
> From Perf
> {code}
> Samples: 1M of event 'cpu-clock', Event count (approx.): 32005850
>   Children  Self  Command Shared Object  Symbol   
>   ◆
> -   16.46% 0.04%  impalad impalad[.] 
> hdfsRead   ▒
>- 16.45% hdfsRead  
>   ▒
>   - 9.71% jni_NewByteArray
>   ▒
>9.63% TypeArrayKlass::allocate_common  
>   ▒
> 6.57% __memmove_ssse3_back
>   ▒
> +9.72% 0.03%  impalad libjvm.so  [.] 
> jni_NewByteArray   ▒
> +9.67% 8.79%  impalad libjvm.so  [.] 
> TypeArrayKlass::allocate_co▒
> +8.82% 0.00%  impalad [unknown]  [.] 
>    ▒
> +7.67% 0.04%  impalad [kernel.kallsyms]  [k] 
> system_call_fastpath   ▒
> +7.19% 7.02%  impalad impalad[.] 
> impala::ScalarColumnReader<▒
> +7.18% 6.55%  impalad libc-2.17.so   [.] 
> __memmove_ssse3_back   ▒
> +6.32% 0.00%  impalad [unknown]  [.] 
> 0x001a9458 ▒
> +6.07% 0.00%  impalad [kernel.kallsyms]  [k] 
> do_softirq ▒
> +6.07% 0.00%  impalad [kernel.kallsyms]  [k] 
> call_softirq   ▒
> +6.05% 0.24%  impalad [kernel.kallsyms]  [k] 
> __do_softirq   ▒
> +5.98% 0.00%  impalad [kernel.kallsyms]  [k] 
> xen_hvm_callback_vector▒
> +5.98% 0.00%  impalad [kernel.kallsyms]  [k] 
> xen_evtchn_do_upcall   ▒
> +5.98% 0.00%  impalad [kernel.kallsyms]  [k] 
> irq_exit   ▒
> +5.81% 0.03%  impalad [kernel.kallsyms]  [k] 
> net_rx_action  ▒
> {code}
> {code}
> #0  0x7ffa3d78d69b in TypeArrayKlass::allocate_common(int, bool, Thread*) 
> () from /usr/java/jdk1.8.0_121/jre/lib/amd64/server/libjvm.so
> #1  0x7ffa3d3e22d2 in jni_NewByteArray () from 
> /usr/java/jdk1.8.0_121/jre/lib/amd64/server/libjvm.so
> #2  0x020ec13c in hdfsRead ()
> #3  0x01100948 in impala::io::ScanRange::Read(unsigned char*, long, 
> long*, bool*) ()
> #4  0x010fa294 in 
> impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, 
> impala::io::RequestContext*, impala::io::ScanRange*) ()
> #5  0x010fa3f4 in 
> impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
> #6  0x00d15193 in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::Promise*) ()
> #7  0x00d158d4 in boost::detail::thread_data void (*)(std::string const&, std::string const&, boost::function, 
> impala::Promise*), boost::_bi::list4, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value*> > > >::run() ()
> #8  0x012919aa in thread_proxy ()
> #9  0x7ffa3b6a6e25 in start_thread () from /lib64/libpthread.so.0
> #10 0x7ffa3b3d0bad in clone () from /lib64/libc.so.6
> {code}
> There is also log4j contention in the JVM due to writing error messages to 
> impalad.ERRO like this
> {code}
> readDirect: FSDataInputStream#read error:
> UnsupportedOperationException: Byte-buffer read unsupported by input 
> streamjava.lang.UnsupportedOperationException: Byte-buffer read unsupported 
> by input stream
> at 
> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:150)
> readDirect: FSDataInputStream#read error:
> UnsupportedOperationException: Byte-buffer read unsupported by input 
> streamjava.lang.UnsupportedOperationException: Byte-buffer read unsupported 
> by input stream
> at 
> 

[jira] [Commented] (IMPALA-7378) test_strict_mode failed on an ASAN build: expected "Error converting column: 5 to DOUBLE"

2018-07-31 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564645#comment-16564645
 ] 

Sailesh Mukil commented on IMPALA-7378:
---

[~poojanilangekar] When you say the coordinator completed the query, do you 
mean that it completed it successfully? If it did, then what I mention below 
_might_ be happening.

Certain queries can complete before all the fragment instances have finished 
running. This is because the coordinator might have received enough row(s) to 
send back to the client. Meanwhile, some fragment instances may continue 
running since they're unaware that the query has completed. So, the coordinator 
will send a Cancel() RPC which will stop those fragment instances eventually.
Example queries where this happens all the time are LIMIT queries.

It may be the case that the coordinator finished the query successfully from 
its point of view, but then a fragment instance unaware of the query completion 
hit IMPALA-7335 (or some error).

So, to summarize, even though the query may have completed successfully, you 
may have hit an error, and the completion of the query may have nothing to do 
with that.

> test_strict_mode failed on an ASAN build: expected "Error converting column: 
> 5 to DOUBLE"
> -
>
> Key: IMPALA-7378
> URL: https://issues.apache.org/jira/browse/IMPALA-7378
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: David Knupp
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build
>
> *Stacktrace*
> {noformat}
> query_test/test_queries.py:159: in test_strict_mode
> self.run_test_case('QueryTest/strict-mode-abort', vector)
> common/impala_test_suite.py:420: in run_test_case
> assert False, "Expected exception: %s" % expected_str
> E   AssertionError: Expected exception: Error converting column: 5 to DOUBLE
> {noformat}
> *Standard Error*
> {noformat}
> -- executing against localhost:21000
> use functional;
> SET strict_mode=1;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=0;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> select * from overflow;
> -- executing against localhost:21000
> use functional;
> SET strict_mode=1;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> select tinyint_col from overflow;
> -- executing against localhost:21000
> select smallint_col from overflow;
> -- executing against localhost:21000
> select int_col from overflow;
> -- executing against localhost:21000
> select bigint_col from overflow;
> -- executing against localhost:21000
> select float_col from overflow;
> -- executing against localhost:21000
> select double_col from overflow;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7376) Impala hits a DCHECK if a fragment instance fails to initialize the filter bank

2018-07-31 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564386#comment-16564386
 ] 

Sailesh Mukil commented on IMPALA-7376:
---

CC: [~kwho]

> Impala hits a DCHECK if a fragment instance fails to initialize the filter 
> bank
> ---
>
> Key: IMPALA-7376
> URL: https://issues.apache.org/jira/browse/IMPALA-7376
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>  Labels: crash
>
> While Prepare()-ing a fragment instance, if we fail to initialize the runtime 
> filter bank, we will exit FIS::Prepare() without acquiring a thread token 
> (AcquireThreadToken()):
> https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L135-L139
> FIS::Finalize() is called *always* regardless of whether the fragment 
> instance succeeded or failed. And FIS::Finalize() tries to 
> ReleaseThreadToken() even though it might not have gotten acquired:
> https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L464
> , causing a DCHECK to be hit.
> This was found while I was adding global debug actions (IMPALA-7046) to the 
> FIS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7376) Impala hits a DCHECK if a fragment instance fails to initialize the filter bank

2018-07-31 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7376:
-

 Summary: Impala hits a DCHECK if a fragment instance fails to 
initialize the filter bank
 Key: IMPALA-7376
 URL: https://issues.apache.org/jira/browse/IMPALA-7376
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


While Prepare()-ing a fragment instance, if we fail to initialize the runtime 
filter bank, we will exit FIS::Prepare() without acquiring a thread token 
(AcquireThreadToken()):

https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L135-L139

FIS::Finalize() is called *always* regardless of whether the fragment instance 
succeeded or failed. And FIS::Finalize() tries to ReleaseThreadToken() even 
though it might not have gotten acquired:
https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L464

, causing a DCHECK to be hit.

This was found while I was adding global debug actions (IMPALA-7046) to the FIS.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7376) Impala hits a DCHECK if a fragment instance fails to initialize the filter bank

2018-07-31 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7376:
-

 Summary: Impala hits a DCHECK if a fragment instance fails to 
initialize the filter bank
 Key: IMPALA-7376
 URL: https://issues.apache.org/jira/browse/IMPALA-7376
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


While Prepare()-ing a fragment instance, if we fail to initialize the runtime 
filter bank, we will exit FIS::Prepare() without acquiring a thread token 
(AcquireThreadToken()):

https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L135-L139

FIS::Finalize() is called *always* regardless of whether the fragment instance 
succeeded or failed. And FIS::Finalize() tries to ReleaseThreadToken() even 
though it might not have gotten acquired:
https://github.com/apache/impala/blob/316b17ac55adb3d1deeb1289b4045688269b201d/be/src/runtime/fragment-instance-state.cc#L464

, causing a DCHECK to be hit.

This was found while I was adding global debug actions (IMPALA-7046) to the FIS.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6644) Add last heartbeat timestamp into Statestore metric

2018-07-30 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562331#comment-16562331
 ] 

Sailesh Mukil commented on IMPALA-6644:
---

[~tarmstrong] Sure, we can change the default, I mentioned 60 seconds since 
that was what was suggested above. It makes sense that would be too noisy since 
it's once per subscriber. However, do we need the flag?

> Add last heartbeat timestamp into Statestore metric
> ---
>
> Key: IMPALA-6644
> URL: https://issues.apache.org/jira/browse/IMPALA-6644
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Mala Chikka Kempanna
>Assignee: Pooja Nilangekar
>Priority: Minor
>  Labels: ramp-up, supportability
>
> In the latest and previous versions, statestore in it's default logging 
> reports only when it fails to send heartbeat to any host.
> There is no way to confirm if Statestore is indeed continuing to heartbeat in 
> all passing conditions, except for turning on debug logs, which becomes too 
> noisy. But at the same time its important to know statestore is indeed 
> heartbeating.
> The suggestion here is to add a metric in statestore metric page and also 
> print the same in log every once in a minute(or any configurable 
> time-frequency), reporting the last heartbeat timestamp and last heartbeat 
> host(optional).
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6644) Add last heartbeat timestamp into Statestore metric

2018-07-30 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562297#comment-16562297
 ] 

Sailesh Mukil commented on IMPALA-6644:
---

[~mkempanna] From a user point of view, is there a strong need to have the log 
frequency configurable, instead of just keeping it as a constant: 60 seconds?

We're trying to avoid adding too many flags to the codebase, so making sure 
that we absolutely need it before it's added to the codebase.

CC: [~poojanilangekar]

> Add last heartbeat timestamp into Statestore metric
> ---
>
> Key: IMPALA-6644
> URL: https://issues.apache.org/jira/browse/IMPALA-6644
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Mala Chikka Kempanna
>Assignee: Pooja Nilangekar
>Priority: Minor
>  Labels: ramp-up, supportability
>
> In the latest and previous versions, statestore in it's default logging 
> reports only when it fails to send heartbeat to any host.
> There is no way to confirm if Statestore is indeed continuing to heartbeat in 
> all passing conditions, except for turning on debug logs, which becomes too 
> noisy. But at the same time its important to know statestore is indeed 
> heartbeating.
> The suggestion here is to add a metric in statestore metric page and also 
> print the same in log every once in a minute(or any configurable 
> time-frequency), reporting the last heartbeat timestamp and last heartbeat 
> host(optional).
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7302) Build fails on Centos6

2018-07-19 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-7302.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Build fails on Centos6
> --
>
> Key: IMPALA-7302
> URL: https://issues.apache.org/jira/browse/IMPALA-7302
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Reporter: Michael Ho
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.1.0
>
>
> Due to recent change in IMPALA-7006, the build started failing on Centos6.
> {noformat}
> 13:33:16 
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/be/src/kudu/util/net/socket.cc:
>  In member function ‘kudu::Status kudu::Socket::SetReusePort(bool)’:
> 13:33:16 
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/be/src/kudu/util/net/socket.cc:249:50:
>  error: ‘SO_REUSEPORT’ was not declared in this scope
> 13:33:16RETURN_NOT_OK_PREPEND(SetSockOpt(SOL_SOCKET, SO_REUSEPORT, 
> int_flag),
> 13:33:16   ^
> 13:33:16 make[2]: *** 
> [be/src/kudu/util/CMakeFiles/kudu_util.dir/net/socket.cc.o] Error 1
> 13:33:16 make[2]: *** Waiting for unfinished jobs
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7302) Build fails on Centos6

2018-07-19 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-7302.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Build fails on Centos6
> --
>
> Key: IMPALA-7302
> URL: https://issues.apache.org/jira/browse/IMPALA-7302
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Reporter: Michael Ho
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.1.0
>
>
> Due to recent change in IMPALA-7006, the build started failing on Centos6.
> {noformat}
> 13:33:16 
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/be/src/kudu/util/net/socket.cc:
>  In member function ‘kudu::Status kudu::Socket::SetReusePort(bool)’:
> 13:33:16 
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/be/src/kudu/util/net/socket.cc:249:50:
>  error: ‘SO_REUSEPORT’ was not declared in this scope
> 13:33:16RETURN_NOT_OK_PREPEND(SetSockOpt(SOL_SOCKET, SO_REUSEPORT, 
> int_flag),
> 13:33:16   ^
> 13:33:16 make[2]: *** 
> [be/src/kudu/util/CMakeFiles/kudu_util.dir/net/socket.cc.o] Error 1
> 13:33:16 make[2]: *** Waiting for unfinished jobs
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7299) Impala fails to work with the krb5 config 'rdns=false' in Impala 2.12.0/3.0

2018-07-13 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7299:
-

 Summary: Impala fails to work with the krb5 config 'rdns=false' in 
Impala 2.12.0/3.0
 Key: IMPALA-7299
 URL: https://issues.apache.org/jira/browse/IMPALA-7299
 Project: IMPALA
  Issue Type: Task
  Components: Docs
Affects Versions: Impala 2.12.0, Impala 3.0
Reporter: Sailesh Mukil
Assignee: Alex Rodoni


Since we've switched to using KRPC from Impala 2.12.0 and Impala 3.0, we found 
a bug that doesn't allow for kerberized communication if the following flag in 
krb5.conf is changed from its default of 'rdns=true' to 'rdns=false'.

The current workaround is to set it back to its default of 'true'. Keep in mind 
that the 'dns_canonicalize_hostname' flag should also be 'true' otherwise the 
'rdns' flag will be ignored.

More details on the flag can be found here:
http://web.mit.edu/kerberos/krb5-1.12/doc/admin/conf_files/krb5_conf.html

A fix for this issue is tracked by IMPALA-7298.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7299) Impala fails to work with the krb5 config 'rdns=false' in Impala 2.12.0/3.0

2018-07-13 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7299:
-

 Summary: Impala fails to work with the krb5 config 'rdns=false' in 
Impala 2.12.0/3.0
 Key: IMPALA-7299
 URL: https://issues.apache.org/jira/browse/IMPALA-7299
 Project: IMPALA
  Issue Type: Task
  Components: Docs
Affects Versions: Impala 2.12.0, Impala 3.0
Reporter: Sailesh Mukil
Assignee: Alex Rodoni


Since we've switched to using KRPC from Impala 2.12.0 and Impala 3.0, we found 
a bug that doesn't allow for kerberized communication if the following flag in 
krb5.conf is changed from its default of 'rdns=true' to 'rdns=false'.

The current workaround is to set it back to its default of 'true'. Keep in mind 
that the 'dns_canonicalize_hostname' flag should also be 'true' otherwise the 
'rdns' flag will be ignored.

More details on the flag can be found here:
http://web.mit.edu/kerberos/krb5-1.12/doc/admin/conf_files/krb5_conf.html

A fix for this issue is tracked by IMPALA-7298.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-4784) Remove InProcessStatestore

2018-07-03 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-4784.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Remove InProcessStatestore
> --
>
> Key: IMPALA-4784
> URL: https://issues.apache.org/jira/browse/IMPALA-4784
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Sailesh Mukil
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> The {{InProcessStatestore}} class is only used by the {{statestore-test}} and 
> is likely to be obsoleted by the KRPC work. Since {{statestore-test}} doesn't 
> provide much test coverage at all, let's remove {{InProcessStatestore}} to 
> avoid the need to keep it up-to-date.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-4784) Remove InProcessStatestore

2018-07-03 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-4784.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Remove InProcessStatestore
> --
>
> Key: IMPALA-4784
> URL: https://issues.apache.org/jira/browse/IMPALA-4784
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Sailesh Mukil
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> The {{InProcessStatestore}} class is only used by the {{statestore-test}} and 
> is likely to be obsoleted by the KRPC work. Since {{statestore-test}} doesn't 
> provide much test coverage at all, let's remove {{InProcessStatestore}} to 
> avoid the need to keep it up-to-date.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7235) Allow the Statestore to shut down cleanly

2018-07-02 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7235:
-

 Summary: Allow the Statestore to shut down cleanly
 Key: IMPALA-7235
 URL: https://issues.apache.org/jira/browse/IMPALA-7235
 Project: IMPALA
  Issue Type: Improvement
  Components: Distributed Exec
Affects Versions: Impala 2.0
Reporter: Sailesh Mukil


The Statestore class was written with the assumption that it will live for the 
entire lifetime of the cluster and never have to be shut down. This is true 
today, however, as a result of this, we have to have all our Statestore objects 
leak in the BE tests.

Adding a clean shut down mechanism shouldn't be too hard, so let's do that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7235) Allow the Statestore to shut down cleanly

2018-07-02 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7235:
-

 Summary: Allow the Statestore to shut down cleanly
 Key: IMPALA-7235
 URL: https://issues.apache.org/jira/browse/IMPALA-7235
 Project: IMPALA
  Issue Type: Improvement
  Components: Distributed Exec
Affects Versions: Impala 2.0
Reporter: Sailesh Mukil


The Statestore class was written with the assumption that it will live for the 
entire lifetime of the cluster and never have to be shut down. This is true 
today, however, as a result of this, we have to have all our Statestore objects 
leak in the BE tests.

Adding a clean shut down mechanism shouldn't be too hard, so let's do that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7215) Implement a templatized CountingBarrier

2018-06-28 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-7215.
---
Resolution: Fixed

> Implement a templatized CountingBarrier
> ---
>
> Key: IMPALA-7215
> URL: https://issues.apache.org/jira/browse/IMPALA-7215
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Currently, our CountingBarrier util only notifies a 'bool' value and uses an 
> underlying Promise.
> We're seeing cases in code where we might want to be notified of a different 
> kind of Promise (other than bool). We can convert this utility code to be 
> templatized to be able to reuse the CountingBarrier for these new use cases 
> as well.
> This was identified while working on IMPALA-7163.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7215) Implement a templatized CountingBarrier

2018-06-26 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7215:
-

 Summary: Implement a templatized CountingBarrier
 Key: IMPALA-7215
 URL: https://issues.apache.org/jira/browse/IMPALA-7215
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 3.0
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil
 Fix For: Impala 3.1.0


Currently, our CountingBarrier util only notifies a 'bool' value and uses an 
underlying Promise.

We're seeing cases in code where we might want to be notified of a different 
kind of Promise (other than bool). We can convert this utility code to be 
templatized to be able to reuse the CountingBarrier for these new use cases as 
well.

This was identified while working on IMPALA-7163.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7215) Implement a templatized CountingBarrier

2018-06-26 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7215:
-

 Summary: Implement a templatized CountingBarrier
 Key: IMPALA-7215
 URL: https://issues.apache.org/jira/browse/IMPALA-7215
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 3.0
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil
 Fix For: Impala 3.1.0


Currently, our CountingBarrier util only notifies a 'bool' value and uses an 
underlying Promise.

We're seeing cases in code where we might want to be notified of a different 
kind of Promise (other than bool). We can convert this utility code to be 
templatized to be able to reuse the CountingBarrier for these new use cases as 
well.

This was identified while working on IMPALA-7163.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IMPALA-7163) Implement a state machine for the QueryState class

2018-06-25 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-7163:
--
Issue Type: Sub-task  (was: Improvement)
Parent: IMPALA-5865

> Implement a state machine for the QueryState class
> --
>
> Key: IMPALA-7163
> URL: https://issues.apache.org/jira/browse/IMPALA-7163
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
>
> We've recently been improving our query lifecycle by adding explicit state 
> transitions so it's easier to reason about what should happen at a given 
> stage in the lifetime of a query or a fragment instance.
> On the coordinator side. The coordinator's view of the query: 
> https://github.com/apache/impala/commit/6ca87e46736a1e591ed7d7d5fee05b4b4d2fbb50
> On the fragment instance state side. A FIS's view of its own execution:
> https://github.com/apache/impala/blob/e12ee485cf4c77203b144c053ee167509cc39374/be/src/runtime/fragment-instance-state.h#L182-L203
> We don't have something like this for the QueryState class which maintains 
> query wide state per executor. Adding it should make the lifecycle of a query 
> from an executors point of view much easier to reason about.
> Additional info: This was identified as part of work for 
> IMPALA-2990/IMPALA-4063, and is a precursor to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6013) Remove InProcessImpalaServer

2018-06-22 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520762#comment-16520762
 ] 

Sailesh Mukil commented on IMPALA-6013:
---

[~tarmstrong] Yes, that sounds about right.

> Remove InProcessImpalaServer
> 
>
> Key: IMPALA-6013
> URL: https://issues.apache.org/jira/browse/IMPALA-6013
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
>  Labels: refactor
>
> Now that we've refactored CreateImpalaServer() such that it can be used in 
> tests (IMPALA-4786), we should get rid of InProcessImpalaServer, which was 
> only historically there because the ImpalaServer couldn't be used in tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6085) Make the setup and teardown of the security code idempotent

2018-06-21 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519822#comment-16519822
 ] 

Sailesh Mukil commented on IMPALA-6085:
---

Interesting test cases this could enable:
* Add case to rpc-mgr-kerberized-test that use different configurations of 
kerberos, such running with auth_to_local_rules.
* Add cases that run a matrix of tests such as {Kerberos enabled, SSL enabled} 
(we run only this today), {Kerberos enabled, SSL disabled}, {Kerberos disabled, 
SSL enabled}.

Will add more as I think of them

> Make the setup and teardown of the security code idempotent
> ---
>
> Key: IMPALA-6085
> URL: https://issues.apache.org/jira/browse/IMPALA-6085
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Security
>Affects Versions: Impala 2.10.0
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: infrastructure, security, test
>
> Our security code assumes that it will only be called once in the lifetime of 
> a process. This is true, however, for tests, we would like to set it up and 
> tear it down multiple times to issue it different configurations and test it 
> within the same backend test process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-5335) Review consistency of ADLS python client used for Impala testing

2018-06-19 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-5335.
---
   Resolution: Fixed
Fix Version/s: Impala 2.12.0

The ADLS client has been fixed upstream so this is not an issue anymore. See 
HADOOP-14450.

> Review consistency of ADLS python client used for Impala testing
> 
>
> Key: IMPALA-5335
> URL: https://issues.apache.org/jira/browse/IMPALA-5335
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
>  Labels: infrastructure
> Fix For: Impala 2.12.0
>
>
> The ADLS Python client seems to have consistency issues even though ADLS 
> claims to be strongly consistent.
> Some of our tests are skipped because of this issue, with the tag 
> SkipIfADLS.slow_client.
> The documentation for the Python client doesn't seem to state or address this 
> as a known issue. It is however, a pre-release client.
> This JIRA is meant to track this issue on the Impala side, and close it once 
> it's addressed by ADLS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-5335) Review consistency of ADLS python client used for Impala testing

2018-06-19 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-5335.
---
   Resolution: Fixed
Fix Version/s: Impala 2.12.0

The ADLS client has been fixed upstream so this is not an issue anymore. See 
HADOOP-14450.

> Review consistency of ADLS python client used for Impala testing
> 
>
> Key: IMPALA-5335
> URL: https://issues.apache.org/jira/browse/IMPALA-5335
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
>  Labels: infrastructure
> Fix For: Impala 2.12.0
>
>
> The ADLS Python client seems to have consistency issues even though ADLS 
> claims to be strongly consistent.
> Some of our tests are skipped because of this issue, with the tag 
> SkipIfADLS.slow_client.
> The documentation for the Python client doesn't seem to state or address this 
> as a known issue. It is however, a pre-release client.
> This JIRA is meant to track this issue on the Impala side, and close it once 
> it's addressed by ADLS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-4970) Record identity of largest latency ExecQueryFInstances() RPC per query

2018-06-19 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-4970.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Record identity of largest latency ExecQueryFInstances() RPC per query
> --
>
> Key: IMPALA-4970
> URL: https://issues.apache.org/jira/browse/IMPALA-4970
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Rahul Shivu Mahadev
>Priority: Major
>  Labels: newbie, ramp-up
> Fix For: Impala 3.1.0
>
>
> Although we retain the histogram of fragment instance startup latencies, we 
> don't record the identity of the most expensive instance, or the host it runs 
> on. This would be helpful in diagnosing slow query start-up times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IMPALA-3825) Distribute runtime filter aggregation across cluster

2018-06-19 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil reassigned IMPALA-3825:
-

Assignee: Rahul Shivu Mahadev  (was: Sailesh Mukil)

> Distribute runtime filter aggregation across cluster
> 
>
> Key: IMPALA-3825
> URL: https://issues.apache.org/jira/browse/IMPALA-3825
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.6.0
>Reporter: Henry Robinson
>Assignee: Rahul Shivu Mahadev
>Priority: Major
>  Labels: runtime-filters
>
> Runtime filters can be tens of MB or more, and incasting all filters from all 
> shuffle joins to the coordinator can put a lot of memory pressure on that 
> node. To alleviate this we should consider spreading out the aggregation 
> operation across the cluster, so that a different node aggregates each 
> runtime filter.
> This still restricts aggregation to #runtime-filters nodes, which will 
> usually be less than the cluster size. If we want to smooth that out further 
> we could use tree-based aggregation, but let's measure the benefits of simply 
> distributing the aggregation work first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7157) Avoid unnecessarily pretty printing profiles per fragment instance

2018-06-18 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-7157.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Avoid unnecessarily pretty printing profiles per fragment instance
> --
>
> Key: IMPALA-7157
> URL: https://issues.apache.org/jira/browse/IMPALA-7157
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Minor
>  Labels: logs
> Fix For: Impala 3.1.0
>
>
> In SendReport(), if VLOG_FILE_IS_ON is 'true' (which is not the most verbose 
> logging level, but is higher than default), we pretty print the profile for 
> every fragment instance, which is a very expensive operation, as serializing 
> the profile is non-trivial (look at RuntimeProfile::PrettyPrint()), and 
> printing large amounts of information to the logs isn't cheap as well. 
> Lastly, it is very noisy.
> This seems unnecessary since this will not benefit us, as all the profiles 
> are merged at the coordinator side. We could argue that this might be 
> necessary when an executor fails to send the profile to the coordinator, but 
> that signifies a network issue which will not be reflected in the profile of 
> any fragment instance.
> This will help reduce noise in the logs when the log level is bumped up to 
> find other real issues that VLOG_FILE can help with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7157) Avoid unnecessarily pretty printing profiles per fragment instance

2018-06-18 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-7157.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Avoid unnecessarily pretty printing profiles per fragment instance
> --
>
> Key: IMPALA-7157
> URL: https://issues.apache.org/jira/browse/IMPALA-7157
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Minor
>  Labels: logs
> Fix For: Impala 3.1.0
>
>
> In SendReport(), if VLOG_FILE_IS_ON is 'true' (which is not the most verbose 
> logging level, but is higher than default), we pretty print the profile for 
> every fragment instance, which is a very expensive operation, as serializing 
> the profile is non-trivial (look at RuntimeProfile::PrettyPrint()), and 
> printing large amounts of information to the logs isn't cheap as well. 
> Lastly, it is very noisy.
> This seems unnecessary since this will not benefit us, as all the profiles 
> are merged at the coordinator side. We could argue that this might be 
> necessary when an executor fails to send the profile to the coordinator, but 
> that signifies a network issue which will not be reflected in the profile of 
> any fragment instance.
> This will help reduce noise in the logs when the log level is bumped up to 
> find other real issues that VLOG_FILE can help with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7182) Impala does not allow the use of insecure clusters with public IPs by default

2018-06-18 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7182:
-

 Summary: Impala does not allow the use of insecure clusters with 
public IPs by default
 Key: IMPALA-7182
 URL: https://issues.apache.org/jira/browse/IMPALA-7182
 Project: IMPALA
  Issue Type: Documentation
  Components: Security
Affects Versions: Impala 2.12.0
Reporter: Sailesh Mukil
Assignee: Alex Rodoni


We made Impala more secure by using KRPC which doesn't allow the usage of 
insecure clusters (no auth or no encryption) with public IPs by default.

We have a workaround which is to add the subnet to this flag 'trusted_subnets':
https://github.com/apache/impala/blob/master/be/src/kudu/rpc/server_negotiation.cc#L70-L80

Although we don't expect users to run production environments with this 
configuration, we need to document this as it's a slight behavioral change from 
Impala 2.11 to Impala 2.12.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7182) Impala does not allow the use of insecure clusters with public IPs by default

2018-06-18 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7182:
-

 Summary: Impala does not allow the use of insecure clusters with 
public IPs by default
 Key: IMPALA-7182
 URL: https://issues.apache.org/jira/browse/IMPALA-7182
 Project: IMPALA
  Issue Type: Documentation
  Components: Security
Affects Versions: Impala 2.12.0
Reporter: Sailesh Mukil
Assignee: Alex Rodoni


We made Impala more secure by using KRPC which doesn't allow the usage of 
insecure clusters (no auth or no encryption) with public IPs by default.

We have a workaround which is to add the subnet to this flag 'trusted_subnets':
https://github.com/apache/impala/blob/master/be/src/kudu/rpc/server_negotiation.cc#L70-L80

Although we don't expect users to run production environments with this 
configuration, we need to document this as it's a slight behavioral change from 
Impala 2.11 to Impala 2.12.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-7163) Implement a state machine for the QueryState class

2018-06-12 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510103#comment-16510103
 ] 

Sailesh Mukil commented on IMPALA-7163:
---

FYI [~kwho] [~dhecht]

> Implement a state machine for the QueryState class
> --
>
> Key: IMPALA-7163
> URL: https://issues.apache.org/jira/browse/IMPALA-7163
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
>
> We've recently been improving our query lifecycle by adding explicit state 
> transitions so it's easier to reason about what should happen at a given 
> stage in the lifetime of a query or a fragment instance.
> On the coordinator side. The coordinator's view of the query: 
> https://github.com/apache/impala/commit/6ca87e46736a1e591ed7d7d5fee05b4b4d2fbb50
> On the fragment instance state side. A FIS's view of its own execution:
> https://github.com/apache/impala/blob/e12ee485cf4c77203b144c053ee167509cc39374/be/src/runtime/fragment-instance-state.h#L182-L203
> We don't have something like this for the QueryState class which maintains 
> query wide state per executor. Adding it should make the lifecycle of a query 
> from an executors point of view much easier to reason about.
> Additional info: This was identified as part of work for 
> IMPALA-2990/IMPALA-4063, and is a precursor to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7163) Implement a state machine for the QueryState class

2018-06-12 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7163:
-

 Summary: Implement a state machine for the QueryState class
 Key: IMPALA-7163
 URL: https://issues.apache.org/jira/browse/IMPALA-7163
 Project: IMPALA
  Issue Type: Improvement
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


We've recently been improving our query lifecycle by adding explicit state 
transitions so it's easier to reason about what should happen at a given stage 
in the lifetime of a query or a fragment instance.

On the coordinator side. The coordinator's view of the query: 
https://github.com/apache/impala/commit/6ca87e46736a1e591ed7d7d5fee05b4b4d2fbb50

On the fragment instance state side. A FIS's view of its own execution:
https://github.com/apache/impala/blob/e12ee485cf4c77203b144c053ee167509cc39374/be/src/runtime/fragment-instance-state.h#L182-L203

We don't have something like this for the QueryState class which maintains 
query wide state per executor. Adding it should make the lifecycle of a query 
from an executors point of view much easier to reason about.

Additional info: This was identified as part of work for IMPALA-2990, and is a 
precursor to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IMPALA-7163) Implement a state machine for the QueryState class

2018-06-12 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-7163:
--
Description: 
We've recently been improving our query lifecycle by adding explicit state 
transitions so it's easier to reason about what should happen at a given stage 
in the lifetime of a query or a fragment instance.

On the coordinator side. The coordinator's view of the query: 
https://github.com/apache/impala/commit/6ca87e46736a1e591ed7d7d5fee05b4b4d2fbb50

On the fragment instance state side. A FIS's view of its own execution:
https://github.com/apache/impala/blob/e12ee485cf4c77203b144c053ee167509cc39374/be/src/runtime/fragment-instance-state.h#L182-L203

We don't have something like this for the QueryState class which maintains 
query wide state per executor. Adding it should make the lifecycle of a query 
from an executors point of view much easier to reason about.

Additional info: This was identified as part of work for 
IMPALA-2990/IMPALA-4063, and is a precursor to it.

  was:
We've recently been improving our query lifecycle by adding explicit state 
transitions so it's easier to reason about what should happen at a given stage 
in the lifetime of a query or a fragment instance.

On the coordinator side. The coordinator's view of the query: 
https://github.com/apache/impala/commit/6ca87e46736a1e591ed7d7d5fee05b4b4d2fbb50

On the fragment instance state side. A FIS's view of its own execution:
https://github.com/apache/impala/blob/e12ee485cf4c77203b144c053ee167509cc39374/be/src/runtime/fragment-instance-state.h#L182-L203

We don't have something like this for the QueryState class which maintains 
query wide state per executor. Adding it should make the lifecycle of a query 
from an executors point of view much easier to reason about.

Additional info: This was identified as part of work for IMPALA-2990, and is a 
precursor to it.


> Implement a state machine for the QueryState class
> --
>
> Key: IMPALA-7163
> URL: https://issues.apache.org/jira/browse/IMPALA-7163
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
>
> We've recently been improving our query lifecycle by adding explicit state 
> transitions so it's easier to reason about what should happen at a given 
> stage in the lifetime of a query or a fragment instance.
> On the coordinator side. The coordinator's view of the query: 
> https://github.com/apache/impala/commit/6ca87e46736a1e591ed7d7d5fee05b4b4d2fbb50
> On the fragment instance state side. A FIS's view of its own execution:
> https://github.com/apache/impala/blob/e12ee485cf4c77203b144c053ee167509cc39374/be/src/runtime/fragment-instance-state.h#L182-L203
> We don't have something like this for the QueryState class which maintains 
> query wide state per executor. Adding it should make the lifecycle of a query 
> from an executors point of view much easier to reason about.
> Additional info: This was identified as part of work for 
> IMPALA-2990/IMPALA-4063, and is a precursor to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7157) Avoid unnecessarily pretty printing profiles per fragment instance

2018-06-08 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7157:
-

 Summary: Avoid unnecessarily pretty printing profiles per fragment 
instance
 Key: IMPALA-7157
 URL: https://issues.apache.org/jira/browse/IMPALA-7157
 Project: IMPALA
  Issue Type: Improvement
  Components: Distributed Exec
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


In SendReport(), if VLOG_FILE_IS_ON is 'true' (which is not the most verbose 
logging level, but is higher than default), we pretty print the profile for 
every fragment instance, which is a very expensive operation, as serializing 
the profile is non-trivial (look at RuntimeProfile::PrettyPrint()), and 
printing large amounts of information to the logs isn't cheap as well. Lastly, 
it is very noisy.

This seems unnecessary since this will not benefit us, as all the profiles are 
merged at the coordinator side. We could argue that this might be necessary 
when an executor fails to send the profile to the coordinator, but that 
signifies a network issue which will not be reflected in the profile of any 
fragment instance.

This will help reduce noise in the logs when the log level is bumped up to find 
other real issues that VLOG_FILE can help with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7157) Avoid unnecessarily pretty printing profiles per fragment instance

2018-06-08 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7157:
-

 Summary: Avoid unnecessarily pretty printing profiles per fragment 
instance
 Key: IMPALA-7157
 URL: https://issues.apache.org/jira/browse/IMPALA-7157
 Project: IMPALA
  Issue Type: Improvement
  Components: Distributed Exec
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


In SendReport(), if VLOG_FILE_IS_ON is 'true' (which is not the most verbose 
logging level, but is higher than default), we pretty print the profile for 
every fragment instance, which is a very expensive operation, as serializing 
the profile is non-trivial (look at RuntimeProfile::PrettyPrint()), and 
printing large amounts of information to the logs isn't cheap as well. Lastly, 
it is very noisy.

This seems unnecessary since this will not benefit us, as all the profiles are 
merged at the coordinator side. We could argue that this might be necessary 
when an executor fails to send the profile to the coordinator, but that 
signifies a network issue which will not be reflected in the profile of any 
fragment instance.

This will help reduce noise in the logs when the log level is bumped up to find 
other real issues that VLOG_FILE can help with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-7113) ASAN heap-buffer-overflow in impala::HdfsRCFileScanner::GetCurrentKeyBuffer()

2018-06-01 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498606#comment-16498606
 ] 

Sailesh Mukil commented on IMPALA-7113:
---

Assigning this to [~rahul.mahadev]. Please reach out to [~pranay_singh] if you 
have any concerns specific to IMPALA-3833.

> ASAN heap-buffer-overflow in impala::HdfsRCFileScanner::GetCurrentKeyBuffer()
> -
>
> Key: IMPALA-7113
> URL: https://issues.apache.org/jira/browse/IMPALA-7113
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Rahul Shivu Mahadev
>Priority: Blocker
>  Labels: asan, broken-build
>
> [~pranay_singh] - I'm assigning this to you since you changed this code last 
> in IMPALA-3833.
> {noformat}
> ==31616==ERROR: AddressSanitizer: heap-buffer-overflow on address 
> 0x619002c94827 at pc 0x02293cf2 bp 0x7f653d570eb0 sp 0x7f653d570ea8
> READ of size 1 at 0x619002c94827 thread T125815
> #0 0x2293cf1 in impala::ReadWriteUtil::GetVLong(unsigned char*, long, 
> long*, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/read-write-util.h:200:31
> #1 0x2292114 in impala::ReadWriteUtil::GetVInt(unsigned char*, int*, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/read-write-util.h:184:13
> #2 0x228e5c6 in impala::HdfsRCFileScanner::GetCurrentKeyBuffer(int, bool, 
> unsigned char**, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-rcfile-scanner.cc:379:20
> #3 0x228ce07 in impala::HdfsRCFileScanner::ReadKeyBuffers() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-rcfile-scanner.cc:354:41
> #4 0x228b8a0 in impala::HdfsRCFileScanner::StartRowGroup() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-rcfile-scanner.cc:259:41
> #5 0x228f006 in 
> impala::HdfsRCFileScanner::ProcessRange(impala::RowBatch*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-rcfile-scanner.cc:531:41
> #6 0x3039cef in 
> impala::BaseSequenceScanner::GetNextInternal(impala::RowBatch*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/base-sequence-scanner.cc:181:19
> #7 0x225c891 in impala::HdfsScanner::ProcessSplit() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-scanner.cc:134:21
> #8 0x221ad33 in 
> impala::HdfsScanNode::ProcessSplit(std::vector std::allocator > const&, impala::MemPool*, 
> impala::io::ScanRange*, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-scan-node.cc:453:21
> #9 0x2219e50 in impala::HdfsScanNode::ScannerThread(bool, long) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-scan-node.cc:360:16
> #10 0x1c4ffb6 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:766:14
> #11 0x211216e in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/util/thread.cc:356:3
> #12 0x211d3f8 in void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> >::operator() const&, std::string const&, boost::function, impala::ThreadDebugInfo 
> const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, 
> void (*&)(std::string const&, std::string const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, 
> int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525:9
> #13 0x211d24b in boost::_bi::bind_t std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > >::operator()() 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20:16
> #14 0x377bf79 in thread_proxy 
> (/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x377bf79)
> #15 0x32d4a07850 in start_thread (/lib64/libpthread.so.0+0x32d4a07850)
> #16 0x32d46e894c in clone (/lib64/libc.so.6+0x32d46e894c)
> 0x619002c94827 is located 89 bytes to the left of 991-byte region 
> [0x619002c94880,0x619002c94c5f)
> allocated by thread T125815 

[jira] [Assigned] (IMPALA-7113) ASAN heap-buffer-overflow in impala::HdfsRCFileScanner::GetCurrentKeyBuffer()

2018-06-01 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil reassigned IMPALA-7113:
-

Assignee: Rahul Shivu Mahadev  (was: Pranay Singh)

> ASAN heap-buffer-overflow in impala::HdfsRCFileScanner::GetCurrentKeyBuffer()
> -
>
> Key: IMPALA-7113
> URL: https://issues.apache.org/jira/browse/IMPALA-7113
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Rahul Shivu Mahadev
>Priority: Blocker
>  Labels: asan, broken-build
>
> [~pranay_singh] - I'm assigning this to you since you changed this code last 
> in IMPALA-3833.
> {noformat}
> ==31616==ERROR: AddressSanitizer: heap-buffer-overflow on address 
> 0x619002c94827 at pc 0x02293cf2 bp 0x7f653d570eb0 sp 0x7f653d570ea8
> READ of size 1 at 0x619002c94827 thread T125815
> #0 0x2293cf1 in impala::ReadWriteUtil::GetVLong(unsigned char*, long, 
> long*, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/read-write-util.h:200:31
> #1 0x2292114 in impala::ReadWriteUtil::GetVInt(unsigned char*, int*, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/read-write-util.h:184:13
> #2 0x228e5c6 in impala::HdfsRCFileScanner::GetCurrentKeyBuffer(int, bool, 
> unsigned char**, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-rcfile-scanner.cc:379:20
> #3 0x228ce07 in impala::HdfsRCFileScanner::ReadKeyBuffers() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-rcfile-scanner.cc:354:41
> #4 0x228b8a0 in impala::HdfsRCFileScanner::StartRowGroup() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-rcfile-scanner.cc:259:41
> #5 0x228f006 in 
> impala::HdfsRCFileScanner::ProcessRange(impala::RowBatch*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-rcfile-scanner.cc:531:41
> #6 0x3039cef in 
> impala::BaseSequenceScanner::GetNextInternal(impala::RowBatch*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/base-sequence-scanner.cc:181:19
> #7 0x225c891 in impala::HdfsScanner::ProcessSplit() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-scanner.cc:134:21
> #8 0x221ad33 in 
> impala::HdfsScanNode::ProcessSplit(std::vector std::allocator > const&, impala::MemPool*, 
> impala::io::ScanRange*, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-scan-node.cc:453:21
> #9 0x2219e50 in impala::HdfsScanNode::ScannerThread(bool, long) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-scan-node.cc:360:16
> #10 0x1c4ffb6 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:766:14
> #11 0x211216e in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/util/thread.cc:356:3
> #12 0x211d3f8 in void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> >::operator() const&, std::string const&, boost::function, impala::ThreadDebugInfo 
> const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, 
> void (*&)(std::string const&, std::string const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, 
> int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525:9
> #13 0x211d24b in boost::_bi::bind_t std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > >::operator()() 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20:16
> #14 0x377bf79 in thread_proxy 
> (/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x377bf79)
> #15 0x32d4a07850 in start_thread (/lib64/libpthread.so.0+0x32d4a07850)
> #16 0x32d46e894c in clone (/lib64/libc.so.6+0x32d46e894c)
> 0x619002c94827 is located 89 bytes to the left of 991-byte region 
> [0x619002c94880,0x619002c94c5f)
> allocated by thread T125815 here:
> #0 0x1654e88 in operator new(unsigned long) 
> 

[jira] [Updated] (IMPALA-7072) Kudu's kinit does not support auth_to_local rules with Heimdal kerberos

2018-05-31 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-7072:
--
Summary: Kudu's kinit does not support auth_to_local rules with Heimdal 
kerberos  (was: Kudu's kinit does not support auth_to_config rules with Heimdal 
kerberos)

> Kudu's kinit does not support auth_to_local rules with Heimdal kerberos
> ---
>
> Key: IMPALA-7072
> URL: https://issues.apache.org/jira/browse/IMPALA-7072
> Project: IMPALA
>  Issue Type: Bug
>  Components: Security
>Affects Versions: Impala 2.12.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>
> On deployments that use Heimdal kerberos configured with 'auth_to_local' 
> rules set, and with the Impala startup flag 'use_kudu_kinit'= true, the 
> auth_to_local rules will not be respected as it's not supported with Kudu's 
> kinit.
> The implication of this is that from Impala 2.12.0 onwards, clusters with the 
> above configuration will not be able to use KRPC with kerberos enabled.
> A workaround is to get rid of the auth_to_local rules for such deployments.
> We need to have a good long term solution to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-31 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6990.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> TestClientSsl.test_tls_v12 failing due to Python SSL error
> --
>
> Key: IMPALA-6990
> URL: https://issues.apache.org/jira/browse/IMPALA-6990
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> We've seen quite a few jobs fail with the following error:
> *_ssl.c:504: EOF occurred in violation of protocol*
> {code:java}
> custom_cluster/test_client_ssl.py:128: in test_tls_v12
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
> E   Not connected to Impala, could not execute queries.
> {code}
> We need to investigate why this is happening and fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-30 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495416#comment-16495416
 ] 

Sailesh Mukil commented on IMPALA-6990:
---

[~philip] I missed a detail which was that this test never ran on RHEL6 due to 
all our RHEL6 machines having OpenSSL 1.0.0 which doesn't support TLSv1.2, 
causing them to be skipped.

On RHEL7, this used to work before the Thrift upgrade because the old Thrift 
cpp library (0.9.0) was somehow accepting TLSv1 connections even though we 
explicitly set TLSv1.2 on the server. I'm unable to figure out why that was 
happening, and it looks like a bug, but I'll keep looking. It could be a bug in 
the Python 'ssl' library, or the Thrift 0.9.0 python library, or the Thrift 
0.9.0 CPP library, or even OpenSSL.

In Thrift 0.9.3, we explicitly select TLSv1.2 if that's what the user specified 
which fixes the above mentioned bug. Our test caught this bug, since the client 
side doesn't support TLSv1.2 unless we are equipped with Python 2.7.9 or up.

As for a weaker test, we already run test_ssl() which is a weaker test as it 
doesn't enforce any ciphers or TLS versions which allows the client and server 
to negotiate a protocol that they're both aware of.

> TestClientSsl.test_tls_v12 failing due to Python SSL error
> --
>
> Key: IMPALA-6990
> URL: https://issues.apache.org/jira/browse/IMPALA-6990
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build, flaky
>
> We've seen quite a few jobs fail with the following error:
> *_ssl.c:504: EOF occurred in violation of protocol*
> {code:java}
> custom_cluster/test_client_ssl.py:128: in test_tls_v12
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
> E   Not connected to Impala, could not execute queries.
> {code}
> We need to investigate why this is happening and fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-29 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494124#comment-16494124
 ] 

Sailesh Mukil commented on IMPALA-6990:
---

Spent some more time looking at this and found that 'requests' wasn't the 
culprit.

When we upgraded to thrift-0.9.3, the TSSLSocket.py logic changed quite a bit. 
Our RHEL7 machines come equipped with Python 2.7.5. Looking at these comments, 
that means that we'll be unable to create a 'SSLContext' but able to explicitly 
specify ciphers:
https://github.com/apache/thrift/blob/master/lib/py/src/transport/TSSLSocket.py#L37-L41

{code:java}
# SSLContext is not available for Python < 2.7.9
_has_ssl_context = sys.hexversion >= 0x020709F0

# ciphers argument is not available for Python < 2.7.0
_has_ciphers = sys.hexversion >= 0x020700F0
{code}

If we cannot create a 'SSLContext', then we cannot use TLSv1.2 and have to use 
TLSv1:
https://github.com/apache/thrift/blob/master/lib/py/src/transport/TSSLSocket.py#L48-L49
{code:java}
# For python >= 2.7.9, use latest TLS that both client and server
# supports.
# SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
# For python < 2.7.9, use TLS 1.0 since TLSv1_X nor OP_NO_SSLvX is
# unavailable.
_default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else \
ssl.PROTOCOL_TLSv1
{code}

Our custom cluster test forces the server to use TLSv1.2 and also forces a 
specific cipher:
https://github.com/apache/impala/blob/master/tests/custom_cluster/test_client_ssl.py#L118-L119

So this combination of configurations causes a failure in RHEL7 because we only 
allow a specific cipher which works with TLSv1.2, but the client cannot use 
TLSv1.2 due to the Python version as mentioned above.

On systems lower than RHEL7, the machines come equipped with Python 2.6.6, 
which does not force the use of specific ciphers, so we get away without a 
failure.

To fix this, we either need to change the Python version on RHEL 7 to be >= 
Python 2.7.9, or reduce the 'test_client_ssl' limitation to run TLSv1.

The second option is the quickest, although not ideal, but it should at least 
unblock our builds while we can upgrade the AMIs for RHEL7.

> TestClientSsl.test_tls_v12 failing due to Python SSL error
> --
>
> Key: IMPALA-6990
> URL: https://issues.apache.org/jira/browse/IMPALA-6990
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build, flaky
>
> We've seen quite a few jobs fail with the following error:
> *_ssl.c:504: EOF occurred in violation of protocol*
> {code:java}
> custom_cluster/test_client_ssl.py:128: in test_tls_v12
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
> E   Not connected to Impala, could not execute queries.
> {code}
> We need to investigate why this is happening and fix it.



--
This message was sent by 

[jira] [Commented] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-25 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491132#comment-16491132
 ] 

Sailesh Mukil commented on IMPALA-6990:
---

Thanks to [~philip] for pointing out that this might be related to the thrift 
upgrade:
https://github.com/apache/impala/commit/0b6be850ca640f17aba3212f96993ca7f77fc0a7

The timelines match, and it seems that it might be an issue with the 'requests' 
package. We force tlsv1.2 in test_client_ssl, however, from this thread, it 
seems that 'requests' needs to be used in a different way to speak tlsv1.2.

I ran the test with tlsv1 on RHEL7 to confirm and I observed that it fixes the 
issue, however, that's not what we're looking for.

Since it's not how _we_ use 'requests', but rather how the python Thrift 
library might be using it, I need to look at it a little more to confirm a 
valid fix that will let us run on RHEL7 with TLSv1.2.

> TestClientSsl.test_tls_v12 failing due to Python SSL error
> --
>
> Key: IMPALA-6990
> URL: https://issues.apache.org/jira/browse/IMPALA-6990
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build, flaky
>
> We've seen quite a few jobs fail with the following error:
> *_ssl.c:504: EOF occurred in violation of protocol*
> {code:java}
> custom_cluster/test_client_ssl.py:128: in test_tls_v12
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
> E   Not connected to Impala, could not execute queries.
> {code}
> We need to investigate why this is happening and fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7072) Kudu's kinit does not support auth_to_config rules with Heimdal kerberos

2018-05-24 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489863#comment-16489863
 ] 

Sailesh Mukil commented on IMPALA-7072:
---

CC: [~kwho]

> Kudu's kinit does not support auth_to_config rules with Heimdal kerberos
> 
>
> Key: IMPALA-7072
> URL: https://issues.apache.org/jira/browse/IMPALA-7072
> Project: IMPALA
>  Issue Type: Bug
>  Components: Security
>Affects Versions: Impala 2.12.0
>Reporter: Sailesh Mukil
>Priority: Critical
>
> On deployments that use Heimdal kerberos configured with 'auth_to_local' 
> rules set, and with the Impala startup flag 'use_kudu_kinit'= true, the 
> auth_to_local rules will not be respected as it's not supported with Kudu's 
> kinit.
> The implication of this is that from Impala 2.12.0 onwards, clusters with the 
> above configuration will not be able to use KRPC with kerberos enabled.
> A workaround is to get rid of the auth_to_local rules for such deployments.
> We need to have a good long term solution to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7072) Kudu's kinit does not support auth_to_config rules with Heimdal kerberos

2018-05-24 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7072:
-

 Summary: Kudu's kinit does not support auth_to_config rules with 
Heimdal kerberos
 Key: IMPALA-7072
 URL: https://issues.apache.org/jira/browse/IMPALA-7072
 Project: IMPALA
  Issue Type: Bug
  Components: Security
Affects Versions: Impala 2.12.0
Reporter: Sailesh Mukil


On deployments that use Heimdal kerberos configured with 'auth_to_local' rules 
set, and with the Impala startup flag 'use_kudu_kinit'= true, the auth_to_local 
rules will not be respected as it's not supported with Kudu's kinit.

The implication of this is that from Impala 2.12.0 onwards, clusters with the 
above configuration will not be able to use KRPC with kerberos enabled.

A workaround is to get rid of the auth_to_local rules for such deployments.

We need to have a good long term solution to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7072) Kudu's kinit does not support auth_to_config rules with Heimdal kerberos

2018-05-24 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7072:
-

 Summary: Kudu's kinit does not support auth_to_config rules with 
Heimdal kerberos
 Key: IMPALA-7072
 URL: https://issues.apache.org/jira/browse/IMPALA-7072
 Project: IMPALA
  Issue Type: Bug
  Components: Security
Affects Versions: Impala 2.12.0
Reporter: Sailesh Mukil


On deployments that use Heimdal kerberos configured with 'auth_to_local' rules 
set, and with the Impala startup flag 'use_kudu_kinit'= true, the auth_to_local 
rules will not be respected as it's not supported with Kudu's kinit.

The implication of this is that from Impala 2.12.0 onwards, clusters with the 
above configuration will not be able to use KRPC with kerberos enabled.

A workaround is to get rid of the auth_to_local rules for such deployments.

We need to have a good long term solution to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6813) Hedged reads metrics broken when scanning non-HDFS based table

2018-05-24 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6813.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Hedged reads metrics broken when scanning non-HDFS based table
> --
>
> Key: IMPALA-6813
> URL: https://issues.apache.org/jira/browse/IMPALA-6813
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: Mostafa Mokhtar
>Assignee: Sailesh Mukil
>Priority: Blocker
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> When preads are enabled ADLS scans can fail updating the Hedged reads metrics
> {code}
> (gdb) bt
> #0  0x003346c32625 in raise () from /lib64/libc.so.6
> #1  0x003346c33e05 in abort () from /lib64/libc.so.6
> #2  0x7f185be140b5 in os::abort(bool) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #3  0x7f185bfb6443 in VMError::report_and_die() ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #4  0x7f185be195bf in JVM_handle_linux_signal ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #5  0x7f185be0fb03 in signalHandler(int, siginfo*, void*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #6  
> #7  0x7f185bbc1a7b in jni_invoke_nonstatic(JNIEnv_*, JavaValue*, 
> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #8  0x7f185bbc7e81 in jni_CallObjectMethodV ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #9  0x0212e2b7 in invokeMethod ()
> #10 0x02131297 in hdfsGetHedgedReadMetrics ()
> #11 0x011601c0 in impala::io::ScanRange::Close() ()
> #12 0x01158a95 in 
> impala::io::DiskIoMgr::HandleReadFinished(impala::io::DiskIoMgr::DiskQueue*, 
> impala::io::RequestContext*, std::unique_ptr std::default_delete >) ()
> #13 0x01158e1c in 
> impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, 
> impala:---Type  to continue, or q  to quit---
> :io::RequestContext*, impala::io::ScanRange*) ()
> #14 0x01159052 in 
> impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
> #15 0x00d5fcaf in 
> impala::Thread::SuperviseThread(std::basic_string std::char_traits, std::allocator > const&, 
> std::basic_string 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #16 0x00d604aa in boost::detail::thread_data void (*)(std::basic_string > const&, std::basic_string std::allocator > const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value std::allocator > >, boost::_bi::value, 
> boost::_bi::value, 
> boost::_bi::value > > >::run() ()
> #17 0x012d6dfa in ?? ()
> #18 0x003347007aa1 in start_thread () from /lib64/libpthread.so.0
> #19 0x003346ce893d in clone () from /lib64/libc.so.6
> {code}
> {code}
> CREATE TABLE adls.lineitem (
>   l_orderkey BIGINT,
>   l_partkey BIGINT,
>   l_suppkey BIGINT,
>   l_linenumber BIGINT,
>   l_quantity DOUBLE,
>   l_extendedprice DOUBLE,
>   l_discount DOUBLE,
>   l_tax DOUBLE,
>   l_returnflag STRING,
>   l_linestatus STRING,
>   l_commitdate STRING,
>   l_receiptdate STRING,
>   l_shipinstruct STRING,
>   l_shipmode STRING,
>   l_comment STRING,
>   l_shipdate STRING
> )
> STORED AS PARQUET
> LOCATION 'adl://foo.azuredatalakestore.net/adls-test.db/lineitem'
> {code}
> select * from adls.lineitem limit 10;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6813) Hedged reads metrics broken when scanning non-HDFS based table

2018-05-24 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6813.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Hedged reads metrics broken when scanning non-HDFS based table
> --
>
> Key: IMPALA-6813
> URL: https://issues.apache.org/jira/browse/IMPALA-6813
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: Mostafa Mokhtar
>Assignee: Sailesh Mukil
>Priority: Blocker
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> When preads are enabled ADLS scans can fail updating the Hedged reads metrics
> {code}
> (gdb) bt
> #0  0x003346c32625 in raise () from /lib64/libc.so.6
> #1  0x003346c33e05 in abort () from /lib64/libc.so.6
> #2  0x7f185be140b5 in os::abort(bool) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #3  0x7f185bfb6443 in VMError::report_and_die() ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #4  0x7f185be195bf in JVM_handle_linux_signal ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #5  0x7f185be0fb03 in signalHandler(int, siginfo*, void*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #6  
> #7  0x7f185bbc1a7b in jni_invoke_nonstatic(JNIEnv_*, JavaValue*, 
> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #8  0x7f185bbc7e81 in jni_CallObjectMethodV ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #9  0x0212e2b7 in invokeMethod ()
> #10 0x02131297 in hdfsGetHedgedReadMetrics ()
> #11 0x011601c0 in impala::io::ScanRange::Close() ()
> #12 0x01158a95 in 
> impala::io::DiskIoMgr::HandleReadFinished(impala::io::DiskIoMgr::DiskQueue*, 
> impala::io::RequestContext*, std::unique_ptr std::default_delete >) ()
> #13 0x01158e1c in 
> impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, 
> impala:---Type  to continue, or q  to quit---
> :io::RequestContext*, impala::io::ScanRange*) ()
> #14 0x01159052 in 
> impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
> #15 0x00d5fcaf in 
> impala::Thread::SuperviseThread(std::basic_string std::char_traits, std::allocator > const&, 
> std::basic_string 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #16 0x00d604aa in boost::detail::thread_data void (*)(std::basic_string > const&, std::basic_string std::allocator > const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value std::allocator > >, boost::_bi::value, 
> boost::_bi::value, 
> boost::_bi::value > > >::run() ()
> #17 0x012d6dfa in ?? ()
> #18 0x003347007aa1 in start_thread () from /lib64/libpthread.so.0
> #19 0x003346ce893d in clone () from /lib64/libc.so.6
> {code}
> {code}
> CREATE TABLE adls.lineitem (
>   l_orderkey BIGINT,
>   l_partkey BIGINT,
>   l_suppkey BIGINT,
>   l_linenumber BIGINT,
>   l_quantity DOUBLE,
>   l_extendedprice DOUBLE,
>   l_discount DOUBLE,
>   l_tax DOUBLE,
>   l_returnflag STRING,
>   l_linestatus STRING,
>   l_commitdate STRING,
>   l_receiptdate STRING,
>   l_shipinstruct STRING,
>   l_shipmode STRING,
>   l_comment STRING,
>   l_shipdate STRING
> )
> STORED AS PARQUET
> LOCATION 'adl://foo.azuredatalakestore.net/adls-test.db/lineitem'
> {code}
> select * from adls.lineitem limit 10;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6998) test_bloom_wait_time fails due to late arrival of filters on Isilon

2018-05-24 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6998.
---
   Resolution: Fixed
Fix Version/s: Impala 2.13.0

> test_bloom_wait_time fails due to late arrival of filters on Isilon
> ---
>
> Key: IMPALA-6998
> URL: https://issues.apache.org/jira/browse/IMPALA-6998
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.13.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 2.13.0
>
>
> This is likely a flaky issue and was seen on an instance of an Isilon run:
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:92: in test_bloom_wait_time assert 
> duration < 60, \ E   AssertionError: Query took too long (118.044356108s, 
> possibly waiting for missing filters?) E   assert 118.04435610771179 < 60
> Stacktrace
> query_test/test_runtime_filters.py:92: in test_bloom_wait_time
> assert duration < 60, \
> E   AssertionError: Query took too long (118.044356108s, possibly waiting for 
> missing filters?)
> E   assert 118.04435610771179 < 60
> Standard Error
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_WAIT_TIME_MS=60;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MODE=GLOBAL;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MAX_SIZE=64K;
> -- executing against localhost:21000
> with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
> select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a
> join (select * from l LIMIT 50) b on a.l_orderkey = -b.l_orderkey;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_WAIT_TIME_MS="0";
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MODE="GLOBAL";
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MAX_SIZE="16777216";
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IMPALA-4970) Record identity of largest latency ExecQueryFInstances() RPC per query

2018-05-21 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-4970:
--
Labels: newbie ramp-up  (was: newbie)

> Record identity of largest latency ExecQueryFInstances() RPC per query
> --
>
> Key: IMPALA-4970
> URL: https://issues.apache.org/jira/browse/IMPALA-4970
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Rahul Shivu Mahadev
>Priority: Major
>  Labels: newbie, ramp-up
>
> Although we retain the histogram of fragment instance startup latencies, we 
> don't record the identity of the most expensive instance, or the host it runs 
> on. This would be helpful in diagnosing slow query start-up times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-4970) Record identity of largest latency ExecQueryFInstances() RPC per query

2018-05-21 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil reassigned IMPALA-4970:
-

Assignee: Rahul Shivu Mahadev

> Record identity of largest latency ExecQueryFInstances() RPC per query
> --
>
> Key: IMPALA-4970
> URL: https://issues.apache.org/jira/browse/IMPALA-4970
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.9.0
>Reporter: Henry Robinson
>Assignee: Rahul Shivu Mahadev
>Priority: Major
>  Labels: newbie
>
> Although we retain the histogram of fragment instance startup latencies, we 
> don't record the identity of the most expensive instance, or the host it runs 
> on. This would be helpful in diagnosing slow query start-up times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7005) The 3.0 changelog page should mention the previous version

2018-05-14 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474698#comment-16474698
 ] 

Sailesh Mukil commented on IMPALA-7005:
---

[~arodoni_cloudera] 
https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release

Step 15 in the above link should have what you need.

> The 3.0 changelog page should mention the previous version
> --
>
> Key: IMPALA-7005
> URL: https://issues.apache.org/jira/browse/IMPALA-7005
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 3.0
>Reporter: Lars Volker
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: website
>
> It's not clear to me from looking at the [3.0 
> changlog|http://impala.apache.org/docs/changelog-3.0.html] what base version 
> it compares to. Can we add a sentence to point out that these changes are in 
> comparison to 2.11 (are they?)?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7008) TestSpillingDebugActionDimensions.test_spilling test setup fails

2018-05-10 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16470686#comment-16470686
 ] 

Sailesh Mukil commented on IMPALA-7008:
---

Thanks, Tim. [~dknupp], as part of IMPALA-6600, could you also see if the 
information from this JIRA helps you resolve both the issues?

> TestSpillingDebugActionDimensions.test_spilling test setup fails
> 
>
> Key: IMPALA-7008
> URL: https://issues.apache.org/jira/browse/IMPALA-7008
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.13.0
>Reporter: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build
>
> We've seen multiple instances of this test failing with the following error:
> {code:java}
> Error Message
> test setup failure
> Stacktrace
> Slave 'gw0' crashed while running 
> "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none]"
> {code}
> We need to investigate why this is happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6998) test_bloom_wait_time fails due to late arrival of filters on Isilon

2018-05-10 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-6998:
--
Summary: test_bloom_wait_time fails due to late arrival of filters on 
Isilon  (was: test_bloom_wait_time fails due to late arrival of filters)

> test_bloom_wait_time fails due to late arrival of filters on Isilon
> ---
>
> Key: IMPALA-6998
> URL: https://issues.apache.org/jira/browse/IMPALA-6998
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.13.0
>Reporter: Sailesh Mukil
>Priority: Critical
>  Labels: broken-build
>
> This is likely a flaky issue and was seen on an instance of an Isilon run:
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:92: in test_bloom_wait_time assert 
> duration < 60, \ E   AssertionError: Query took too long (118.044356108s, 
> possibly waiting for missing filters?) E   assert 118.04435610771179 < 60
> Stacktrace
> query_test/test_runtime_filters.py:92: in test_bloom_wait_time
> assert duration < 60, \
> E   AssertionError: Query took too long (118.044356108s, possibly waiting for 
> missing filters?)
> E   assert 118.04435610771179 < 60
> Standard Error
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_WAIT_TIME_MS=60;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MODE=GLOBAL;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MAX_SIZE=64K;
> -- executing against localhost:21000
> with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
> select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a
> join (select * from l LIMIT 50) b on a.l_orderkey = -b.l_orderkey;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_WAIT_TIME_MS="0";
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MODE="GLOBAL";
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MAX_SIZE="16777216";
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7008) TestSpillingDebugActionDimensions.test_spilling test setup fails

2018-05-10 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-7008:
--
Labels: broken-build  (was: )

> TestSpillingDebugActionDimensions.test_spilling test setup fails
> 
>
> Key: IMPALA-7008
> URL: https://issues.apache.org/jira/browse/IMPALA-7008
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.13.0
>Reporter: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build
>
> We've seen multiple instances of this test failing with the following error:
> {code:java}
> Error Message
> test setup failure
> Stacktrace
> Slave 'gw0' crashed while running 
> "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none]"
> {code}
> We need to investigate why this is happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7009) test_drop_table_with_purge fails on Isilon

2018-05-10 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7009:
-

 Summary: test_drop_table_with_purge fails on Isilon
 Key: IMPALA-7009
 URL: https://issues.apache.org/jira/browse/IMPALA-7009
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.0, Impala 2.13.0
Reporter: Sailesh Mukil


We've seen multiple failures of test_drop_table_with_purge
{code:java}

metadata.test_ddl.TestDdlStatements.test_drop_table_with_purge (from pytest)

Failing for the past 1 build (Since Failed#22 )
Took 18 sec.
add description

Error Message
metadata/test_ddl.py:72: in test_drop_table_with_purge assert not 
self.filesystem_client.exists(\ E   assert not True E+  where True = >('user/jenkins/.Trash/Current/test-warehouse/test_drop_table_with_purge_58c75c18.db/t2')
 E+where > = 
.exists E
+  where  = .filesystem_client 
E+and   
'user/jenkins/.Trash/Current/test-warehouse/test_drop_table_with_purge_58c75c18.db/t2'
 = ('jenkins', 
'test_drop_table_with_purge_58c75c18') E+  where  = 
'user/{0}/.Trash/Current/test-warehouse/{1}.db/t2'.format E+  and   
'jenkins' = () E+where  = getpass.getuser
Stacktrace
metadata/test_ddl.py:72: in test_drop_table_with_purge
assert not self.filesystem_client.exists(\
E   assert not True
E+  where True = >('user/jenkins/.Trash/Current/test-warehouse/test_drop_table_with_purge_58c75c18.db/t2')
E+where > = 
.exists
E+  where  = .filesystem_client
E+and   
'user/jenkins/.Trash/Current/test-warehouse/test_drop_table_with_purge_58c75c18.db/t2'
 = ('jenkins', 
'test_drop_table_with_purge_58c75c18')
E+  where  = 
'user/{0}/.Trash/Current/test-warehouse/{1}.db/t2'.format
E+  and   'jenkins' = ()
E+where  = getpass.getuser

Standard Error
-- connecting to: localhost:21000
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_drop_table_with_purge_58c75c18` CASCADE;

SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_drop_table_with_purge_58c75c18`;

MainThread: Created database "test_drop_table_with_purge_58c75c18" for test ID 
"metadata/test_ddl.py::TestDdlStatements::()::test_drop_table_with_purge"
-- executing against localhost:21000
create table test_drop_table_with_purge_58c75c18.t1(i int);

-- executing against localhost:21000
create table test_drop_table_with_purge_58c75c18.t2(i int);

MainThread: Starting new HTTP connection (1): 10.17.95.12
MainThread: Starting new HTTP connection (1): 10.17.95.12
MainThread: Starting new HTTP connection (1): 10.17.95.12
MainThread: Starting new HTTP connection (1): 10.17.95.12
-- executing against localhost:21000
drop table test_drop_table_with_purge_58c75c18.t1;

MainThread: Starting new HTTP connection (1): 10.17.95.12
MainThread: Starting new HTTP connection (1): 10.17.95.12
MainThread: Starting new HTTP connection (1): 10.17.95.12
MainThread: Starting new HTTP connection (1): 10.17.95.12
-- executing against localhost:21000
drop table test_drop_table_with_purge_58c75c18.t2 purge;

MainThread: Starting new HTTP connection (1): 10.17.95.12
MainThread: Starting new HTTP connection (1): 10.17.95.12
MainThread: Starting new HTTP connection (1): 10.17.95.12
MainThread: Starting new HTTP connection (1): 10.17.95.12
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7008) TestSpillingDebugActionDimensions.test_spilling test setup fails

2018-05-10 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-7008:
-

 Summary: TestSpillingDebugActionDimensions.test_spilling test 
setup fails
 Key: IMPALA-7008
 URL: https://issues.apache.org/jira/browse/IMPALA-7008
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.0, Impala 2.13.0
Reporter: Sailesh Mukil


We've seen multiple instances of this test failing with the following error:


{code:java}

Error Message
test setup failure

Stacktrace
Slave 'gw0' crashed while running 
"query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
 {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
table_format: parquet/none]"
{code}

We need to investigate why this is happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6975) TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded

2018-05-08 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6975.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded
> -
>
> Key: IMPALA-6975
> URL: https://issues.apache.org/jira/browse/IMPALA-6975
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>  Labels: build-failure
> Fix For: Impala 3.1.0
>
>
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:171: in test_row_filters 
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)}) 
> common/impala_test_suite.py:405: in run_test_case result = 
> self.__execute_query(target_impalad_client, query, user=user) 
> common/impala_test_suite.py:620: in __execute_query return 
> impalad_client.execute(query, user=user) common/impala_connection.py:160: in 
> execute return self.__beeswax_client.execute(sql_stmt, user=user) 
> beeswax/impala_beeswax.py:173: in execute handle = 
> self.__execute_query(query_string.strip(), user=user) 
> beeswax/impala_beeswax.py:341: in __execute_query 
> self.wait_for_completion(handle) beeswax/impala_beeswax.py:361: in 
> wait_for_completion raise ImpalaBeeswaxException("Query aborted:" + 
> error_log, None) E   ImpalaBeeswaxException: ImpalaBeeswaxException: E
> Query aborted:Memory limit exceeded: ParquetColumnReader::ReadDataPage() 
> failed to allocate 65533 bytes for decompressed data. E   HDFS_SCAN_NODE 
> (id=0) could not allocate 64.00 KB without exceeding limit. E   Error 
> occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   AGGREGATION_NODE (id=3): 
> Total=4.00 KB Peak=4.00 KB E Exprs: Total=4.00 KB Peak=4.00 KB E  
>  HASH_JOIN_NODE (id=2): Reservation=132.00 MB OtherMemory=198.25 KB 
> Total=132.19 MB Peak=132.19 MB E Exprs: Total=25.12 KB Peak=25.12 KB 
> E Hash Join Builder (join_node_id=2): Total=157.12 KB Peak=157.12 KB 
> E   Hash Join Builder (join_node_id=2) Exprs: Total=149.12 KB 
> Peak=149.12 KB E   EXCHANGE_NODE (id=4): Reservation=14.91 MB 
> OtherMemory=67.76 KB Total=14.97 MB Peak=14.97 MB E KrpcDeferredRpcs: 
> Total=67.76 KB Peak=67.76 KB E   EXCHANGE_NODE (id=5): Reservation=15.03 
> MB OtherMemory=0 Total=15.03 MB Peak=15.03 MB E KrpcDeferredRpcs: 
> Total=0 Peak=45.12 KB E   KrpcDataStreamSender (dst_id=6): Total=16.00 KB 
> Peak=16.00 KB E Fragment f9423526590ed30b:732f1db10001: 
> Reservation=26.00 MB OtherMemory=9.75 MB Total=35.75 MB Peak=35.75 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   HDFS_SCAN_NODE (id=0): 
> Reservation=24.00 MB OtherMemory=9.30 MB Total=33.30 MB Peak=33.30 MB E   
>   Exprs: Total=260.00 KB Peak=260.00 KB E   KrpcDataStreamSender 
> (dst_id=4): Total=426.57 KB Peak=458.57 KB E KrpcDataStreamSender 
> (dst_id=4) Exprs: Total=256.00 KB Peak=256.00 KB E Fragment 
> f9423526590ed30b:732f1db10004: Reservation=0 OtherMemory=0 Total=0 
> Peak=29.59 MB E   HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 
> Total=0 Peak=29.32 MB E   KrpcDataStreamSender (dst_id=5): Total=0 
> Peak=266.57 KB EE   Memory limit exceeded: 
> ParquetColumnReader::ReadDataPage() failed to allocate 65533 bytes for 
> decompressed data. E   HDFS_SCAN_NODE (id=0) could not allocate 64.00 KB 
> without exceeding limit. E   Error occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   

[jira] [Resolved] (IMPALA-6975) TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded

2018-05-08 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6975.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded
> -
>
> Key: IMPALA-6975
> URL: https://issues.apache.org/jira/browse/IMPALA-6975
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>  Labels: build-failure
> Fix For: Impala 3.1.0
>
>
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:171: in test_row_filters 
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)}) 
> common/impala_test_suite.py:405: in run_test_case result = 
> self.__execute_query(target_impalad_client, query, user=user) 
> common/impala_test_suite.py:620: in __execute_query return 
> impalad_client.execute(query, user=user) common/impala_connection.py:160: in 
> execute return self.__beeswax_client.execute(sql_stmt, user=user) 
> beeswax/impala_beeswax.py:173: in execute handle = 
> self.__execute_query(query_string.strip(), user=user) 
> beeswax/impala_beeswax.py:341: in __execute_query 
> self.wait_for_completion(handle) beeswax/impala_beeswax.py:361: in 
> wait_for_completion raise ImpalaBeeswaxException("Query aborted:" + 
> error_log, None) E   ImpalaBeeswaxException: ImpalaBeeswaxException: E
> Query aborted:Memory limit exceeded: ParquetColumnReader::ReadDataPage() 
> failed to allocate 65533 bytes for decompressed data. E   HDFS_SCAN_NODE 
> (id=0) could not allocate 64.00 KB without exceeding limit. E   Error 
> occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   AGGREGATION_NODE (id=3): 
> Total=4.00 KB Peak=4.00 KB E Exprs: Total=4.00 KB Peak=4.00 KB E  
>  HASH_JOIN_NODE (id=2): Reservation=132.00 MB OtherMemory=198.25 KB 
> Total=132.19 MB Peak=132.19 MB E Exprs: Total=25.12 KB Peak=25.12 KB 
> E Hash Join Builder (join_node_id=2): Total=157.12 KB Peak=157.12 KB 
> E   Hash Join Builder (join_node_id=2) Exprs: Total=149.12 KB 
> Peak=149.12 KB E   EXCHANGE_NODE (id=4): Reservation=14.91 MB 
> OtherMemory=67.76 KB Total=14.97 MB Peak=14.97 MB E KrpcDeferredRpcs: 
> Total=67.76 KB Peak=67.76 KB E   EXCHANGE_NODE (id=5): Reservation=15.03 
> MB OtherMemory=0 Total=15.03 MB Peak=15.03 MB E KrpcDeferredRpcs: 
> Total=0 Peak=45.12 KB E   KrpcDataStreamSender (dst_id=6): Total=16.00 KB 
> Peak=16.00 KB E Fragment f9423526590ed30b:732f1db10001: 
> Reservation=26.00 MB OtherMemory=9.75 MB Total=35.75 MB Peak=35.75 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   HDFS_SCAN_NODE (id=0): 
> Reservation=24.00 MB OtherMemory=9.30 MB Total=33.30 MB Peak=33.30 MB E   
>   Exprs: Total=260.00 KB Peak=260.00 KB E   KrpcDataStreamSender 
> (dst_id=4): Total=426.57 KB Peak=458.57 KB E KrpcDataStreamSender 
> (dst_id=4) Exprs: Total=256.00 KB Peak=256.00 KB E Fragment 
> f9423526590ed30b:732f1db10004: Reservation=0 OtherMemory=0 Total=0 
> Peak=29.59 MB E   HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 
> Total=0 Peak=29.32 MB E   KrpcDataStreamSender (dst_id=5): Total=0 
> Peak=266.57 KB EE   Memory limit exceeded: 
> ParquetColumnReader::ReadDataPage() failed to allocate 65533 bytes for 
> decompressed data. E   HDFS_SCAN_NODE (id=0) could not allocate 64.00 KB 
> without exceeding limit. E   Error occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   

[jira] [Assigned] (IMPALA-6975) TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded

2018-05-08 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil reassigned IMPALA-6975:
-

Assignee: Sailesh Mukil

> TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded
> -
>
> Key: IMPALA-6975
> URL: https://issues.apache.org/jira/browse/IMPALA-6975
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>  Labels: build-failure
>
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:171: in test_row_filters 
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)}) 
> common/impala_test_suite.py:405: in run_test_case result = 
> self.__execute_query(target_impalad_client, query, user=user) 
> common/impala_test_suite.py:620: in __execute_query return 
> impalad_client.execute(query, user=user) common/impala_connection.py:160: in 
> execute return self.__beeswax_client.execute(sql_stmt, user=user) 
> beeswax/impala_beeswax.py:173: in execute handle = 
> self.__execute_query(query_string.strip(), user=user) 
> beeswax/impala_beeswax.py:341: in __execute_query 
> self.wait_for_completion(handle) beeswax/impala_beeswax.py:361: in 
> wait_for_completion raise ImpalaBeeswaxException("Query aborted:" + 
> error_log, None) E   ImpalaBeeswaxException: ImpalaBeeswaxException: E
> Query aborted:Memory limit exceeded: ParquetColumnReader::ReadDataPage() 
> failed to allocate 65533 bytes for decompressed data. E   HDFS_SCAN_NODE 
> (id=0) could not allocate 64.00 KB without exceeding limit. E   Error 
> occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   AGGREGATION_NODE (id=3): 
> Total=4.00 KB Peak=4.00 KB E Exprs: Total=4.00 KB Peak=4.00 KB E  
>  HASH_JOIN_NODE (id=2): Reservation=132.00 MB OtherMemory=198.25 KB 
> Total=132.19 MB Peak=132.19 MB E Exprs: Total=25.12 KB Peak=25.12 KB 
> E Hash Join Builder (join_node_id=2): Total=157.12 KB Peak=157.12 KB 
> E   Hash Join Builder (join_node_id=2) Exprs: Total=149.12 KB 
> Peak=149.12 KB E   EXCHANGE_NODE (id=4): Reservation=14.91 MB 
> OtherMemory=67.76 KB Total=14.97 MB Peak=14.97 MB E KrpcDeferredRpcs: 
> Total=67.76 KB Peak=67.76 KB E   EXCHANGE_NODE (id=5): Reservation=15.03 
> MB OtherMemory=0 Total=15.03 MB Peak=15.03 MB E KrpcDeferredRpcs: 
> Total=0 Peak=45.12 KB E   KrpcDataStreamSender (dst_id=6): Total=16.00 KB 
> Peak=16.00 KB E Fragment f9423526590ed30b:732f1db10001: 
> Reservation=26.00 MB OtherMemory=9.75 MB Total=35.75 MB Peak=35.75 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   HDFS_SCAN_NODE (id=0): 
> Reservation=24.00 MB OtherMemory=9.30 MB Total=33.30 MB Peak=33.30 MB E   
>   Exprs: Total=260.00 KB Peak=260.00 KB E   KrpcDataStreamSender 
> (dst_id=4): Total=426.57 KB Peak=458.57 KB E KrpcDataStreamSender 
> (dst_id=4) Exprs: Total=256.00 KB Peak=256.00 KB E Fragment 
> f9423526590ed30b:732f1db10004: Reservation=0 OtherMemory=0 Total=0 
> Peak=29.59 MB E   HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 
> Total=0 Peak=29.32 MB E   KrpcDataStreamSender (dst_id=5): Total=0 
> Peak=266.57 KB EE   Memory limit exceeded: 
> ParquetColumnReader::ReadDataPage() failed to allocate 65533 bytes for 
> decompressed data. E   HDFS_SCAN_NODE (id=0) could not allocate 64.00 KB 
> without exceeding limit. E   Error occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   
> Runtime Filter Bank: Reservation=2.00 MB 

[jira] [Updated] (IMPALA-6998) test_bloom_wait_time fails due to late arrival of filters

2018-05-08 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-6998:
--
Description: 
This is likely a flaky issue and was seen on an instance of an Isilon run:

{code:java}
Error Message
query_test/test_runtime_filters.py:92: in test_bloom_wait_time assert 
duration < 60, \ E   AssertionError: Query took too long (118.044356108s, 
possibly waiting for missing filters?) E   assert 118.04435610771179 < 60
Stacktrace
query_test/test_runtime_filters.py:92: in test_bloom_wait_time
assert duration < 60, \
E   AssertionError: Query took too long (118.044356108s, possibly waiting for 
missing filters?)
E   assert 118.04435610771179 < 60
Standard Error
-- executing against localhost:21000
use functional_parquet;

SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000

SET RUNTIME_FILTER_WAIT_TIME_MS=60;

-- executing against localhost:21000

SET RUNTIME_FILTER_MODE=GLOBAL;

-- executing against localhost:21000

SET RUNTIME_FILTER_MAX_SIZE=64K;

-- executing against localhost:21000

with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a
join (select * from l LIMIT 50) b on a.l_orderkey = -b.l_orderkey;

-- executing against localhost:21000
SET RUNTIME_FILTER_WAIT_TIME_MS="0";

-- executing against localhost:21000
SET RUNTIME_FILTER_MODE="GLOBAL";

-- executing against localhost:21000
SET RUNTIME_FILTER_MAX_SIZE="16777216";
{code}


  was:
This is likely a flaky issue:

{code:java}
Error Message
query_test/test_runtime_filters.py:92: in test_bloom_wait_time assert 
duration < 60, \ E   AssertionError: Query took too long (118.044356108s, 
possibly waiting for missing filters?) E   assert 118.04435610771179 < 60
Stacktrace
query_test/test_runtime_filters.py:92: in test_bloom_wait_time
assert duration < 60, \
E   AssertionError: Query took too long (118.044356108s, possibly waiting for 
missing filters?)
E   assert 118.04435610771179 < 60
Standard Error
-- executing against localhost:21000
use functional_parquet;

SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000

SET RUNTIME_FILTER_WAIT_TIME_MS=60;

-- executing against localhost:21000

SET RUNTIME_FILTER_MODE=GLOBAL;

-- executing against localhost:21000

SET RUNTIME_FILTER_MAX_SIZE=64K;

-- executing against localhost:21000

with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a
join (select * from l LIMIT 50) b on a.l_orderkey = -b.l_orderkey;

-- executing against localhost:21000
SET RUNTIME_FILTER_WAIT_TIME_MS="0";

-- executing against localhost:21000
SET RUNTIME_FILTER_MODE="GLOBAL";

-- executing against localhost:21000
SET RUNTIME_FILTER_MAX_SIZE="16777216";
{code}



> test_bloom_wait_time fails due to late arrival of filters
> -
>
> Key: IMPALA-6998
> URL: https://issues.apache.org/jira/browse/IMPALA-6998
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.13.0
>Reporter: Sailesh Mukil
>Priority: Critical
>  Labels: broken-build
>
> This is likely a flaky issue and was seen on an instance of an Isilon run:
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:92: in test_bloom_wait_time assert 
> duration < 60, \ E   AssertionError: Query took too long (118.044356108s, 
> possibly waiting for missing filters?) E   assert 118.04435610771179 < 60
> Stacktrace
> query_test/test_runtime_filters.py:92: in test_bloom_wait_time
> assert duration < 60, \
> E   AssertionError: Query took too long (118.044356108s, possibly waiting for 
> missing filters?)
> E   assert 118.04435610771179 < 60
> Standard Error
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_WAIT_TIME_MS=60;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MODE=GLOBAL;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MAX_SIZE=64K;
> -- executing against localhost:21000
> with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
> select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a
> join (select * from l LIMIT 50) 

[jira] [Created] (IMPALA-6998) test_bloom_wait_time fails due to late arrival of filters

2018-05-08 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6998:
-

 Summary: test_bloom_wait_time fails due to late arrival of filters
 Key: IMPALA-6998
 URL: https://issues.apache.org/jira/browse/IMPALA-6998
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.13.0
Reporter: Sailesh Mukil


This is likely a flaky issue:

{code:java}
Error Message
query_test/test_runtime_filters.py:92: in test_bloom_wait_time assert 
duration < 60, \ E   AssertionError: Query took too long (118.044356108s, 
possibly waiting for missing filters?) E   assert 118.04435610771179 < 60
Stacktrace
query_test/test_runtime_filters.py:92: in test_bloom_wait_time
assert duration < 60, \
E   AssertionError: Query took too long (118.044356108s, possibly waiting for 
missing filters?)
E   assert 118.04435610771179 < 60
Standard Error
-- executing against localhost:21000
use functional_parquet;

SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000

SET RUNTIME_FILTER_WAIT_TIME_MS=60;

-- executing against localhost:21000

SET RUNTIME_FILTER_MODE=GLOBAL;

-- executing against localhost:21000

SET RUNTIME_FILTER_MAX_SIZE=64K;

-- executing against localhost:21000

with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a
join (select * from l LIMIT 50) b on a.l_orderkey = -b.l_orderkey;

-- executing against localhost:21000
SET RUNTIME_FILTER_WAIT_TIME_MS="0";

-- executing against localhost:21000
SET RUNTIME_FILTER_MODE="GLOBAL";

-- executing against localhost:21000
SET RUNTIME_FILTER_MAX_SIZE="16777216";
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6998) test_bloom_wait_time fails due to late arrival of filters

2018-05-08 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6998:
-

 Summary: test_bloom_wait_time fails due to late arrival of filters
 Key: IMPALA-6998
 URL: https://issues.apache.org/jira/browse/IMPALA-6998
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.13.0
Reporter: Sailesh Mukil


This is likely a flaky issue:

{code:java}
Error Message
query_test/test_runtime_filters.py:92: in test_bloom_wait_time assert 
duration < 60, \ E   AssertionError: Query took too long (118.044356108s, 
possibly waiting for missing filters?) E   assert 118.04435610771179 < 60
Stacktrace
query_test/test_runtime_filters.py:92: in test_bloom_wait_time
assert duration < 60, \
E   AssertionError: Query took too long (118.044356108s, possibly waiting for 
missing filters?)
E   assert 118.04435610771179 < 60
Standard Error
-- executing against localhost:21000
use functional_parquet;

SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000

SET RUNTIME_FILTER_WAIT_TIME_MS=60;

-- executing against localhost:21000

SET RUNTIME_FILTER_MODE=GLOBAL;

-- executing against localhost:21000

SET RUNTIME_FILTER_MAX_SIZE=64K;

-- executing against localhost:21000

with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a
join (select * from l LIMIT 50) b on a.l_orderkey = -b.l_orderkey;

-- executing against localhost:21000
SET RUNTIME_FILTER_WAIT_TIME_MS="0";

-- executing against localhost:21000
SET RUNTIME_FILTER_MODE="GLOBAL";

-- executing against localhost:21000
SET RUNTIME_FILTER_MAX_SIZE="16777216";
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6908) IsConnResetTException() should include ECONNRESET

2018-05-08 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6908.
---
   Resolution: Fixed
Fix Version/s: Impala 2.13.0

> IsConnResetTException() should include ECONNRESET
> -
>
> Key: IMPALA-6908
> URL: https://issues.apache.org/jira/browse/IMPALA-6908
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Sailesh Mukil
>Priority: Major
> Fix For: Impala 2.13.0
>
>
> {{IsConnReset()}} aims to check if the given exception is due to a stale 
> connection. Apparently, it's missing the case in which the error code is 
> ECONNRESET.
> {noformat}
> bool IsConnResetTException(const TTransportException& e) {
>   // Strings taken from TTransport::readAll(). This happens iff 
> TSocket::read() returns 0.
>   // As readAll() is reading non-zero length payload, this can only mean 
> recv() called
>   // by read() returns 0. According to man page of recv(), this implies a 
> stream socket
>   // peer has performed an orderly shutdown.
>   return (e.getType() == TTransportException::END_OF_FILE &&
>  strstr(e.what(), "No more data to read.") != nullptr) ||
>  (e.getType() == TTransportException::INTERNAL_ERROR &&
>  strstr(e.what(), "SSL_read: Connection reset by peer") != 
> nullptr);
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6908) IsConnResetTException() should include ECONNRESET

2018-05-08 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6908.
---
   Resolution: Fixed
Fix Version/s: Impala 2.13.0

> IsConnResetTException() should include ECONNRESET
> -
>
> Key: IMPALA-6908
> URL: https://issues.apache.org/jira/browse/IMPALA-6908
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Sailesh Mukil
>Priority: Major
> Fix For: Impala 2.13.0
>
>
> {{IsConnReset()}} aims to check if the given exception is due to a stale 
> connection. Apparently, it's missing the case in which the error code is 
> ECONNRESET.
> {noformat}
> bool IsConnResetTException(const TTransportException& e) {
>   // Strings taken from TTransport::readAll(). This happens iff 
> TSocket::read() returns 0.
>   // As readAll() is reading non-zero length payload, this can only mean 
> recv() called
>   // by read() returns 0. According to man page of recv(), this implies a 
> stream socket
>   // peer has performed an orderly shutdown.
>   return (e.getType() == TTransportException::END_OF_FILE &&
>  strstr(e.what(), "No more data to read.") != nullptr) ||
>  (e.getType() == TTransportException::INTERNAL_ERROR &&
>  strstr(e.what(), "SSL_read: Connection reset by peer") != 
> nullptr);
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-08 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-6990:
--
Priority: Blocker  (was: Major)

> TestClientSsl.test_tls_v12 failing due to Python SSL error
> --
>
> Key: IMPALA-6990
> URL: https://issues.apache.org/jira/browse/IMPALA-6990
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Blocker
>
> We've seen quite a few jobs fail with the following error:
> *_ssl.c:504: EOF occurred in violation of protocol*
> {code:java}
> custom_cluster/test_client_ssl.py:128: in test_tls_v12
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
> E   Not connected to Impala, could not execute queries.
> {code}
> We need to investigate why this is happening and fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-08 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6990:
-

 Summary: TestClientSsl.test_tls_v12 failing due to Python SSL error
 Key: IMPALA-6990
 URL: https://issues.apache.org/jira/browse/IMPALA-6990
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.0
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


We've seen quite a few jobs fail with the following error:
*_ssl.c:504: EOF occurred in violation of protocol*

{code:java}
custom_cluster/test_client_ssl.py:128: in test_tls_v12
self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
result = run_impala_shell_cmd(shell_options)
shell/util.py:97: in run_impala_shell_cmd
result.stderr)
E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
Starting Impala Shell without Kerberos authentication
E   SSL is enabled. Impala server certificates will NOT be verified (set 
--ca_cert to change)
E   
/data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
 DeprecationWarning: 3th positional argument is deprecated. Use keyward 
argument insteand.
E DeprecationWarning)
E   
/data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
 DeprecationWarning: 4th positional argument is deprecated. Use keyward 
argument insteand.
E DeprecationWarning)
E   
/data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
 DeprecationWarning: 5th positional argument is deprecated. Use keyward 
argument insteand.
E DeprecationWarning)
E   
/data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
 DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE instead
E DeprecationWarning)
E   No handlers could be found for logger "thrift.transport.TSSLSocket"
E   Error connecting: TTransportException, Could not connect to 
localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
E   Not connected to Impala, could not execute queries.
{code}

We need to investigate why this is happening and fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-08 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6990:
-

 Summary: TestClientSsl.test_tls_v12 failing due to Python SSL error
 Key: IMPALA-6990
 URL: https://issues.apache.org/jira/browse/IMPALA-6990
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.0
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


We've seen quite a few jobs fail with the following error:
*_ssl.c:504: EOF occurred in violation of protocol*

{code:java}
custom_cluster/test_client_ssl.py:128: in test_tls_v12
self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
result = run_impala_shell_cmd(shell_options)
shell/util.py:97: in run_impala_shell_cmd
result.stderr)
E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
Starting Impala Shell without Kerberos authentication
E   SSL is enabled. Impala server certificates will NOT be verified (set 
--ca_cert to change)
E   
/data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
 DeprecationWarning: 3th positional argument is deprecated. Use keyward 
argument insteand.
E DeprecationWarning)
E   
/data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
 DeprecationWarning: 4th positional argument is deprecated. Use keyward 
argument insteand.
E DeprecationWarning)
E   
/data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
 DeprecationWarning: 5th positional argument is deprecated. Use keyward 
argument insteand.
E DeprecationWarning)
E   
/data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
 DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE instead
E DeprecationWarning)
E   No handlers could be found for logger "thrift.transport.TSSLSocket"
E   Error connecting: TTransportException, Could not connect to 
localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
E   Not connected to Impala, could not execute queries.
{code}

We need to investigate why this is happening and fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6975) TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded

2018-05-04 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464273#comment-16464273
 ] 

Sailesh Mukil commented on IMPALA-6975:
---

Ok thanks [~tarmstrong]. Let me try running a private test with that mem limit 
and see if it fixes the issue.

> TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded
> -
>
> Key: IMPALA-6975
> URL: https://issues.apache.org/jira/browse/IMPALA-6975
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Priority: Critical
>  Labels: build-failure
>
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:171: in test_row_filters 
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)}) 
> common/impala_test_suite.py:405: in run_test_case result = 
> self.__execute_query(target_impalad_client, query, user=user) 
> common/impala_test_suite.py:620: in __execute_query return 
> impalad_client.execute(query, user=user) common/impala_connection.py:160: in 
> execute return self.__beeswax_client.execute(sql_stmt, user=user) 
> beeswax/impala_beeswax.py:173: in execute handle = 
> self.__execute_query(query_string.strip(), user=user) 
> beeswax/impala_beeswax.py:341: in __execute_query 
> self.wait_for_completion(handle) beeswax/impala_beeswax.py:361: in 
> wait_for_completion raise ImpalaBeeswaxException("Query aborted:" + 
> error_log, None) E   ImpalaBeeswaxException: ImpalaBeeswaxException: E
> Query aborted:Memory limit exceeded: ParquetColumnReader::ReadDataPage() 
> failed to allocate 65533 bytes for decompressed data. E   HDFS_SCAN_NODE 
> (id=0) could not allocate 64.00 KB without exceeding limit. E   Error 
> occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   AGGREGATION_NODE (id=3): 
> Total=4.00 KB Peak=4.00 KB E Exprs: Total=4.00 KB Peak=4.00 KB E  
>  HASH_JOIN_NODE (id=2): Reservation=132.00 MB OtherMemory=198.25 KB 
> Total=132.19 MB Peak=132.19 MB E Exprs: Total=25.12 KB Peak=25.12 KB 
> E Hash Join Builder (join_node_id=2): Total=157.12 KB Peak=157.12 KB 
> E   Hash Join Builder (join_node_id=2) Exprs: Total=149.12 KB 
> Peak=149.12 KB E   EXCHANGE_NODE (id=4): Reservation=14.91 MB 
> OtherMemory=67.76 KB Total=14.97 MB Peak=14.97 MB E KrpcDeferredRpcs: 
> Total=67.76 KB Peak=67.76 KB E   EXCHANGE_NODE (id=5): Reservation=15.03 
> MB OtherMemory=0 Total=15.03 MB Peak=15.03 MB E KrpcDeferredRpcs: 
> Total=0 Peak=45.12 KB E   KrpcDataStreamSender (dst_id=6): Total=16.00 KB 
> Peak=16.00 KB E Fragment f9423526590ed30b:732f1db10001: 
> Reservation=26.00 MB OtherMemory=9.75 MB Total=35.75 MB Peak=35.75 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   HDFS_SCAN_NODE (id=0): 
> Reservation=24.00 MB OtherMemory=9.30 MB Total=33.30 MB Peak=33.30 MB E   
>   Exprs: Total=260.00 KB Peak=260.00 KB E   KrpcDataStreamSender 
> (dst_id=4): Total=426.57 KB Peak=458.57 KB E KrpcDataStreamSender 
> (dst_id=4) Exprs: Total=256.00 KB Peak=256.00 KB E Fragment 
> f9423526590ed30b:732f1db10004: Reservation=0 OtherMemory=0 Total=0 
> Peak=29.59 MB E   HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 
> Total=0 Peak=29.32 MB E   KrpcDataStreamSender (dst_id=5): Total=0 
> Peak=266.57 KB EE   Memory limit exceeded: 
> ParquetColumnReader::ReadDataPage() failed to allocate 65533 bytes for 
> decompressed data. E   HDFS_SCAN_NODE (id=0) could not allocate 64.00 KB 
> without exceeding limit. E   Error occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB 

[jira] [Commented] (IMPALA-6975) TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded

2018-05-04 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464266#comment-16464266
 ] 

Sailesh Mukil commented on IMPALA-6975:
---

For some reason, this test is consistently failing on exhaustive-RHEL7. There 
are 3 instances of this test failing in that environment, all in the same 
environment.

> TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded
> -
>
> Key: IMPALA-6975
> URL: https://issues.apache.org/jira/browse/IMPALA-6975
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Priority: Critical
>  Labels: build-failure
>
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:171: in test_row_filters 
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)}) 
> common/impala_test_suite.py:405: in run_test_case result = 
> self.__execute_query(target_impalad_client, query, user=user) 
> common/impala_test_suite.py:620: in __execute_query return 
> impalad_client.execute(query, user=user) common/impala_connection.py:160: in 
> execute return self.__beeswax_client.execute(sql_stmt, user=user) 
> beeswax/impala_beeswax.py:173: in execute handle = 
> self.__execute_query(query_string.strip(), user=user) 
> beeswax/impala_beeswax.py:341: in __execute_query 
> self.wait_for_completion(handle) beeswax/impala_beeswax.py:361: in 
> wait_for_completion raise ImpalaBeeswaxException("Query aborted:" + 
> error_log, None) E   ImpalaBeeswaxException: ImpalaBeeswaxException: E
> Query aborted:Memory limit exceeded: ParquetColumnReader::ReadDataPage() 
> failed to allocate 65533 bytes for decompressed data. E   HDFS_SCAN_NODE 
> (id=0) could not allocate 64.00 KB without exceeding limit. E   Error 
> occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   AGGREGATION_NODE (id=3): 
> Total=4.00 KB Peak=4.00 KB E Exprs: Total=4.00 KB Peak=4.00 KB E  
>  HASH_JOIN_NODE (id=2): Reservation=132.00 MB OtherMemory=198.25 KB 
> Total=132.19 MB Peak=132.19 MB E Exprs: Total=25.12 KB Peak=25.12 KB 
> E Hash Join Builder (join_node_id=2): Total=157.12 KB Peak=157.12 KB 
> E   Hash Join Builder (join_node_id=2) Exprs: Total=149.12 KB 
> Peak=149.12 KB E   EXCHANGE_NODE (id=4): Reservation=14.91 MB 
> OtherMemory=67.76 KB Total=14.97 MB Peak=14.97 MB E KrpcDeferredRpcs: 
> Total=67.76 KB Peak=67.76 KB E   EXCHANGE_NODE (id=5): Reservation=15.03 
> MB OtherMemory=0 Total=15.03 MB Peak=15.03 MB E KrpcDeferredRpcs: 
> Total=0 Peak=45.12 KB E   KrpcDataStreamSender (dst_id=6): Total=16.00 KB 
> Peak=16.00 KB E Fragment f9423526590ed30b:732f1db10001: 
> Reservation=26.00 MB OtherMemory=9.75 MB Total=35.75 MB Peak=35.75 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   HDFS_SCAN_NODE (id=0): 
> Reservation=24.00 MB OtherMemory=9.30 MB Total=33.30 MB Peak=33.30 MB E   
>   Exprs: Total=260.00 KB Peak=260.00 KB E   KrpcDataStreamSender 
> (dst_id=4): Total=426.57 KB Peak=458.57 KB E KrpcDataStreamSender 
> (dst_id=4) Exprs: Total=256.00 KB Peak=256.00 KB E Fragment 
> f9423526590ed30b:732f1db10004: Reservation=0 OtherMemory=0 Total=0 
> Peak=29.59 MB E   HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 
> Total=0 Peak=29.32 MB E   KrpcDataStreamSender (dst_id=5): Total=0 
> Peak=266.57 KB EE   Memory limit exceeded: 
> ParquetColumnReader::ReadDataPage() failed to allocate 65533 bytes for 
> decompressed data. E   HDFS_SCAN_NODE (id=0) could not allocate 64.00 KB 
> without exceeding limit. E   Error occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> 

[jira] [Commented] (IMPALA-6975) TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded

2018-05-04 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464234#comment-16464234
 ] 

Sailesh Mukil commented on IMPALA-6975:
---

[~tarmstrong] Could you provide any guidance here? We can find someone to fix 
it.

> TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded
> -
>
> Key: IMPALA-6975
> URL: https://issues.apache.org/jira/browse/IMPALA-6975
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Priority: Critical
>  Labels: build-failure
>
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:171: in test_row_filters 
> test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)}) 
> common/impala_test_suite.py:405: in run_test_case result = 
> self.__execute_query(target_impalad_client, query, user=user) 
> common/impala_test_suite.py:620: in __execute_query return 
> impalad_client.execute(query, user=user) common/impala_connection.py:160: in 
> execute return self.__beeswax_client.execute(sql_stmt, user=user) 
> beeswax/impala_beeswax.py:173: in execute handle = 
> self.__execute_query(query_string.strip(), user=user) 
> beeswax/impala_beeswax.py:341: in __execute_query 
> self.wait_for_completion(handle) beeswax/impala_beeswax.py:361: in 
> wait_for_completion raise ImpalaBeeswaxException("Query aborted:" + 
> error_log, None) E   ImpalaBeeswaxException: ImpalaBeeswaxException: E
> Query aborted:Memory limit exceeded: ParquetColumnReader::ReadDataPage() 
> failed to allocate 65533 bytes for decompressed data. E   HDFS_SCAN_NODE 
> (id=0) could not allocate 64.00 KB without exceeding limit. E   Error 
> occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   AGGREGATION_NODE (id=3): 
> Total=4.00 KB Peak=4.00 KB E Exprs: Total=4.00 KB Peak=4.00 KB E  
>  HASH_JOIN_NODE (id=2): Reservation=132.00 MB OtherMemory=198.25 KB 
> Total=132.19 MB Peak=132.19 MB E Exprs: Total=25.12 KB Peak=25.12 KB 
> E Hash Join Builder (join_node_id=2): Total=157.12 KB Peak=157.12 KB 
> E   Hash Join Builder (join_node_id=2) Exprs: Total=149.12 KB 
> Peak=149.12 KB E   EXCHANGE_NODE (id=4): Reservation=14.91 MB 
> OtherMemory=67.76 KB Total=14.97 MB Peak=14.97 MB E KrpcDeferredRpcs: 
> Total=67.76 KB Peak=67.76 KB E   EXCHANGE_NODE (id=5): Reservation=15.03 
> MB OtherMemory=0 Total=15.03 MB Peak=15.03 MB E KrpcDeferredRpcs: 
> Total=0 Peak=45.12 KB E   KrpcDataStreamSender (dst_id=6): Total=16.00 KB 
> Peak=16.00 KB E Fragment f9423526590ed30b:732f1db10001: 
> Reservation=26.00 MB OtherMemory=9.75 MB Total=35.75 MB Peak=35.75 MB E   
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   HDFS_SCAN_NODE (id=0): 
> Reservation=24.00 MB OtherMemory=9.30 MB Total=33.30 MB Peak=33.30 MB E   
>   Exprs: Total=260.00 KB Peak=260.00 KB E   KrpcDataStreamSender 
> (dst_id=4): Total=426.57 KB Peak=458.57 KB E KrpcDataStreamSender 
> (dst_id=4) Exprs: Total=256.00 KB Peak=256.00 KB E Fragment 
> f9423526590ed30b:732f1db10004: Reservation=0 OtherMemory=0 Total=0 
> Peak=29.59 MB E   HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 
> Total=0 Peak=29.32 MB E   KrpcDataStreamSender (dst_id=5): Total=0 
> Peak=266.57 KB EE   Memory limit exceeded: 
> ParquetColumnReader::ReadDataPage() failed to allocate 65533 bytes for 
> decompressed data. E   HDFS_SCAN_NODE (id=0) could not allocate 64.00 KB 
> without exceeding limit. E   Error occurred on backend 
> impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
> f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB 
> E   Memory left in query limit: 22.16 KB E   
> Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 
> MB ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB 
> Peak=199.98 MB E Fragment f9423526590ed30b:732f1db10007: 
> Reservation=134.00 MB OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   
> 

[jira] [Created] (IMPALA-6975) TestRuntimeRowFilters.test_row_filters failing with Memory limit exceeded

2018-05-04 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6975:
-

 Summary: TestRuntimeRowFilters.test_row_filters failing with 
Memory limit exceeded
 Key: IMPALA-6975
 URL: https://issues.apache.org/jira/browse/IMPALA-6975
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.0
Reporter: Sailesh Mukil



{code:java}
Error Message
query_test/test_runtime_filters.py:171: in test_row_filters 
test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)}) 
common/impala_test_suite.py:405: in run_test_case result = 
self.__execute_query(target_impalad_client, query, user=user) 
common/impala_test_suite.py:620: in __execute_query return 
impalad_client.execute(query, user=user) common/impala_connection.py:160: in 
execute return self.__beeswax_client.execute(sql_stmt, user=user) 
beeswax/impala_beeswax.py:173: in execute handle = 
self.__execute_query(query_string.strip(), user=user) 
beeswax/impala_beeswax.py:341: in __execute_query 
self.wait_for_completion(handle) beeswax/impala_beeswax.py:361: in 
wait_for_completion raise ImpalaBeeswaxException("Query aborted:" + 
error_log, None) E   ImpalaBeeswaxException: ImpalaBeeswaxException: EQuery 
aborted:Memory limit exceeded: ParquetColumnReader::ReadDataPage() failed to 
allocate 65533 bytes for decompressed data. E   HDFS_SCAN_NODE (id=0) could not 
allocate 64.00 KB without exceeding limit. E   Error occurred on backend 
impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB E  
 Memory left in query limit: 22.16 KB E   
Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 MB 
ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB Peak=199.98 MB 
E Fragment f9423526590ed30b:732f1db10007: Reservation=134.00 MB 
OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   Runtime Filter 
Bank: Reservation=2.00 MB ReservationLimit=2.00 MB OtherMemory=0 Total=2.00 MB 
Peak=2.00 MB E   AGGREGATION_NODE (id=3): Total=4.00 KB Peak=4.00 KB E  
   Exprs: Total=4.00 KB Peak=4.00 KB E   HASH_JOIN_NODE (id=2): 
Reservation=132.00 MB OtherMemory=198.25 KB Total=132.19 MB Peak=132.19 MB E
 Exprs: Total=25.12 KB Peak=25.12 KB E Hash Join Builder 
(join_node_id=2): Total=157.12 KB Peak=157.12 KB E   Hash Join Builder 
(join_node_id=2) Exprs: Total=149.12 KB Peak=149.12 KB E   EXCHANGE_NODE 
(id=4): Reservation=14.91 MB OtherMemory=67.76 KB Total=14.97 MB Peak=14.97 MB 
E KrpcDeferredRpcs: Total=67.76 KB Peak=67.76 KB E   EXCHANGE_NODE 
(id=5): Reservation=15.03 MB OtherMemory=0 Total=15.03 MB Peak=15.03 MB E   
  KrpcDeferredRpcs: Total=0 Peak=45.12 KB E   KrpcDataStreamSender 
(dst_id=6): Total=16.00 KB Peak=16.00 KB E Fragment 
f9423526590ed30b:732f1db10001: Reservation=26.00 MB OtherMemory=9.75 MB 
Total=35.75 MB Peak=35.75 MB E   Runtime Filter Bank: Reservation=2.00 MB 
ReservationLimit=2.00 MB OtherMemory=0 Total=2.00 MB Peak=2.00 MB E   
HDFS_SCAN_NODE (id=0): Reservation=24.00 MB OtherMemory=9.30 MB Total=33.30 MB 
Peak=33.30 MB E Exprs: Total=260.00 KB Peak=260.00 KB E   
KrpcDataStreamSender (dst_id=4): Total=426.57 KB Peak=458.57 KB E 
KrpcDataStreamSender (dst_id=4) Exprs: Total=256.00 KB Peak=256.00 KB E 
Fragment f9423526590ed30b:732f1db10004: Reservation=0 OtherMemory=0 Total=0 
Peak=29.59 MB E   HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 
Total=0 Peak=29.32 MB E   KrpcDataStreamSender (dst_id=5): Total=0 
Peak=266.57 KB EE   Memory limit exceeded: 
ParquetColumnReader::ReadDataPage() failed to allocate 65533 bytes for 
decompressed data. E   HDFS_SCAN_NODE (id=0) could not allocate 64.00 KB 
without exceeding limit. E   Error occurred on backend 
impala-boost-static-burst-slave-el7-1aa3.vpc.cloudera.com:22001 by fragment 
f9423526590ed30b:732f1db10001 E   Memory left in process limit: 11.14 GB E  
 Memory left in query limit: 22.16 KB E   
Query(f9423526590ed30b:732f1db1): Limit=200.00 MB Reservation=160.00 MB 
ReservationLimit=160.00 MB OtherMemory=39.98 MB Total=199.98 MB Peak=199.98 MB 
E Fragment f9423526590ed30b:732f1db10007: Reservation=134.00 MB 
OtherMemory=30.22 MB Total=164.22 MB Peak=164.22 MB E   Runtime Filter 
Bank: Reservation=2.00 MB ReservationLimit=2.00 MB OtherMemory=0 Total=2.00 MB 
Peak=2.00 MB E   AGGREGATION_NODE (id=3): Total=4.00 KB Peak=4.00 KB E  
   Exprs: Total=4.00 KB Peak=4.00 KB E   HASH_JOIN_NODE (id=2): 
Reservation=132.00 MB OtherMemory=198.25 KB Total=132.19 MB Peak=132.19 MB E
 Exprs: Total=25.12 KB Peak=25.12 KB E Hash Join Builder 
(join_node_id=2): Total=157.12 KB Peak=157.12 KB E   Hash Join Builder 
(join_node_id=2) Exprs: 

[jira] [Commented] (IMPALA-6227) TestAdmissionControllerStress can be flaky

2018-05-03 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463188#comment-16463188
 ] 

Sailesh Mukil commented on IMPALA-6227:
---

Hit this again:


{code:java}

custom_cluster.test_admission_controller.TestAdmissionControllerStress.test_mem_limit[num_queries:
 50 | submission_delay_ms: 50 | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none | round_robin_submission: True] (from pytest)

Failing for the past 1 build (Since Failed#8 )
Took 2 min 30 sec.
Error Message
AssertionError: Timed out waiting 60 seconds for metrics admitted,timed-out 
delta 5 current {'dequeued': 20, 'rejected': 20, 'released': 24, 'admitted': 
30, 'queued': 20, 'timed-out': 0} initial {'dequeued': 14, 'rejected': 20, 
'released': 18, 'admitted': 26, 'queued': 20, 'timed-out': 0} assert 
(1524822944.9910979 - 1524822883.858825) < 60  +  where 1524822944.9910979 = 
time()
Stacktrace
custom_cluster/test_admission_controller.py:943: in test_mem_limit
{'request_pool': self.pool_name, 'mem_limit': query_mem_limit})
custom_cluster/test_admission_controller.py:844: in run_admission_test
['admitted', 'timed-out'], curr_metrics, expected_admitted)
custom_cluster/test_admission_controller.py:547: in wait_for_metric_changes
assert (time() - start_time < STRESS_TIMEOUT),\
E   AssertionError: Timed out waiting 60 seconds for metrics admitted,timed-out 
delta 5 current {'dequeued': 20, 'rejected': 20, 'released': 24, 'admitted': 
30, 'queued': 20, 'timed-out': 0} initial {'dequeued': 14, 'rejected': 20, 
'released': 18, 'admitted': 26, 'queued': 20, 'timed-out': 0}
E   assert (1524822944.9910979 - 1524822883.858825) < 60
E+  where 1524822944.9910979 = time()
Standard Output
Starting State Store logging to 
/data/jenkins/workspace/impala-cdh6.0.x-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO
Starting Catalog Service logging to 
/data/jenkins/workspace/impala-cdh6.0.x-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-cdh6.0.x-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO
Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-cdh6.0.x-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-cdh6.0.x-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
Impala Cluster Running with 3 nodes (3 coordinators, 3 executors).
Standard Error
MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25000
MainThread: Debug webpage not yet available.
MainThread: Debug webpage not yet available.
MainThread: Waiting for num_known_live_backends=3. Current value: 0
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25000
MainThread: Waiting for num_known_live_backends=3. Current value: 0
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25000
MainThread: Waiting for num_known_live_backends=3. Current value: 1
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25000
MainThread: Waiting for num_known_live_backends=3. Current value: 2
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25000
MainThread: num_known_live_backends has reached value: 3
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25001
MainThread: num_known_live_backends has reached value: 3
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25002
MainThread: num_known_live_backends has reached value: 3
MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
MainThread: Getting metric: statestore.live-backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25010
MainThread: Metric 'statestore.live-backends' has reached desired value: 4
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25000
MainThread: num_known_live_backends has reached value: 3
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25001
MainThread: num_known_live_backends has reached value: 3
MainThread: Getting num_known_live_backends from 
ec2-m2-4xlarge-centos-6-4-07fb.vpc.cloudera.com:25002
MainThread: num_known_live_backends has reached value: 3
-- connecting to: localhost:21000
MainThread: Starting test case with parameters: num_queries: 50 | 
submission_delay_ms: 50 | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 

[jira] [Created] (IMPALA-6970) DiskMgr::AllocateBuffersForRange crashes on failed DCHECK

2018-05-03 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6970:
-

 Summary: DiskMgr::AllocateBuffersForRange crashes on failed DCHECK
 Key: IMPALA-6970
 URL: https://issues.apache.org/jira/browse/IMPALA-6970
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.13.0
Reporter: Sailesh Mukil


Similar to IMPALA-6587, but the DCHECK failed in a slightly different way. 
Cannot tell if the root cause is the same as that though without further 
investigation.

{code:java}
FSF0503 09:30:26.715791 30750 reservation-tracker.cc:376] Check failed: bytes 
<= unused_reservation() (8388608 vs. 6291456) 
*** Check failure stack trace: ***
@  0x4277c1d  google::LogMessage::Fail()
@  0x42794c2  google::LogMessage::SendToLog()
@  0x42775f7  google::LogMessage::Flush()
@  0x427abbe  google::LogMessageFatal::~LogMessageFatal()
@  0x1ef1343  impala::ReservationTracker::AllocateFromLocked()
@  0x1ef111d  impala::ReservationTracker::AllocateFrom()
@  0x1ee8c57  impala::BufferPool::Client::PrepareToAllocateBuffer()
@  0x1ee5543  impala::BufferPool::AllocateBuffer()
@  0x2f50f68  impala::io::DiskIoMgr::AllocateBuffersForRange()
@  0x1f74762  impala::HdfsScanNodeBase::StartNextScanRange()
@  0x1f6b052  impala::HdfsScanNode::ScannerThread()
@  0x1f6a4ea  
_ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
@  0x1f6c5cc  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
@  0x1bd4748  boost::function0<>::operator()()
@  0x1ebf349  impala::Thread::SuperviseThread()
@  0x1ec74e5  boost::_bi::list5<>::operator()<>()
@  0x1ec7409  boost::_bi::bind_t<>::operator()()
@  0x1ec73cc  boost::detail::thread_data<>::run()
@  0x31a1f0a  thread_proxy
@   0x36d1607851  (unknown)
@   0x36d12e894d  (unknown)

{code}

Git hash of Impala used in job: ba84ad03cb83d7f7aed8524fcfbb0e2cdc9fdd53




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-6970) DiskMgr::AllocateBuffersForRange crashes on failed DCHECK

2018-05-03 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6970:
-

 Summary: DiskMgr::AllocateBuffersForRange crashes on failed DCHECK
 Key: IMPALA-6970
 URL: https://issues.apache.org/jira/browse/IMPALA-6970
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.13.0
Reporter: Sailesh Mukil


Similar to IMPALA-6587, but the DCHECK failed in a slightly different way. 
Cannot tell if the root cause is the same as that though without further 
investigation.

{code:java}
FSF0503 09:30:26.715791 30750 reservation-tracker.cc:376] Check failed: bytes 
<= unused_reservation() (8388608 vs. 6291456) 
*** Check failure stack trace: ***
@  0x4277c1d  google::LogMessage::Fail()
@  0x42794c2  google::LogMessage::SendToLog()
@  0x42775f7  google::LogMessage::Flush()
@  0x427abbe  google::LogMessageFatal::~LogMessageFatal()
@  0x1ef1343  impala::ReservationTracker::AllocateFromLocked()
@  0x1ef111d  impala::ReservationTracker::AllocateFrom()
@  0x1ee8c57  impala::BufferPool::Client::PrepareToAllocateBuffer()
@  0x1ee5543  impala::BufferPool::AllocateBuffer()
@  0x2f50f68  impala::io::DiskIoMgr::AllocateBuffersForRange()
@  0x1f74762  impala::HdfsScanNodeBase::StartNextScanRange()
@  0x1f6b052  impala::HdfsScanNode::ScannerThread()
@  0x1f6a4ea  
_ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
@  0x1f6c5cc  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
@  0x1bd4748  boost::function0<>::operator()()
@  0x1ebf349  impala::Thread::SuperviseThread()
@  0x1ec74e5  boost::_bi::list5<>::operator()<>()
@  0x1ec7409  boost::_bi::bind_t<>::operator()()
@  0x1ec73cc  boost::detail::thread_data<>::run()
@  0x31a1f0a  thread_proxy
@   0x36d1607851  (unknown)
@   0x36d12e894d  (unknown)

{code}

Git hash of Impala used in job: ba84ad03cb83d7f7aed8524fcfbb0e2cdc9fdd53




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6968) TestBlockVerificationGcmDisabled failure in exhaustive

2018-05-03 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6968:
-

 Summary: TestBlockVerificationGcmDisabled failure in exhaustive
 Key: IMPALA-6968
 URL: https://issues.apache.org/jira/browse/IMPALA-6968
 Project: IMPALA
  Issue Type: Task
  Components: Infrastructure
Affects Versions: Impala 2.13.0
Reporter: Sailesh Mukil
Assignee: Tim Armstrong


{code:java}
/data/jenkins/workspace/impala-cdh5-trunk-exhaustive-release-thrift/repos/Impala/be/src/runtime/tmp-file-mgr-test.cc:550
Value of: read_status.code()
  Actual: 0
Expected: TErrorCode::SCRATCH_READ_VERIFY_FAILED
Which is: 118
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-6587) Crash in DiskMgr::AllocateBuffersForRange

2018-05-03 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463130#comment-16463130
 ] 

Sailesh Mukil commented on IMPALA-6587:
---

We're still seeing this crash after this patch has gone in. [~tarmstrong] 
Should I reopen this JIRA, or file a new one? The DCHECK failed in a slightly 
different way

{code:java}
FSF0503 09:30:26.715791 30750 reservation-tracker.cc:376] Check failed: bytes 
<= unused_reservation() (8388608 vs. 6291456) 
*** Check failure stack trace: ***
@  0x4277c1d  google::LogMessage::Fail()
@  0x42794c2  google::LogMessage::SendToLog()
@  0x42775f7  google::LogMessage::Flush()
@  0x427abbe  google::LogMessageFatal::~LogMessageFatal()
@  0x1ef1343  impala::ReservationTracker::AllocateFromLocked()
@  0x1ef111d  impala::ReservationTracker::AllocateFrom()
@  0x1ee8c57  impala::BufferPool::Client::PrepareToAllocateBuffer()
@  0x1ee5543  impala::BufferPool::AllocateBuffer()
@  0x2f50f68  impala::io::DiskIoMgr::AllocateBuffersForRange()
@  0x1f74762  impala::HdfsScanNodeBase::StartNextScanRange()
@  0x1f6b052  impala::HdfsScanNode::ScannerThread()
@  0x1f6a4ea  
_ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
@  0x1f6c5cc  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
@  0x1bd4748  boost::function0<>::operator()()
@  0x1ebf349  impala::Thread::SuperviseThread()
@  0x1ec74e5  boost::_bi::list5<>::operator()<>()
@  0x1ec7409  boost::_bi::bind_t<>::operator()()
@  0x1ec73cc  boost::detail::thread_data<>::run()
@  0x31a1f0a  thread_proxy
@   0x36d1607851  (unknown)
@   0x36d12e894d  (unknown)
{code}

Git hash of Impala used in job: ba84ad03cb83d7f7aed8524fcfbb0e2cdc9fdd53

I can provide more details if necessary.

> Crash in DiskMgr::AllocateBuffersForRange
> -
>
> Key: IMPALA-6587
> URL: https://issues.apache.org/jira/browse/IMPALA-6587
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: broken-build, crash
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> {noformat}
> F0224 17:43:08.522589 13124 reservation-tracker.cc:376] Check failed: bytes 
> <= unused_reservation() (8192 vs. 0) 
> {noformat}
> {noformat}
> #0  0x003cb32328e5 in raise () from /lib64/libc.so.6
> #1  0x003cb32340c5 in abort () from /lib64/libc.so.6
> #2  0x03c5a244 in google::DumpStackTraceAndExit() ()
> #3  0x03c50cbd in google::LogMessage::Fail() ()
> #4  0x03c52562 in google::LogMessage::SendToLog() ()
> #5  0x03c50697 in google::LogMessage::Flush() ()
> #6  0x03c53c5e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x01b7a813 in impala::ReservationTracker::AllocateFromLocked 
> (this=0x1a75d2a98, bytes=8192) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-integration/repos/Impala/be/src/runtime/bufferpool/reservation-tracker.cc:376
> #8  0x01b7a5ed in impala::ReservationTracker::AllocateFrom 
> (this=0x1a75d2a98, bytes=8192) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-integration/repos/Impala/be/src/runtime/bufferpool/reservation-tracker.cc:370
> #9  0x01b72127 in impala::BufferPool::Client::PrepareToAllocateBuffer 
> (this=0x1a75d2a80, len=8192, reserved=true, success=0x0) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-integration/repos/Impala/be/src/runtime/bufferpool/buffer-pool.cc:567
> #10 0x01b6ea13 in impala::BufferPool::AllocateBuffer (this=0xa6af380, 
> client=0x14121248, len=8192, handle=0x7fede6224260) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-integration/repos/Impala/be/src/runtime/bufferpool/buffer-pool.cc:229
> #11 0x02b894f0 in impala::io::DiskIoMgr::AllocateBuffersForRange 
> (this=0xb06fd40, reader=0x1ecf10300, bp_client=0x14121248, range=0x14711180, 
> max_bytes=8192) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-integration/repos/Impala/be/src/runtime/io/disk-io-mgr.cc:470
> #12 0x01bef7ff in impala::HdfsScanNode::ScannerThread 
> (this=0x14121100, scanner_thread_reservation=8192) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-integration/repos/Impala/be/src/exec/hdfs-scan-node.cc:393
> #13 0x01beec52 in impala::HdfsScanNode::::operator()(void) 
> const (__closure=0x7fede6224bc8) at 
> 

[jira] [Created] (IMPALA-6967) GVO should only allow patches that apply cleanly to both master and 2.x

2018-05-03 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6967:
-

 Summary: GVO should only allow patches that apply cleanly to both 
master and 2.x
 Key: IMPALA-6967
 URL: https://issues.apache.org/jira/browse/IMPALA-6967
 Project: IMPALA
  Issue Type: Task
  Components: Infrastructure
Reporter: Sailesh Mukil


Following this thread:
https://lists.apache.org/thread.html/bba3c5a87635ad3c70c40ac120de2ddb41c3d0e2f5db0b29bc0243ff@%3Cdev.impala.apache.org%3E

It would take load off authors if the GVO could automatically tell if a patch 
that's being pushed to master would cleanly cherry-pick to 2.x.

At the beginning of the GVO, we should try to cherry-pick to 2.x and fail if 
there are conflicts, unless the commit message has the line:
"Cherry-picks: not for 2.x"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6967) GVO should only allow patches that apply cleanly to both master and 2.x

2018-05-03 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462965#comment-16462965
 ] 

Sailesh Mukil commented on IMPALA-6967:
---

CC: [~lv] [~jbapple]

> GVO should only allow patches that apply cleanly to both master and 2.x
> ---
>
> Key: IMPALA-6967
> URL: https://issues.apache.org/jira/browse/IMPALA-6967
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: jenkins
>
> Following this thread:
> https://lists.apache.org/thread.html/bba3c5a87635ad3c70c40ac120de2ddb41c3d0e2f5db0b29bc0243ff@%3Cdev.impala.apache.org%3E
> It would take load off authors if the GVO could automatically tell if a patch 
> that's being pushed to master would cleanly cherry-pick to 2.x.
> At the beginning of the GVO, we should try to cherry-pick to 2.x and fail if 
> there are conflicts, unless the commit message has the line:
> "Cherry-picks: not for 2.x"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-6967) GVO should only allow patches that apply cleanly to both master and 2.x

2018-05-03 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6967:
-

 Summary: GVO should only allow patches that apply cleanly to both 
master and 2.x
 Key: IMPALA-6967
 URL: https://issues.apache.org/jira/browse/IMPALA-6967
 Project: IMPALA
  Issue Type: Task
  Components: Infrastructure
Reporter: Sailesh Mukil


Following this thread:
https://lists.apache.org/thread.html/bba3c5a87635ad3c70c40ac120de2ddb41c3d0e2f5db0b29bc0243ff@%3Cdev.impala.apache.org%3E

It would take load off authors if the GVO could automatically tell if a patch 
that's being pushed to master would cleanly cherry-pick to 2.x.

At the beginning of the GVO, we should try to cherry-pick to 2.x and fail if 
there are conflicts, unless the commit message has the line:
"Cherry-picks: not for 2.x"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-6340) There is no error when inserting an invalid value into a decimal column under decimal_v2

2018-04-30 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458899#comment-16458899
 ] 

Sailesh Mukil commented on IMPALA-6340:
---

[~tarasbob] Can we close this?

> There is no error when inserting an invalid value into a decimal column under 
> decimal_v2
> 
>
> Key: IMPALA-6340
> URL: https://issues.apache.org/jira/browse/IMPALA-6340
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Taras Bobrovytsky
>Assignee: Taras Bobrovytsky
>Priority: Blocker
>  Labels: correctness
>
> The following series of commands does not result in an error or a warning 
> when decimal_v2 is enabled.
> {code}
> set decimal_v2=1;
> create table t1 (c1 decimal(38,37));
> insert into t1 select 11.11;
> {code}
> We end up inserting a NULL into the column without any warnings.
> If these commands are executed with decimal_v2 disabled, we get the following 
> warning:
> {code}
> WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6908) IsConnResetTException() should include ECONNRESET

2018-04-30 Thread Sailesh Mukil (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458815#comment-16458815
 ] 

Sailesh Mukil commented on IMPALA-6908:
---

I looked at this for a bit. We match with "SSL_read: Connection reset by peer", 
which is the version of the same ECONNRESET error while using TLS. I think we 
can safely add plain ECONNRESET to the list for clusters that do not have TLS 
turned on.

> IsConnResetTException() should include ECONNRESET
> -
>
> Key: IMPALA-6908
> URL: https://issues.apache.org/jira/browse/IMPALA-6908
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Sailesh Mukil
>Priority: Major
>
> {{IsConnReset()}} aims to check if the given exception is due to a stale 
> connection. Apparently, it's missing the case in which the error code is 
> ECONNRESET.
> {noformat}
> bool IsConnResetTException(const TTransportException& e) {
>   // Strings taken from TTransport::readAll(). This happens iff 
> TSocket::read() returns 0.
>   // As readAll() is reading non-zero length payload, this can only mean 
> recv() called
>   // by read() returns 0. According to man page of recv(), this implies a 
> stream socket
>   // peer has performed an orderly shutdown.
>   return (e.getType() == TTransportException::END_OF_FILE &&
>  strstr(e.what(), "No more data to read.") != nullptr) ||
>  (e.getType() == TTransportException::INTERNAL_ERROR &&
>  strstr(e.what(), "SSL_read: Connection reset by peer") != 
> nullptr);
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6318) Test suite may hang on test_query_cancellation_during_fetch

2018-04-30 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-6318:
--
Fix Version/s: (was: Impala 2.12.0)

> Test suite may hang on test_query_cancellation_during_fetch
> ---
>
> Key: IMPALA-6318
> URL: https://issues.apache.org/jira/browse/IMPALA-6318
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 2.11.0
> Environment: I managed to investigate this issue only once so far, it 
> was hanging in some of our Jenkins build jobs.
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: hangs, test_issue
> Attachments: Screen Shot 2017-12-13 at 8.56.54.png
>
>
> test_query_cancellation_during_fetch steps:
>   1) Runs a query in Impala shell that goes quickly to fetching state, where 
> the fetching would take several minutes.
>   2) While the query is running, the script polls the Impala debug page to 
> wait until the query gets to "FINISHED" state. This state means that the 
> results are ready for fetching. (There is a 15 try threshold for the polling 
> part.)
>   3) Once the query gets to "FINISHED" state a CTRL-C signal is sent to 
> Impala shell to cancel the query.
>   4) Query output is fetched and verified.
> Initial assumption
> =
> My initial assumption on this issue was that the query somehow was stuck in 
> step 2) while waiting for the desired query state (and the retry threshold 
> wasn't applied somehow) but when I checked the Impala debug page, apparently 
> the query had gone to completed from in-flight with having 2048 rows already 
> fetched (see picture attached). Impala logs also show that the query had been 
> cancelled.
> {code:java}
> I1209 08:29:35.281550 18194 coordinator.cc:99] Exec() 
> query_id=d248bc6079f33f66:1b638a7 stmt=with v as (values (1 as x), 
> (2), (3), (4)) select * from v, v v2, v v3, v v4, v v5, v v6, v v7, v v8, v 
> v9, v v10, v v11
> {code}
> {code:java}
> I1209 08:29:35.895359 18196 query-state.cc:384] Instance completed. 
> instance_id=d248bc6079f33f66:1b638a7 #in-flight=0 status=CANCELLED: 
> Cancelled
> I1209 08:29:35.895372 18196 query-state.cc:396] Cancel: 
> query_id=d248bc6079f33f66:1b638a7
> I1209 08:29:35.895407 18196 query-exec-mgr.cc:149] ReleaseQueryState(): 
> query_id=d248bc6079f33f66:1b638a7 refcnt=2
> I1209 08:29:35.908305 18194 query-exec-mgr.cc:149] ReleaseQueryState(): 
> query_id=d248bc6079f33f66:1b638a7 refcnt=1
> {code}
> This means that the step 2) and even step 3) had finished properly and the 
> query was cancelled during the fetching phase.
> The interesting part is when I checked the running processes on the host, I 
> observed a running impala-shell.py that is executing the query.
> {code:java}
> jenkins  18187  6223  0 Dec09 ?00:00:00 
> /Impala/shell/impala_shell.py -i localhost:21000 -q with v as 
> (values (1 as x), (2), (3), (4)) select * from v, v v2, v v3, v v4, v v5, v 
> v6, v v7, v v8, v v9, v v10, v v11;
> {code}
> I attached a gdb to the running process but the backtrace didn't give 
> anything meaningful.
> Summary
> 
>   - The query shows completed on Impala debug page with a few lines had 
> already been fetched (as desired).
>   - Impala logs show that the query had been cancelled (as desired).
>   - An impala_shell.py is still showing up in 'ps -ef' that seems to run the 
> query.
>   - According to 'top' there is no process that pikes in cpu usage.
> Assumption
> 
> As the debug page shows that the query is completed I assume that the 
> 'waiting for state' and the actual cancellation of the query finished 
> successfully so the execution should hang on step 4) where the results are 
> retrieved from ImpalaShell.
> {code:java}
> 1) p = ImpalaShell(args)
> 2) self.wait_for_query_state(stmt, cancel_at_state)
> 3) os.kill(p.pid(), signal.SIGINT)
> 4) result = p.get_result()
> {code}
> The get_result() contains a shell_process.communicate() call that fetches the 
> stdout and stderr from the underlying process. According to the python docs 
> on this communicate() function it seems that it doesn't work well when the 
> data size is big.
> Taking into account that this query fetches and prints results for more than 
> 30 mins we can consider the stdout of the ImpalaShell large.
> https://docs.python.org/2/library/subprocess.html
> "Note The data read is buffered in memory, so do not use this method if the 
> data size is large or unlimited."
> If this is indeed the root of the issue then the possible solution is to 
> modify the util.py:ImpalaShell to judge based on an input parameter when 
> calling Popen whether it connects to stdout wit Pipe or not connect to it at 
> all. This would be suitable 

[jira] [Updated] (IMPALA-6332) Impala webserver should return HTTP error code for missing query profiles

2018-04-30 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-6332:
--
Fix Version/s: (was: Impala 2.13.0)

> Impala webserver should return HTTP error code for missing query profiles
> -
>
> Key: IMPALA-6332
> URL: https://issues.apache.org/jira/browse/IMPALA-6332
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.10.0, Impala 2.11.0
>Reporter: Zoram Thanga
>Priority: Minor
>
> When we try to read the thrift runtime profile of a query soon after the 
> query is finished, sometimes the profile cannot be found in either the query 
> map or query log. Then the server sends back "Query id $0 not found." from 
> ImpalaServer::GetRuntimeProfileStr(). This is followed up in 
> ImpalaHttpHandler::QueryProfileEncodedHandler() as:
> ss.str(Substitute("Could not obtain runtime profile: $0", 
> status.GetDetail()));
> The string us returned to the caller, but the HTTP response code is OK. This 
> can fool clients into thinking that they successfully read a valid thrift 
> profile. But since that's not true, clients that deserialize the thrift 
> profile may think that they received a corrupted profile.
> We should change the code to send back a non-OK response, such as 404 in this 
> situation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6440) Impala cannot read / write HBase tables when metadata is created with newer versions of Hive

2018-04-30 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil updated IMPALA-6440:
--
Fix Version/s: (was: Impala 2.13.0)

> Impala cannot read / write HBase tables when metadata is created with newer 
> versions of Hive
> 
>
> Key: IMPALA-6440
> URL: https://issues.apache.org/jira/browse/IMPALA-6440
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.11.0
>Reporter: Zach Amsden
>Assignee: Adrian Ng
>Priority: Major
>
> Due to https://issues.apache.org/jira/browse/HIVE-18366 the way we fetch 
> table properties needs to be changed.  Ideally this should be backwards 
> compatible to allow both newer and older versions of Hive to be used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-5893) Remove old kinit code for Impala 3

2018-04-20 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-5893.
---
   Resolution: Fixed
Fix Version/s: Impala 3.0

> Remove old kinit code for Impala 3
> --
>
> Key: IMPALA-5893
> URL: https://issues.apache.org/jira/browse/IMPALA-5893
> Project: IMPALA
>  Issue Type: Task
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
> Fix For: Impala 3.0
>
>
> Once we switch to Kudu's kinit code and we're confident that it works well 
> for all our use cases, we should remove the old kinit code including the 
> following flags:
> * kerberos_reinit_interval
> * use_kudu_kinit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-4744) Apache Impala release should include release tag or hash in version string

2018-04-20 Thread Sailesh Mukil (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-4744.
---
   Resolution: Fixed
Fix Version/s: Impala 2.13.0
   Impala 3.0

Thanks [~jbapple]. Updated:
https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release

Closing this.

> Apache Impala release should include release tag or hash in version string
> --
>
> Key: IMPALA-4744
> URL: https://issues.apache.org/jira/browse/IMPALA-4744
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.8.0
>Reporter: Tim Armstrong
>Assignee: Sailesh Mukil
>Priority: Major
> Fix For: Impala 3.0, Impala 2.13.0
>
>
> If I build the Apache Impala release from a source tarball, I get this 
> version string:
> {code}
> tarmstrong@tarmstrong-box:~/apache-impala-incubating-2.8.0/apache-impala-incubating-2.8.0$
>  impala-shell.sh 
> INFO:bootstrap_virtualenv:Installing Kudu into the virtualenv
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 2.8.0-RELEASE DEBUG (build Could not obtain 
> git hash)
> {code}
> It would be nice if we had a way to put  something more friendly in there, 
> e.g. the release tag or the git hash.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6861) Avoid spurious OpenSSL warning printed by KRPC

2018-04-16 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6861:
-

 Summary: Avoid spurious OpenSSL warning printed by KRPC
 Key: IMPALA-6861
 URL: https://issues.apache.org/jira/browse/IMPALA-6861
 Project: IMPALA
  Issue Type: Task
  Components: Distributed Exec
Affects Versions: Impala 3.0, Impala 2.12.0
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


This warning has no effect, we should opt for an initialization codepath that 
does not print this error message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6859) De-templatize RpcMgrTestBase

2018-04-16 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6859:
-

 Summary: De-templatize RpcMgrTestBase
 Key: IMPALA-6859
 URL: https://issues.apache.org/jira/browse/IMPALA-6859
 Project: IMPALA
  Issue Type: Task
  Components: Backend
Affects Versions: Impala 3.0
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


Now that we've gotten rid of the old way of Kinit-ing (IMPALA-5893), we can 
detemplatize RpcMgrTestBase, since there's only one option to run the kerberos 
tests with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6833) Use include-what-you-use to reduce build times

2018-04-10 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6833:
-

 Summary: Use include-what-you-use to reduce build times
 Key: IMPALA-6833
 URL: https://issues.apache.org/jira/browse/IMPALA-6833
 Project: IMPALA
  Issue Type: Task
  Components: Infrastructure
Reporter: Sailesh Mukil


Kudu has a systematic way of doing this:
https://github.com/apache/kudu/tree/master/build-support/iwyu

We can adopt the same efforts to improve our build times and have slightly 
cleaner code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6806) TLS certificate with Intermediate CA in server cert file fails with KRPC

2018-04-04 Thread Sailesh Mukil (JIRA)
Sailesh Mukil created IMPALA-6806:
-

 Summary: TLS certificate with Intermediate CA in server cert file 
fails with KRPC
 Key: IMPALA-6806
 URL: https://issues.apache.org/jira/browse/IMPALA-6806
 Project: IMPALA
  Issue Type: Bug
  Components: Security
Affects Versions: Impala 2.12.0
Reporter: Sailesh Mukil
Assignee: Sailesh Mukil


Take 2 certificate files: cert.pem and truststore.pem

cert.pem has 2 certificates in it:
A cert for that node (with CN="hostname", and signed by CN=CertToolkitIntCA)
And the intermediate CA cert (with CN=CertToolkitIntCA, and signed by 
CN=CertToolkitRootCA)

truststore.pem has 1 certificate in it:
A cert which is the root CA (with CN=CertToolkitRootCA, self-signed)

This format of certificates don't seem to verify on the OpenSSL command line 
but works with Thrift. This also doesn't work with KRPC.

Workaround for this issue w/ KRPC turned on:
If we move the second certificate from cert.pem (CN=CertToolkitIntCA) into 
truststore.pem, then this seems to work.

We'll need to dig into whether this is a PEM file format issue, or a KRPC 
issue. But the above workaround should unblock us for now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >