date:20180524

[jira] [Commented] (IMPALA-7073) Failed test: query_test.test_scanners.TestScannerReservation.test_scanners

2018-05-24 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490275#comment-16490275
 ] 

Tim Armstrong commented on IMPALA-7073:
---

I added this test so I'll take it for now. I'm guessing it's a out-of-date 
profile, i.e. IMPALA-6338 but there's not enough context here to diagnose.

> Failed test: query_test.test_scanners.TestScannerReservation.test_scanners
> --
>
> Key: IMPALA-7073
> URL: https://issues.apache.org/jira/browse/IMPALA-7073
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, test-failure
>
> Possibly flaky test: 
> {code:java}
> Stacktrace
> query_test/test_scanners.py:1064: in test_scanners
> self.run_test_case('QueryTest/scanner-reservation', vector)
> common/impala_test_suite.py:451: in run_test_case
> verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
> result.runtime_profile)
> common/test_result_verifier.py:590: in verify_runtime_profile
> actual))
> E   AssertionError: Did not find matches for lines in runtime profile:
> E   EXPECTED LINES:
> E   row_regex:.*InitialRangeActualReservation.*Avg: 4.00 MB.*{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-6933) test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists"

2018-05-24 Thread Vuk Ercegovac (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-6933 started by Vuk Ercegovac.
-
> test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: 
> Database already exists"
> --
>
> Key: IMPALA-6933
> URL: https://issues.apache.org/jira/browse/IMPALA-6933
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Vuk Ercegovac
>Priority: Critical
>  Labels: kudu, test-infra
>
> Error Message
> {noformat}
> test setup failure
> {noformat}
> Stacktrace
> {noformat}
> conftest.py:347: in conn
> with __unique_conn(db_name=db_name, timeout=timeout) as conn:
> /usr/lib64/python2.6/contextlib.py:16: in __enter__
> return self.gen.next()
> conftest.py:380: in __unique_conn
> cur.execute("CREATE DATABASE %s" % db_name)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:302: in 
> execute
> configuration=configuration)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:343: in 
> execute_async
> self._execute_async(op)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:362: in 
> _execute_async
> operation_fn()
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:340: in 
> op
> async=True)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:1027: 
> in execute
> return self._operation('ExecuteStatement', req)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:957: in 
> _operation
> resp = self._rpc(kind, request)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:925: in 
> _rpc
> err_if_rpc_not_ok(response)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:704: in 
> err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> E   HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' 
> RPC to Hive Metastore: 
> E   CAUSED BY: AlreadyExistsException: Database f0mraw already exists
> {noformat}
> Tests affected:
> * query_test.test_kudu.TestCreateExternalTable.test_unsupported_binary_col
> * query_test.test_kudu.TestCreateExternalTable.test_drop_external_table
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_preference
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist
> * 
> query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist_but_implicit_does
> * query_test.test_kudu.TestCreateExternalTable.test_table_without_partitioning
> * query_test.test_kudu.TestCreateExternalTable.test_column_name_case
> * query_test.test_kudu.TestCreateExternalTable.test_conflicting_column_name



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7073) Failed test: query_test.test_scanners.TestScannerReservation.test_scanners

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)

Dimitris Tsirogiannis created IMPALA-7073:
-

 Summary: Failed test: 
query_test.test_scanners.TestScannerReservation.test_scanners
 Key: IMPALA-7073
 URL: https://issues.apache.org/jira/browse/IMPALA-7073
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


Possibly flaky test: 
{code:java}
Stacktrace
query_test/test_scanners.py:1064: in test_scanners
self.run_test_case('QueryTest/scanner-reservation', vector)
common/impala_test_suite.py:451: in run_test_case
verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
result.runtime_profile)
common/test_result_verifier.py:590: in verify_runtime_profile
actual))
E   AssertionError: Did not find matches for lines in runtime profile:
E   EXPECTED LINES:
E   row_regex:.*InitialRangeActualReservation.*Avg: 4.00 MB.*{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7073) Failed test: query_test.test_scanners.TestScannerReservation.test_scanners

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)

Dimitris Tsirogiannis created IMPALA-7073:
-

 Summary: Failed test: 
query_test.test_scanners.TestScannerReservation.test_scanners
 Key: IMPALA-7073
 URL: https://issues.apache.org/jira/browse/IMPALA-7073
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


Possibly flaky test: 
{code:java}
Stacktrace
query_test/test_scanners.py:1064: in test_scanners
self.run_test_case('QueryTest/scanner-reservation', vector)
common/impala_test_suite.py:451: in run_test_case
verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
result.runtime_profile)
common/test_result_verifier.py:590: in verify_runtime_profile
actual))
E   AssertionError: Did not find matches for lines in runtime profile:
E   EXPECTED LINES:
E   row_regex:.*InitialRangeActualReservation.*Avg: 4.00 MB.*{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work started] (IMPALA-5740) Clarify STRING size limit

2018-05-24 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-5740 started by Alex Rodoni.
---
> Clarify STRING size limit
> -
>
> Key: IMPALA-5740
> URL: https://issues.apache.org/jira/browse/IMPALA-5740
> Project: IMPALA
>  Issue Type: Documentation
>  Components: Docs
>Affects Versions: Impala 2.10.0
>Reporter: Tim Armstrong
>Assignee: Alex Rodoni
>Priority: Critical
>
> The Impala docs currently state that strings have a maximum size of 32kb. 
> http://impala.apache.org/docs/build/html/topics/impala_string.html
> This is misleading and causes confusion. We should clarify this in a way that 
> makes it clear that 32kb is not the actual limit, but also makes it clear 
> that performance and memory consumption will degrade with larger strings.
> Here's what the behaviour is. Unsure the best way to succinctly characterise 
> it.
> * We expect that queries operating on 32KB strings will work reliably and not 
> hit significant performance or memory problems (unless you have very complex 
> queries, very many columns, etc).
> * There is an absolute hard limit of ~ 2GB on strings.
> * Memory consumption of queries will grow as string sizes increase and the 
> probability of hitting "memory limit exceeded" will increase.
> * Performance of queries will decrease as strings get larger.
> * The row size (i.e. total size of all string and other columns) is limited 
> in various places by the implementation of various operators.
> ** Rows coming from the right side of any hash join
> ** Rows coming from either side of a spilling hash join
> ** Rows being sorted
> ** Rows in a grouping aggregation.
> In Impala <= 2.9 the default limit in those places is 8MB.
> With the IMPALA-3200 changes that default number may decrease significantly, 
> to 2MB or less, but there will be a new query option, something like 
> MAX_ROW_SIZE, that will make this row size limit configurable on a per-query 
> basis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6933) test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists"

2018-05-24 Thread Vuk Ercegovac (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489967#comment-16489967
 ] 

Vuk Ercegovac commented on IMPALA-6933:
---

Tracing through this, I see that the tests in test_kudu.py derive from 
kudu_test_suite.py.

kudu_test_suite override get_db_name as well as auto_create_db. the 
implementation of get_db_name works differently than the default way of 
creating a name for a db (f0mraw). The default way of generating such a name 
prepends the name with the test-name, so in this case, I suspect we were 
unlucky. Looking into making the kudu name generation more similar to the 
default handling.

> test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: 
> Database already exists"
> --
>
> Key: IMPALA-6933
> URL: https://issues.apache.org/jira/browse/IMPALA-6933
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Vuk Ercegovac
>Priority: Critical
>  Labels: kudu, test-infra
>
> Error Message
> {noformat}
> test setup failure
> {noformat}
> Stacktrace
> {noformat}
> conftest.py:347: in conn
> with __unique_conn(db_name=db_name, timeout=timeout) as conn:
> /usr/lib64/python2.6/contextlib.py:16: in __enter__
> return self.gen.next()
> conftest.py:380: in __unique_conn
> cur.execute("CREATE DATABASE %s" % db_name)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:302: in 
> execute
> configuration=configuration)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:343: in 
> execute_async
> self._execute_async(op)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:362: in 
> _execute_async
> operation_fn()
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:340: in 
> op
> async=True)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:1027: 
> in execute
> return self._operation('ExecuteStatement', req)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:957: in 
> _operation
> resp = self._rpc(kind, request)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:925: in 
> _rpc
> err_if_rpc_not_ok(response)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:704: in 
> err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> E   HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' 
> RPC to Hive Metastore: 
> E   CAUSED BY: AlreadyExistsException: Database f0mraw already exists
> {noformat}
> Tests affected:
> * query_test.test_kudu.TestCreateExternalTable.test_unsupported_binary_col
> * query_test.test_kudu.TestCreateExternalTable.test_drop_external_table
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_preference
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist
> * 
> query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist_but_implicit_does
> * query_test.test_kudu.TestCreateExternalTable.test_table_without_partitioning
> * query_test.test_kudu.TestCreateExternalTable.test_column_name_case
> * query_test.test_kudu.TestCreateExternalTable.test_conflicting_column_name



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Comment Edited] (IMPALA-7012) NullPointerException with CTAS query

2018-05-24 Thread Tianyi Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489930#comment-16489930
 ] 

Tianyi Wang edited comment on IMPALA-7012 at 5/24/18 10:44 PM:
---

This is the output at commit 4bd7cc8dbf2f07db3468e1feb595cd16a7cd81e3, the 
parent of IMPALA-3916 on master:
{noformat}
ERROR: AnalysisException: Syntax error in line 1:
...nth) stored as parquet as /* +noclustered */select at1...
^
Encountered: Unknown last token with id: 212
Expected: SELECT, VALUES, WITH
CAUSED BY: Exception: Syntax error
{noformat}
The query that breaks it can be as simple as:
{noformat}
/*+*/;{noformat}


was (Author: tianyiwang):
This is the output at commit 4bd7cc8dbf2f07db3468e1feb595cd16a7cd81e3, the 
parent of IMPALA-3916 on master:
{noformat}
Encountered: Unknown last token with id: 212
Expected: ALTER, COMPUTE, CREATE, DELETE, DESCRIBE, DROP, EXPLAIN, GRANT, 
INSERT, INVALIDATE, LOAD, REFRESH, REVOKE, SELECT, SET, SHOW, TRUNCATE, UPDATE, 
UPSERT, USE, VALUES, WITH
{noformat}
The query that breaks it can be as simple as:
{noformat}
/*+*/;{noformat}

> NullPointerException with CTAS query
> 
>
> Key: IMPALA-7012
> URL: https://issues.apache.org/jira/browse/IMPALA-7012
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tianyi Wang
>Priority: Critical
>
> {noformat}
> [localhost:21000] default> create table alltypesinsert partitioned by (year, 
> month) stored as parquet as /* +noclustered */select at1.id, at1.bool_col, 
> at1.tinyint_col, at1.smallint_col, at1.int_col, at1.bigint_col,   
>
> at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,   
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2;
> Query: create table alltypesinsert partitioned by (year, month) stored as 
> parquet as /* +noclustered */
> select at1.id, at1.bool_col, at1.tinyint_col, at1.smallint_col, at1.int_col, 
> at1.bigint_col,
>   at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2
> Query submitted at: 2018-05-10 13:46:02 (Coordinator: 
> http://tarmstrong-box:25000)
> ERROR: NullPointerException: null
> {noformat}
> {noformat}
> I0510 13:46:02.977249  4238 Frontend.java:987] Analyzing query: create table 
> alltypesinsert partitioned by (year, month) stored as parquet as /* 
> +noclustered */
> select at1.id, at1.bool_col, at1.tinyint_col, at1.smallint_col, at1.int_col, 
> at1.bigint_col,
>   at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2
> I0510 13:46:03.025013  4238 jni-util.cc:230] java.lang.NullPointerException
> at 
> org.apache.impala.analysis.SqlScanner.isReserved(SqlScanner.java:725)
> at 
> org.apache.impala.analysis.SqlParser.getErrorMsg(SqlParser.java:1532)
> at org.apache.impala.service.Frontend.parse(Frontend.java:975)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:990)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156)
> I0510 13:46:03.124739  4238 status.cc:125] NullPointerException: null
> @  0x18782ef  impala::Status::Status()
> @  0x1e55652  impala::JniUtil::GetJniExceptionMsg()
> @  0x1d133ed  impala::JniUtil::CallJniMethod<>()
> @  0x1d10047  impala::Frontend::GetExecRequest()
> @  0x1d3205a  impala::ImpalaServer::ExecuteInternal()
> @  0x1d31ba2  impala::ImpalaServer::Execute()
> @  0x1d9be70  impala::ImpalaServer::query()
> @  0x2ee378e  beeswax::BeeswaxServiceProcessor::process_query()
> @  0x2ee34dc  beeswax::BeeswaxServiceProcessor::dispatchCall()
> @  0x2ebcf9d  impala::ImpalaServiceProcessor::dispatchCall()
> @  0x1836690  apache::thrift::TDispatchProcessor::process()
> @  0x1b9649d  
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @  0x1b8d9c5  impala::ThriftThread::RunRunnable()
> @  0x1b8f0c9  boost::_mfi::mf2<>::operator()()
> @  0x1b8ef5f  boost::_bi::list3<>::operator()<>()
> @  0x1b8ecab  boost::_bi::bind_t<>::operator()()
> @  0x1b8ebbe  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @  0x1bd3b1a  boost::function0<>::operator()()
> @  0x1ebec51  impala::Thread::SuperviseThread()
> @

[jira] [Commented] (IMPALA-7012) NullPointerException with CTAS query

2018-05-24 Thread Tianyi Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489930#comment-16489930
 ] 

Tianyi Wang commented on IMPALA-7012:
-

This is the output at commit 4bd7cc8dbf2f07db3468e1feb595cd16a7cd81e3, the 
parent of IMPALA-3916 on master:
{noformat}
Encountered: Unknown last token with id: 212
Expected: ALTER, COMPUTE, CREATE, DELETE, DESCRIBE, DROP, EXPLAIN, GRANT, 
INSERT, INVALIDATE, LOAD, REFRESH, REVOKE, SELECT, SET, SHOW, TRUNCATE, UPDATE, 
UPSERT, USE, VALUES, WITH
{noformat}
The query that breaks it can be as simple as:
{noformat}
/*+*/;{noformat}

> NullPointerException with CTAS query
> 
>
> Key: IMPALA-7012
> URL: https://issues.apache.org/jira/browse/IMPALA-7012
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tianyi Wang
>Priority: Critical
>
> {noformat}
> [localhost:21000] default> create table alltypesinsert partitioned by (year, 
> month) stored as parquet as /* +noclustered */select at1.id, at1.bool_col, 
> at1.tinyint_col, at1.smallint_col, at1.int_col, at1.bigint_col,   
>
> at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,   
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2;
> Query: create table alltypesinsert partitioned by (year, month) stored as 
> parquet as /* +noclustered */
> select at1.id, at1.bool_col, at1.tinyint_col, at1.smallint_col, at1.int_col, 
> at1.bigint_col,
>   at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2
> Query submitted at: 2018-05-10 13:46:02 (Coordinator: 
> http://tarmstrong-box:25000)
> ERROR: NullPointerException: null
> {noformat}
> {noformat}
> I0510 13:46:02.977249  4238 Frontend.java:987] Analyzing query: create table 
> alltypesinsert partitioned by (year, month) stored as parquet as /* 
> +noclustered */
> select at1.id, at1.bool_col, at1.tinyint_col, at1.smallint_col, at1.int_col, 
> at1.bigint_col,
>   at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2
> I0510 13:46:03.025013  4238 jni-util.cc:230] java.lang.NullPointerException
> at 
> org.apache.impala.analysis.SqlScanner.isReserved(SqlScanner.java:725)
> at 
> org.apache.impala.analysis.SqlParser.getErrorMsg(SqlParser.java:1532)
> at org.apache.impala.service.Frontend.parse(Frontend.java:975)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:990)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156)
> I0510 13:46:03.124739  4238 status.cc:125] NullPointerException: null
> @  0x18782ef  impala::Status::Status()
> @  0x1e55652  impala::JniUtil::GetJniExceptionMsg()
> @  0x1d133ed  impala::JniUtil::CallJniMethod<>()
> @  0x1d10047  impala::Frontend::GetExecRequest()
> @  0x1d3205a  impala::ImpalaServer::ExecuteInternal()
> @  0x1d31ba2  impala::ImpalaServer::Execute()
> @  0x1d9be70  impala::ImpalaServer::query()
> @  0x2ee378e  beeswax::BeeswaxServiceProcessor::process_query()
> @  0x2ee34dc  beeswax::BeeswaxServiceProcessor::dispatchCall()
> @  0x2ebcf9d  impala::ImpalaServiceProcessor::dispatchCall()
> @  0x1836690  apache::thrift::TDispatchProcessor::process()
> @  0x1b9649d  
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @  0x1b8d9c5  impala::ThriftThread::RunRunnable()
> @  0x1b8f0c9  boost::_mfi::mf2<>::operator()()
> @  0x1b8ef5f  boost::_bi::list3<>::operator()<>()
> @  0x1b8ecab  boost::_bi::bind_t<>::operator()()
> @  0x1b8ebbe  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @  0x1bd3b1a  boost::function0<>::operator()()
> @  0x1ebec51  impala::Thread::SuperviseThread()
> @  0x1ec6ded  boost::_bi::list5<>::operator()<>()
> @  0x1ec6d11  boost::_bi::bind_t<>::operator()()
> @  0x1ec6cd4  boost::detail::thread_data<>::run()
> @  0x31b3a4a  thread_proxy
> @ 0x7fcf12d536ba  start_thread
> @ 0x7fcf12a8941d  clone
> I0510 13:46:03.124944  4238 impala-server.cc:1010] UnregisterQuery(): 
> query_id=6b4791bb7a54de54:16bdcba7
> I0510 13:46:03.124948  4238 impala-server.cc:1097] Cancel(): 
>

[jira] [Commented] (IMPALA-7072) Kudu's kinit does not support auth_to_config rules with Heimdal kerberos

2018-05-24 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489863#comment-16489863
 ] 

Sailesh Mukil commented on IMPALA-7072:
---

CC: [~kwho]

> Kudu's kinit does not support auth_to_config rules with Heimdal kerberos
> 
>
> Key: IMPALA-7072
> URL: https://issues.apache.org/jira/browse/IMPALA-7072
> Project: IMPALA
>  Issue Type: Bug
>  Components: Security
>Affects Versions: Impala 2.12.0
>Reporter: Sailesh Mukil
>Priority: Critical
>
> On deployments that use Heimdal kerberos configured with 'auth_to_local' 
> rules set, and with the Impala startup flag 'use_kudu_kinit'= true, the 
> auth_to_local rules will not be respected as it's not supported with Kudu's 
> kinit.
> The implication of this is that from Impala 2.12.0 onwards, clusters with the 
> above configuration will not be able to use KRPC with kerberos enabled.
> A workaround is to get rid of the auth_to_local rules for such deployments.
> We need to have a good long term solution to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7072) Kudu's kinit does not support auth_to_config rules with Heimdal kerberos

2018-05-24 Thread Sailesh Mukil (JIRA)

Sailesh Mukil created IMPALA-7072:
-

 Summary: Kudu's kinit does not support auth_to_config rules with 
Heimdal kerberos
 Key: IMPALA-7072
 URL: https://issues.apache.org/jira/browse/IMPALA-7072
 Project: IMPALA
  Issue Type: Bug
  Components: Security
Affects Versions: Impala 2.12.0
Reporter: Sailesh Mukil


On deployments that use Heimdal kerberos configured with 'auth_to_local' rules 
set, and with the Impala startup flag 'use_kudu_kinit'= true, the auth_to_local 
rules will not be respected as it's not supported with Kudu's kinit.

The implication of this is that from Impala 2.12.0 onwards, clusters with the 
above configuration will not be able to use KRPC with kerberos enabled.

A workaround is to get rid of the auth_to_local rules for such deployments.

We need to have a good long term solution to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7072) Kudu's kinit does not support auth_to_config rules with Heimdal kerberos

2018-05-24 Thread Sailesh Mukil (JIRA)

Sailesh Mukil created IMPALA-7072:
-

 Summary: Kudu's kinit does not support auth_to_config rules with 
Heimdal kerberos
 Key: IMPALA-7072
 URL: https://issues.apache.org/jira/browse/IMPALA-7072
 Project: IMPALA
  Issue Type: Bug
  Components: Security
Affects Versions: Impala 2.12.0
Reporter: Sailesh Mukil


On deployments that use Heimdal kerberos configured with 'auth_to_local' rules 
set, and with the Impala startup flag 'use_kudu_kinit'= true, the auth_to_local 
rules will not be respected as it's not supported with Kudu's kinit.

The implication of this is that from Impala 2.12.0 onwards, clusters with the 
above configuration will not be able to use KRPC with kerberos enabled.

A workaround is to get rid of the auth_to_local rules for such deployments.

We need to have a good long term solution to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IMPALA-7012) NullPointerException with CTAS query

2018-05-24 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489849#comment-16489849
 ] 

Tim Armstrong commented on IMPALA-7012:
---

I'm not sure, I was probably doing something wrong here, but it shouldn't 
generate a NPE regardless

> NullPointerException with CTAS query
> 
>
> Key: IMPALA-7012
> URL: https://issues.apache.org/jira/browse/IMPALA-7012
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tianyi Wang
>Priority: Critical
>
> {noformat}
> [localhost:21000] default> create table alltypesinsert partitioned by (year, 
> month) stored as parquet as /* +noclustered */select at1.id, at1.bool_col, 
> at1.tinyint_col, at1.smallint_col, at1.int_col, at1.bigint_col,   
>
> at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,   
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2;
> Query: create table alltypesinsert partitioned by (year, month) stored as 
> parquet as /* +noclustered */
> select at1.id, at1.bool_col, at1.tinyint_col, at1.smallint_col, at1.int_col, 
> at1.bigint_col,
>   at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2
> Query submitted at: 2018-05-10 13:46:02 (Coordinator: 
> http://tarmstrong-box:25000)
> ERROR: NullPointerException: null
> {noformat}
> {noformat}
> I0510 13:46:02.977249  4238 Frontend.java:987] Analyzing query: create table 
> alltypesinsert partitioned by (year, month) stored as parquet as /* 
> +noclustered */
> select at1.id, at1.bool_col, at1.tinyint_col, at1.smallint_col, at1.int_col, 
> at1.bigint_col,
>   at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2
> I0510 13:46:03.025013  4238 jni-util.cc:230] java.lang.NullPointerException
> at 
> org.apache.impala.analysis.SqlScanner.isReserved(SqlScanner.java:725)
> at 
> org.apache.impala.analysis.SqlParser.getErrorMsg(SqlParser.java:1532)
> at org.apache.impala.service.Frontend.parse(Frontend.java:975)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:990)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156)
> I0510 13:46:03.124739  4238 status.cc:125] NullPointerException: null
> @  0x18782ef  impala::Status::Status()
> @  0x1e55652  impala::JniUtil::GetJniExceptionMsg()
> @  0x1d133ed  impala::JniUtil::CallJniMethod<>()
> @  0x1d10047  impala::Frontend::GetExecRequest()
> @  0x1d3205a  impala::ImpalaServer::ExecuteInternal()
> @  0x1d31ba2  impala::ImpalaServer::Execute()
> @  0x1d9be70  impala::ImpalaServer::query()
> @  0x2ee378e  beeswax::BeeswaxServiceProcessor::process_query()
> @  0x2ee34dc  beeswax::BeeswaxServiceProcessor::dispatchCall()
> @  0x2ebcf9d  impala::ImpalaServiceProcessor::dispatchCall()
> @  0x1836690  apache::thrift::TDispatchProcessor::process()
> @  0x1b9649d  
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @  0x1b8d9c5  impala::ThriftThread::RunRunnable()
> @  0x1b8f0c9  boost::_mfi::mf2<>::operator()()
> @  0x1b8ef5f  boost::_bi::list3<>::operator()<>()
> @  0x1b8ecab  boost::_bi::bind_t<>::operator()()
> @  0x1b8ebbe  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @  0x1bd3b1a  boost::function0<>::operator()()
> @  0x1ebec51  impala::Thread::SuperviseThread()
> @  0x1ec6ded  boost::_bi::list5<>::operator()<>()
> @  0x1ec6d11  boost::_bi::bind_t<>::operator()()
> @  0x1ec6cd4  boost::detail::thread_data<>::run()
> @  0x31b3a4a  thread_proxy
> @ 0x7fcf12d536ba  start_thread
> @ 0x7fcf12a8941d  clone
> I0510 13:46:03.124944  4238 impala-server.cc:1010] UnregisterQuery(): 
> query_id=6b4791bb7a54de54:16bdcba7
> I0510 13:46:03.124948  4238 impala-server.cc:1097] Cancel(): 
> query_id=6b4791bb7a54de54:16bdcba7
> {noformat}
> This is on commit hash 3e736450354e55244e16924cfeb223a30629351d  . It looks 
> like the code was added by IMPALA-3916



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail:

[jira] [Commented] (IMPALA-7012) NullPointerException with CTAS query

2018-05-24 Thread Tianyi Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489841#comment-16489841
 ] 

Tianyi Wang commented on IMPALA-7012:
-

Is this supposed to be a syntax error?

> NullPointerException with CTAS query
> 
>
> Key: IMPALA-7012
> URL: https://issues.apache.org/jira/browse/IMPALA-7012
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tianyi Wang
>Priority: Critical
>
> {noformat}
> [localhost:21000] default> create table alltypesinsert partitioned by (year, 
> month) stored as parquet as /* +noclustered */select at1.id, at1.bool_col, 
> at1.tinyint_col, at1.smallint_col, at1.int_col, at1.bigint_col,   
>
> at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,   
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2;
> Query: create table alltypesinsert partitioned by (year, month) stored as 
> parquet as /* +noclustered */
> select at1.id, at1.bool_col, at1.tinyint_col, at1.smallint_col, at1.int_col, 
> at1.bigint_col,
>   at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2
> Query submitted at: 2018-05-10 13:46:02 (Coordinator: 
> http://tarmstrong-box:25000)
> ERROR: NullPointerException: null
> {noformat}
> {noformat}
> I0510 13:46:02.977249  4238 Frontend.java:987] Analyzing query: create table 
> alltypesinsert partitioned by (year, month) stored as parquet as /* 
> +noclustered */
> select at1.id, at1.bool_col, at1.tinyint_col, at1.smallint_col, at1.int_col, 
> at1.bigint_col,
>   at1.float_col, at1.double_col, at1.date_string_col, at1.string_col, 
> at1.timestamp_col,
>   at1.year, at2.id as month
> from  functional.alltypes at1, functional.alltypes at2
> I0510 13:46:03.025013  4238 jni-util.cc:230] java.lang.NullPointerException
> at 
> org.apache.impala.analysis.SqlScanner.isReserved(SqlScanner.java:725)
> at 
> org.apache.impala.analysis.SqlParser.getErrorMsg(SqlParser.java:1532)
> at org.apache.impala.service.Frontend.parse(Frontend.java:975)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:990)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156)
> I0510 13:46:03.124739  4238 status.cc:125] NullPointerException: null
> @  0x18782ef  impala::Status::Status()
> @  0x1e55652  impala::JniUtil::GetJniExceptionMsg()
> @  0x1d133ed  impala::JniUtil::CallJniMethod<>()
> @  0x1d10047  impala::Frontend::GetExecRequest()
> @  0x1d3205a  impala::ImpalaServer::ExecuteInternal()
> @  0x1d31ba2  impala::ImpalaServer::Execute()
> @  0x1d9be70  impala::ImpalaServer::query()
> @  0x2ee378e  beeswax::BeeswaxServiceProcessor::process_query()
> @  0x2ee34dc  beeswax::BeeswaxServiceProcessor::dispatchCall()
> @  0x2ebcf9d  impala::ImpalaServiceProcessor::dispatchCall()
> @  0x1836690  apache::thrift::TDispatchProcessor::process()
> @  0x1b9649d  
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @  0x1b8d9c5  impala::ThriftThread::RunRunnable()
> @  0x1b8f0c9  boost::_mfi::mf2<>::operator()()
> @  0x1b8ef5f  boost::_bi::list3<>::operator()<>()
> @  0x1b8ecab  boost::_bi::bind_t<>::operator()()
> @  0x1b8ebbe  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @  0x1bd3b1a  boost::function0<>::operator()()
> @  0x1ebec51  impala::Thread::SuperviseThread()
> @  0x1ec6ded  boost::_bi::list5<>::operator()<>()
> @  0x1ec6d11  boost::_bi::bind_t<>::operator()()
> @  0x1ec6cd4  boost::detail::thread_data<>::run()
> @  0x31b3a4a  thread_proxy
> @ 0x7fcf12d536ba  start_thread
> @ 0x7fcf12a8941d  clone
> I0510 13:46:03.124944  4238 impala-server.cc:1010] UnregisterQuery(): 
> query_id=6b4791bb7a54de54:16bdcba7
> I0510 13:46:03.124948  4238 impala-server.cc:1097] Cancel(): 
> query_id=6b4791bb7a54de54:16bdcba7
> {noformat}
> This is on commit hash 3e736450354e55244e16924cfeb223a30629351d  . It looks 
> like the code was added by IMPALA-3916



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail:

[jira] [Created] (IMPALA-7071) Make get_fs_path() idempotent

2018-05-24 Thread Dan Hecht (JIRA)

Dan Hecht created IMPALA-7071:
-

 Summary: Make get_fs_path() idempotent
 Key: IMPALA-7071
 URL: https://issues.apache.org/jira/browse/IMPALA-7071
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 3.1.0
Reporter: Dan Hecht
Assignee: Dan Hecht


To avoid errors like in IMPALA-7068, make get_fs_path() idempotent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7071) Make get_fs_path() idempotent

2018-05-24 Thread Dan Hecht (JIRA)

Dan Hecht created IMPALA-7071:
-

 Summary: Make get_fs_path() idempotent
 Key: IMPALA-7071
 URL: https://issues.apache.org/jira/browse/IMPALA-7071
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 3.1.0
Reporter: Dan Hecht
Assignee: Dan Hecht


To avoid errors like in IMPALA-7068, make get_fs_path() idempotent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (IMPALA-7070) Failed test: query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays on S3

2018-05-24 Thread Hubert Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hubert Sun reassigned IMPALA-7070:
--

Assignee: Lars Volker

> Failed test: 
> query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays
>  on S3
> -
>
> Key: IMPALA-7070
> URL: https://issues.apache.org/jira/browse/IMPALA-7070
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Lars Volker
>Priority: Blocker
>  Labels: broken-build, test-failure
>
>  
> {code:java}
> Error Message
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays "col1 
> array") query_test/test_nested_types.py:579: in 
> _create_test_table check_call(["hadoop", "fs", "-put", local_path, 
> location], shell=False) /usr/lib64/python2.6/subprocess.py:505: in check_call 
> raise CalledProcessError(retcode, cmd) E   CalledProcessError: Command 
> '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Stacktrace
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays
> "col1 array")
> query_test/test_nested_types.py:579: in _create_test_table
> check_call(["hadoop", "fs", "-put", local_path, location], shell=False)
> /usr/lib64/python2.6/subprocess.py:505: in check_call
> raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Standard Error
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_thrift_array_of_arrays_11da5fde` CASCADE;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_thrift_array_of_arrays_11da5fde`;
> MainThread: Created database "test_thrift_array_of_arrays_11da5fde" for test 
> ID 
> "query_test/test_nested_types.py::TestParquetArrayEncodings::()::test_thrift_array_of_arrays[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
> -- executing against localhost:21000
> create table test_thrift_array_of_arrays_11da5fde.ThriftArrayOfArrays (col1 
> array) stored as parquet location 
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays';
> 18/05/20 18:31:03 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 
> 10 second(s).
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> started
> 18/05/20 18:31:06 INFO Configuration.deprecation: 
> fs.s3a.server-side-encryption-key is deprecated. Instead, use 
> fs.s3a.server-side-encryption.key
> put: rename 
> `s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet._COPYING_'
>  to 
> `s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet':
>  Input/output error
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: Stopping s3a-file-system 
> metrics system...
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> stopped.
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> shutdown complete.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7067) sleep(100000) command from test_shell_commandline.py can hang around and cause test_metrics_are_zero to fail

2018-05-24 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7067.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> sleep(10) command from test_shell_commandline.py can hang around and 
> cause test_metrics_are_zero to fail
> 
>
> Key: IMPALA-7067
> URL: https://issues.apache.org/jira/browse/IMPALA-7067
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> {noformat}
> 03:25:47 [gw6] PASSED 
> shell/test_shell_commandline.py::TestImpalaShell::test_cancellation 
> ...
> 03:27:01 verifiers/test_verify_metrics.py:34: in test_metrics_are_zero
> 03:27:01 verifier.verify_metrics_are_zero()
> 03:27:01 verifiers/metric_verifier.py:47: in verify_metrics_are_zero
> 03:27:01 self.wait_for_metric(metric, 0, timeout)
> 03:27:01 verifiers/metric_verifier.py:62: in wait_for_metric
> 03:27:01 self.impalad_service.wait_for_metric_value(metric_name, 
> expected_value, timeout)
> 03:27:01 common/impala_service.py:135: in wait_for_metric_value
> 03:27:01 json.dumps(self.read_debug_webpage('rpcz?json')))
> 03:27:01 E   AssertionError: Metric value impala-server.mem-pool.total-bytes 
> did not reach value 0 in 60s
> {noformat}
> I used the json dump from memz and the logs to trace it back to the 
> sleep(10) query hanging around



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IMPALA-6953) Improve encapsulation within DiskIoMgr

2018-05-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489756#comment-16489756
 ] 

ASF subversion and git services commented on IMPALA-6953:
-

Commit 890457c01878cb3ee3b2045b63bbcabf772d88bc in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=890457c ]

IMPALA-6953: part 1: clean up DiskIoMgr

There should be no behavioural changes as a result of
this refactoring.

Make DiskQueue an encapsulated class.

Remove friend classes where possible, either by using public
methods or moving code between classes.

Move method into protected in some cases.

Split GetNextRequestRange() into two methods that
operate on DiskQueue and RequestContext state. The methods
belong to the respective classes.

Reduce transitive #include dependencies to hopefully help
with build time.

Testing:
Ran core tests.

Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1
Reviewed-on: http://gerrit.cloudera.org:8080/10245
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Improve encapsulation within DiskIoMgr
> --
>
> Key: IMPALA-6953
> URL: https://issues.apache.org/jira/browse/IMPALA-6953
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> While DiskIoMgr is still fresh in my mind, I should do some refactoring to 
> improve the encapsulation within io::. Currently lots of classes are friends 
> with each other and some code is not in the most appropriate class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7067) sleep(100000) command from test_shell_commandline.py can hang around and cause test_metrics_are_zero to fail

2018-05-24 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7067.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> sleep(10) command from test_shell_commandline.py can hang around and 
> cause test_metrics_are_zero to fail
> 
>
> Key: IMPALA-7067
> URL: https://issues.apache.org/jira/browse/IMPALA-7067
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> {noformat}
> 03:25:47 [gw6] PASSED 
> shell/test_shell_commandline.py::TestImpalaShell::test_cancellation 
> ...
> 03:27:01 verifiers/test_verify_metrics.py:34: in test_metrics_are_zero
> 03:27:01 verifier.verify_metrics_are_zero()
> 03:27:01 verifiers/metric_verifier.py:47: in verify_metrics_are_zero
> 03:27:01 self.wait_for_metric(metric, 0, timeout)
> 03:27:01 verifiers/metric_verifier.py:62: in wait_for_metric
> 03:27:01 self.impalad_service.wait_for_metric_value(metric_name, 
> expected_value, timeout)
> 03:27:01 common/impala_service.py:135: in wait_for_metric_value
> 03:27:01 json.dumps(self.read_debug_webpage('rpcz?json')))
> 03:27:01 E   AssertionError: Metric value impala-server.mem-pool.total-bytes 
> did not reach value 0 in 60s
> {noformat}
> I used the json dump from memz and the logs to trace it back to the 
> sleep(10) query hanging around



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-7068) Failed test: metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression on S3

2018-05-24 Thread Dan Hecht (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Hecht reassigned IMPALA-7068:
-

Assignee: Dan Hecht

> Failed test: 
> metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression
>  on S3
> ---
>
> Key: IMPALA-7068
> URL: https://issues.apache.org/jira/browse/IMPALA-7068
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Dan Hecht
>Priority: Blocker
>  Labels: S3, broken-build, test-failure
>
> This is from executing the failed test. It seems that the S3 prefix 
> (s3a://impala-cdh5-s3-tests) is added twice to the table location, resulting 
> in an invalid S3 path. 
> {code:java}
> Error Message
> metadata/test_partition_metadata.py:177: in test_unsupported_text_compression 
> FQ_TBL_NAME, TBL_LOCATION)) common/impala_connection.py:160: in execute   
>   return self.__beeswax_client.execute(sql_stmt, user=user) 
> beeswax/impala_beeswax.py:173: in execute handle = 
> self.__execute_query(query_string.strip(), user=user) 
> beeswax/impala_beeswax.py:339: in __execute_query handle = 
> self.execute_query_async(query_string, user=user) 
> beeswax/impala_beeswax.py:335: in execute_query_async return 
> self.__do_rpc(lambda: self.imp_service.query(query,)) 
> beeswax/impala_beeswax.py:460: in __do_rpc raise 
> ImpalaBeeswaxException(self.__build_error_message(b), b) E   
> ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'> EMESSAGE: AnalysisException: Bucket 
> impala-cdh5-s3-tests3a does not exist E   CAUSED BY: FileNotFoundException: 
> Bucket impala-cdh5-s3-tests3a does not exist
> Stacktrace
> metadata/test_partition_metadata.py:177: in test_unsupported_text_compression
> FQ_TBL_NAME, TBL_LOCATION))
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: AnalysisException: Bucket impala-cdh5-s3-tests3a does not exist
> E   CAUSED BY: FileNotFoundException: Bucket impala-cdh5-s3-tests3a does not 
> exist
> Standard Error
> -- connecting to: localhost:21000
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_unsupported_text_compression_695d360a` CASCADE;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_unsupported_text_compression_695d360a`;
> MainThread: Created database "test_unsupported_text_compression_695d360a" for 
> test ID 
> "metadata/test_partition_metadata.py::TestPartitionMetadataUncompressedTextOnly::()::test_unsupported_text_compression[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none]"
> MainThread: Starting new HTTPS connection (1): 
> impala-cdh5-s3-test.s3.amazonaws.com
> -- executing against localhost:21000
> create external table 
> test_unsupported_text_compression_695d360a.multi_text_compression like 
> functional.alltypes location 
> 's3a://impala-cdh5-s3-tests3a://impala-cdh5-s3-test/test-warehouse/test_unsupported_text_compression_695d360a.db/multi_text_compression';
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7043) Failure in HBase splitting should not fail dataload

2018-05-24 Thread Joe McDonnell (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-7043.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Failure in HBase splitting should not fail dataload
> ---
>
> Key: IMPALA-7043
> URL: https://issues.apache.org/jira/browse/IMPALA-7043
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Dataload splits two of the HBase tables to provide consistent state for 
> frontend tests. However, sometimes HBase will change and the splitting code 
> will fail. Since this is happening during dataload, the whole invocation of 
> buildall.sh fails. This means that no tests run and any minor problem with 
> HBase can impact all testing, even of things that are not impacted by the 
> HBase splitting.
>  
> The HBase splitting should not fail dataload. Some tests may fail, but the 
> tests that are unrelated can run and pass.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7055) test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to table format AVRO is not supported")

2018-05-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489642#comment-16489642
 ] 

ASF subversion and git services commented on IMPALA-7055:
-

Commit 2362b672ccd94ed97331fe9c84ac1603ecb3772f in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=2362b67 ]

IMPALA-7055: fix race with DML errors

Error statuses could be lost because backend_exec_complete_barrier_
went to 0 before the query was transitioned to an error state.
Reordering the UpdateExecState() and backend_exec_complete_barrier_
calls prevents this race.

Change-Id: Idafd0b342e77a065be7cc28fa8c8a9df445622c2
Reviewed-on: http://gerrit.cloudera.org:8080/10491
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to 
> table format AVRO is not supported")
> --
>
> Key: IMPALA-7055
> URL: https://issues.apache.org/jira/browse/IMPALA-7055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> This failure occurred while verifying https://gerrit.cloudera.org/c/10455/, 
> but it is not related to that patch. The failing build is 
> https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/ 
> (https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2232/)
> Test appears to be (from 
> [avro-writer.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test]):
> {noformat}
>  QUERY
> SET ALLOW_UNSUPPORTED_FORMATS=0;
> insert into __avro_write select 1, "b", 2.2;
>  CATCH
> Writing to table format AVRO is not supported. Use query option 
> ALLOW_UNSUPPORTED_FORMATS
> {noformat}
> Error output:
> {noformat}
> 01:50:18 ] FAIL 
> query_test/test_compressed_formats.py::TestTableWriters::()::test_avro_writer[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none]
> 01:50:18 ] === FAILURES 
> ===
> 01:50:18 ]  TestTableWriters.test_avro_writer[exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] 
> 01:50:18 ] [gw9] linux2 -- Python 2.7.12 
> /home/ubuntu/Impala/bin/../infra/python/env/bin/python
> 01:50:18 ] query_test/test_compressed_formats.py:189: in test_avro_writer
> 01:50:18 ] self.run_test_case('QueryTest/avro-writer', vector)
> 01:50:18 ] common/impala_test_suite.py:420: in run_test_case
> 01:50:18 ] assert False, "Expected exception: %s" % expected_str
> 01:50:18 ] E   AssertionError: Expected exception: Writing to table format 
> AVRO is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS
> 01:50:18 ]  Captured stderr setup 
> -
> 01:50:18 ] -- connecting to: localhost:21000
> 01:50:18 ] - Captured stderr call 
> -
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] use functional;
> 01:50:18 ] 
> 01:50:18 ] SET batch_size=0;
> 01:50:18 ] SET num_nodes=0;
> 01:50:18 ] SET disable_codegen_rows_threshold=5000;
> 01:50:18 ] SET disable_codegen=False;
> 01:50:18 ] SET abort_on_error=1;
> 01:50:18 ] SET exec_single_node_rows_threshold=0;
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] drop table if exists __avro_write;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC=NONE;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] 
> 01:50:18 ] create table __avro_write (i int, s string, d double)
> 01:50:18 ] stored as AVRO
> 01:50:18 ] TBLPROPERTIES ('avro.schema.literal'='{
> 01:50:18 ]   "name": "my_record",
> 01:50:18 ]   "type": "record",
> 01:50:18 ]   "fields": [
> 01:50:18 ]   {"name":"i", "type":["int", "null"]},
> 01:50:18 ]   {"name":"s", "type":["string", "null"]},
> 01:50:18 ]   {"name":"d", "type":["double", "null"]}]}');
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET

[jira] [Resolved] (IMPALA-7055) test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to table format AVRO is not supported")

2018-05-24 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7055.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to 
> table format AVRO is not supported")
> --
>
> Key: IMPALA-7055
> URL: https://issues.apache.org/jira/browse/IMPALA-7055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> This failure occurred while verifying https://gerrit.cloudera.org/c/10455/, 
> but it is not related to that patch. The failing build is 
> https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/ 
> (https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2232/)
> Test appears to be (from 
> [avro-writer.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test]):
> {noformat}
>  QUERY
> SET ALLOW_UNSUPPORTED_FORMATS=0;
> insert into __avro_write select 1, "b", 2.2;
>  CATCH
> Writing to table format AVRO is not supported. Use query option 
> ALLOW_UNSUPPORTED_FORMATS
> {noformat}
> Error output:
> {noformat}
> 01:50:18 ] FAIL 
> query_test/test_compressed_formats.py::TestTableWriters::()::test_avro_writer[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none]
> 01:50:18 ] === FAILURES 
> ===
> 01:50:18 ]  TestTableWriters.test_avro_writer[exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] 
> 01:50:18 ] [gw9] linux2 -- Python 2.7.12 
> /home/ubuntu/Impala/bin/../infra/python/env/bin/python
> 01:50:18 ] query_test/test_compressed_formats.py:189: in test_avro_writer
> 01:50:18 ] self.run_test_case('QueryTest/avro-writer', vector)
> 01:50:18 ] common/impala_test_suite.py:420: in run_test_case
> 01:50:18 ] assert False, "Expected exception: %s" % expected_str
> 01:50:18 ] E   AssertionError: Expected exception: Writing to table format 
> AVRO is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS
> 01:50:18 ]  Captured stderr setup 
> -
> 01:50:18 ] -- connecting to: localhost:21000
> 01:50:18 ] - Captured stderr call 
> -
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] use functional;
> 01:50:18 ] 
> 01:50:18 ] SET batch_size=0;
> 01:50:18 ] SET num_nodes=0;
> 01:50:18 ] SET disable_codegen_rows_threshold=5000;
> 01:50:18 ] SET disable_codegen=False;
> 01:50:18 ] SET abort_on_error=1;
> 01:50:18 ] SET exec_single_node_rows_threshold=0;
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] drop table if exists __avro_write;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC=NONE;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] 
> 01:50:18 ] create table __avro_write (i int, s string, d double)
> 01:50:18 ] stored as AVRO
> 01:50:18 ] TBLPROPERTIES ('avro.schema.literal'='{
> 01:50:18 ]   "name": "my_record",
> 01:50:18 ]   "type": "record",
> 01:50:18 ]   "fields": [
> 01:50:18 ]   {"name":"i", "type":["int", "null"]},
> 01:50:18 ]   {"name":"s", "type":["string", "null"]},
> 01:50:18 ]   {"name":"d", "type":["double", "null"]}]}');
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC="";
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC=NONE;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] 
> 01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS=1;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] 
> 01:50:18 ] insert into __avro_write select 0, "a", 1.1;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC="";
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS="0";
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
>

[jira] [Resolved] (IMPALA-7055) test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to table format AVRO is not supported")

2018-05-24 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7055.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to 
> table format AVRO is not supported")
> --
>
> Key: IMPALA-7055
> URL: https://issues.apache.org/jira/browse/IMPALA-7055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> This failure occurred while verifying https://gerrit.cloudera.org/c/10455/, 
> but it is not related to that patch. The failing build is 
> https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/ 
> (https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2232/)
> Test appears to be (from 
> [avro-writer.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test]):
> {noformat}
>  QUERY
> SET ALLOW_UNSUPPORTED_FORMATS=0;
> insert into __avro_write select 1, "b", 2.2;
>  CATCH
> Writing to table format AVRO is not supported. Use query option 
> ALLOW_UNSUPPORTED_FORMATS
> {noformat}
> Error output:
> {noformat}
> 01:50:18 ] FAIL 
> query_test/test_compressed_formats.py::TestTableWriters::()::test_avro_writer[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none]
> 01:50:18 ] === FAILURES 
> ===
> 01:50:18 ]  TestTableWriters.test_avro_writer[exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] 
> 01:50:18 ] [gw9] linux2 -- Python 2.7.12 
> /home/ubuntu/Impala/bin/../infra/python/env/bin/python
> 01:50:18 ] query_test/test_compressed_formats.py:189: in test_avro_writer
> 01:50:18 ] self.run_test_case('QueryTest/avro-writer', vector)
> 01:50:18 ] common/impala_test_suite.py:420: in run_test_case
> 01:50:18 ] assert False, "Expected exception: %s" % expected_str
> 01:50:18 ] E   AssertionError: Expected exception: Writing to table format 
> AVRO is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS
> 01:50:18 ]  Captured stderr setup 
> -
> 01:50:18 ] -- connecting to: localhost:21000
> 01:50:18 ] - Captured stderr call 
> -
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] use functional;
> 01:50:18 ] 
> 01:50:18 ] SET batch_size=0;
> 01:50:18 ] SET num_nodes=0;
> 01:50:18 ] SET disable_codegen_rows_threshold=5000;
> 01:50:18 ] SET disable_codegen=False;
> 01:50:18 ] SET abort_on_error=1;
> 01:50:18 ] SET exec_single_node_rows_threshold=0;
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] drop table if exists __avro_write;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC=NONE;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] 
> 01:50:18 ] create table __avro_write (i int, s string, d double)
> 01:50:18 ] stored as AVRO
> 01:50:18 ] TBLPROPERTIES ('avro.schema.literal'='{
> 01:50:18 ]   "name": "my_record",
> 01:50:18 ]   "type": "record",
> 01:50:18 ]   "fields": [
> 01:50:18 ]   {"name":"i", "type":["int", "null"]},
> 01:50:18 ]   {"name":"s", "type":["string", "null"]},
> 01:50:18 ]   {"name":"d", "type":["double", "null"]}]}');
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC="";
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC=NONE;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] 
> 01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS=1;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] 
> 01:50:18 ] insert into __avro_write select 0, "a", 1.1;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC="";
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET ALLOW_UNSUPPORTED_FORMATS="0";
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
>

[jira] [Commented] (IMPALA-6119) Inconsistent file metadata updates when multiple partitions point to the same path

2018-05-24 Thread Gabor Kaszab (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489604#comment-16489604
 ] 

Gabor Kaszab commented on IMPALA-6119:
--

[~bharathv] If I understand your proposal correctly then within 
resetAndLoadFileMetadata() changing the for loop that goes through the received 
partitions to add the new file descriptor to each of them should fix this 
issue. Do I understand it well?

My issue with this is that apparently resetAndLoadFileMetadata() is not invoked 
in case I do an insert to my test table. It most probably goes to the other 
direction towards refreshFileMetadata().
I guess I could do something similar that function as well, however, the 
'partitions' parameter for these functions would hold only b=1 partition (using 
the test case in the description) and still the other partitions pointing to 
the same location has to be found and we are again there to choose between 
solution (1) and (2), right?

Am I missing something?

> Inconsistent file metadata updates when multiple partitions point to the same 
> path
> --
>
> Key: IMPALA-6119
> URL: https://issues.apache.org/jira/browse/IMPALA-6119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: bharath v
>Assignee: Gabor Kaszab
>Priority: Critical
>  Labels: correctness, ramp-up
>
> Following steps can give inconsistent results.
> {noformat}
> // Create a partitioned table
> create table test(a int) partitioned by (b int);
> // Create two partitions b=1/b=2 mapped to the same HDFS location.
> insert into test partition(b=1) values (1);
> alter table test add partition (b=2) location 
> 'hdfs://localhost:20500/test-warehouse/test/b=1/' 
> [localhost:21000] > show partitions test;
> Query: show partitions test
> +---+---++--+--+---++---++
> | b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
> Incremental stats | Location   |
> +---+---++--+--+---++---++
> | 1 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | 2 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | Total | -1| 2  | 4B   | 0B   |   || 
>   ||
> +---+---++--+--+---++---++
> // Insert new data into one of the partitions
> insert into test partition(b=1) values (2);
> // Newly added file is reflected only in the added partition files. 
> show files in test;
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> invalidate metadata test;
>  show files in test;
> // After invalidation, the newly added file now shows up in both the 
> partitions.
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
>

[jira] [Comment Edited] (IMPALA-7008) TestSpillingDebugActionDimensions.test_spilling test setup fails

2018-05-24 Thread David Knupp (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489602#comment-16489602
 ] 

David Knupp edited comment on IMPALA-7008 at 5/24/18 7:01 PM:
--

[~joemcdonnell] We're trying various experiments -- running the tests in serial 
seems to work, as does running them on a beefier VM. Still looking for a root 
cause as to why it started happening.


was (Author: dknupp):
We're trying various experiments -- running the tests in serial seems to work, 
as does running them on a beefier VM. Still looking for a root cause as to why 
it started happening.

> TestSpillingDebugActionDimensions.test_spilling test setup fails
> 
>
> Key: IMPALA-7008
> URL: https://issues.apache.org/jira/browse/IMPALA-7008
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.13.0
>Reporter: Sailesh Mukil
>Assignee: David Knupp
>Priority: Blocker
>  Labels: broken-build, flaky
>
> We've seen multiple instances of this test failing with the following error:
> {code:java}
> Error Message
> test setup failure
> Stacktrace
> Slave 'gw0' crashed while running 
> "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none]"
> {code}
> We need to investigate why this is happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7008) TestSpillingDebugActionDimensions.test_spilling test setup fails

2018-05-24 Thread David Knupp (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489602#comment-16489602
 ] 

David Knupp commented on IMPALA-7008:
-

We're trying various experiments -- running the tests in serial seems to work, 
as does running them on a beefier VM. Still looking for a root cause as to why 
it started happening.

> TestSpillingDebugActionDimensions.test_spilling test setup fails
> 
>
> Key: IMPALA-7008
> URL: https://issues.apache.org/jira/browse/IMPALA-7008
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.13.0
>Reporter: Sailesh Mukil
>Assignee: David Knupp
>Priority: Blocker
>  Labels: broken-build, flaky
>
> We've seen multiple instances of this test failing with the following error:
> {code:java}
> Error Message
> test setup failure
> Stacktrace
> Slave 'gw0' crashed while running 
> "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none]"
> {code}
> We need to investigate why this is happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-6600) py.test error "Replacing crashed slave gw1" in test_spilling

2018-05-24 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-6600.
-
Resolution: Duplicate

> py.test error "Replacing crashed slave gw1" in test_spilling
> 
>
> Key: IMPALA-6600
> URL: https://issues.apache.org/jira/browse/IMPALA-6600
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Lars Volker
>Assignee: David Knupp
>Priority: Major
>  Labels: broken-build, flaky
>
> I saw a build fail with "Replacing crashed slave gw1". Here's the failing 
> part of the log:
> {noformat}
> 12:18:33 [gw0] PASSED 
> query_test/test_tablesample.py::TestTableSample::test_tablesample[repeatable: 
> False | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 12:18:33 [gw1] node down: Not properly terminated
> 12:18:33 [gw1] FAILED 
> query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none] 
> 12:18:33 Replacing crashed slave gw1
> 12:18:34 
> 12:18:34 unittests/test_file_parser.py::TestTestFileParser::test_valid_parse 
> {noformat}
> Here is the summary:
> {noformat}
> 12:44:14 === FAILURES 
> ===
> 12:44:14 _ query_test/test_spilling.py 
> __
> 12:44:14 [gw1] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-asf-master-core-local/repos/Impala/bin/../infra/python/env/bin/python
> 12:44:14 Slave 'gw1' crashed while running 
> "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none]"
> 12:44:14 == 1 failed, 1494 passed, 404 skipped, 9 xfailed in 10374.68 
> seconds ===
> {noformat}
> [~dknupp] - I’m assigning this to you thinking you might have an idea what’s 
> going on here; feel free to find another person or assign back to me if 
> you're swamped.
> I’ve seen this happen in a private Jenkins run. Please ping me if you would 
> like access to the build artifacts.
> I've also seen a similar error message in IMPALA-5724 and in [this GRPC issue 
> on Github|https://github.com/grpc/grpc/issues/3577].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6600) py.test error "Replacing crashed slave gw1" in test_spilling

2018-05-24 Thread David Knupp (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489592#comment-16489592
 ] 

David Knupp commented on IMPALA-6600:
-

I feel like this is a duplicate of IMPALA-7008.

> py.test error "Replacing crashed slave gw1" in test_spilling
> 
>
> Key: IMPALA-6600
> URL: https://issues.apache.org/jira/browse/IMPALA-6600
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Lars Volker
>Assignee: David Knupp
>Priority: Major
>  Labels: broken-build, flaky
>
> I saw a build fail with "Replacing crashed slave gw1". Here's the failing 
> part of the log:
> {noformat}
> 12:18:33 [gw0] PASSED 
> query_test/test_tablesample.py::TestTableSample::test_tablesample[repeatable: 
> False | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 12:18:33 [gw1] node down: Not properly terminated
> 12:18:33 [gw1] FAILED 
> query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none] 
> 12:18:33 Replacing crashed slave gw1
> 12:18:34 
> 12:18:34 unittests/test_file_parser.py::TestTestFileParser::test_valid_parse 
> {noformat}
> Here is the summary:
> {noformat}
> 12:44:14 === FAILURES 
> ===
> 12:44:14 _ query_test/test_spilling.py 
> __
> 12:44:14 [gw1] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-asf-master-core-local/repos/Impala/bin/../infra/python/env/bin/python
> 12:44:14 Slave 'gw1' crashed while running 
> "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none]"
> 12:44:14 == 1 failed, 1494 passed, 404 skipped, 9 xfailed in 10374.68 
> seconds ===
> {noformat}
> [~dknupp] - I’m assigning this to you thinking you might have an idea what’s 
> going on here; feel free to find another person or assign back to me if 
> you're swamped.
> I’ve seen this happen in a private Jenkins run. Please ping me if you would 
> like access to the build artifacts.
> I've also seen a similar error message in IMPALA-5724 and in [this GRPC issue 
> on Github|https://github.com/grpc/grpc/issues/3577].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-6600) py.test error "Replacing crashed slave gw1" in test_spilling

2018-05-24 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-6600.
-
Resolution: Duplicate

> py.test error "Replacing crashed slave gw1" in test_spilling
> 
>
> Key: IMPALA-6600
> URL: https://issues.apache.org/jira/browse/IMPALA-6600
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Lars Volker
>Assignee: David Knupp
>Priority: Major
>  Labels: broken-build, flaky
>
> I saw a build fail with "Replacing crashed slave gw1". Here's the failing 
> part of the log:
> {noformat}
> 12:18:33 [gw0] PASSED 
> query_test/test_tablesample.py::TestTableSample::test_tablesample[repeatable: 
> False | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 12:18:33 [gw1] node down: Not properly terminated
> 12:18:33 [gw1] FAILED 
> query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none] 
> 12:18:33 Replacing crashed slave gw1
> 12:18:34 
> 12:18:34 unittests/test_file_parser.py::TestTestFileParser::test_valid_parse 
> {noformat}
> Here is the summary:
> {noformat}
> 12:44:14 === FAILURES 
> ===
> 12:44:14 _ query_test/test_spilling.py 
> __
> 12:44:14 [gw1] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-asf-master-core-local/repos/Impala/bin/../infra/python/env/bin/python
> 12:44:14 Slave 'gw1' crashed while running 
> "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none]"
> 12:44:14 == 1 failed, 1494 passed, 404 skipped, 9 xfailed in 10374.68 
> seconds ===
> {noformat}
> [~dknupp] - I’m assigning this to you thinking you might have an idea what’s 
> going on here; feel free to find another person or assign back to me if 
> you're swamped.
> I’ve seen this happen in a private Jenkins run. Please ping me if you would 
> like access to the build artifacts.
> I've also seen a similar error message in IMPALA-5724 and in [this GRPC issue 
> on Github|https://github.com/grpc/grpc/issues/3577].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-7023) TestInsertQueries.test_insert_overwrite fails by hitting memory limit

2018-05-24 Thread Joe McDonnell (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-7023.
---
Resolution: Fixed

> TestInsertQueries.test_insert_overwrite fails by hitting memory limit
> -
>
> Key: IMPALA-7023
> URL: https://issues.apache.org/jira/browse/IMPALA-7023
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
>
> This failure is seen on exhaustive builds on both master and 2.x:
> {noformat}
> Error Message
> ImpalaBeeswaxException: ImpalaBeeswaxException:  INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>  MESSAGE: AnalysisException: Failed to 
> evaluate expr: 20 CAUSED BY: InternalException: Memory limit exceeded: Error 
> occurred on backend 
> impala-boost-static-burst-slave-el7-03d4.vpc.cloudera.com:22000 by fragment 
> 0:0 Memory left in process limit: -4.29 GB 
> Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0   : Total=0 
> Peak=0Process: memory limit exceeded. Limit=12.00 GB Total=16.29 GB 
> Peak=16.29 GB   Buffer Pool: Free Buffers: Total=160.00 KB   Buffer Pool: 
> Clean Pages: Total=0   Buffer Pool: Unused Reservation: Total=-328.00 KB   
> Data Stream Service Queue: Limit=614.40 MB Total=0 Peak=116.12 KB   Data 
> Stream Manager Early RPCs: Total=0 Peak=6.76 KB   TCMalloc Overhead: 
> Total=103.56 MB   RequestPool=fe-eval-exprs: Total=0 Peak=52.83 KB 
> Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0   
> RequestPool=default-pool: Total=5.20 GB Peak=5.20 GB 
> Query(f4014f7bb49ea78:6b926b19): memory limit exceeded. Limit=64.00 
> MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=1.76 GB Total=1.76 GB 
> Peak=1.96 GB Query(2c44b65fbcb4e1ce:3d73badf): memory limit 
> exceeded. Limit=64.00 MB Reservation=0 ReservationLimit=32.00 MB 
> OtherMemory=1.04 GB Total=1.04 GB Peak=1.61 GB 
> Query(214cc23c1376176f:7844977b): memory limit exceeded. Limit=64.00 
> MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=1.23 GB Total=1.23 GB 
> Peak=1.23 GB Query(8949bdf792a32ad2:33a36c03): memory limit 
> exceeded. Limit=64.00 MB Reservation=0 ReservationLimit=32.00 MB 
> OtherMemory=642.20 MB Total=642.20 MB Peak=642.20 MB 
> Query(5412ff4e6065721:519d3e61): memory limit exceeded. Limit=64.00 
> MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=556.49 MB Total=556.49 
> MB Peak=556.49 MB   Untracked Memory: Total=10.98 GB
> Stacktrace
> query_test/test_insert.py:132: in test_insert_overwrite
> multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1)
> common/impala_test_suite.py:405: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:620: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: AnalysisException: Failed to evaluate expr: 20
> E   CAUSED BY: InternalException: Memory limit exceeded: Error occurred on 
> backend impala-boost-static-burst-slave-el7-03d4.vpc.cloudera.com:22000 by 
> fragment 0:0
> E   Memory left in process limit: -4.29 GB
> E   Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0
> E : Total=0 Peak=0Process: memory limit exceeded. Limit=12.00 GB 
> Total=16.29 GB Peak=16.29 GB
> E Buffer Pool: Free Buffers: Total=160.00 KB
> E Buffer Pool: Clean Pages: Total=0
> E Buffer Pool: Unused Reservation: Total=-328.00 KB
> E Data Stream Service Queue: Limit=614.40 MB Total=0 Peak=116.12 KB
> E Data Stream Manager Early RPCs: Total=0 Peak=6.76 KB
> E TCMalloc Overhead: Total=103.56 MB
> E RequestPool=fe-eval-exprs: Total=0 Peak=52.83 KB
> E   Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0
> E RequestPool=default-pool: Total=5.20 GB Peak=5.20 GB
> E   Query(f4014f7bb49ea78:6b926b19): memory limit exceeded. 
> Limit=64.00 MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=1.76 GB 
> Total=1.76 GB Peak=1.96 GB
> E

[jira] [Updated] (IMPALA-7023) TestInsertQueries.test_insert_overwrite fails by hitting memory limit

2018-05-24 Thread Joe McDonnell (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-7023:
--
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> TestInsertQueries.test_insert_overwrite fails by hitting memory limit
> -
>
> Key: IMPALA-7023
> URL: https://issues.apache.org/jira/browse/IMPALA-7023
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> This failure is seen on exhaustive builds on both master and 2.x:
> {noformat}
> Error Message
> ImpalaBeeswaxException: ImpalaBeeswaxException:  INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>  MESSAGE: AnalysisException: Failed to 
> evaluate expr: 20 CAUSED BY: InternalException: Memory limit exceeded: Error 
> occurred on backend 
> impala-boost-static-burst-slave-el7-03d4.vpc.cloudera.com:22000 by fragment 
> 0:0 Memory left in process limit: -4.29 GB 
> Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0   : Total=0 
> Peak=0Process: memory limit exceeded. Limit=12.00 GB Total=16.29 GB 
> Peak=16.29 GB   Buffer Pool: Free Buffers: Total=160.00 KB   Buffer Pool: 
> Clean Pages: Total=0   Buffer Pool: Unused Reservation: Total=-328.00 KB   
> Data Stream Service Queue: Limit=614.40 MB Total=0 Peak=116.12 KB   Data 
> Stream Manager Early RPCs: Total=0 Peak=6.76 KB   TCMalloc Overhead: 
> Total=103.56 MB   RequestPool=fe-eval-exprs: Total=0 Peak=52.83 KB 
> Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0   
> RequestPool=default-pool: Total=5.20 GB Peak=5.20 GB 
> Query(f4014f7bb49ea78:6b926b19): memory limit exceeded. Limit=64.00 
> MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=1.76 GB Total=1.76 GB 
> Peak=1.96 GB Query(2c44b65fbcb4e1ce:3d73badf): memory limit 
> exceeded. Limit=64.00 MB Reservation=0 ReservationLimit=32.00 MB 
> OtherMemory=1.04 GB Total=1.04 GB Peak=1.61 GB 
> Query(214cc23c1376176f:7844977b): memory limit exceeded. Limit=64.00 
> MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=1.23 GB Total=1.23 GB 
> Peak=1.23 GB Query(8949bdf792a32ad2:33a36c03): memory limit 
> exceeded. Limit=64.00 MB Reservation=0 ReservationLimit=32.00 MB 
> OtherMemory=642.20 MB Total=642.20 MB Peak=642.20 MB 
> Query(5412ff4e6065721:519d3e61): memory limit exceeded. Limit=64.00 
> MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=556.49 MB Total=556.49 
> MB Peak=556.49 MB   Untracked Memory: Total=10.98 GB
> Stacktrace
> query_test/test_insert.py:132: in test_insert_overwrite
> multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1)
> common/impala_test_suite.py:405: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:620: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: AnalysisException: Failed to evaluate expr: 20
> E   CAUSED BY: InternalException: Memory limit exceeded: Error occurred on 
> backend impala-boost-static-burst-slave-el7-03d4.vpc.cloudera.com:22000 by 
> fragment 0:0
> E   Memory left in process limit: -4.29 GB
> E   Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0
> E : Total=0 Peak=0Process: memory limit exceeded. Limit=12.00 GB 
> Total=16.29 GB Peak=16.29 GB
> E Buffer Pool: Free Buffers: Total=160.00 KB
> E Buffer Pool: Clean Pages: Total=0
> E Buffer Pool: Unused Reservation: Total=-328.00 KB
> E Data Stream Service Queue: Limit=614.40 MB Total=0 Peak=116.12 KB
> E Data Stream Manager Early RPCs: Total=0 Peak=6.76 KB
> E TCMalloc Overhead: Total=103.56 MB
> E RequestPool=fe-eval-exprs: Total=0 Peak=52.83 KB
> E   Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0
> E RequestPool=default-pool: Total=5.20 GB Peak=5.20 GB
> E   Query(f4014f7bb49ea78:6b926b19): memory limit exceeded. 
> Limit=64.00 MB

[jira] [Resolved] (IMPALA-6813) Hedged reads metrics broken when scanning non-HDFS based table

2018-05-24 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6813.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Hedged reads metrics broken when scanning non-HDFS based table
> --
>
> Key: IMPALA-6813
> URL: https://issues.apache.org/jira/browse/IMPALA-6813
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: Mostafa Mokhtar
>Assignee: Sailesh Mukil
>Priority: Blocker
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> When preads are enabled ADLS scans can fail updating the Hedged reads metrics
> {code}
> (gdb) bt
> #0  0x003346c32625 in raise () from /lib64/libc.so.6
> #1  0x003346c33e05 in abort () from /lib64/libc.so.6
> #2  0x7f185be140b5 in os::abort(bool) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #3  0x7f185bfb6443 in VMError::report_and_die() ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #4  0x7f185be195bf in JVM_handle_linux_signal ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #5  0x7f185be0fb03 in signalHandler(int, siginfo*, void*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #6  
> #7  0x7f185bbc1a7b in jni_invoke_nonstatic(JNIEnv_*, JavaValue*, 
> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #8  0x7f185bbc7e81 in jni_CallObjectMethodV ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #9  0x0212e2b7 in invokeMethod ()
> #10 0x02131297 in hdfsGetHedgedReadMetrics ()
> #11 0x011601c0 in impala::io::ScanRange::Close() ()
> #12 0x01158a95 in 
> impala::io::DiskIoMgr::HandleReadFinished(impala::io::DiskIoMgr::DiskQueue*, 
> impala::io::RequestContext*, std::unique_ptr std::default_delete >) ()
> #13 0x01158e1c in 
> impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, 
> impala:---Type  to continue, or q  to quit---
> :io::RequestContext*, impala::io::ScanRange*) ()
> #14 0x01159052 in 
> impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
> #15 0x00d5fcaf in 
> impala::Thread::SuperviseThread(std::basic_string std::char_traits, std::allocator > const&, 
> std::basic_string 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #16 0x00d604aa in boost::detail::thread_data void (*)(std::basic_string > const&, std::basic_string std::allocator > const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value std::allocator > >, boost::_bi::value, 
> boost::_bi::value, 
> boost::_bi::value > > >::run() ()
> #17 0x012d6dfa in ?? ()
> #18 0x003347007aa1 in start_thread () from /lib64/libpthread.so.0
> #19 0x003346ce893d in clone () from /lib64/libc.so.6
> {code}
> {code}
> CREATE TABLE adls.lineitem (
>   l_orderkey BIGINT,
>   l_partkey BIGINT,
>   l_suppkey BIGINT,
>   l_linenumber BIGINT,
>   l_quantity DOUBLE,
>   l_extendedprice DOUBLE,
>   l_discount DOUBLE,
>   l_tax DOUBLE,
>   l_returnflag STRING,
>   l_linestatus STRING,
>   l_commitdate STRING,
>   l_receiptdate STRING,
>   l_shipinstruct STRING,
>   l_shipmode STRING,
>   l_comment STRING,
>   l_shipdate STRING
> )
> STORED AS PARQUET
> LOCATION 'adl://foo.azuredatalakestore.net/adls-test.db/lineitem'
> {code}
> select * from adls.lineitem limit 10;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-6813) Hedged reads metrics broken when scanning non-HDFS based table

2018-05-24 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6813.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Hedged reads metrics broken when scanning non-HDFS based table
> --
>
> Key: IMPALA-6813
> URL: https://issues.apache.org/jira/browse/IMPALA-6813
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: Mostafa Mokhtar
>Assignee: Sailesh Mukil
>Priority: Blocker
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> When preads are enabled ADLS scans can fail updating the Hedged reads metrics
> {code}
> (gdb) bt
> #0  0x003346c32625 in raise () from /lib64/libc.so.6
> #1  0x003346c33e05 in abort () from /lib64/libc.so.6
> #2  0x7f185be140b5 in os::abort(bool) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #3  0x7f185bfb6443 in VMError::report_and_die() ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #4  0x7f185be195bf in JVM_handle_linux_signal ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #5  0x7f185be0fb03 in signalHandler(int, siginfo*, void*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #6  
> #7  0x7f185bbc1a7b in jni_invoke_nonstatic(JNIEnv_*, JavaValue*, 
> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #8  0x7f185bbc7e81 in jni_CallObjectMethodV ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #9  0x0212e2b7 in invokeMethod ()
> #10 0x02131297 in hdfsGetHedgedReadMetrics ()
> #11 0x011601c0 in impala::io::ScanRange::Close() ()
> #12 0x01158a95 in 
> impala::io::DiskIoMgr::HandleReadFinished(impala::io::DiskIoMgr::DiskQueue*, 
> impala::io::RequestContext*, std::unique_ptr std::default_delete >) ()
> #13 0x01158e1c in 
> impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, 
> impala:---Type  to continue, or q  to quit---
> :io::RequestContext*, impala::io::ScanRange*) ()
> #14 0x01159052 in 
> impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
> #15 0x00d5fcaf in 
> impala::Thread::SuperviseThread(std::basic_string std::char_traits, std::allocator > const&, 
> std::basic_string 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #16 0x00d604aa in boost::detail::thread_data void (*)(std::basic_string > const&, std::basic_string std::allocator > const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value std::allocator > >, boost::_bi::value, 
> boost::_bi::value, 
> boost::_bi::value > > >::run() ()
> #17 0x012d6dfa in ?? ()
> #18 0x003347007aa1 in start_thread () from /lib64/libpthread.so.0
> #19 0x003346ce893d in clone () from /lib64/libc.so.6
> {code}
> {code}
> CREATE TABLE adls.lineitem (
>   l_orderkey BIGINT,
>   l_partkey BIGINT,
>   l_suppkey BIGINT,
>   l_linenumber BIGINT,
>   l_quantity DOUBLE,
>   l_extendedprice DOUBLE,
>   l_discount DOUBLE,
>   l_tax DOUBLE,
>   l_returnflag STRING,
>   l_linestatus STRING,
>   l_commitdate STRING,
>   l_receiptdate STRING,
>   l_shipinstruct STRING,
>   l_shipmode STRING,
>   l_comment STRING,
>   l_shipdate STRING
> )
> STORED AS PARQUET
> LOCATION 'adl://foo.azuredatalakestore.net/adls-test.db/lineitem'
> {code}
> select * from adls.lineitem limit 10;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7069) Java UDF tests can trigger a crash in Java ClassLoader

2018-05-24 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489522#comment-16489522
 ] 

Tim Armstrong commented on IMPALA-7069:
---

I saw it on a commit branched off fd7e7c93c5d2ae153784548a2f83f423e85dda43 . 
AFAIK we didn't see it before yesterday.

> Java UDF tests can trigger a crash in Java ClassLoader
> --
>
> Key: IMPALA-7069
> URL: https://issues.apache.org/jira/browse/IMPALA-7069
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Vuk Ercegovac
>Priority: Critical
>  Labels: crash, flaky
> Attachments: hs_err_pid22764.log, hs_err_pid29246.log, 
> hs_err_pid8975.log, hs_err_pid9694.log
>
>
> I hit this crash on a GVO, but was able to reproduce it on master on my 
> desktop.
> Repro steps:
> {code}
> git checkout c1362afb9a072e49df470d9068d44cdbdf5cdec5
> ./buildall.sh -debug -noclean -notests -skiptests -ninja
> start-impala-cluster.py
> while impala-py.test tests/query_test/test_udfs.py -k 'hive or java or jar' 
> -n4 --verbose; do date; done
> {code}
> I generally hit the crash within a hour of looping the test.
> {noformat}
> Stack: [0x7fa04791f000,0x7fa04812],  sp=0x7fa04811aff0,  free 
> space=8175k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> V  [libjvm.so+0x8a8107]
> V  [libjvm.so+0x96cf5f]
> v  ~RuntimeStub::_complete_monitor_locking_Java
> J 2758 C2 
> java.util.concurrent.ConcurrentHashMap.putVal(Ljava/lang/Object;Ljava/lang/Object;Z)Ljava/lang/Object;
>  (362 bytes) @ 0x7fa0c73637d4 [0x7fa0c7362d00+0xad4]
> J 2311 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; (122 
> bytes) @ 0x7fa0c70a09a8 [0x7fa0c70a08e0+0xc8]
> J 3953 C2 
> java.net.FactoryURLClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;
>  (40 bytes) @ 0x7fa0c71ce0f0 [0x7fa0c71ce0a0+0x50]
> J 2987 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class; (7 
> bytes) @ 0x7fa0c72ddb64 [0x7fa0c72ddb20+0x44]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x661ec4]
> V  [libjvm.so+0x662523]
> V  [libjvm.so+0x9e398d]
> V  [libjvm.so+0x9e2326]
> V  [libjvm.so+0x9e2b50]
> V  [libjvm.so+0x42c099]
> V  [libjvm.so+0x9dc786]
> V  [libjvm.so+0x6a5edf]
> V  [libjvm.so+0x6a70cb]  JVM_DefineClass+0xbb
> V  [libjvm.so+0xa31ea5]
> V  [libjvm.so+0xa37ea7]
> J 4842  
> sun.misc.Unsafe.defineClass(Ljava/lang/String;[BIILjava/lang/ClassLoader;Ljava/security/ProtectionDomain;)Ljava/lang/Class;
>  (0 bytes) @ 0x7fa0c7af120b [0x7fa0c7af1100+0x10b]
> J 13229 C2 sun.reflect.MethodAccessorGenerator$1.run()Ljava/lang/Object; (5 
> bytes) @ 0x7fa0c8cf2a74 [0x7fa0c8cf2940+0x134]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6b5949]  JVM_DoPrivileged+0x429
> J 1035  
> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;
>  (0 bytes) @ 0x7fa0c7220c7f [0x7fa0c7220bc0+0xbf]
> J 20421 C2 
> sun.reflect.MethodAccessorGenerator.generate(Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/Class;Ljava/lang/Class;[Ljava/lang/Class;IZZLjava/lang/Class;)Lsun/reflect/MagicAccessorImpl;
>  (762 bytes) @ 0x7fa0c89bb848 [0x7fa0c89b9da0+0x1aa8]
> J 4163 C2 
> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (104 bytes) @ 0x7fa0c789cca8 [0x7fa0c789c8c0+0x3e8]
> J 2379 C2 org.apache.impala.hive.executor.UdfExecutor.evaluate()V (396 bytes) 
> @ 0x7fa0c711c638 [0x7fa0c711c400+0x238]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6822d7]
> V  [libjvm.so+0x6862c9]
> C  [impalad+0x2a004fa]  JNIEnv_::CallNonvirtualVoidMethodA(_jobject*, 
> _jclass*, _jmethodID*, jvalue const*)+0x40
> C  [impalad+0x29fe4ff]  
> impala::HiveUdfCall::Evaluate(impala::ScalarExprEvaluator*, impala::TupleRow 
> const*) const+0x44b
> C  [impalad+0x29ffde9]  
> impala::HiveUdfCall::GetSmallIntVal(impala::ScalarExprEvaluator*, 
> impala::TupleRow const*) const+0xbb
> C  [impalad+0x2a0948a]  
> impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
> impala::TupleRow const*)+0x14c
> C  [impalad+0x2a48eb1]  
> impala::ScalarFnCall::EvaluateNonConstantChildren(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x9d
> C  [impalad+0x2a4abba]  impala_udf::BooleanVal 
> impala::ScalarFnCall::InterpretEval(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x18c
> C  [impalad+0x2a4907d]  
> impala::ScalarFnCall::GetBooleanVal(impala::ScalarExprEvaluator*, 
> impala::TupleRow const*) const+0x179
> C  [impalad+0x2a09c7f]  
>

[jira] [Work started] (IMPALA-6812) Kudu scans not returning all rows

2018-05-24 Thread Thomas Tauber-Marshall (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-6812 started by Thomas Tauber-Marshall.
--
> Kudu scans not returning all rows
> -
>
> Key: IMPALA-6812
> URL: https://issues.apache.org/jira/browse/IMPALA-6812
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0, Impala 2.13.0
>Reporter: Tianyi Wang
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build
>
> In a 2.x exhaustive build, test_column_storage_attributes failed:
> {noformat}
> Error Message
> query_test/test_kudu.py:383: in test_column_storage_attributes assert 
> cursor.fetchall() == \ E   assert [] == [(26, True, 0, 0, 0, 0, ...)] E 
> Right contains more items, first extra item: (26, True, 0, 0, 0, 0, ...) E
>  Use -v to get the full diff
> Stacktrace
> query_test/test_kudu.py:383: in test_column_storage_attributes
> assert cursor.fetchall() == \
> E   assert [] == [(26, True, 0, 0, 0, 0, ...)]
> E Right contains more items, first extra item: (26, True, 0, 0, 0, 0, ...)
> E Use -v to get the full diff
> {noformat}
> The last alter column query in the log is:
> {noformat}
>  alter table test_column_storage_attributes_b9040aa.storage_attrs alter 
> column decimal_col
> set encoding DICT_ENCODING compression NO_COMPRESSION
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-6947) kudu: GetTableLocations RPC timing out with ASAN

2018-05-24 Thread Thomas Tauber-Marshall (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-6947 started by Thomas Tauber-Marshall.
--
> kudu: GetTableLocations RPC timing out with ASAN
> 
>
> Key: IMPALA-6947
> URL: https://issues.apache.org/jira/browse/IMPALA-6947
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0
>Reporter: Michael Brown
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>
> {noformat}
> query_test/test_kudu.py:84: in test_kudu_insert
> self.run_test_case('QueryTest/kudu_insert', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:398: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:613: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:341: in __execute_query
> self.wait_for_completion(handle)
> beeswax/impala_beeswax.py:361: in wait_for_completion
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:Kudu error(s) reported, first error: Timed out: 
> GetTableLocations { table: 'impala::test_kudu_insert_70eff904.kudu_test', 
> partition-key: (HASH (a, b): 2), attempt: 1 } failed: GetTableLocations RPC 
> to 127.0.0.1:7051 timed out after 10.000s (SENT)
> E   
> E   Key already present in Kudu table 
> 'impala::test_kudu_insert_70eff904.kudu_test'. (1 of 3 similar)
> E   Error in Kudu table 'impala::test_kudu_insert_70eff904.kudu_test': Timed 
> out: GetTableLocations { table: 
> 'impala::test_kudu_insert_70eff904.kudu_test', partition-key: (HASH (a, b): 
> 2), attempt: 1 } failed: GetTableLocations RPC to 127.0.0.1:7051 timed out 
> after 10.000s (SENT) (1 of 21 similar)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7063) Miniprofile 2 compilation broken on trunk

2018-05-24 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-7063.
-
Resolution: Fixed
  Assignee: Philip Zeyliger

> Miniprofile 2 compilation broken on trunk
> -
>
> Key: IMPALA-7063
> URL: https://issues.apache.org/jira/browse/IMPALA-7063
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Major
>
> The commit for IMPALA-7019 used {{FileStatus.isErasureCoded()}} which doesn't 
> exist in Hadoop 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-387) Make "REFRESH" command a SQL statement rather than RPC

2018-05-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489472#comment-16489472
 ] 

ASF subversion and git services commented on IMPALA-387:


Commit c98c01c55d7f6af7e536347986c5b22841bc78e7 in impala's branch 
refs/heads/2.x from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c98c01c ]

IMPALA-6131: Track time of last statistics update in metadata

The timestamp of the last COMPUTE STATS operation is saved to
table property "impala.lastComputeStatsTime". The format is
the same as in "transient_lastDdlTime", so the two can be
compared to check if the schema has changed since computing
statistics.

Other changes:
- Handling of "transient_lastDdlTime" is simplified - the old
  logic set it to current time + 1, if the old version was
  >= current time, to ensure that it is always increased by
  DDL operations. This was useful in the past, as IMPALA-387
  used lastDdlTime to check if partition data needs to be
  reloaded, but since IMPALA-1480, Impala does not rely on
  lastDdlTime at all.

- Computing / setting stats on HDFS tables no longer increases
  "transient_lastDdlTime".

- When Kudu tables are (re)loaded, it is checked if their
  HMS representation is up to date, and if it is, then
  IMetaStoreClient.alter_table() is not called. The old
  logic always called alter_table() after loading metadata
  from Kudu. This change was needed to ensure that
  "transient_lastDdlTime" works similarly in HDFS and Kudu
  tables, and should also make (re)loading Kudu tables faster.

Notes:
- Kudu will be able to sync its tables to HMS in the near
  future (see KUDU-2191), so the Kudu metadata handling in
  Impala may need to be redesigned.

Testing:
tests/metadata/test_last_ddl_time_update.py is extended by
- also checking "impala.lastComputeStatsTime"
- testing more SQL statements
- tests for Kudu tables

Note that test_last_ddl_time_update.py is ran only in
exhaustive testing.

Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234
Reviewed-on: http://gerrit.cloudera.org:8080/10484
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins 


> Make "REFRESH" command a SQL statement rather than RPC
> --
>
> Key: IMPALA-387
> URL: https://issues.apache.org/jira/browse/IMPALA-387
> Project: IMPALA
>  Issue Type: New Feature
>Affects Versions: Impala 1.0
>Reporter: Lenni Kuff
>Assignee: Alan Choi
>Priority: Major
> Fix For: Impala 1.1
>
>
> It would be good to make "REFRESH" a first-class SQL statement rather than 
> just an RPC. This would allow users to submit refreshes outside of the 
> impala-shell (for example - via JDBC/ODBC). Initially, we would need to 
> support both a full catalog refresh as well as a table-level refresh:
> REFRESH;
> REFRESH ;
> IMPALA-339 may introduce some additional syntax to choose between a RELOAD 
> and a REFRESH so that should be covered as well.
> As part of the change the impala-shell should be updated to submit refreshes 
> using the regular "query" API rather than calling ResetCatalog/ResetTable



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7063) Miniprofile 2 compilation broken on trunk

2018-05-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489474#comment-16489474
 ] 

ASF subversion and git services commented on IMPALA-7063:
-

Commit 879f106d1b28380fbe71680d04cf6df568b3daa3 in impala's branch 
refs/heads/master from [~philip]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=879f106 ]

IMPALA-7063: Fix compilation for MiniProfile2 after Erasure Coding changes.

This commit shims out HDFS's FileStatus.isErasureCoded() to manage
working with multiple versions of Hadoop.

I tested compilation with both profiles.

Cherry-picks: not for 2.x.

Change-Id: I423087078f84b0806545322519f224d58815123d
Reviewed-on: http://gerrit.cloudera.org:8080/10487
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 


> Miniprofile 2 compilation broken on trunk
> -
>
> Key: IMPALA-7063
> URL: https://issues.apache.org/jira/browse/IMPALA-7063
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Reporter: Philip Zeyliger
>Priority: Major
>
> The commit for IMPALA-7019 used {{FileStatus.isErasureCoded()}} which doesn't 
> exist in Hadoop 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6131) Track time of last statistics update in metadata

2018-05-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489471#comment-16489471
 ] 

ASF subversion and git services commented on IMPALA-6131:
-

Commit c98c01c55d7f6af7e536347986c5b22841bc78e7 in impala's branch 
refs/heads/2.x from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c98c01c ]

IMPALA-6131: Track time of last statistics update in metadata

The timestamp of the last COMPUTE STATS operation is saved to
table property "impala.lastComputeStatsTime". The format is
the same as in "transient_lastDdlTime", so the two can be
compared to check if the schema has changed since computing
statistics.

Other changes:
- Handling of "transient_lastDdlTime" is simplified - the old
  logic set it to current time + 1, if the old version was
  >= current time, to ensure that it is always increased by
  DDL operations. This was useful in the past, as IMPALA-387
  used lastDdlTime to check if partition data needs to be
  reloaded, but since IMPALA-1480, Impala does not rely on
  lastDdlTime at all.

- Computing / setting stats on HDFS tables no longer increases
  "transient_lastDdlTime".

- When Kudu tables are (re)loaded, it is checked if their
  HMS representation is up to date, and if it is, then
  IMetaStoreClient.alter_table() is not called. The old
  logic always called alter_table() after loading metadata
  from Kudu. This change was needed to ensure that
  "transient_lastDdlTime" works similarly in HDFS and Kudu
  tables, and should also make (re)loading Kudu tables faster.

Notes:
- Kudu will be able to sync its tables to HMS in the near
  future (see KUDU-2191), so the Kudu metadata handling in
  Impala may need to be redesigned.

Testing:
tests/metadata/test_last_ddl_time_update.py is extended by
- also checking "impala.lastComputeStatsTime"
- testing more SQL statements
- tests for Kudu tables

Note that test_last_ddl_time_update.py is ran only in
exhaustive testing.

Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234
Reviewed-on: http://gerrit.cloudera.org:8080/10484
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins 


> Track time of last statistics update in metadata
> 
>
> Key: IMPALA-6131
> URL: https://issues.apache.org/jira/browse/IMPALA-6131
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Reporter: Lars Volker
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: ramp-up
>
> Currently we (ab-)use {{transient_lastDdlTime}} to track the last update time 
> of statistics. Instead we should introduce a separate counter to track the 
> last update. With that we should also remove all occurrences of 
> {{catalog_.updateLastDdlTime()}} from {{CatalogOpExecutor}} and fall back to 
> Hive's default behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-4025) add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()

2018-05-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489475#comment-16489475
 ] 

ASF subversion and git services commented on IMPALA-4025:
-

Commit 1ca077fd065bc5689e7614073c174b5a2bb11d96 in impala's branch 
refs/heads/master from [~tianyiwang]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=1ca077f ]

IMPALA-4025: Part 1: Generalize and cleanup StmtRewriter

This patch generalizes StmtRewriter, allowing it to be subclassed. The
base class would traverse the stmt tree while the subclasses can install
hooks to execute specific rewrite rules at certain places. Existing
rewriting rules are moved into SubqueryRewriter.

Change-Id: I9e7a6108d3d49be12ae032fdb54b5c3c23152a47
Reviewed-on: http://gerrit.cloudera.org:8080/10495
Reviewed-by: Vuk Ercegovac 
Tested-by: Impala Public Jenkins 


> add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()
> 
>
> Key: IMPALA-4025
> URL: https://issues.apache.org/jira/browse/IMPALA-4025
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Affects Versions: Impala 2.2.4
>Reporter: Greg Rahn
>Assignee: Tianyi Wang
>Priority: Major
>  Labels: built-in-function, sql-language
>
> Add the following functions as both an aggregate function and window/analytic 
> function:
> * PERCENTILE_CONT
> * PERCENTILE_DISC
> * MEDIAN (impmented as PERCENTILE_CONT(0.5))
> h6. Syntax
> {code}
> PERCENTILE_CONT() WITHIN GROUP (ORDER BY  [ASC|DESC] 
> [NULLS {FIRST | LAST}]) [ OVER ([])]
> PERCENTILE_DISC() WITHIN GROUP (ORDER BY  [ASC|DESC] 
> [NULLS {FIRST | LAST}]) [ OVER ([])]
> MEDIAN(expr) [ OVER () ]
> {code}
> h6. Notes from other systems
> *Greenplum*
> {code}
> PERCENTILE_CONT(_percentage_) WITHIN GROUP (ORDER BY _expression_)
> {code}
> http://gpdb.docs.pivotal.io/4320/admin_guide/query.html
> Greenplum Database provides the MEDIAN aggregate function, which returns the 
> fiftieth percentile of the PERCENTILE_CONT result and special aggregate 
> expressions for inverse distribution functions as follows:
> Currently you can use only these two expressions with the keyword WITHIN 
> GROUP.
> Note: aggregation fuction only
> *Oracle*
> {code}
> PERCENTILE_CONT(expr) WITHIN GROUP (ORDER BY expr [ DESC | ASC ]) [ OVER 
> (query_partition_clause) ]}}
> {code}
> http://docs.oracle.com/database/121/SQLRF/functions141.htm#SQLRF00687
> Note: implemented as both an aggregate and window function
> *Vertica*
> {code}
> PERCENTILE_CONT ( %_number ) WITHIN GROUP (... ORDER BY expression [ ASC | 
> DESC ] ) OVER (... [ window-partition-clause ] )
> {code}
> https://my.vertica.com/docs/7.2.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/Analytic/PERCENTILE_CONTAnalytic.htm
> Note: window fuction only
> *Teradata*
> {code}
> PERCENTILE_CONT() WITHIN GROUP (ORDER BY  
> [asc | desc] [nulls {first | last}])
> {code}
> Note: aggregation fuction only
> *Netezza*
> {code}
> SELECT fn() WITHIN GROUP (ORDER BY  [asc|desc] [nulls 
> {first | last}]) FROM [GROUP BY ];
> {code}
> https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html
> Note: aggregation fuction only
> *Redshift*
> {code}
> PERCENTILE_CONT ( percentile ) WITHIN GROUP (ORDER BY expr) OVER (  [ 
> PARTITION BY expr_list ]  )
> {code}
> https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html
> Note: window fuction only



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-1480) Slow DDL statements for tables with large number of partitions

2018-05-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489473#comment-16489473
 ] 

ASF subversion and git services commented on IMPALA-1480:
-

Commit c98c01c55d7f6af7e536347986c5b22841bc78e7 in impala's branch 
refs/heads/2.x from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c98c01c ]

IMPALA-6131: Track time of last statistics update in metadata

The timestamp of the last COMPUTE STATS operation is saved to
table property "impala.lastComputeStatsTime". The format is
the same as in "transient_lastDdlTime", so the two can be
compared to check if the schema has changed since computing
statistics.

Other changes:
- Handling of "transient_lastDdlTime" is simplified - the old
  logic set it to current time + 1, if the old version was
  >= current time, to ensure that it is always increased by
  DDL operations. This was useful in the past, as IMPALA-387
  used lastDdlTime to check if partition data needs to be
  reloaded, but since IMPALA-1480, Impala does not rely on
  lastDdlTime at all.

- Computing / setting stats on HDFS tables no longer increases
  "transient_lastDdlTime".

- When Kudu tables are (re)loaded, it is checked if their
  HMS representation is up to date, and if it is, then
  IMetaStoreClient.alter_table() is not called. The old
  logic always called alter_table() after loading metadata
  from Kudu. This change was needed to ensure that
  "transient_lastDdlTime" works similarly in HDFS and Kudu
  tables, and should also make (re)loading Kudu tables faster.

Notes:
- Kudu will be able to sync its tables to HMS in the near
  future (see KUDU-2191), so the Kudu metadata handling in
  Impala may need to be redesigned.

Testing:
tests/metadata/test_last_ddl_time_update.py is extended by
- also checking "impala.lastComputeStatsTime"
- testing more SQL statements
- tests for Kudu tables

Note that test_last_ddl_time_update.py is ran only in
exhaustive testing.

Change-Id: Ibda49725d3e76456f2d1b3edd1bf117b0174e234
Reviewed-on: http://gerrit.cloudera.org:8080/10484
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins 


> Slow DDL statements for tables with large number of partitions
> --
>
> Key: IMPALA-1480
> URL: https://issues.apache.org/jira/browse/IMPALA-1480
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Dimitris Tsirogiannis
>Priority: Critical
>  Labels: impala, performance
> Fix For: Impala 2.5.0
>
>
> Impala users sometimes report that DDL statements (e.g. alter table partition 
> set location...) are taking multiple seconds (>5) for partitioned tables with 
> large number of partitions. The same operations are significantly faster in 
> hive (sub-second response time). 
> Use case:
> * 2 node cluster
> * Single table (24 columns, 3 partition keys) with 2500 partitions
> * alter table foo partition (foo_i = i) set location 'hdfs://.' takes 
> approximately 5-6sec (0.2 in HIVE)
> * 1 sec delay in the alter stmt is caused by 
> https://issues.apache.org/jira/browse/HIVE-5524



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7069) Java UDF tests can trigger a crash in Java ClassLoader

2018-05-24 Thread Dan Hecht (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489466#comment-16489466
 ] 

Dan Hecht commented on IMPALA-7069:
---

Any idea the earliest githash we first saw this at? Presumably it's a 
regression, right?

> Java UDF tests can trigger a crash in Java ClassLoader
> --
>
> Key: IMPALA-7069
> URL: https://issues.apache.org/jira/browse/IMPALA-7069
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Vuk Ercegovac
>Priority: Critical
>  Labels: crash, flaky
> Attachments: hs_err_pid22764.log, hs_err_pid29246.log, 
> hs_err_pid8975.log, hs_err_pid9694.log
>
>
> I hit this crash on a GVO, but was able to reproduce it on master on my 
> desktop.
> Repro steps:
> {code}
> git checkout c1362afb9a072e49df470d9068d44cdbdf5cdec5
> ./buildall.sh -debug -noclean -notests -skiptests -ninja
> start-impala-cluster.py
> while impala-py.test tests/query_test/test_udfs.py -k 'hive or java or jar' 
> -n4 --verbose; do date; done
> {code}
> I generally hit the crash within a hour of looping the test.
> {noformat}
> Stack: [0x7fa04791f000,0x7fa04812],  sp=0x7fa04811aff0,  free 
> space=8175k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> V  [libjvm.so+0x8a8107]
> V  [libjvm.so+0x96cf5f]
> v  ~RuntimeStub::_complete_monitor_locking_Java
> J 2758 C2 
> java.util.concurrent.ConcurrentHashMap.putVal(Ljava/lang/Object;Ljava/lang/Object;Z)Ljava/lang/Object;
>  (362 bytes) @ 0x7fa0c73637d4 [0x7fa0c7362d00+0xad4]
> J 2311 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; (122 
> bytes) @ 0x7fa0c70a09a8 [0x7fa0c70a08e0+0xc8]
> J 3953 C2 
> java.net.FactoryURLClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;
>  (40 bytes) @ 0x7fa0c71ce0f0 [0x7fa0c71ce0a0+0x50]
> J 2987 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class; (7 
> bytes) @ 0x7fa0c72ddb64 [0x7fa0c72ddb20+0x44]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x661ec4]
> V  [libjvm.so+0x662523]
> V  [libjvm.so+0x9e398d]
> V  [libjvm.so+0x9e2326]
> V  [libjvm.so+0x9e2b50]
> V  [libjvm.so+0x42c099]
> V  [libjvm.so+0x9dc786]
> V  [libjvm.so+0x6a5edf]
> V  [libjvm.so+0x6a70cb]  JVM_DefineClass+0xbb
> V  [libjvm.so+0xa31ea5]
> V  [libjvm.so+0xa37ea7]
> J 4842  
> sun.misc.Unsafe.defineClass(Ljava/lang/String;[BIILjava/lang/ClassLoader;Ljava/security/ProtectionDomain;)Ljava/lang/Class;
>  (0 bytes) @ 0x7fa0c7af120b [0x7fa0c7af1100+0x10b]
> J 13229 C2 sun.reflect.MethodAccessorGenerator$1.run()Ljava/lang/Object; (5 
> bytes) @ 0x7fa0c8cf2a74 [0x7fa0c8cf2940+0x134]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6b5949]  JVM_DoPrivileged+0x429
> J 1035  
> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;
>  (0 bytes) @ 0x7fa0c7220c7f [0x7fa0c7220bc0+0xbf]
> J 20421 C2 
> sun.reflect.MethodAccessorGenerator.generate(Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/Class;Ljava/lang/Class;[Ljava/lang/Class;IZZLjava/lang/Class;)Lsun/reflect/MagicAccessorImpl;
>  (762 bytes) @ 0x7fa0c89bb848 [0x7fa0c89b9da0+0x1aa8]
> J 4163 C2 
> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (104 bytes) @ 0x7fa0c789cca8 [0x7fa0c789c8c0+0x3e8]
> J 2379 C2 org.apache.impala.hive.executor.UdfExecutor.evaluate()V (396 bytes) 
> @ 0x7fa0c711c638 [0x7fa0c711c400+0x238]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6822d7]
> V  [libjvm.so+0x6862c9]
> C  [impalad+0x2a004fa]  JNIEnv_::CallNonvirtualVoidMethodA(_jobject*, 
> _jclass*, _jmethodID*, jvalue const*)+0x40
> C  [impalad+0x29fe4ff]  
> impala::HiveUdfCall::Evaluate(impala::ScalarExprEvaluator*, impala::TupleRow 
> const*) const+0x44b
> C  [impalad+0x29ffde9]  
> impala::HiveUdfCall::GetSmallIntVal(impala::ScalarExprEvaluator*, 
> impala::TupleRow const*) const+0xbb
> C  [impalad+0x2a0948a]  
> impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
> impala::TupleRow const*)+0x14c
> C  [impalad+0x2a48eb1]  
> impala::ScalarFnCall::EvaluateNonConstantChildren(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x9d
> C  [impalad+0x2a4abba]  impala_udf::BooleanVal 
> impala::ScalarFnCall::InterpretEval(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x18c
> C  [impalad+0x2a4907d]  
> impala::ScalarFnCall::GetBooleanVal(impala::ScalarExprEvaluator*, 
> impala::TupleRow const*) const+0x179
> C  [impalad+0x2a09c7f]  
> impala::ScalarExprEvaluator::GetBooleanVal(impala::TupleRow*)+0x37
> C

[jira] [Commented] (IMPALA-3316) convert_legacy_hive_parquet_utc_timestamps=true makes reading parquet tables 30x slower

2018-05-24 Thread Boris Tyukin (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489443#comment-16489443
 ] 

Boris Tyukin commented on IMPALA-3316:
--

[~attilaj] would you be kind to share an update on the fix? IMHO this should be 
classified as a major issue not minor and deserves more attention since parquet 
is a recommended format for Impala and most companies I know use Hive to 
process data for Impala to consume. I wonder why it does not get more 
attention. 

> convert_legacy_hive_parquet_utc_timestamps=true makes reading parquet tables 
> 30x slower
> ---
>
> Key: IMPALA-3316
> URL: https://issues.apache.org/jira/browse/IMPALA-3316
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: impala 2.3
> Environment: CDH 5.5.2/ Impala 2.3
> Parquet table with a timestamp column
> Secure cluster
> convert_legacy_hive_parquet_utc_timestamps=true
> Timestamp column is not being filtered on
>Reporter: Ruslan Dautkhanov
>Assignee: Attila Jeges
>Priority: Minor
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Enabling convert_legacy_hive_parquet_utc_timestamps=true
> makes simple queries that don't even filter on a timestamp attribute perform 
> really poorly.
> Parquet table.
> Impala 2.3 / CDH 5.5.2.
> convert_legacy_hive_parquet_utc_timestamps=true makes following simple query 
> 30x slower (1.1minutes -> over 30 minutes).
> {quote} select * from parquet_table_with_a_timestamp_attribute where 
> bigint_attribute=1000771658169 {quote}
> Notice I did not even filter on a timestamp attribute.
> Made multiple tests with and without 
> convert_legacy_hive_parquet_utc_timestamps=true impalad present.
> Also, from https://issues.cloudera.org/browse/IMPALA-1658
> {quote} Casey Ching added a comment - 15/Jun/15 5:12 PM
> Btw, a perf test showed enabling this flag was 10x slower. {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7070) Failed test: query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays on S3

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)

Dimitris Tsirogiannis created IMPALA-7070:
-

 Summary: Failed test: 
query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays
 on S3
 Key: IMPALA-7070
 URL: https://issues.apache.org/jira/browse/IMPALA-7070
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


 
{code:java}
Error Message

query_test/test_nested_types.py:406: in test_thrift_array_of_arrays "col1 
array") query_test/test_nested_types.py:579: in _create_test_table  
   check_call(["hadoop", "fs", "-put", local_path, location], shell=False) 
/usr/lib64/python2.6/subprocess.py:505: in check_call raise 
CalledProcessError(retcode, cmd) E   CalledProcessError: Command '['hadoop', 
'fs', '-put', 
'/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
 
's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
 returned non-zero exit status 1

Stacktrace

query_test/test_nested_types.py:406: in test_thrift_array_of_arrays
"col1 array")
query_test/test_nested_types.py:579: in _create_test_table
check_call(["hadoop", "fs", "-put", local_path, location], shell=False)
/usr/lib64/python2.6/subprocess.py:505: in check_call
raise CalledProcessError(retcode, cmd)
E   CalledProcessError: Command '['hadoop', 'fs', '-put', 
'/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
 
's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
 returned non-zero exit status 1

Standard Error

SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_thrift_array_of_arrays_11da5fde` CASCADE;

SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_thrift_array_of_arrays_11da5fde`;

MainThread: Created database "test_thrift_array_of_arrays_11da5fde" for test ID 
"query_test/test_nested_types.py::TestParquetArrayEncodings::()::test_thrift_array_of_arrays[exec_option:
 {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
-- executing against localhost:21000
create table test_thrift_array_of_arrays_11da5fde.ThriftArrayOfArrays (col1 
array) stored as parquet location 
's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays';

18/05/20 18:31:03 WARN impl.MetricsConfig: Cannot locate configuration: tried 
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
18/05/20 18:31:03 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 
second(s).
18/05/20 18:31:03 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
started
18/05/20 18:31:06 INFO Configuration.deprecation: 
fs.s3a.server-side-encryption-key is deprecated. Instead, use 
fs.s3a.server-side-encryption.key
put: rename 
`s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet._COPYING_'
 to 
`s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet':
 Input/output error
18/05/20 18:31:08 INFO impl.MetricsSystemImpl: Stopping s3a-file-system metrics 
system...
18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
stopped.
18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
shutdown complete.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-6998) test_bloom_wait_time fails due to late arrival of filters on Isilon

2018-05-24 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6998.
---
   Resolution: Fixed
Fix Version/s: Impala 2.13.0

> test_bloom_wait_time fails due to late arrival of filters on Isilon
> ---
>
> Key: IMPALA-6998
> URL: https://issues.apache.org/jira/browse/IMPALA-6998
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.13.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 2.13.0
>
>
> This is likely a flaky issue and was seen on an instance of an Isilon run:
> {code:java}
> Error Message
> query_test/test_runtime_filters.py:92: in test_bloom_wait_time assert 
> duration < 60, \ E   AssertionError: Query took too long (118.044356108s, 
> possibly waiting for missing filters?) E   assert 118.04435610771179 < 60
> Stacktrace
> query_test/test_runtime_filters.py:92: in test_bloom_wait_time
> assert duration < 60, \
> E   AssertionError: Query took too long (118.044356108s, possibly waiting for 
> missing filters?)
> E   assert 118.04435610771179 < 60
> Standard Error
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_WAIT_TIME_MS=60;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MODE=GLOBAL;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MAX_SIZE=64K;
> -- executing against localhost:21000
> with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
> select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 1) a
> join (select * from l LIMIT 50) b on a.l_orderkey = -b.l_orderkey;
> -- executing against localhost:21000
> SET RUNTIME_FILTER_WAIT_TIME_MS="0";
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MODE="GLOBAL";
> -- executing against localhost:21000
> SET RUNTIME_FILTER_MAX_SIZE="16777216";
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IMPALA-6119) Inconsistent file metadata updates when multiple partitions point to the same path

2018-05-24 Thread bharath v (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489393#comment-16489393
 ] 

bharath v commented on IMPALA-6119:
---

Not sure about Spark, but based on my local experiments, it looks like we are 
consistent with Hive.

> Inconsistent file metadata updates when multiple partitions point to the same 
> path
> --
>
> Key: IMPALA-6119
> URL: https://issues.apache.org/jira/browse/IMPALA-6119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: bharath v
>Assignee: Gabor Kaszab
>Priority: Critical
>  Labels: correctness, ramp-up
>
> Following steps can give inconsistent results.
> {noformat}
> // Create a partitioned table
> create table test(a int) partitioned by (b int);
> // Create two partitions b=1/b=2 mapped to the same HDFS location.
> insert into test partition(b=1) values (1);
> alter table test add partition (b=2) location 
> 'hdfs://localhost:20500/test-warehouse/test/b=1/' 
> [localhost:21000] > show partitions test;
> Query: show partitions test
> +---+---++--+--+---++---++
> | b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
> Incremental stats | Location   |
> +---+---++--+--+---++---++
> | 1 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | 2 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | Total | -1| 2  | 4B   | 0B   |   || 
>   ||
> +---+---++--+--+---++---++
> // Insert new data into one of the partitions
> insert into test partition(b=1) values (2);
> // Newly added file is reflected only in the added partition files. 
> show files in test;
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> invalidate metadata test;
>  show files in test;
> // After invalidation, the newly added file now shows up in both the 
> partitions.
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=2   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> {noformat}
> So, depending whether the user invalidates the table, they can see different 
> results. The bug is in the following code.
> {noformat}
> private FileMetadataLoadStats resetAndLoadFileMetadata(
>   Path partDir, List partitions) throws IOException {
> FileMetadataLoadStats

[jira] [Comment Edited] (IMPALA-6119) Inconsistent file metadata updates when multiple partitions point to the same path

2018-05-24 Thread bharath v (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489304#comment-16489304
 ] 

bharath v edited comment on IMPALA-6119 at 5/24/18 4:53 PM:


Thanks for digging into this. I vaguely remember that we considered the above 
solutions and like you mentioned  (1) is more computationally heavy and (2) 
adds more memory and the Catalog is already notorious for its memory usage.:) 

What are your thoughts on the fix I proposed? We may exploit the references and 
update the source fds directly. 


was (Author: bharathv):
Thanks for digging into this. I vaguely remember that we considered the above 
solutions and like you mentioned  (1) is more computationally heavy and (2) 
adds more memory and the Catalog is already notorious for its memory usage.:) 

What are your on the fix I proposed? We make exploit the references and update 
the source fds directly. 

> Inconsistent file metadata updates when multiple partitions point to the same 
> path
> --
>
> Key: IMPALA-6119
> URL: https://issues.apache.org/jira/browse/IMPALA-6119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: bharath v
>Assignee: Gabor Kaszab
>Priority: Critical
>  Labels: correctness, ramp-up
>
> Following steps can give inconsistent results.
> {noformat}
> // Create a partitioned table
> create table test(a int) partitioned by (b int);
> // Create two partitions b=1/b=2 mapped to the same HDFS location.
> insert into test partition(b=1) values (1);
> alter table test add partition (b=2) location 
> 'hdfs://localhost:20500/test-warehouse/test/b=1/' 
> [localhost:21000] > show partitions test;
> Query: show partitions test
> +---+---++--+--+---++---++
> | b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
> Incremental stats | Location   |
> +---+---++--+--+---++---++
> | 1 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | 2 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | Total | -1| 2  | 4B   | 0B   |   || 
>   ||
> +---+---++--+--+---++---++
> // Insert new data into one of the partitions
> insert into test partition(b=1) values (2);
> // Newly added file is reflected only in the added partition files. 
> show files in test;
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> invalidate metadata test;
>  show files in test;
> // After invalidation, the newly added file now shows up in both the 
> partitions.
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
>

[jira] [Commented] (IMPALA-6119) Inconsistent file metadata updates when multiple partitions point to the same path

2018-05-24 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489375#comment-16489375
 ] 

Philip Zeyliger commented on IMPALA-6119:
-

Does it make sense to be allowing two partitions to have the same location? Is 
Impala's behavior consistent with Hive and Spark when this happens?

> Inconsistent file metadata updates when multiple partitions point to the same 
> path
> --
>
> Key: IMPALA-6119
> URL: https://issues.apache.org/jira/browse/IMPALA-6119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: bharath v
>Assignee: Gabor Kaszab
>Priority: Critical
>  Labels: correctness, ramp-up
>
> Following steps can give inconsistent results.
> {noformat}
> // Create a partitioned table
> create table test(a int) partitioned by (b int);
> // Create two partitions b=1/b=2 mapped to the same HDFS location.
> insert into test partition(b=1) values (1);
> alter table test add partition (b=2) location 
> 'hdfs://localhost:20500/test-warehouse/test/b=1/' 
> [localhost:21000] > show partitions test;
> Query: show partitions test
> +---+---++--+--+---++---++
> | b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
> Incremental stats | Location   |
> +---+---++--+--+---++---++
> | 1 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | 2 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | Total | -1| 2  | 4B   | 0B   |   || 
>   ||
> +---+---++--+--+---++---++
> // Insert new data into one of the partitions
> insert into test partition(b=1) values (2);
> // Newly added file is reflected only in the added partition files. 
> show files in test;
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> invalidate metadata test;
>  show files in test;
> // After invalidation, the newly added file now shows up in both the 
> partitions.
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=2   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> {noformat}
> So, depending whether the user invalidates the table, they can see different 
> results. The bug is in the following code.
> {noformat}
> private FileMetadataLoadStats resetAndLoadFileMetadata(
>   Path partDir, List

[jira] [Work started] (IMPALA-5552) Proxy user list should support groups

2018-05-24 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-5552 started by Fredy Wijaya.

> Proxy user list should support groups
> -
>
> Key: IMPALA-5552
> URL: https://issues.apache.org/jira/browse/IMPALA-5552
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Tristan Stevens
>Assignee: Fredy Wijaya
>Priority: Critical
>
> The authorized_proxy_user_config takes a map of user->doAsUser* - i.e. user 
> is allowed to impersonate any users in the list of doAsUsers.
> For enterprise deployments, this would be better specified as a list of 
> groups, rather than a a list of users:
> user1->group*
> When accepting a query, Impala will check that the doAs user is a member of 
> any of the list of groups specified for the connecting user.
> HiveServer2 does this via Hadoop-level proxy user privileges (e.g.
>  {{
>   hadoop.proxyuser.user1.hosts
>   doAsUser1,doAsUser2
> 
> 
>   hadoop.proxyuser.user1.groups
>   doAsGroup1,doAsGroup2
> }}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7069) Java UDF tests can trigger a crash in Java ClassLoader

2018-05-24 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489358#comment-16489358
 ] 

Tim Armstrong commented on IMPALA-7069:
---

Assigned to Vuk since he's looked at the UDF loading most recently. I attached 
the hs_err_pid files from my repro attempts.

> Java UDF tests can trigger a crash in Java ClassLoader
> --
>
> Key: IMPALA-7069
> URL: https://issues.apache.org/jira/browse/IMPALA-7069
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Vuk Ercegovac
>Priority: Critical
>  Labels: crash, flaky
> Attachments: hs_err_pid22764.log, hs_err_pid29246.log, 
> hs_err_pid8975.log, hs_err_pid9694.log
>
>
> I hit this crash on a GVO, but was able to reproduce it on master on my 
> desktop.
> Repro steps:
> {code}
> git checkout c1362afb9a072e49df470d9068d44cdbdf5cdec5
> ./buildall.sh -debug -noclean -notests -skiptests -ninja
> start-impala-cluster.py
> while impala-py.test tests/query_test/test_udfs.py -k 'hive or java or jar' 
> -n4 --verbose; do date; done
> {code}
> I generally hit the crash within a hour of looping the test.
> {noformat}
> Stack: [0x7fa04791f000,0x7fa04812],  sp=0x7fa04811aff0,  free 
> space=8175k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> V  [libjvm.so+0x8a8107]
> V  [libjvm.so+0x96cf5f]
> v  ~RuntimeStub::_complete_monitor_locking_Java
> J 2758 C2 
> java.util.concurrent.ConcurrentHashMap.putVal(Ljava/lang/Object;Ljava/lang/Object;Z)Ljava/lang/Object;
>  (362 bytes) @ 0x7fa0c73637d4 [0x7fa0c7362d00+0xad4]
> J 2311 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; (122 
> bytes) @ 0x7fa0c70a09a8 [0x7fa0c70a08e0+0xc8]
> J 3953 C2 
> java.net.FactoryURLClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;
>  (40 bytes) @ 0x7fa0c71ce0f0 [0x7fa0c71ce0a0+0x50]
> J 2987 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class; (7 
> bytes) @ 0x7fa0c72ddb64 [0x7fa0c72ddb20+0x44]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x661ec4]
> V  [libjvm.so+0x662523]
> V  [libjvm.so+0x9e398d]
> V  [libjvm.so+0x9e2326]
> V  [libjvm.so+0x9e2b50]
> V  [libjvm.so+0x42c099]
> V  [libjvm.so+0x9dc786]
> V  [libjvm.so+0x6a5edf]
> V  [libjvm.so+0x6a70cb]  JVM_DefineClass+0xbb
> V  [libjvm.so+0xa31ea5]
> V  [libjvm.so+0xa37ea7]
> J 4842  
> sun.misc.Unsafe.defineClass(Ljava/lang/String;[BIILjava/lang/ClassLoader;Ljava/security/ProtectionDomain;)Ljava/lang/Class;
>  (0 bytes) @ 0x7fa0c7af120b [0x7fa0c7af1100+0x10b]
> J 13229 C2 sun.reflect.MethodAccessorGenerator$1.run()Ljava/lang/Object; (5 
> bytes) @ 0x7fa0c8cf2a74 [0x7fa0c8cf2940+0x134]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6b5949]  JVM_DoPrivileged+0x429
> J 1035  
> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;
>  (0 bytes) @ 0x7fa0c7220c7f [0x7fa0c7220bc0+0xbf]
> J 20421 C2 
> sun.reflect.MethodAccessorGenerator.generate(Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/Class;Ljava/lang/Class;[Ljava/lang/Class;IZZLjava/lang/Class;)Lsun/reflect/MagicAccessorImpl;
>  (762 bytes) @ 0x7fa0c89bb848 [0x7fa0c89b9da0+0x1aa8]
> J 4163 C2 
> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (104 bytes) @ 0x7fa0c789cca8 [0x7fa0c789c8c0+0x3e8]
> J 2379 C2 org.apache.impala.hive.executor.UdfExecutor.evaluate()V (396 bytes) 
> @ 0x7fa0c711c638 [0x7fa0c711c400+0x238]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6822d7]
> V  [libjvm.so+0x6862c9]
> C  [impalad+0x2a004fa]  JNIEnv_::CallNonvirtualVoidMethodA(_jobject*, 
> _jclass*, _jmethodID*, jvalue const*)+0x40
> C  [impalad+0x29fe4ff]  
> impala::HiveUdfCall::Evaluate(impala::ScalarExprEvaluator*, impala::TupleRow 
> const*) const+0x44b
> C  [impalad+0x29ffde9]  
> impala::HiveUdfCall::GetSmallIntVal(impala::ScalarExprEvaluator*, 
> impala::TupleRow const*) const+0xbb
> C  [impalad+0x2a0948a]  
> impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
> impala::TupleRow const*)+0x14c
> C  [impalad+0x2a48eb1]  
> impala::ScalarFnCall::EvaluateNonConstantChildren(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x9d
> C  [impalad+0x2a4abba]  impala_udf::BooleanVal 
> impala::ScalarFnCall::InterpretEval(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x18c
> C  [impalad+0x2a4907d]  
> impala::ScalarFnCall::GetBooleanVal(impala::ScalarExprEvaluator*, 
> impala::TupleRow const*) const+0x179
> C  [impalad+0x2a09c7f]  
>

[jira] [Updated] (IMPALA-7069) Java UDF tests can trigger a crash in Java ClassLoader

2018-05-24 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7069:
--
Summary: Java UDF tests can trigger a crash in Java ClassLoader  (was: Java 
UDF tests can trigger a crash in 
java.util.concurrent.ConcurrentHashMap.putVal())

> Java UDF tests can trigger a crash in Java ClassLoader
> --
>
> Key: IMPALA-7069
> URL: https://issues.apache.org/jira/browse/IMPALA-7069
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Vuk Ercegovac
>Priority: Critical
>  Labels: crash, flaky
> Attachments: hs_err_pid22764.log, hs_err_pid29246.log, 
> hs_err_pid8975.log, hs_err_pid9694.log
>
>
> I hit this crash on a GVO, but was able to reproduce it on master on my 
> desktop.
> Repro steps:
> {code}
> git checkout c1362afb9a072e49df470d9068d44cdbdf5cdec5
> ./buildall.sh -debug -noclean -notests -skiptests -ninja
> start-impala-cluster.py
> while impala-py.test tests/query_test/test_udfs.py -k 'hive or java or jar' 
> -n4 --verbose; do date; done
> {code}
> I generally hit the crash within a hour of looping the test.
> {noformat}
> Stack: [0x7fa04791f000,0x7fa04812],  sp=0x7fa04811aff0,  free 
> space=8175k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> V  [libjvm.so+0x8a8107]
> V  [libjvm.so+0x96cf5f]
> v  ~RuntimeStub::_complete_monitor_locking_Java
> J 2758 C2 
> java.util.concurrent.ConcurrentHashMap.putVal(Ljava/lang/Object;Ljava/lang/Object;Z)Ljava/lang/Object;
>  (362 bytes) @ 0x7fa0c73637d4 [0x7fa0c7362d00+0xad4]
> J 2311 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; (122 
> bytes) @ 0x7fa0c70a09a8 [0x7fa0c70a08e0+0xc8]
> J 3953 C2 
> java.net.FactoryURLClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;
>  (40 bytes) @ 0x7fa0c71ce0f0 [0x7fa0c71ce0a0+0x50]
> J 2987 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class; (7 
> bytes) @ 0x7fa0c72ddb64 [0x7fa0c72ddb20+0x44]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x661ec4]
> V  [libjvm.so+0x662523]
> V  [libjvm.so+0x9e398d]
> V  [libjvm.so+0x9e2326]
> V  [libjvm.so+0x9e2b50]
> V  [libjvm.so+0x42c099]
> V  [libjvm.so+0x9dc786]
> V  [libjvm.so+0x6a5edf]
> V  [libjvm.so+0x6a70cb]  JVM_DefineClass+0xbb
> V  [libjvm.so+0xa31ea5]
> V  [libjvm.so+0xa37ea7]
> J 4842  
> sun.misc.Unsafe.defineClass(Ljava/lang/String;[BIILjava/lang/ClassLoader;Ljava/security/ProtectionDomain;)Ljava/lang/Class;
>  (0 bytes) @ 0x7fa0c7af120b [0x7fa0c7af1100+0x10b]
> J 13229 C2 sun.reflect.MethodAccessorGenerator$1.run()Ljava/lang/Object; (5 
> bytes) @ 0x7fa0c8cf2a74 [0x7fa0c8cf2940+0x134]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6b5949]  JVM_DoPrivileged+0x429
> J 1035  
> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;
>  (0 bytes) @ 0x7fa0c7220c7f [0x7fa0c7220bc0+0xbf]
> J 20421 C2 
> sun.reflect.MethodAccessorGenerator.generate(Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/Class;Ljava/lang/Class;[Ljava/lang/Class;IZZLjava/lang/Class;)Lsun/reflect/MagicAccessorImpl;
>  (762 bytes) @ 0x7fa0c89bb848 [0x7fa0c89b9da0+0x1aa8]
> J 4163 C2 
> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (104 bytes) @ 0x7fa0c789cca8 [0x7fa0c789c8c0+0x3e8]
> J 2379 C2 org.apache.impala.hive.executor.UdfExecutor.evaluate()V (396 bytes) 
> @ 0x7fa0c711c638 [0x7fa0c711c400+0x238]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6822d7]
> V  [libjvm.so+0x6862c9]
> C  [impalad+0x2a004fa]  JNIEnv_::CallNonvirtualVoidMethodA(_jobject*, 
> _jclass*, _jmethodID*, jvalue const*)+0x40
> C  [impalad+0x29fe4ff]  
> impala::HiveUdfCall::Evaluate(impala::ScalarExprEvaluator*, impala::TupleRow 
> const*) const+0x44b
> C  [impalad+0x29ffde9]  
> impala::HiveUdfCall::GetSmallIntVal(impala::ScalarExprEvaluator*, 
> impala::TupleRow const*) const+0xbb
> C  [impalad+0x2a0948a]  
> impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
> impala::TupleRow const*)+0x14c
> C  [impalad+0x2a48eb1]  
> impala::ScalarFnCall::EvaluateNonConstantChildren(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x9d
> C  [impalad+0x2a4abba]  impala_udf::BooleanVal 
> impala::ScalarFnCall::InterpretEval(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x18c
> C  [impalad+0x2a4907d]  
> impala::ScalarFnCall::GetBooleanVal(impala::ScalarExprEvaluator*, 
> impala::TupleRow const*) const+0x179
> C  [impalad+0x2a09c7f]  
>

[jira] [Created] (IMPALA-7069) Java UDF tests can trigger a crash in java.util.concurrent.ConcurrentHashMap.putVal()

2018-05-24 Thread Tim Armstrong (JIRA)

Tim Armstrong created IMPALA-7069:
-

 Summary: Java UDF tests can trigger a crash in 
java.util.concurrent.ConcurrentHashMap.putVal()
 Key: IMPALA-7069
 URL: https://issues.apache.org/jira/browse/IMPALA-7069
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Tim Armstrong
Assignee: Vuk Ercegovac
 Attachments: hs_err_pid22764.log, hs_err_pid29246.log, 
hs_err_pid8975.log, hs_err_pid9694.log

I hit this crash on a GVO, but was able to reproduce it on master on my desktop.

Repro steps:
{code}
git checkout c1362afb9a072e49df470d9068d44cdbdf5cdec5
./buildall.sh -debug -noclean -notests -skiptests -ninja
start-impala-cluster.py
while impala-py.test tests/query_test/test_udfs.py -k 'hive or java or jar' -n4 
--verbose; do date; done
{code}
I generally hit the crash within a hour of looping the test.

{noformat}
Stack: [0x7fa04791f000,0x7fa04812],  sp=0x7fa04811aff0,  free 
space=8175k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x8a8107]
V  [libjvm.so+0x96cf5f]
v  ~RuntimeStub::_complete_monitor_locking_Java
J 2758 C2 
java.util.concurrent.ConcurrentHashMap.putVal(Ljava/lang/Object;Ljava/lang/Object;Z)Ljava/lang/Object;
 (362 bytes) @ 0x7fa0c73637d4 [0x7fa0c7362d00+0xad4]
J 2311 C2 java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; 
(122 bytes) @ 0x7fa0c70a09a8 [0x7fa0c70a08e0+0xc8]
J 3953 C2 
java.net.FactoryURLClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; 
(40 bytes) @ 0x7fa0c71ce0f0 [0x7fa0c71ce0a0+0x50]
J 2987 C2 java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class; 
(7 bytes) @ 0x7fa0c72ddb64 [0x7fa0c72ddb20+0x44]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x6648eb]
V  [libjvm.so+0x661ec4]
V  [libjvm.so+0x662523]
V  [libjvm.so+0x9e398d]
V  [libjvm.so+0x9e2326]
V  [libjvm.so+0x9e2b50]
V  [libjvm.so+0x42c099]
V  [libjvm.so+0x9dc786]
V  [libjvm.so+0x6a5edf]
V  [libjvm.so+0x6a70cb]  JVM_DefineClass+0xbb
V  [libjvm.so+0xa31ea5]
V  [libjvm.so+0xa37ea7]
J 4842  
sun.misc.Unsafe.defineClass(Ljava/lang/String;[BIILjava/lang/ClassLoader;Ljava/security/ProtectionDomain;)Ljava/lang/Class;
 (0 bytes) @ 0x7fa0c7af120b [0x7fa0c7af1100+0x10b]
J 13229 C2 sun.reflect.MethodAccessorGenerator$1.run()Ljava/lang/Object; (5 
bytes) @ 0x7fa0c8cf2a74 [0x7fa0c8cf2940+0x134]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x6648eb]
V  [libjvm.so+0x6b5949]  JVM_DoPrivileged+0x429
J 1035  
java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;
 (0 bytes) @ 0x7fa0c7220c7f [0x7fa0c7220bc0+0xbf]
J 20421 C2 
sun.reflect.MethodAccessorGenerator.generate(Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/Class;Ljava/lang/Class;[Ljava/lang/Class;IZZLjava/lang/Class;)Lsun/reflect/MagicAccessorImpl;
 (762 bytes) @ 0x7fa0c89bb848 [0x7fa0c89b9da0+0x1aa8]
J 4163 C2 
sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
 (104 bytes) @ 0x7fa0c789cca8 [0x7fa0c789c8c0+0x3e8]
J 2379 C2 org.apache.impala.hive.executor.UdfExecutor.evaluate()V (396 bytes) @ 
0x7fa0c711c638 [0x7fa0c711c400+0x238]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x6648eb]
V  [libjvm.so+0x6822d7]
V  [libjvm.so+0x6862c9]
C  [impalad+0x2a004fa]  JNIEnv_::CallNonvirtualVoidMethodA(_jobject*, _jclass*, 
_jmethodID*, jvalue const*)+0x40
C  [impalad+0x29fe4ff]  
impala::HiveUdfCall::Evaluate(impala::ScalarExprEvaluator*, impala::TupleRow 
const*) const+0x44b
C  [impalad+0x29ffde9]  
impala::HiveUdfCall::GetSmallIntVal(impala::ScalarExprEvaluator*, 
impala::TupleRow const*) const+0xbb
C  [impalad+0x2a0948a]  
impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
impala::TupleRow const*)+0x14c
C  [impalad+0x2a48eb1]  
impala::ScalarFnCall::EvaluateNonConstantChildren(impala::ScalarExprEvaluator*, 
impala::TupleRow const*) const+0x9d
C  [impalad+0x2a4abba]  impala_udf::BooleanVal 
impala::ScalarFnCall::InterpretEval(impala::ScalarExprEvaluator*,
 impala::TupleRow const*) const+0x18c
C  [impalad+0x2a4907d]  
impala::ScalarFnCall::GetBooleanVal(impala::ScalarExprEvaluator*, 
impala::TupleRow const*) const+0x179
C  [impalad+0x2a09c7f]  
impala::ScalarExprEvaluator::GetBooleanVal(impala::TupleRow*)+0x37
C  [impalad+0x1b70efb]  
impala::ExecNode::EvalPredicate(impala::ScalarExprEvaluator*, 
impala::TupleRow*)+0x23
C  [impalad+0x1b70efb]  
impala::ExecNode::EvalPredicate(impala::ScalarExprEvaluator*, 
impala::TupleRow*)+0x23
C  [impalad+0x1b6fdf0]  
impala::ExecNode::EvalConjuncts(impala::ScalarExprEvaluator* const*, int, 
impala::TupleRow*)+0x42
C  [impalad+0x1bb60e3]  
impala::HdfsScanner::EvalConjuncts(impala::TupleRow*)+0x4d
C  [impalad+0x1bb08fd]  
impala::HdfsScanner::WriteCompleteTuple(impala::MemPool*,

[jira] [Created] (IMPALA-7069) Java UDF tests can trigger a crash in java.util.concurrent.ConcurrentHashMap.putVal()

2018-05-24 Thread Tim Armstrong (JIRA)

Tim Armstrong created IMPALA-7069:
-

 Summary: Java UDF tests can trigger a crash in 
java.util.concurrent.ConcurrentHashMap.putVal()
 Key: IMPALA-7069
 URL: https://issues.apache.org/jira/browse/IMPALA-7069
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Tim Armstrong
Assignee: Vuk Ercegovac
 Attachments: hs_err_pid22764.log, hs_err_pid29246.log, 
hs_err_pid8975.log, hs_err_pid9694.log

I hit this crash on a GVO, but was able to reproduce it on master on my desktop.

Repro steps:
{code}
git checkout c1362afb9a072e49df470d9068d44cdbdf5cdec5
./buildall.sh -debug -noclean -notests -skiptests -ninja
start-impala-cluster.py
while impala-py.test tests/query_test/test_udfs.py -k 'hive or java or jar' -n4 
--verbose; do date; done
{code}
I generally hit the crash within a hour of looping the test.

{noformat}
Stack: [0x7fa04791f000,0x7fa04812],  sp=0x7fa04811aff0,  free 
space=8175k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x8a8107]
V  [libjvm.so+0x96cf5f]
v  ~RuntimeStub::_complete_monitor_locking_Java
J 2758 C2 
java.util.concurrent.ConcurrentHashMap.putVal(Ljava/lang/Object;Ljava/lang/Object;Z)Ljava/lang/Object;
 (362 bytes) @ 0x7fa0c73637d4 [0x7fa0c7362d00+0xad4]
J 2311 C2 java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; 
(122 bytes) @ 0x7fa0c70a09a8 [0x7fa0c70a08e0+0xc8]
J 3953 C2 
java.net.FactoryURLClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; 
(40 bytes) @ 0x7fa0c71ce0f0 [0x7fa0c71ce0a0+0x50]
J 2987 C2 java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class; 
(7 bytes) @ 0x7fa0c72ddb64 [0x7fa0c72ddb20+0x44]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x6648eb]
V  [libjvm.so+0x661ec4]
V  [libjvm.so+0x662523]
V  [libjvm.so+0x9e398d]
V  [libjvm.so+0x9e2326]
V  [libjvm.so+0x9e2b50]
V  [libjvm.so+0x42c099]
V  [libjvm.so+0x9dc786]
V  [libjvm.so+0x6a5edf]
V  [libjvm.so+0x6a70cb]  JVM_DefineClass+0xbb
V  [libjvm.so+0xa31ea5]
V  [libjvm.so+0xa37ea7]
J 4842  
sun.misc.Unsafe.defineClass(Ljava/lang/String;[BIILjava/lang/ClassLoader;Ljava/security/ProtectionDomain;)Ljava/lang/Class;
 (0 bytes) @ 0x7fa0c7af120b [0x7fa0c7af1100+0x10b]
J 13229 C2 sun.reflect.MethodAccessorGenerator$1.run()Ljava/lang/Object; (5 
bytes) @ 0x7fa0c8cf2a74 [0x7fa0c8cf2940+0x134]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x6648eb]
V  [libjvm.so+0x6b5949]  JVM_DoPrivileged+0x429
J 1035  
java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;
 (0 bytes) @ 0x7fa0c7220c7f [0x7fa0c7220bc0+0xbf]
J 20421 C2 
sun.reflect.MethodAccessorGenerator.generate(Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/Class;Ljava/lang/Class;[Ljava/lang/Class;IZZLjava/lang/Class;)Lsun/reflect/MagicAccessorImpl;
 (762 bytes) @ 0x7fa0c89bb848 [0x7fa0c89b9da0+0x1aa8]
J 4163 C2 
sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
 (104 bytes) @ 0x7fa0c789cca8 [0x7fa0c789c8c0+0x3e8]
J 2379 C2 org.apache.impala.hive.executor.UdfExecutor.evaluate()V (396 bytes) @ 
0x7fa0c711c638 [0x7fa0c711c400+0x238]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x6648eb]
V  [libjvm.so+0x6822d7]
V  [libjvm.so+0x6862c9]
C  [impalad+0x2a004fa]  JNIEnv_::CallNonvirtualVoidMethodA(_jobject*, _jclass*, 
_jmethodID*, jvalue const*)+0x40
C  [impalad+0x29fe4ff]  
impala::HiveUdfCall::Evaluate(impala::ScalarExprEvaluator*, impala::TupleRow 
const*) const+0x44b
C  [impalad+0x29ffde9]  
impala::HiveUdfCall::GetSmallIntVal(impala::ScalarExprEvaluator*, 
impala::TupleRow const*) const+0xbb
C  [impalad+0x2a0948a]  
impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
impala::TupleRow const*)+0x14c
C  [impalad+0x2a48eb1]  
impala::ScalarFnCall::EvaluateNonConstantChildren(impala::ScalarExprEvaluator*, 
impala::TupleRow const*) const+0x9d
C  [impalad+0x2a4abba]  impala_udf::BooleanVal 
impala::ScalarFnCall::InterpretEval(impala::ScalarExprEvaluator*,
 impala::TupleRow const*) const+0x18c
C  [impalad+0x2a4907d]  
impala::ScalarFnCall::GetBooleanVal(impala::ScalarExprEvaluator*, 
impala::TupleRow const*) const+0x179
C  [impalad+0x2a09c7f]  
impala::ScalarExprEvaluator::GetBooleanVal(impala::TupleRow*)+0x37
C  [impalad+0x1b70efb]  
impala::ExecNode::EvalPredicate(impala::ScalarExprEvaluator*, 
impala::TupleRow*)+0x23
C  [impalad+0x1b70efb]  
impala::ExecNode::EvalPredicate(impala::ScalarExprEvaluator*, 
impala::TupleRow*)+0x23
C  [impalad+0x1b6fdf0]  
impala::ExecNode::EvalConjuncts(impala::ScalarExprEvaluator* const*, int, 
impala::TupleRow*)+0x42
C  [impalad+0x1bb60e3]  
impala::HdfsScanner::EvalConjuncts(impala::TupleRow*)+0x4d
C  [impalad+0x1bb08fd]  
impala::HdfsScanner::WriteCompleteTuple(impala::MemPool*,

[jira] [Created] (IMPALA-7068) Failed test: metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression on S3

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)

Dimitris Tsirogiannis created IMPALA-7068:
-

 Summary: Failed test: 
metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression
 on S3
 Key: IMPALA-7068
 URL: https://issues.apache.org/jira/browse/IMPALA-7068
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog, Infrastructure
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


This is from executing the failed test. It seems that the S3 prefix 
(s3a://impala-cdh5-s3-tests) is added twice to the table location, resulting in 
an invalid S3 path. 
{code:java}
Error Message
metadata/test_partition_metadata.py:177: in test_unsupported_text_compression   
  FQ_TBL_NAME, TBL_LOCATION)) common/impala_connection.py:160: in execute 
return self.__beeswax_client.execute(sql_stmt, user=user) 
beeswax/impala_beeswax.py:173: in execute handle = 
self.__execute_query(query_string.strip(), user=user) 
beeswax/impala_beeswax.py:339: in __execute_query handle = 
self.execute_query_async(query_string, user=user) 
beeswax/impala_beeswax.py:335: in execute_query_async return 
self.__do_rpc(lambda: self.imp_service.query(query,)) 
beeswax/impala_beeswax.py:460: in __do_rpc raise 
ImpalaBeeswaxException(self.__build_error_message(b), b) E   
ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION:  EMESSAGE: AnalysisException: Bucket 
impala-cdh5-s3-tests3a does not exist E   CAUSED BY: FileNotFoundException: 
Bucket impala-cdh5-s3-tests3a does not exist
Stacktrace
metadata/test_partition_metadata.py:177: in test_unsupported_text_compression
FQ_TBL_NAME, TBL_LOCATION))
common/impala_connection.py:160: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:173: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:339: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:335: in execute_query_async
return self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:460: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: AnalysisException: Bucket impala-cdh5-s3-tests3a does not exist
E   CAUSED BY: FileNotFoundException: Bucket impala-cdh5-s3-tests3a does not 
exist
Standard Error
-- connecting to: localhost:21000
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_unsupported_text_compression_695d360a` CASCADE;

SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_unsupported_text_compression_695d360a`;

MainThread: Created database "test_unsupported_text_compression_695d360a" for 
test ID 
"metadata/test_partition_metadata.py::TestPartitionMetadataUncompressedTextOnly::()::test_unsupported_text_compression[exec_option:
 {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: text/none]"
MainThread: Starting new HTTPS connection (1): 
impala-cdh5-s3-test.s3.amazonaws.com
-- executing against localhost:21000
create external table 
test_unsupported_text_compression_695d360a.multi_text_compression like 
functional.alltypes location 
's3a://impala-cdh5-s3-tests3a://impala-cdh5-s3-test/test-warehouse/test_unsupported_text_compression_695d360a.db/multi_text_compression';
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7068) Failed test: metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression on S3

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)

Dimitris Tsirogiannis created IMPALA-7068:
-

 Summary: Failed test: 
metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression
 on S3
 Key: IMPALA-7068
 URL: https://issues.apache.org/jira/browse/IMPALA-7068
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog, Infrastructure
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


This is from executing the failed test. It seems that the S3 prefix 
(s3a://impala-cdh5-s3-tests) is added twice to the table location, resulting in 
an invalid S3 path. 
{code:java}
Error Message
metadata/test_partition_metadata.py:177: in test_unsupported_text_compression   
  FQ_TBL_NAME, TBL_LOCATION)) common/impala_connection.py:160: in execute 
return self.__beeswax_client.execute(sql_stmt, user=user) 
beeswax/impala_beeswax.py:173: in execute handle = 
self.__execute_query(query_string.strip(), user=user) 
beeswax/impala_beeswax.py:339: in __execute_query handle = 
self.execute_query_async(query_string, user=user) 
beeswax/impala_beeswax.py:335: in execute_query_async return 
self.__do_rpc(lambda: self.imp_service.query(query,)) 
beeswax/impala_beeswax.py:460: in __do_rpc raise 
ImpalaBeeswaxException(self.__build_error_message(b), b) E   
ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION:  EMESSAGE: AnalysisException: Bucket 
impala-cdh5-s3-tests3a does not exist E   CAUSED BY: FileNotFoundException: 
Bucket impala-cdh5-s3-tests3a does not exist
Stacktrace
metadata/test_partition_metadata.py:177: in test_unsupported_text_compression
FQ_TBL_NAME, TBL_LOCATION))
common/impala_connection.py:160: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:173: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:339: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:335: in execute_query_async
return self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:460: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: AnalysisException: Bucket impala-cdh5-s3-tests3a does not exist
E   CAUSED BY: FileNotFoundException: Bucket impala-cdh5-s3-tests3a does not 
exist
Standard Error
-- connecting to: localhost:21000
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_unsupported_text_compression_695d360a` CASCADE;

SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_unsupported_text_compression_695d360a`;

MainThread: Created database "test_unsupported_text_compression_695d360a" for 
test ID 
"metadata/test_partition_metadata.py::TestPartitionMetadataUncompressedTextOnly::()::test_unsupported_text_compression[exec_option:
 {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: text/none]"
MainThread: Starting new HTTPS connection (1): 
impala-cdh5-s3-test.s3.amazonaws.com
-- executing against localhost:21000
create external table 
test_unsupported_text_compression_695d360a.multi_text_compression like 
functional.alltypes location 
's3a://impala-cdh5-s3-tests3a://impala-cdh5-s3-test/test-warehouse/test_unsupported_text_compression_695d360a.db/multi_text_compression';
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IMPALA-7067) sleep(100000) command from test_shell_commandline.py can hang around and cause test_metrics_are_zero to fail

2018-05-24 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489329#comment-16489329
 ] 

Tim Armstrong commented on IMPALA-7067:
---

It's a test bug - the fragment thread is stuck in the sleep() call so can't 
clean itself up.

> sleep(10) command from test_shell_commandline.py can hang around and 
> cause test_metrics_are_zero to fail
> 
>
> Key: IMPALA-7067
> URL: https://issues.apache.org/jira/browse/IMPALA-7067
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: flaky
>
> {noformat}
> 03:25:47 [gw6] PASSED 
> shell/test_shell_commandline.py::TestImpalaShell::test_cancellation 
> ...
> 03:27:01 verifiers/test_verify_metrics.py:34: in test_metrics_are_zero
> 03:27:01 verifier.verify_metrics_are_zero()
> 03:27:01 verifiers/metric_verifier.py:47: in verify_metrics_are_zero
> 03:27:01 self.wait_for_metric(metric, 0, timeout)
> 03:27:01 verifiers/metric_verifier.py:62: in wait_for_metric
> 03:27:01 self.impalad_service.wait_for_metric_value(metric_name, 
> expected_value, timeout)
> 03:27:01 common/impala_service.py:135: in wait_for_metric_value
> 03:27:01 json.dumps(self.read_debug_webpage('rpcz?json')))
> 03:27:01 E   AssertionError: Metric value impala-server.mem-pool.total-bytes 
> did not reach value 0 in 60s
> {noformat}
> I used the json dump from memz and the logs to trace it back to the 
> sleep(10) query hanging around



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6119) Inconsistent file metadata updates when multiple partitions point to the same path

2018-05-24 Thread bharath v (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489304#comment-16489304
 ] 

bharath v commented on IMPALA-6119:
---

Thanks for digging into this. I vaguely remember that we considered the above 
solutions and like you mentioned  (1) is more computationally heavy and (2) 
adds more memory and the Catalog is already notorious for its memory usage.:) 

What are your on the fix I proposed? We make exploit the references and update 
the source fds directly. 

> Inconsistent file metadata updates when multiple partitions point to the same 
> path
> --
>
> Key: IMPALA-6119
> URL: https://issues.apache.org/jira/browse/IMPALA-6119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: bharath v
>Assignee: Gabor Kaszab
>Priority: Critical
>  Labels: correctness, ramp-up
>
> Following steps can give inconsistent results.
> {noformat}
> // Create a partitioned table
> create table test(a int) partitioned by (b int);
> // Create two partitions b=1/b=2 mapped to the same HDFS location.
> insert into test partition(b=1) values (1);
> alter table test add partition (b=2) location 
> 'hdfs://localhost:20500/test-warehouse/test/b=1/' 
> [localhost:21000] > show partitions test;
> Query: show partitions test
> +---+---++--+--+---++---++
> | b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
> Incremental stats | Location   |
> +---+---++--+--+---++---++
> | 1 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | 2 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | Total | -1| 2  | 4B   | 0B   |   || 
>   ||
> +---+---++--+--+---++---++
> // Insert new data into one of the partitions
> insert into test partition(b=1) values (2);
> // Newly added file is reflected only in the added partition files. 
> show files in test;
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> invalidate metadata test;
>  show files in test;
> // After invalidation, the newly added file now shows up in both the 
> partitions.
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=2   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> {noformat}
> So, depending whether the user invalidates the

[jira] [Commented] (IMPALA-7067) sleep(100000) command from test_shell_commandline.py can hang around and cause test_metrics_are_zero to fail

2018-05-24 Thread Dan Hecht (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489278#comment-16489278
 ] 

Dan Hecht commented on IMPALA-7067:
---

Does the close RPC show up in the log? i.e. do we think this is a shell bug or 
impala bug?

> sleep(10) command from test_shell_commandline.py can hang around and 
> cause test_metrics_are_zero to fail
> 
>
> Key: IMPALA-7067
> URL: https://issues.apache.org/jira/browse/IMPALA-7067
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: flaky
>
> {noformat}
> 03:25:47 [gw6] PASSED 
> shell/test_shell_commandline.py::TestImpalaShell::test_cancellation 
> ...
> 03:27:01 verifiers/test_verify_metrics.py:34: in test_metrics_are_zero
> 03:27:01 verifier.verify_metrics_are_zero()
> 03:27:01 verifiers/metric_verifier.py:47: in verify_metrics_are_zero
> 03:27:01 self.wait_for_metric(metric, 0, timeout)
> 03:27:01 verifiers/metric_verifier.py:62: in wait_for_metric
> 03:27:01 self.impalad_service.wait_for_metric_value(metric_name, 
> expected_value, timeout)
> 03:27:01 common/impala_service.py:135: in wait_for_metric_value
> 03:27:01 json.dumps(self.read_debug_webpage('rpcz?json')))
> 03:27:01 E   AssertionError: Metric value impala-server.mem-pool.total-bytes 
> did not reach value 0 in 60s
> {noformat}
> I used the json dump from memz and the logs to trace it back to the 
> sleep(10) query hanging around



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6119) Inconsistent file metadata updates when multiple partitions point to the same path

2018-05-24 Thread Gabor Kaszab (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489122#comment-16489122
 ] 

Gabor Kaszab commented on IMPALA-6119:
--

I identified the issue to be located somewhere else. What I found is that in 
case an "insert into test partition(b=1) values (2);" is invoked then only the 
b=1 partition is reloaded. However the b=2 partition should also be reloaded to 
be aware of the new file created by the insert.

In updatePartitionsFromHms()
{code:java}
if (loadPartitionFileMetadata) {
  if (partitionsToUpdate != null) {
// Only reload file metadata of partitions specified in 'partitionsToUpdate'
Preconditions.checkState(partitionsToUpdateFileMdByPath.isEmpty());
partitionsToUpdateFileMdByPath = getPartitionsByPath(partitionsToUpdate);
  }
  loadMetadataAndDiskIds(partitionsToUpdateFileMdByPath, true);
}
{code}
getPartitionsByPath() in this case receives the b=1 partition an returns it's 
path and the partition itself. As a result loadMetadataAndDiskIds() is called 
only for b=1 partition.

I made some experiments to modify getPartitionsByPath() to find all the 
partitions that point to the location where the ones received by parameter 
points to and apparently that fixes this issue. However, it's might not the 
most optimal solution.

+My general fix proposal is then the following:+
 - When a particular partition is being reloaded then find all the other 
partitions that has the same 'location' and reload them as well. Finding these 
partitions in an optimal way is not that straightforward, though.

+2 proposals for finding the partitions with the same location:+

1)  When a set of partitions are received in updatePartitionsFromHms() then go 
through this set and compare the current partition from the set with all the 
partitions in the table to check which one has the same 'location'. When we 
have big inserts that affect a number of partitions and in the same time the 
total number of partitions is high than this operation can be computational 
heavy and has to be done for each insert even if we don't have partitions 
pointing to the same location.

2) Similarly to HdfsTable:: partitionMap_ we can keep track of a (path -> set 
of partitions) mapping. This way when a set of partitions are received in 
updatePartitionsFromHms() then it's enough to go through this received list and 
then we can find the partitions pointing to the same location using this 
mapping in constant time. One downside of this is that everywhere partitionMap_ 
is changed this mapping has to be maintained as well. Still this seems the 
better approach for me.

 

> Inconsistent file metadata updates when multiple partitions point to the same 
> path
> --
>
> Key: IMPALA-6119
> URL: https://issues.apache.org/jira/browse/IMPALA-6119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: bharath v
>Assignee: Gabor Kaszab
>Priority: Critical
>  Labels: correctness, ramp-up
>
> Following steps can give inconsistent results.
> {noformat}
> // Create a partitioned table
> create table test(a int) partitioned by (b int);
> // Create two partitions b=1/b=2 mapped to the same HDFS location.
> insert into test partition(b=1) values (1);
> alter table test add partition (b=2) location 
> 'hdfs://localhost:20500/test-warehouse/test/b=1/' 
> [localhost:21000] > show partitions test;
> Query: show partitions test
> +---+---++--+--+---++---++
> | b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
> Incremental stats | Location   |
> +---+---++--+--+---++---++
> | 1 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | 2 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | Total | -1| 2  | 4B   | 0B   |   || 
>   ||
> +---+---++--+--+---++---++
> // Insert new data into one of the partitions
> insert into test partition(b=1) values (2);
> // Newly added file is reflected only in the added partition files. 
> show files in test;
> Query: show files in test
>

[jira] [Work started] (IMPALA-6119) Inconsistent file metadata updates when multiple partitions point to the same path

2018-05-24 Thread Gabor Kaszab (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-6119 started by Gabor Kaszab.

> Inconsistent file metadata updates when multiple partitions point to the same 
> path
> --
>
> Key: IMPALA-6119
> URL: https://issues.apache.org/jira/browse/IMPALA-6119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: bharath v
>Assignee: Gabor Kaszab
>Priority: Critical
>  Labels: correctness, ramp-up
>
> Following steps can give inconsistent results.
> {noformat}
> // Create a partitioned table
> create table test(a int) partitioned by (b int);
> // Create two partitions b=1/b=2 mapped to the same HDFS location.
> insert into test partition(b=1) values (1);
> alter table test add partition (b=2) location 
> 'hdfs://localhost:20500/test-warehouse/test/b=1/' 
> [localhost:21000] > show partitions test;
> Query: show partitions test
> +---+---++--+--+---++---++
> | b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
> Incremental stats | Location   |
> +---+---++--+--+---++---++
> | 1 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | 2 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | Total | -1| 2  | 4B   | 0B   |   || 
>   ||
> +---+---++--+--+---++---++
> // Insert new data into one of the partitions
> insert into test partition(b=1) values (2);
> // Newly added file is reflected only in the added partition files. 
> show files in test;
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> invalidate metadata test;
>  show files in test;
> // After invalidation, the newly added file now shows up in both the 
> partitions.
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=2   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> {noformat}
> So, depending whether the user invalidates the table, they can see different 
> results. The bug is in the following code.
> {noformat}
> private FileMetadataLoadStats resetAndLoadFileMetadata(
>   Path partDir, List partitions) throws IOException {
> FileMetadataLoadStats loadStats = new FileMetadataLoadStats(partDir);
> 
> 
> 
>  for (HdfsPartition partition: partitions) 
>

[jira] [Comment Edited] (IMPALA-6119) Inconsistent file metadata updates when multiple partitions point to the same path

2018-05-24 Thread Gabor Kaszab (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489122#comment-16489122
 ] 

Gabor Kaszab edited comment on IMPALA-6119 at 5/24/18 2:42 PM:
---

I identified the issue to be located somewhere else. What I found is that in 
case an "insert into test partition(b=1) values (2);" is invoked then only the 
b=1 partition is reloaded. However the b=2 partition should also be reloaded to 
be aware of the new file created by the insert.

In updatePartitionsFromHms()
{code:java}
if (loadPartitionFileMetadata) {
  if (partitionsToUpdate != null) {
// Only reload file metadata of partitions specified in 'partitionsToUpdate'
Preconditions.checkState(partitionsToUpdateFileMdByPath.isEmpty());
partitionsToUpdateFileMdByPath = getPartitionsByPath(partitionsToUpdate);
  }
  loadMetadataAndDiskIds(partitionsToUpdateFileMdByPath, true);
}
{code}
getPartitionsByPath() in this case receives the b=1 partition an returns it's 
path and the partition itself. As a result loadMetadataAndDiskIds() is called 
only for b=1 partition.

I made some experiments to modify getPartitionsByPath() to find all the 
partitions that point to the location where the ones received by parameter 
points to and apparently that fixes this issue. However, it's might not the 
most optimal solution.

+My general fix proposal is then the following:+
 - When a particular partition is being reloaded then find all the other 
partitions that has the same 'location' and reload them as well. Finding these 
partitions in an optimal way is not that straightforward, though.

+2 proposals for finding the partitions with the same location:+

1)  When a set of partitions are received in updatePartitionsFromHms() then go 
through this set and compare the current partition from the set with all the 
partitions in the table to check which one has the same 'location'. When we 
have big inserts that affect a number of partitions and in the same time the 
total number of partitions is high than this operation can be computational 
heavy and has to be done for each insert even if we don't have partitions 
pointing to the same location.

2) Similarly to HdfsTable:: partitionMap_ we can keep track of a (path -> set 
of partitions) mapping. This way when a set of partitions are received in 
updatePartitionsFromHms() then it's enough to go through this received list and 
then we can find the partitions pointing to the same location using this 
mapping in constant time. One downside of this is that everywhere partitionMap_ 
is changed this mapping has to be maintained as well. Still this seems the 
better approach for me.


was (Author: gaborkaszab):
I identified the issue to be located somewhere else. What I found is that in 
case an "insert into test partition(b=1) values (2);" is invoked then only the 
b=1 partition is reloaded. However the b=2 partition should also be reloaded to 
be aware of the new file created by the insert.

In updatePartitionsFromHms()
{code:java}
if (loadPartitionFileMetadata) {
  if (partitionsToUpdate != null) {
// Only reload file metadata of partitions specified in 'partitionsToUpdate'
Preconditions.checkState(partitionsToUpdateFileMdByPath.isEmpty());
partitionsToUpdateFileMdByPath = getPartitionsByPath(partitionsToUpdate);
  }
  loadMetadataAndDiskIds(partitionsToUpdateFileMdByPath, true);
}
{code}
getPartitionsByPath() in this case receives the b=1 partition an returns it's 
path and the partition itself. As a result loadMetadataAndDiskIds() is called 
only for b=1 partition.

I made some experiments to modify getPartitionsByPath() to find all the 
partitions that point to the location where the ones received by parameter 
points to and apparently that fixes this issue. However, it's might not the 
most optimal solution.

+My general fix proposal is then the following:+
 - When a particular partition is being reloaded then find all the other 
partitions that has the same 'location' and reload them as well. Finding these 
partitions in an optimal way is not that straightforward, though.

+2 proposals for finding the partitions with the same location:+

1)  When a set of partitions are received in updatePartitionsFromHms() then go 
through this set and compare the current partition from the set with all the 
partitions in the table to check which one has the same 'location'. When we 
have big inserts that affect a number of partitions and in the same time the 
total number of partitions is high than this operation can be computational 
heavy and has to be done for each insert even if we don't have partitions 
pointing to the same location.

2) Similarly to HdfsTable:: partitionMap_ we can keep track of a (path -> set 
of partitions) mapping. This way when a set of partitions are received in 
updatePartitionsFromHms() then it's enough to go through this received list and 
then we can

[jira] [Commented] (IMPALA-7060) Restrict Impala to only support timezones that work in Hive (IANA + Java)

2018-05-24 Thread Jim Apple (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489039#comment-16489039
 ] 

Jim Apple commented on IMPALA-7060:
---

For a user that depends on these, a breaking change in 3.1 or 3.2 could be a 
very bad user experience.

Given our tradition of holding breaking changes until a major version bump, I 
strongly suggest we talk about this on the dev@ mailing list before committing 
it without bumping the major version.

I'm not necessarily opposed to going straight to 4.0 is this is so important is 
can't wait.

> Restrict Impala to only support timezones that work in Hive (IANA + Java)
> -
>
> Key: IMPALA-7060
> URL: https://issues.apache.org/jira/browse/IMPALA-7060
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>
> IANA timezone integration (IMPALA-3307) will drop the support for many 
> non-standard timezone aliases. As IMPALA-3307 is a very large change, its 
> release may be delayed - meanwhile, it would be better to discourage Impala 
> 3.x users from using timezone names that will not be supported in the future. 
> For this reason, the current boost based implementation could drop the 
> support for non-standard aliases.
> Note that the current implementation has some major issues:
> - Some of the aliases are ambiguous - Impala will always interpret it the 
> same way, but this is based on an arbitrary ordering of timezones, so the 
> timezone may be actually different than what the user wanted. I think that it 
> is better the print a warning in this case.
> - If the name is not among the standard names, the lookup among aliases is 
> extremely slow (linear search on all supported standard timezones).
> - Most of the aliases are not covered by test cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7060) Restrict Impala to only support timezones that work in Hive (IANA + Java)

2018-05-24 Thread Csaba Ringhofer (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1640#comment-1640
 ] 

Csaba Ringhofer commented on IMPALA-7060:
-

Yes, this is a breaking change, but the plan is to release it in 3.x as an 
exception. The original plan was to release IMPALA-3307 in 3.0,  but that train 
was missed. The current plan is to release IMPALA-3307 in a minor version when 
it will be ready, but that should not be possible as it is a breaking change - 
so this issue (IMPALA-7060) contains the breaking parts, and the intention is 
to release it as soon as possible, before 3.x gets actually adopted in 
production. The rationale is that this a low risk change, and the timezone 
aliases it removes are quite problematic.

> Restrict Impala to only support timezones that work in Hive (IANA + Java)
> -
>
> Key: IMPALA-7060
> URL: https://issues.apache.org/jira/browse/IMPALA-7060
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>
> IANA timezone integration (IMPALA-3307) will drop the support for many 
> non-standard timezone aliases. As IMPALA-3307 is a very large change, its 
> release may be delayed - meanwhile, it would be better to discourage Impala 
> 3.x users from using timezone names that will not be supported in the future. 
> For this reason, the current boost based implementation could drop the 
> support for non-standard aliases.
> Note that the current implementation has some major issues:
> - Some of the aliases are ambiguous - Impala will always interpret it the 
> same way, but this is based on an arbitrary ordering of timezones, so the 
> timezone may be actually different than what the user wanted. I think that it 
> is better the print a warning in this case.
> - If the name is not among the standard names, the lookup among aliases is 
> extremely slow (linear search on all supported standard timezones).
> - Most of the aliases are not covered by test cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-5842) Write page index in Parquet files

2018-05-24 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/IMPALA-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-5842.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Write page index in Parquet files
> -
>
> Key: IMPALA-5842
> URL: https://issues.apache.org/jira/browse/IMPALA-5842
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Lars Volker
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: parquet
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Once PARQUET-922 has been resolved, we should start writing page indices to 
> Parquet files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-5842) Write page index in Parquet files

2018-05-24 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/IMPALA-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-5842.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Write page index in Parquet files
> -
>
> Key: IMPALA-5842
> URL: https://issues.apache.org/jira/browse/IMPALA-5842
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Lars Volker
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: parquet
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Once PARQUET-922 has been resolved, we should start writing page indices to 
> Parquet files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7048) Failed test: query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables

2018-05-24 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/IMPALA-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-7048.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Failed test: 
> query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables
> 
>
> Key: IMPALA-7048
> URL: https://issues.apache.org/jira/browse/IMPALA-7048
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Dimitris Tsirogiannis
>Assignee: Zoltán Borók-Nagy
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The following test fails when the filesystem is LOCAL:
> {code:java}
> query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables[exec_option:
>  \{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from 
> pytest) {code}
> Zoltan, assigning to you since this looks suspiciously related to the fix for 
> IMPALA-5842. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-7048) Failed test: query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables

2018-05-24 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/IMPALA-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-7048.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Failed test: 
> query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables
> 
>
> Key: IMPALA-7048
> URL: https://issues.apache.org/jira/browse/IMPALA-7048
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Dimitris Tsirogiannis
>Assignee: Zoltán Borók-Nagy
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The following test fails when the filesystem is LOCAL:
> {code:java}
> query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables[exec_option:
>  \{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from 
> pytest) {code}
> Zoltan, assigning to you since this looks suspiciously related to the fix for 
> IMPALA-5842. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

69 matches

Mail list logo