[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 5:50 AM:
--

Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, unlike an Impala user who will not possess any privilege on the 
specified column {{col}} after the following 3 SQL statements, a Hive user 
would still possess the {{SELECT}} privilege on the table {{tbl}} afterwards.
{code:java}
GRANT SELECT(col) ON TABLE tbl TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}
In my view, the behavior of Impala should be acceptable even though its 
behavior is different from Hive in this case. When we revoke from a user the 
{{SELECT}} privilege on {{SERVER}}, which is a superset of the TABLE {{tbl}}, 
it should be fine that in the end this user does not possess the {{SELECT}} 
privilege on any resource under this {{SERVER}}.





was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, unlike an Impala user who will not possess any privilege on the 
specified column {{col}} after the following 3 SQL statements, a Hive user 
would still possess the {{SELECT}} privilege on the table {{tbl}} afterwards.
{code:java}
GRANT SELECT(col) ON TABLE tbl TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}
In my view, the behavior of Impala should be acceptable. When we revoke from a 
user the {{SELECT}} privilege on {{SERVER}}, which is a superset of the TABLE 
{{tbl}}, it should be fine that in the end this user does not possess the 
{{SELECT}} privilege on any resource under this {{SERVER}}.




> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> 

[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 5:49 AM:
--

Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, unlike an Impala user who will not possess any privilege on the 
specified column {{col}} after the following 3 SQL statements, a Hive user 
would still possess the {{SELECT}} privilege on the table {{tbl}} afterwards.
{code:java}
GRANT SELECT(col) ON TABLE tbl TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}
In my view, the behavior of Impala should be acceptable. When we revoke from a 
user the {{SELECT}} privilege on {{SERVER}}, which is a superset of the TABLE 
{{tbl}}, it should be fine that in the end this user does not possess the 
{{SELECT}} privilege on any resource under this {{SERVER}}.





was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, unlike an Impala user who will not possess any privilege on the 
specified column {{col}} after the following 3 SQL statements, a Hive user 
would still possess the {{SELECT}} privilege on the table {{tbl}} afterwards.
{code:java}
GRANT SELECT(col) ON TABLE tbl TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}
In my view, the behavior of Impala should be acceptable. When we revoke from a 
user the {{SELECT}} privilege on {{SERVER}}, which is a superset of the TABLE 
{{tbl}}, it should be fine in the end this user does not possess the {{SELECT}} 
privilege on any resource under this {{SERVER}}.




> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> 

[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 5:47 AM:
--

Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, unlike an Impala user who will not possess any privilege on the 
specified column {{col}} after the following 3 SQL statements, a Hive user 
would still possess the {{SELECT}} privilege on the table {{tbl}} afterwards.
{code:java}
GRANT SELECT(col) ON TABLE tbl TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}
In my view, the behavior of Impala should be acceptable. When we revoke from a 
user the {{SELECT}} privilege on {{SERVER}}, which is a superset of the TABLE 
{{tbl}}, it should be fine in the end this user does not possess the {{SELECT}} 
privilege on any resource under this {{SERVER}}.





was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, a Hive user would still possess the {{SELECT}} privilege on the 
table {{tbl}} after the following 3 SQL statements.
{code:java}
GRANT SELECT(id) ON TABLE tbl TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}





> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> 

[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 5:31 AM:
--

Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, a Hive user would still possess the {{SELECT}} privilege on the 
table {{tbl}} after the following 3 SQL statements.
{code:java}
GRANT SELECT(id) ON TABLE tbl TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}






was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, a Hive user would still possess the {{SELECT}} privilege on the 
table {{tbl}} after the following 3 SQL statements.
{code:java}
GRANT SELECT(id) ON TABLE {{tbl}} TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}





> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 5:31 AM:
--

Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, a Hive user would still possess the {{SELECT}} privilege on the 
table {{tbl}} after the following 3 SQL statements.
{code:java}
GRANT SELECT(id) ON TABLE {{tbl}} TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}






was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, a Hive user would still possess the {{SELECT}} privilege on the 
table {{default}} after the following 3 SQL statements.
{code:java}
GRANT SELECT(id) ON TABLE default TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}





> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 5:28 AM:
--

Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.

For the second issue, the behavior of Hive is different from that of Impala. 
Specifically, a Hive user would still possess the {{SELECT}} privilege on the 
table {{default}} after the following 3 SQL statements.
{code:java}
GRANT SELECT(id) ON TABLE default TO ROLE foo_role;
GRANT SELECT ON SERVER TO ROLE foo_role;
REVOKE SELECT ON SERVER FROM ROLE foo_role; 
{code}






was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.






> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 5:03 AM:
--

Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~aholley]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.







was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~jeszyb]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. That is, if a user previously being granted the {{SELECT}} 
privilege on a database is granted the {{ALL}} privilege on the same database, 
then {{SHOW GRANT ROLE}} statement will only show this user's {{ALL}} privilege 
on the database.






> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 4:59 AM:
--

Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~jeszyb]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

In the first issue, the behavior of Hive is the same as Impala described at the 
very beginning. If a user previously being granted the {{SELECT}} privilege on 
a database is granted the {{ALL}} privilege on the same database, then {{SHOW 
GRANT ROLE}} statement will only show this user's {{ALL}} privilege on the 
database.







was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~jeszyb]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

Will take a closer look at both and keep you posted.




> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 2:27 AM:
--

Hi [~joemcdonnell] and [~vihangk1], it seems there are two issues reported 
here. One is the issue originally reported by [~fredyw] and the other is the 
issue reported by [~jeszyb]. The former issue  does not involve the {{REVOKE}} 
statement, whereas the latter involves the {{REVOKE}} statement.

Will take a closer look at both and keep you posted.





was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], I took a look at the description above. I 
am able to reproduce the issue reported by [~fredyw].

I have also briefly compared Impala's behavior and Hive's behavior using the 
SQL statements provided by [~fredyw] and found that their behavior is 
different. Specifically, in the end, the role {{foo_role}} would still possess 
the {{SELECT}} privilege even though we explicitly revoke the {{ALL}} privilege 
from {{foo_role}} in HIve. Those 2 privileges are considered separately.



> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 2:15 AM:
--

Hi [~joemcdonnell] and [~vihangk1], I took a look at the description above. I 
am able to reproduce the issue reported by [~fredyw].

I have also briefly compared Impala's behavior and Hive's behavior using the 
SQL statements provided by [~fredyw] and found that their behavior is 
different. Specifically, in the end, the role {{foo_role}} would still possess 
the {{SELECT}} privilege even though we explicitly revoke the {{ALL}} privilege 
from {{foo_role}} in HIve. Those 2 privileges are considered separately.




was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], I took a look at the description above. I 
am able to reproduce the issue reported by [~fredyw].

As for [~vihangk1]'s question regarding whether this should be an issue, I 
briefly compared the case in which Impala is using Ranger as the authorization 
provider.

If we grant a user {{non_owner}} the {{SELECT}} privilege on a database, e.g., 
{{functional}}, and then grant {{non_owner}} the {{ALL}} privilege on 
{{SERVER}} to {{non_owner}}, {{non_owner}} would possess 2 privileges.

Now if we revoke the {{ALL}} privilege from the user {{non_owner}}, it will 
still possess the {{SELECT}} privilege on the database {{functional}}.

I will try to see how Hive behaves with Sentry being the authorization provider 
in this situation described above and keep you posted.


> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8013) Switch from boost:: to std:: locks

2020-02-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8013 started by Tim Armstrong.
-
> Switch from boost:: to std:: locks
> --
>
> Key: IMPALA-8013
> URL: https://issues.apache.org/jira/browse/IMPALA-8013
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Trivial
>
> We use boost::unique_lock, boost::lock_guard, boost::mutex, etc throughout 
> the backend. There are now standard library equivalents. It would be good to 
> switch to them and remove the dependency on that part of boost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9399) Optimise RuntimeProfile::ToThrift()

2020-02-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9399 started by Tim Armstrong.
-
> Optimise RuntimeProfile::ToThrift()
> ---
>
> Key: IMPALA-9399
> URL: https://issues.apache.org/jira/browse/IMPALA-9399
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: perf
>
> I looked at the RuntimeProfile::ToThrift() part of the dop16 profile from the 
> parent JIRA - 
> https://issues.apache.org/jira/secure/attachment/12993392/coord_q5_dop16.svg.
> There are some minor inefficiencies within ToThrift(), e.g. constructing 
> vectors then copying them immediately.
> The majority of time, though, is spent in TRuntimeProfileNode() and 
> ~TRuntimeProfileNode(). The only place those would be invoked from ToThrift() 
> is in this line at 
> https://github.com/apache/impala/blob/f2f348c0f93208a0f34c33b6a4dc82f4d9d4b290/be/src/util/runtime-profile.cc#L1174:
> {code}
>   nodes->push_back(TRuntimeProfileNode());
> {code}
> I scratched my head and started at our code a bit, then went and looked at 
> the std::vector and thrift-generated code. I believe this line in std::vector 
> is the problem 
> https://github.com/gcc-mirror/gcc/blob/releases/gcc-4.9.2/libstdc%2B%2B-v3/include/bits/vector.tcc#L421
> {code}
>   __new_finish
> = std::__uninitialized_move_if_noexcept_a
> (this->_M_impl._M_start, this->_M_impl._M_finish,
>  __new_start, _M_get_Tp_allocator());
> {code}
> It can't use the move constructor of the TRuntimeProfileNode, because of some 
> c++ exception-safety guarantees and the fact that the constructor is not 
> marked noexcept. I don't think there's an easy way to avoid this without 
> changing the code generated by thrift.
> I played around with the compiler explorer and it looks like it generates 
> much better code with the noexcept added: https://godbolt.org/z/ZTSMHY



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9373) Trial run of IWYU on codebase

2020-02-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9373 started by Tim Armstrong.
-
> Trial run of IWYU on codebase
> -
>
> Key: IMPALA-9373
> URL: https://issues.apache.org/jira/browse/IMPALA-9373
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> I did a trial run and implemented some of the suggestions of IWYU to confirm 
> that it made sense. I'll post a patch with the changes and leave notes here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-4224) Add backend support for join build sinks in parallel plans

2020-02-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040568#comment-17040568
 ] 

ASF subversion and git services commented on IMPALA-4224:
-

Commit 0bb056e525794fca41cd333bc2896098566945bb in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0bb056e ]

IMPALA-4224: execute separate join builds fragments

This enables parallel plans with the join build in a
separate fragment and fixes all of the ensuing fallout.
After this change, mt_dop plans with joins have separate
build fragments. There is still a 1:1 relationship between
join nodes and builders, so the builders are only accessed
by the join node's thread after it is handed off. This lets
us defer the work required to make PhjBuilder and NljBuilder
safe to be shared between nodes.

Planner changes:
* Combined the parallel and distributed planning code paths.
* Misc fixes to generate reasonable thrift structures in the
  query exec requests, i.e. containing the right nodes.
* Fixes to resource calculations for the separate build plans.
** Calculate separate join/build resource consumption.
** Simplified the resource estimation by calculating resource
   consumption for each fragment separately, and assuming that
   all fragments hit their peak resource consumption at the
   same time. IMPALA-9255 is the follow-on to make the resource
   estimation more accurate.

Scheduler changes:
* Various fixes to handle multiple TPlanExecInfos correctly,
  which are generated by the planner for the different cohorts.
* Add logic to colocate build fragments with parent fragments.

Runtime filter changes:
* Build sinks now produce runtime filters, which required
  planner and coordinator fixes to handle.

DataSink changes:
* Close the input plan tree before calling FlushFinal() to release
  resources. This depends on Send() not holding onto references
  to input batches, which was true except for NljBuilder. This
  invariant is documented.

Join builder changes:
* Add a common base class for PhjBuilder and NljBuilder with
  functions to handle synchronisation with the join node.
* Close plan tree earlier in FragmentInstanceState::Exec()
  so that peak resource requirements are lower.
* The NLJ always copies input batches, so that it can close
  its input tree.

JoinNode changes:
* Join node blocks waiting for build-side to be ready,
  then eventually signals that it's done, allowing the builder
  to be cleaned up.
* NLJ and PHJ nodes handle both the integrated builder and
  the external builder. There is a 1:1 relationship between
  the node and the builder, so we don't deal with thread safety
  yet.
* Buffer reservations are transferred between the builder and join
  node when running with the separate builder. This is not really
  necessary right now, since it is all single-threaded, but will
  be important for the shared broadcast.
  - The builder transfers memory for probe buffers to the join node
at the end of each build phase.
  - At end of each probe phase, reservation needs to be handed back
to builder (or released).

ExecSummary changes:
* The summary logic was modified to handle connecting fragments
  via join builds. The logic is an extension of what was used
  for exchanges.

Testing:
* Enable --unlock_mt_dop for end-to-end tests
* Migrate some tests to run as part of end-to-end tests instead of
  custom cluster.
* Add mt_dop dimension to various end-to-end tests to provide
  coverage of join queries, spill-to-disk and cancellation.
* Ran a single node TPC-H and TPC-DS stress test with mt_dop=0
  and mt_dop=4.

Perf:
* Ran TPC-H scale factor 30 locally with mt_dop=0. No significant
  change.

Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
Reviewed-on: http://gerrit.cloudera.org:8080/14859
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Add backend support for join build sinks in parallel plans
> --
>
> Key: IMPALA-4224
> URL: https://issues.apache.org/jira/browse/IMPALA-4224
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: multithreading
>
> Now that IMPALA-3567 is solved, the next step is to add the plumbing to have 
> a join builder as the sink of a plan fragment to implement the parallel plans 
> added in http://gerrit.cloudera.org:8080/2846
> This JIRA tracks making the plans executable, without sharing of the join 
> build for broadcast join.
> Steps required:
> * Enable the join build sink in the planner
> * Update planner to include all required state in the thrift objects (the 
> join build sinks are missing various required info).
> * 

[jira] [Commented] (IMPALA-9255) Avoid unnecessary concurrent resource consumption by fragments.

2020-02-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040569#comment-17040569
 ] 

ASF subversion and git services commented on IMPALA-9255:
-

Commit 0bb056e525794fca41cd333bc2896098566945bb in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0bb056e ]

IMPALA-4224: execute separate join builds fragments

This enables parallel plans with the join build in a
separate fragment and fixes all of the ensuing fallout.
After this change, mt_dop plans with joins have separate
build fragments. There is still a 1:1 relationship between
join nodes and builders, so the builders are only accessed
by the join node's thread after it is handed off. This lets
us defer the work required to make PhjBuilder and NljBuilder
safe to be shared between nodes.

Planner changes:
* Combined the parallel and distributed planning code paths.
* Misc fixes to generate reasonable thrift structures in the
  query exec requests, i.e. containing the right nodes.
* Fixes to resource calculations for the separate build plans.
** Calculate separate join/build resource consumption.
** Simplified the resource estimation by calculating resource
   consumption for each fragment separately, and assuming that
   all fragments hit their peak resource consumption at the
   same time. IMPALA-9255 is the follow-on to make the resource
   estimation more accurate.

Scheduler changes:
* Various fixes to handle multiple TPlanExecInfos correctly,
  which are generated by the planner for the different cohorts.
* Add logic to colocate build fragments with parent fragments.

Runtime filter changes:
* Build sinks now produce runtime filters, which required
  planner and coordinator fixes to handle.

DataSink changes:
* Close the input plan tree before calling FlushFinal() to release
  resources. This depends on Send() not holding onto references
  to input batches, which was true except for NljBuilder. This
  invariant is documented.

Join builder changes:
* Add a common base class for PhjBuilder and NljBuilder with
  functions to handle synchronisation with the join node.
* Close plan tree earlier in FragmentInstanceState::Exec()
  so that peak resource requirements are lower.
* The NLJ always copies input batches, so that it can close
  its input tree.

JoinNode changes:
* Join node blocks waiting for build-side to be ready,
  then eventually signals that it's done, allowing the builder
  to be cleaned up.
* NLJ and PHJ nodes handle both the integrated builder and
  the external builder. There is a 1:1 relationship between
  the node and the builder, so we don't deal with thread safety
  yet.
* Buffer reservations are transferred between the builder and join
  node when running with the separate builder. This is not really
  necessary right now, since it is all single-threaded, but will
  be important for the shared broadcast.
  - The builder transfers memory for probe buffers to the join node
at the end of each build phase.
  - At end of each probe phase, reservation needs to be handed back
to builder (or released).

ExecSummary changes:
* The summary logic was modified to handle connecting fragments
  via join builds. The logic is an extension of what was used
  for exchanges.

Testing:
* Enable --unlock_mt_dop for end-to-end tests
* Migrate some tests to run as part of end-to-end tests instead of
  custom cluster.
* Add mt_dop dimension to various end-to-end tests to provide
  coverage of join queries, spill-to-disk and cancellation.
* Ran a single node TPC-H and TPC-DS stress test with mt_dop=0
  and mt_dop=4.

Perf:
* Ran TPC-H scale factor 30 locally with mt_dop=0. No significant
  change.

Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
Reviewed-on: http://gerrit.cloudera.org:8080/14859
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Avoid unnecessary concurrent resource consumption by fragments.
> ---
>
> Key: IMPALA-9255
> URL: https://issues.apache.org/jira/browse/IMPALA-9255
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: multithreading
>
> In Planner.computeResourceReqs(), we assume that all fragment's peak resource 
> consumption occurs at the same time, resulting in higher than necessary 
> estimates. There are a few things that need to change here to address this 
> inefficiency:
>  * We should consider changing the backend logic so that fragments don't 
> start up and claim resources until their dependencies are ready.
>  * We need to change the backend logic so that resources can be released when 
> needed, e.g. the join build fragment's thread should exit once the build is 
> done, instead of 

[jira] [Resolved] (IMPALA-4224) Add backend support for join build sinks in parallel plans

2020-02-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-4224.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Add backend support for join build sinks in parallel plans
> --
>
> Key: IMPALA-4224
> URL: https://issues.apache.org/jira/browse/IMPALA-4224
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: multithreading
> Fix For: Impala 3.4.0
>
>
> Now that IMPALA-3567 is solved, the next step is to add the plumbing to have 
> a join builder as the sink of a plan fragment to implement the parallel plans 
> added in http://gerrit.cloudera.org:8080/2846
> This JIRA tracks making the plans executable, without sharing of the join 
> build for broadcast join.
> Steps required:
> * Enable the join build sink in the planner
> * Update planner to include all required state in the thrift objects (the 
> join build sinks are missing various required info).
> * Update planner resource requirement calculations - join build fragment 
> needs real resource estimates
> * Update scheduler to schedule join build fragment co-located with their 
> parent fragment. This depends on the build plans being sent pre-order. Pass 
> the source fragment instance id into the join nodes so they can locate the 
> input fragment instance.
> * Update scheduler to correctly handle multiple build plans.
> * Instantiate the join builders as input sinks to the plan. This requires 
> getting some data from the thrift structs instead of passed in from the 
> PHJNode
> * Ensure the join builders function correctly as plan sinks (e.g. add an 
> indefinite wait to the join node to prevent it from crashing, ensure that the 
> builder consumes the whole input). Initially we probably wait to have the 
> build thread block in Close(). 
> * Update the join node so that in the non-subplan mt_dop > 0 case, it looks 
> up the input fragment instance and waits for it to finish the build (with 
> cancellation). Need to find all the places it looks for the right child.
> *  After that the join node "owns" the builder so the control flow should be 
> the same mostly. The main difference is that the buffer pool client and 
> memory tracking is set up differently. Maybe need to change the Close() call 
> as well?
> * Figure out any resource management, etc, issues across the build and probe 
> (threads, memory, etc). Fix up the builder thread behaviour so that Close() 
> doesn't block and the thread is released.
> This, I think, needs to be one change because the intermediate states aren't 
> testable or functional.
> Testing:
> * Existing mt join tests are useful and will exercise the new behaviour
> * Ensure spilling is tested with multithreading (new dimension to spilling 
> tests?)
> * Ensure cancellation is tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9396) Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh

2020-02-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040567#comment-17040567
 ] 

ASF subversion and git services commented on IMPALA-9396:
-

Commit 88b6dd6a712cde32465308291af6341ce31676de in impala's branch 
refs/heads/master from zhaorenhai
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=88b6dd6 ]

IMPALA-9396 Add arm64 openjdk version for Ubuntu 18.04

Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh

Change-Id: I8a7572439cef8f53499c3b20a09269613feea2cf
Reviewed-on: http://gerrit.cloudera.org:8080/15237
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh
> -
>
> Key: IMPALA-9396
> URL: https://issues.apache.org/jira/browse/IMPALA-9396
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: zhaorenhai
>Assignee: zhaorenhai
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9392) Replace Boost-1.57.0-p3 to Boost-1.61.0.-p2 for aarch64 supporting

2020-02-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040566#comment-17040566
 ] 

ASF subversion and git services commented on IMPALA-9392:
-

Commit a69b19a59ba6802583e9c01e7d3dae0561e9b5d8 in impala's branch 
refs/heads/master from zhaorenhai
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a69b19a ]

IMPALA-9392 Change Boost version for aarch64 supporting

Replace Boost version from 1.57.0-p3 to 1.61.0.-p2
for aarch64 supporting in impala-config.sh,
and add 'std' namespace before function 'make_shared'
in impala-server.cc and impala-hs2-server.cc

Change-Id: I12f1d8fa1f3e35a62f3f42b3a2d19b85ca8c7a3d
Reviewed-on: http://gerrit.cloudera.org:8080/15231
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 


> Replace Boost-1.57.0-p3 to Boost-1.61.0.-p2 for aarch64 supporting
> --
>
> Key: IMPALA-9392
> URL: https://issues.apache.org/jira/browse/IMPALA-9392
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: zhaorenhai
>Assignee: zhaorenhai
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> Replace Boost version from  Boost-1.57.0-p3 to Boost-1.61.0.-p2 in 
> bin/impala-config.sh for aarch64 supporting



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9392) Replace Boost-1.57.0-p3 to Boost-1.61.0.-p2 for aarch64 supporting

2020-02-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-9392:
--
Fix Version/s: Impala 3.4.0

> Replace Boost-1.57.0-p3 to Boost-1.61.0.-p2 for aarch64 supporting
> --
>
> Key: IMPALA-9392
> URL: https://issues.apache.org/jira/browse/IMPALA-9392
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: zhaorenhai
>Assignee: zhaorenhai
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> Replace Boost version from  Boost-1.57.0-p3 to Boost-1.61.0.-p2 in 
> bin/impala-config.sh for aarch64 supporting



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9396) Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh

2020-02-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-9396:
--
Fix Version/s: Impala 3.4.0

> Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh
> -
>
> Key: IMPALA-9396
> URL: https://issues.apache.org/jira/browse/IMPALA-9396
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: zhaorenhai
>Assignee: zhaorenhai
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9392) Replace Boost-1.57.0-p3 to Boost-1.61.0.-p2 for aarch64 supporting

2020-02-19 Thread zhaorenhai (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaorenhai resolved IMPALA-9392.

Resolution: Fixed

> Replace Boost-1.57.0-p3 to Boost-1.61.0.-p2 for aarch64 supporting
> --
>
> Key: IMPALA-9392
> URL: https://issues.apache.org/jira/browse/IMPALA-9392
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: zhaorenhai
>Assignee: zhaorenhai
>Priority: Major
>
> Replace Boost version from  Boost-1.57.0-p3 to Boost-1.61.0.-p2 in 
> bin/impala-config.sh for aarch64 supporting



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9396) Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh

2020-02-19 Thread zhaorenhai (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaorenhai resolved IMPALA-9396.

Resolution: Fixed

> Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh
> -
>
> Key: IMPALA-9396
> URL: https://issues.apache.org/jira/browse/IMPALA-9396
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: zhaorenhai
>Assignee: zhaorenhai
>Priority: Major
>
> Add arm64 openjdk version for Ubuntu 18.04 in bootstrap_system.sh



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9386) IMPALA-9287 breaks catalog table loading

2020-02-19 Thread Vihang Karajgaonkar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar resolved IMPALA-9386.
-
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

Resolving this as the offending patch was reverted. I will update use 
https://gerrit.cloudera.org/#/c/15223/ to update the patch for IMPALA-9287

> IMPALA-9287 breaks catalog table loading
> 
>
> Key: IMPALA-9386
> URL: https://issues.apache.org/jira/browse/IMPALA-9386
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
> Fix For: Impala 3.4.0
>
>
> IMPALA-9287 brings in hive-metastore dependency which brings in hive-serde 
> which brings in a different version of flatbuffers and may cause the table 
> loading to fail depending of which version of flatbuffer hive-metastore 
> brings in. We should exclude hive-serde from the dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-9386) IMPALA-9287 breaks catalog table loading

2020-02-19 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040543#comment-17040543
 ] 

Vihang Karajgaonkar edited comment on IMPALA-9386 at 2/20/20 1:09 AM:
--

Resolving this as the offending patch was reverted. I will use 
https://gerrit.cloudera.org/#/c/15223/ to update the patch for IMPALA-9287


was (Author: vihangk1):
Resolving this as the offending patch was reverted. I will update use 
https://gerrit.cloudera.org/#/c/15223/ to update the patch for IMPALA-9287

> IMPALA-9287 breaks catalog table loading
> 
>
> Key: IMPALA-9386
> URL: https://issues.apache.org/jira/browse/IMPALA-9386
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
> Fix For: Impala 3.4.0
>
>
> IMPALA-9287 brings in hive-metastore dependency which brings in hive-serde 
> which brings in a different version of flatbuffers and may cause the table 
> loading to fail depending of which version of flatbuffer hive-metastore 
> brings in. We should exclude hive-serde from the dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7784) Partition pruning handles escaped strings incorrectly

2020-02-19 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reassigned IMPALA-7784:
--

Assignee: Quanlong Huang  (was: Bharath Vissapragada)

> Partition pruning handles escaped strings incorrectly
> -
>
> Key: IMPALA-7784
> URL: https://issues.apache.org/jira/browse/IMPALA-7784
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Csaba Ringhofer
>Assignee: Quanlong Huang
>Priority: Critical
>  Labels: correctness
>
> Repro:
> {code}
> create table tpart (i int) partitioned by (p string)
> insert into tpart partition (p="\"") values (1);
> select  * from tpart where p = "\"";
> Result;
> Fetched 0 row(s)
> select  * from tpart where p = '"';
> Result:
> 1,
> {code}
> Hive returns the row for both queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 12:44 AM:
---

Hi [~joemcdonnell] and [~vihangk1], I took a look at the description above. I 
am able to reproduce the issue reported by [~fredyw].

As for [~vihangk1]'s question regarding whether this should be an issue, I 
briefly compared the case in which Impala is using Ranger as the authorization 
provider.

If we grant a user {{non_owner}} the {{SELECT}} privilege on a database, e.g., 
{{functional}}, and then grant {{non_owner}} the {{ALL}} privilege on 
{{SERVER}} to {{non_owner}}, {{non_owner}} would possess 2 privileges.

Now if we revoke the {{ALL}} privilege from the user {{non_owner}}, it will 
still possess the {{SELECT}} privilege on the database {{functional}}.

I will try to see how Hive behaves with Sentry being the authorization provider 
in this situation described above and keep you posted.



was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], I took a look at the description above. I 
am able to reproduce the issue reported by [~fredyw].

As for [~vihangk1]'s question regarding whether this should be an issue, I 
briefly compared the case in which Impala is using Ranger as the authorization 
provider.

If we grant a user {{non_owner}} the {{SELECT}} privilege of a database, e.g., 
{{functional}}, and then grant {{non_owner}} the {{ALL}} privilege on 
{{SERVER}} to {{non_owner}}, {{non_owner}} would possess 2 privileges.

Now if we revoke the {{ALL}} privilege from the user {{non_owner}}, it will 
still possess the {{SELECT}} privilege on the database {{functional}}.

I will try to see how Hive behaves with Sentry being the authorization provider 
in this situation described above and keep you posted.


> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao edited comment on IMPALA-7282 at 2/20/20 12:44 AM:
---

Hi [~joemcdonnell] and [~vihangk1], I took a look at the description above. I 
am able to reproduce the issue reported by [~fredyw].

As for [~vihangk1]'s question regarding whether this should be an issue, I 
briefly compared the case in which Impala is using Ranger as the authorization 
provider.

If we grant a user {{non_owner}} the {{SELECT}} privilege of a database, e.g., 
{{functional}}, and then grant {{non_owner}} the {{ALL}} privilege on 
{{SERVER}} to {{non_owner}}, {{non_owner}} would possess 2 privileges.

Now if we revoke the {{ALL}} privilege from the user {{non_owner}}, it will 
still possess the {{SELECT}} privilege on the database {{functional}}.

I will try to see how Hive behaves with Sentry being the authorization provider 
in this situation described above and keep you posted.



was (Author: fangyurao):
Hi [~joemcdonnell] and [~vihangk1], I took a look at the description above. I 
am able to reproduce the issue reported by [~fredyw].

As for [~vihangk1]'s question regarding whether this should be an issue, I 
briefly compared the case in which Impala is using Ranger as the authorization 
provider. If we grant a user {{non_owner}} the {{SELECT}} privilege of a 
database, e.g., {{functional}}, and then grant {{non_owner}} the {{ALL}} 
privilege on {{SERVER}} to {{non_owner}}, {{non_owner}} would possess 2 
privileges.

Now if we revoke the {{ALL}} privilege from the user {{non_owner}}, it will 
still possess the {{SELECT}} privilege on the database {{functional}}.

I will try to see how Hive behaves with Sentry being the authorization provider 
in this situation described above and keep you posted.


> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040534#comment-17040534
 ] 

Fang-Yu Rao commented on IMPALA-7282:
-

Hi [~joemcdonnell] and [~vihangk1], I took a look at the description above. I 
am able to reproduce the issue reported by [~fredyw].

As for [~vihangk1]'s question regarding whether this should be an issue, I 
briefly compared the case in which Impala is using Ranger as the authorization 
provider. If we grant a user {{non_owner}} the {{SELECT}} privilege of a 
database, e.g., {{functional}}, and then grant {{non_owner}} the {{ALL}} 
privilege on {{SERVER}} to {{non_owner}}, {{non_owner}} would possess 2 
privileges.

Now if we revoke the {{ALL}} privilege from the user {{non_owner}}, it will 
still possess the {{SELECT}} privilege on the database {{functional}}.

I will try to see how Hive behaves with Sentry being the authorization provider 
in this situation described above and keep you posted.


> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9386) IMPALA-9287 breaks catalog table loading

2020-02-19 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040533#comment-17040533
 ] 

Vihang Karajgaonkar commented on IMPALA-9386:
-

The code review brings back the changes which were reverted earlier with 
changes so that it doesn't break the table loading due to conflicting 
flatbuffer libraries in the classpath.

> IMPALA-9287 breaks catalog table loading
> 
>
> Key: IMPALA-9386
> URL: https://issues.apache.org/jira/browse/IMPALA-9386
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> IMPALA-9287 brings in hive-metastore dependency which brings in hive-serde 
> which brings in a different version of flatbuffers and may cause the table 
> loading to fail depending of which version of flatbuffer hive-metastore 
> brings in. We should exclude hive-serde from the dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9378) CPU usage for runtime profiles with multithreading

2020-02-19 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040532#comment-17040532
 ] 

Tim Armstrong commented on IMPALA-9378:
---

https://gerrit.cloudera.org/#/c/15236/ - IMPALA-9381 should reduce about half 
of the time spent in CloseImpalaOperation(), according to the flame graph.

> CPU usage for runtime profiles with multithreading
> --
>
> Key: IMPALA-9378
> URL: https://issues.apache.org/jira/browse/IMPALA-9378
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
> Attachments: coord_q5_dop0.svg, coord_q5_dop16.svg
>
>
> [~drorke] reports that significant amounts of time can be spent on the 
> runtime profile with higher values of mt_dop. This can impact query 
> performance from the client's point of view since profile serialisation is on 
> the critical path for closing the query. Also serialising the profile for the 
> webserver holds the ClientRequestState's lock, so can block query progress.
> We should figure out how to make this more efficient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9381) Lazily convert and/or cache different representations of the query profile

2020-02-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9381 started by Tim Armstrong.
-
> Lazily convert and/or cache different representations of the query profile
> --
>
> Key: IMPALA-9381
> URL: https://issues.apache.org/jira/browse/IMPALA-9381
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> There are some obvious inefficiencies with how the query state record works:
> * We do an unnecessary copy of the archive string when adding it to the query 
> log
> https://github.com/apache/impala/blob/79aae231443a305ce8503dbc7b4335e8ae3f3946/be/src/service/impala-server.cc#L1812.
> * We eagerly convert the profile to text and JSON, when in many cases they 
> won't be needed - 
> https://github.com/apache/impala/blob/79aae231443a305ce8503dbc7b4335e8ae3f3946/be/src/service/impala-server.cc#L1839
>  . I think it is generally rare for more than one profile format to be 
> downloaded from the web UI. I know of tools that scrape the thrift profile, 
> but the human-readable version would usually only be consumed by humans. We 
> could avoid this by only storing the thrift representation of the profile, 
> then reconstituting the other representations from thrift if requested.
> * After ComputeExecSummary(), the profile shouldn't change, but we'll 
> regenerate the thrift representation for every web request to get the 
> encoded. This may waste a lot of CPU for tools scraping the profiles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9386) IMPALA-9287 breaks catalog table loading

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040511#comment-17040511
 ] 

Joe McDonnell commented on IMPALA-9386:
---

[~vihangk1] IMPALA-9287 was reverted. Unless I'm missing something, the code 
review above is changing code that was reverted, and this issue is no longer 
present in master.

> IMPALA-9287 breaks catalog table loading
> 
>
> Key: IMPALA-9386
> URL: https://issues.apache.org/jira/browse/IMPALA-9386
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> IMPALA-9287 brings in hive-metastore dependency which brings in hive-serde 
> which brings in a different version of flatbuffers and may cause the table 
> loading to fail depending of which version of flatbuffer hive-metastore 
> brings in. We should exclude hive-serde from the dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9357) custom_cluster.test_event_processing.TestEventProcessing.test_self_events seems flaky in centos6

2020-02-19 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040502#comment-17040502
 ] 

Vihang Karajgaonkar commented on IMPALA-9357:
-

I am trying to reproduce this locally but I could not. I ran the test in a loop 
50 times and I couldn't see any test failure.

> custom_cluster.test_event_processing.TestEventProcessing.test_self_events 
> seems flaky in centos6
> 
>
> Key: IMPALA-9357
> URL: https://issues.apache.org/jira/browse/IMPALA-9357
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> The EE test {{test_self_events}} seems flaky. In what follows the stacktrace 
> is provided.
> {code:java}
> Stacktrace
> custom_cluster/test_event_processing.py:167: in test_self_events
> self.__run_self_events_test(unique_database, True)
> custom_cluster/test_event_processing.py:232: in __run_self_events_test
> self.__exec_sql_and_check_selfevent_counter(stmt, use_impala)
> custom_cluster/test_event_processing.py:388: in 
> __exec_sql_and_check_selfevent_counter
> assert self_events_after > self_events
> E   assert 1 > 1
> {code}
> Since this test was recently added by 
> https://issues.apache.org/jira/browse/IMPALA-9101 and reviewed by 
> [~stigahuang], maybe you could provide some insight into it? The JIRA is 
> assigned to [~vihangk1] for now, but please feel free to re-assign it to 
> others as appropriate. Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9386) IMPALA-9287 breaks catalog table loading

2020-02-19 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040498#comment-17040498
 ] 

Vihang Karajgaonkar commented on IMPALA-9386:
-

Hi [~joemcdonnell] the code review is already out for this.  I am waiting on 
reviewers for this. This is not a complex change and it would be great if we 
could get it in 3.4. I am not sure if there would be a 3.5 in the near future 
(as per your email "Impala 3.4.0 release" on the d...@impala.apache.org) and 
hence I think we should try to commit this for 3.4.

> IMPALA-9287 breaks catalog table loading
> 
>
> Key: IMPALA-9386
> URL: https://issues.apache.org/jira/browse/IMPALA-9386
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> IMPALA-9287 brings in hive-metastore dependency which brings in hive-serde 
> which brings in a different version of flatbuffers and may cause the table 
> loading to fail depending of which version of flatbuffer hive-metastore 
> brings in. We should exclude hive-serde from the dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9091) query_test.test_scanners.TestScannerReservation.test_scanners flaky

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040496#comment-17040496
 ] 

Joe McDonnell commented on IMPALA-9091:
---

[~boroknagyz] Can you triage this and let me know if this is something that we 
should fix in Impala 3.4?

> query_test.test_scanners.TestScannerReservation.test_scanners flaky
> ---
>
> Key: IMPALA-9091
> URL: https://issues.apache.org/jira/browse/IMPALA-9091
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: flaky
>
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/8463
> {noformat}
> E   AssertionError: Did not find matches for lines in runtime profile:
> E   EXPECTED LINES:
> E   row_regex:.*ParquetRowGroupIdealReservation.*Avg: 3.50 MB.*
> E   
> E   ACTUAL PROFILE:
> E   Query (id=3b48738ce971e36b:b6f52bf5):
> E DEBUG MODE WARNING: Query profile created while running a DEBUG build 
> of Impala. Use RELEASE builds to measure query performance.
> E Summary:
> E   Session ID: 2e4c96b22f2ac6e3:88afa967b63e7983
> E   Session Type: BEESWAX
> E   Start Time: 2019-10-24 21:22:06.311001000
> E   End Time: 2019-10-24 21:22:06.520778000
> E   Query Type: QUERY
> E   Query State: FINISHED
> E   Query Status: OK
> E   Impala Version: impalad version 3.4.0-SNAPSHOT DEBUG (build 
> 8c60e91f7c3812aca14739535a994d21c51fc0b0)
> E   User: ubuntu
> E   Connected User: ubuntu
> E   Delegated User: 
> E   Network Address: :::127.0.0.1:37312
> E   Default Db: functional
> E   Sql Statement: select * from tpch_parquet.lineitem
> E   where l_orderkey < 10
> E   Coordinator: ip-172-31-20-105:22000
> E   Query Options (set by configuration): 
> ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,DISABLE_CODEGEN_ROWS_THRESHOLD=0,TIMEZONE=Universal,CLIENT_IDENTIFIER=query_test/test_scanners.py::TestScannerReservation::()::test_scanners[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table_form
> E   Query Options (set by configuration and planner): 
> ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,MT_DOP=0,DISABLE_CODEGEN_ROWS_THRESHOLD=0,TIMEZONE=Universal,CLIENT_IDENTIFIER=query_test/test_scanners.py::TestScannerReservation::()::test_scanners[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table_form
> E   Plan: 
> E   
> E   Max Per-Host Resource Reservation: Memory=40.00MB Threads=3
> E   Per-Host Resource Estimates: Memory=1.26GB
> E   Analyzed query: SELECT * FROM tpch_parquet.lineitem WHERE l_orderkey < 
> CAST(10
> E   AS BIGINT)
> E   
> E   F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> E   |  Per-Host Resources: mem-estimate=10.69MB mem-reservation=0B 
> thread-reservation=1
> E   PLAN-ROOT SINK
> E   |  output exprs: tpch_parquet.lineitem.l_orderkey, 
> tpch_parquet.lineitem.l_partkey, tpch_parquet.lineitem.l_suppkey, 
> tpch_parquet.lineitem.l_linenumber, tpch_parquet.lineitem.l_quantity, 
> tpch_parquet.lineitem.l_extendedprice, tpch_parquet.lineitem.l_discount, 
> tpch_parquet.lineitem.l_tax, tpch_parquet.lineitem.l_returnflag, 
> tpch_parquet.lineitem.l_linestatus, tpch_parquet.lineitem.l_shipdate, 
> tpch_parquet.lineitem.l_commitdate, tpch_parquet.lineitem.l_receiptdate, 
> tpch_parquet.lineitem.l_shipinstruct, tpch_parquet.lineitem.l_shipmode, 
> tpch_parquet.lineitem.l_comment
> E   |  mem-estimate=0B mem-reservation=0B thread-reservation=0
> E   |
> E   01:EXCHANGE [UNPARTITIONED]
> E   |  mem-estimate=10.69MB mem-reservation=0B thread-reservation=0
> E   |  tuple-ids=0 row-size=231B cardinality=600.12K
> E   |  in pipelines: 00(GETNEXT)
> E   |
> E   F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> E   Per-Host Resources: mem-estimate=1.25GB mem-reservation=40.00MB 
> thread-reservation=2
> E   00:SCAN HDFS [tpch_parquet.lineitem, RANDOM]
> E  HDFS partitions=1/1 files=3 size=193.97MB
> E  predicates: l_orderkey < CAST(10 AS BIGINT)
> E  stored statistics:
> Etable: rows=6.00M size=193.97MB
> Ecolumns: all
> E  extrapolated-rows=disabled max-scan-range-rows=2.14M
> E  parquet statistics predicates: l_orderkey < CAST(10 AS BIGINT)
> E  parquet dictionary predicates: l_orderkey < CAST(10 AS BIGINT)
> E  mem-estimate=1.25GB mem-reservation=40.00MB thread-reservation=1
> E  tuple-ids=0 row-size=231B cardinality=600.12K
> E  in pipelines: 00(GETNEXT)
> E   
> E   

[jira] [Commented] (IMPALA-7282) Sentry privilege disappears after a catalog refresh

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040494#comment-17040494
 ] 

Joe McDonnell commented on IMPALA-7282:
---

[~vihangk1] [~fangyurao] Can you triage this and let me know if this needs a 
fix in Impala 3.4?

> Sentry privilege disappears after a catalog refresh
> ---
>
> Key: IMPALA-7282
> URL: https://issues.apache.org/jira/browse/IMPALA-7282
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Critical
>  Labels: security
>
> {noformat}
> [localhost:21000] default> grant select on database functional to role 
> foo_role;
> Query: grant select on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.05s
> [localhost:21000] default> grant all on database functional to role foo_role;
> Query: grant all on database functional to role foo_role
> +-+
> | summary |
> +-+
> | Privilege(s) have been granted. |
> +-+
> Fetched 1 row(s) in 0.03s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+-+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time |
> +--++---++-+---+--+-+
> | database | functional |   || | select| false| 
> NULL|
> | database | functional |   || | all   | false| 
> NULL|
> +--++---++-+---+--+-+
> Fetched 2 row(s) in 0.02s
> [localhost:21000] default> show grant role foo_role;
> Query: show grant role foo_role
> +--++---++-+---+--+---+
> | scope| database   | table | column | uri | privilege | grant_option | 
> create_time   |
> +--++---++-+---+--+---+
> | database | functional |   || | all   | false| 
> Wed, Jul 11 2018 15:38:41.113 |
> +--++---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8521) Lots of "unreleased ByteBuffers allocated by read()" errors from HDFS client

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040493#comment-17040493
 ] 

Joe McDonnell commented on IMPALA-8521:
---

[~stakiar] [~tarmstrong] Is this something we should fix for Impala 3.4?

> Lots of "unreleased ByteBuffers allocated by read()" errors from HDFS client
> 
>
> Key: IMPALA-8521
> URL: https://issues.apache.org/jira/browse/IMPALA-8521
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Critical
>
> I'm looking at some job logs and seeing a bunch of errors like this. I don't 
> know if it's benign or if it's something more serious.
> {noformat}
> I0507 07:34:53.934693 20195 scan-range.cc:607] 
> dd4d6eb8d2ad9587:6b44fe1b0002] Cache read failed for scan range: 
> file=hdfs://localhost:20500/test-warehouse/f861f1a3/nation.tbl disk_id=0 
> offset=1024  exclusive_hdfs_fh=0xec09220 num_remote_bytes=0 cancel_status= 
> buffer_queue=0 num_buffers_in_readers=0 unused_iomgr_buffers=0 
> unused_iomgr_buffer_bytes=0 blocked_on_buffer=0. Switching to disk read path.
> W0507 07:34:53.934787 20195 DFSInputStream.java:668] 
> dd4d6eb8d2ad9587:6b44fe1b0002] closing file 
> /test-warehouse/f861f1a3/nation.tbl, but there are still unreleased 
> ByteBuffers allocated by read().  Please release 
> java.nio.DirectByteBufferR[pos=1024 lim=2048 cap=2199].
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8830) Coordinator-only queries get queued when there are no executor groups

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040488#comment-17040488
 ] 

Joe McDonnell commented on IMPALA-8830:
---

[~bikramjeet.vig] [~tarmstrong] Is this a blocker for Impala 3.4?

> Coordinator-only queries get queued when there are no executor groups
> -
>
> Key: IMPALA-8830
> URL: https://issues.apache.org/jira/browse/IMPALA-8830
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Blocker
>  Labels: admission-control, resource-management
>
> Reproduction:
> {noformat}
> tarmstrong@tarmstrong-box:~/Impala/incubator-impala$ start-impala-cluster.py 
> -s1 --use_exclusive_coordinators;
> [localhost:21000] default> select * from tpch.lineitem order by l_orderkey 
> limit 5;
> ERROR: Admission for query exceeded timeout 6ms in pool default-pool. 
> Queued reason: No healthy executor groups found for pool default-pool.
> [localhost:21000] default> select 1;
> ERROR: Admission for query exceeded timeout 6ms in pool default-pool. 
> Queued reason: No healthy executor groups found for pool default-pool.
> {noformat}
> I expected that the second query should run immediately since it doesn't 
> actually need to be scheduled on any executors. I suspect this may be a 
> regression from the executor group changes, but didn't confirm.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8533) Impala daemon crash on sort

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040484#comment-17040484
 ] 

Joe McDonnell commented on IMPALA-8533:
---

[~kdeschle] FYI, This is on my list of things we should fix for Impala 3.4.

> Impala daemon crash on sort
> ---
>
> Key: IMPALA-8533
> URL: https://issues.apache.org/jira/browse/IMPALA-8533
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Jacob Evan Beard
>Assignee: Kurt Deschler
>Priority: Blocker
>  Labels: crash
> Attachments: fatal_error.txt, hs_err_pid8552.log, query.txt
>
>
> Running the attached data generation query crashes the Impala coordinator 
> daemon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9357) custom_cluster.test_event_processing.TestEventProcessing.test_self_events seems flaky in centos6

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040482#comment-17040482
 ] 

Joe McDonnell commented on IMPALA-9357:
---

[~vihangk1] This is on my list of things we should fix for Impala 3.4.

> custom_cluster.test_event_processing.TestEventProcessing.test_self_events 
> seems flaky in centos6
> 
>
> Key: IMPALA-9357
> URL: https://issues.apache.org/jira/browse/IMPALA-9357
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> The EE test {{test_self_events}} seems flaky. In what follows the stacktrace 
> is provided.
> {code:java}
> Stacktrace
> custom_cluster/test_event_processing.py:167: in test_self_events
> self.__run_self_events_test(unique_database, True)
> custom_cluster/test_event_processing.py:232: in __run_self_events_test
> self.__exec_sql_and_check_selfevent_counter(stmt, use_impala)
> custom_cluster/test_event_processing.py:388: in 
> __exec_sql_and_check_selfevent_counter
> assert self_events_after > self_events
> E   assert 1 > 1
> {code}
> Since this test was recently added by 
> https://issues.apache.org/jira/browse/IMPALA-9101 and reviewed by 
> [~stigahuang], maybe you could provide some insight into it? The JIRA is 
> assigned to [~vihangk1] for now, but please feel free to re-assign it to 
> others as appropriate. Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9351) AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to non-existing path

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040483#comment-17040483
 ] 

Joe McDonnell commented on IMPALA-9351:
---

[~norbertluksa] FYI, This is on my list of things we should fix for Impala 3.4.

> AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to non-existing path
> -
>
> Key: IMPALA-9351
> URL: https://issues.apache.org/jira/browse/IMPALA-9351
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Fang-Yu Rao
>Assignee: Norbert Luksa
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to a non-existing path. 
> Specifically, we see the following error message.
> {code:java}
> Error Message
> Error during analysis:
> org.apache.impala.common.AnalysisException: Cannot infer schema, path does 
> not exist: 
> hdfs://localhost:20500/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0
> sql:
> create table if not exists newtbl_DNE like orc 
> '/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0'
> {code}
> The stack trace is provided in the following.
> {code:java}
> Stacktrace
> java.lang.AssertionError: 
> Error during analysis:
> org.apache.impala.common.AnalysisException: Cannot infer schema, path does 
> not exist: 
> hdfs://localhost:20500/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0
> sql:
> create table if not exists newtbl_DNE like orc 
> '/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0'
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.impala.common.FrontendFixture.analyzeStmt(FrontendFixture.java:397)
>   at 
> org.apache.impala.common.FrontendTestBase.AnalyzesOk(FrontendTestBase.java:244)
>   at 
> org.apache.impala.common.FrontendTestBase.AnalyzesOk(FrontendTestBase.java:185)
>   at 
> org.apache.impala.analysis.AnalyzeDDLTest.TestCreateTableLikeFileOrc(AnalyzeDDLTest.java:2045)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:236)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:386)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:323)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:143)
> {code}
> This test was recently added by [~norbertluksa], and [~boroknagyz] gave a +2, 
> maybe [~boroknagyz] could provide some insight into this? Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional 

[jira] [Commented] (IMPALA-9386) IMPALA-9287 breaks catalog table loading

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040481#comment-17040481
 ] 

Joe McDonnell commented on IMPALA-9386:
---

[~vihangk1] We reverted IMPALA-9287, so can you update the priority? Is there 
content from this Jira that should go into Impala 3.4?

> IMPALA-9287 breaks catalog table loading
> 
>
> Key: IMPALA-9386
> URL: https://issues.apache.org/jira/browse/IMPALA-9386
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> IMPALA-9287 brings in hive-metastore dependency which brings in hive-serde 
> which brings in a different version of flatbuffers and may cause the table 
> loading to fail depending of which version of flatbuffer hive-metastore 
> brings in. We should exclude hive-serde from the dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9383) HS2 HTTP server hangs on large chunked requests

2020-02-19 Thread Thomas Tauber-Marshall (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-9383.

Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> HS2 HTTP server hangs on large chunked requests
> ---
>
> Key: IMPALA-9383
> URL: https://issues.apache.org/jira/browse/IMPALA-9383
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> There's a bug in THttpTransport that causes it to hang when sent a large, 
> chunked request.
> The issue is that in each call to THttpTransport::read(), it always starts by 
> calling refill() which reads more data off the socket, but for chunked 
> requests each call to read() only processes a single chunk. 
> So, if more than one chunk is read off the socket at a time, you can end up 
> with more chunks still needing to be processed but no more data to read off 
> the socket, and the next call to THttpTransport::read() will hang when it 
> calls refill().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9405) Improvements for Frontend#metaStoreClientPool_

2020-02-19 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040464#comment-17040464
 ] 

Sahil Takiar commented on IMPALA-9405:
--

[~boroknagyz], [~vihangk1] does this make sense?

> Improvements for Frontend#metaStoreClientPool_
> --
>
> Key: IMPALA-9405
> URL: https://issues.apache.org/jira/browse/IMPALA-9405
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Sahil Takiar
>Priority: Major
>
> While trying to resurrect {{tests/experiments/test_catalog_hms_failures.py}} 
> I noticed the test {{TestCatalogHMSFailures::test_start_catalog_before_hms}} 
> has started to fail. The reason is that when this test was written, only the 
> catalogd was connecting to HMS, but with catalog v2 and ACID integration this 
> is no longer the case.
> It looks like catalog v2 honors {{initial_hms_cnxn_timeout_s}}, (at least 
> {{DirectMetaProvider}} honors the flag, and I *think* that is part of the 
> metadata v2 code), but the {{Frontend}} Java class has a member variable 
> {{metaStoreClientPool_}} that does not use the flag. It looks like that pool 
> was added for ACID integration.
> The flag {{initial_hms_cnxn_timeout_s}} was added in IMPALA-4278 to help with 
> concurrent startup of Impala and HMS.
> Somewhat related to this issue, is that there seems to be multiple places 
> where Impala creates a {{MetaStoreClientPool}}, I think it would make more 
> sense to just have one global pool that is used across the process. Doing so 
> would improve connection re-use and possibly decrease the number of HMS 
> connections. There is actually a TODO in {{DirectMetaProvider}} as well that 
> says {{msClientPool_}} should be a process wide singleton.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9405) Improvements for Frontend#metaStoreClientPool_

2020-02-19 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-9405:


 Summary: Improvements for Frontend#metaStoreClientPool_
 Key: IMPALA-9405
 URL: https://issues.apache.org/jira/browse/IMPALA-9405
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Reporter: Sahil Takiar


While trying to resurrect {{tests/experiments/test_catalog_hms_failures.py}} I 
noticed the test {{TestCatalogHMSFailures::test_start_catalog_before_hms}} has 
started to fail. The reason is that when this test was written, only the 
catalogd was connecting to HMS, but with catalog v2 and ACID integration this 
is no longer the case.

It looks like catalog v2 honors {{initial_hms_cnxn_timeout_s}}, (at least 
{{DirectMetaProvider}} honors the flag, and I *think* that is part of the 
metadata v2 code), but the {{Frontend}} Java class has a member variable 
{{metaStoreClientPool_}} that does not use the flag. It looks like that pool 
was added for ACID integration.

The flag {{initial_hms_cnxn_timeout_s}} was added in IMPALA-4278 to help with 
concurrent startup of Impala and HMS.

Somewhat related to this issue, is that there seems to be multiple places where 
Impala creates a {{MetaStoreClientPool}}, I think it would make more sense to 
just have one global pool that is used across the process. Doing so would 
improve connection re-use and possibly decrease the number of HMS connections. 
There is actually a TODO in {{DirectMetaProvider}} as well that says 
{{msClientPool_}} should be a process wide singleton.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8693) RPC failure between coordinator and executor leads to bogus "Cancelled" error

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-8693:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> RPC failure between coordinator and executor leads to bogus "Cancelled" error
> -
>
> Key: IMPALA-8693
> URL: https://issues.apache.org/jira/browse/IMPALA-8693
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Michael Ho
>Priority: Critical
>  Labels: observability
> Attachments: 2b4de6f8564f0cef_9ff7b875_INFO
>
>
> Impala returns bogus "Cancelled" error for some RPC failures. I saw an 
> example caused by KUDU-2871. I've attached log lines relating to the query. 
> The shell output looked like this:
> {noformat}
> Query submitted at: 2019-06-19 09:54:40 (Coordinator: 
> https://quasar-epvzqp-4.vpc.cloudera.com:25000)
> Query progress can be monitored at: 
> https://quasar-epvzqp-4.vpc.cloudera.com:25000/query_plan?query_id=2b4de6f8564f0cef:9ff7b875
> ERROR: Cancelled
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9391) Impala 3.3.0 can't support Transactional (ACID) tables

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040413#comment-17040413
 ] 

Joe McDonnell commented on IMPALA-9391:
---

To use Hive ACID tables, Impala needs to be compiled with Hive 3 support by 
building with the USE_CDP_HIVE environment variable set to "true". Without 
USE_CDP_HIVE=true, Impala compiles against Hive 2, which does not have Hive 
ACID support. The error you are seeing is the error that Impala would throw if 
it does not have Hive ACID support, so that may be the cause.

This error text is a bit deceptive. It should make it clear if Impala has been 
compiled without Hive ACID support.

> Impala 3.3.0  can't support Transactional (ACID) tables
> ---
>
> Key: IMPALA-9391
> URL: https://issues.apache.org/jira/browse/IMPALA-9391
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
> Environment: hdp : CDH-6.2.1  
> impala: Apache-impala-3.3.0
>Reporter: liu
>Priority: Critical
> Attachments: 112233.jpg
>
>
> IMPALA-8813  has already support  Create, Insert, and read Insert-only ACID 
> tables
> but,I created the hive table like this
> {code:java}
> CREATE TABLE test_orc9 (
> id INT,
> name STRING
> )
> CLUSTERED BY (id)  INTO 5 BUCKETS
> STORED AS ORC
> TBLPROPERTIES("transactional"="true","transactional_properties"="insert_only","compress.mode"="SNAPPY")
> ;
> {code}
> Query sql: select * from test_orc9
> Error: AnalysisException: Table ods.test_orc9 not supported. Transactional 
> (ACID) tables are only supported for read
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8582) HDFS Datanodes fail to start with USE_CDP_HIVE=true on Centos 6

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-8582:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> HDFS Datanodes fail to start with USE_CDP_HIVE=true on Centos 6
> ---
>
> Key: IMPALA-8582
> URL: https://issues.apache.org/jira/browse/IMPALA-8582
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Joe McDonnell
>Priority: Critical
>  Labels: broken-build
>
> On Centos 6, the HDFS Datanode won't start up with this error:
> {noformat}
> 2019-05-22 22:35:49,852 WARN org.apache.hadoop.util.NativeCodeLoader: Unable 
> to load native-hadoop library for your platform... using builtin-java classes 
> where applicable
> ...
> 2019-05-22 22:35:52,497 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
> java.lang.RuntimeException: Cannot start datanode because the configured max 
> locked memory size (dfs.datanode.max.locked.memory) is greater than zero and 
> native code is not available.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1379)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:500)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2782)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2690)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2732)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2876)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2900)
> 2019-05-22 22:35:52,506 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: java.lang.RuntimeException: Cannot start datanode because the 
> configured max locked memory size (dfs.datanode.max.locked.memory) is greater 
> than zero and native code is not available.{noformat}
> There must be something about the CDP version of Hadoop binaries that impacts 
> this. As far as I know, the CDP Hadoop binaries are built on Centos 7. This 
> is likely to be fixed by getting appropriate binaries. Anecdotally, this 
> seems fine on Ubuntu 16.04 and Centos 7. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8582) HDFS Datanodes fail to start with USE_CDP_HIVE=true on Centos 6

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040408#comment-17040408
 ] 

Joe McDonnell commented on IMPALA-8582:
---

This has never worked, so I'm not considering it a blocker for Impala 3.4. In 
the Impala 4 discussions, we'll decide whether Centos 6 will be one of the 
platforms.

> HDFS Datanodes fail to start with USE_CDP_HIVE=true on Centos 6
> ---
>
> Key: IMPALA-8582
> URL: https://issues.apache.org/jira/browse/IMPALA-8582
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Joe McDonnell
>Priority: Critical
>  Labels: broken-build
>
> On Centos 6, the HDFS Datanode won't start up with this error:
> {noformat}
> 2019-05-22 22:35:49,852 WARN org.apache.hadoop.util.NativeCodeLoader: Unable 
> to load native-hadoop library for your platform... using builtin-java classes 
> where applicable
> ...
> 2019-05-22 22:35:52,497 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
> java.lang.RuntimeException: Cannot start datanode because the configured max 
> locked memory size (dfs.datanode.max.locked.memory) is greater than zero and 
> native code is not available.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1379)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:500)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2782)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2690)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2732)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2876)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2900)
> 2019-05-22 22:35:52,506 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: java.lang.RuntimeException: Cannot start datanode because the 
> configured max locked memory size (dfs.datanode.max.locked.memory) is greater 
> than zero and native code is not available.{noformat}
> There must be something about the CDP version of Hadoop binaries that impacts 
> this. As far as I know, the CDP Hadoop binaries are built on Centos 7. This 
> is likely to be fixed by getting appropriate binaries. Anecdotally, this 
> seems fine on Ubuntu 16.04 and Centos 7. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8500) test_timestamp_out_of_range fails with NoSuchObjectException: test_timestamp_out_of_range_dc37915d on S3

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-8500.
---
Fix Version/s: Impala 3.3.0
   Resolution: Fixed

This is an s3 consistency issue, and we have added support for running the 
tests with s3guard. I'm resolving this as fixed by the s3guard change.

> test_timestamp_out_of_range fails with NoSuchObjectException: 
> test_timestamp_out_of_range_dc37915d on S3
> 
>
> Key: IMPALA-8500
> URL: https://issues.apache.org/jira/browse/IMPALA-8500
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 3.3.0
>
>
> I'm not sure what to make of this. Maybe you have an idea:
> {noformat}
> Error Message
> test setup failure
> Stacktrace
> conftest.py:319: in cleanup
> {'sync_ddl': sync_ddl})
> common/impala_test_suite.py:620: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:628: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:722: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:180: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:187: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:362: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:356: in execute_query_async
> handle = self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:516: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error making 'dropDatabase' RPC to Hive 
> Metastore: 
> E   CAUSED BY: NoSuchObjectException: test_timestamp_out_of_range_dc37915d
> {noformat}
> It does look like it created the database and used it fine:
> {noformat}
> -- 2019-05-02 11:09:21,883 INFO MainThread: Started query 
> e847e67be097c2a0:6c29d4f4
> SET 
> client_identifier=query_test/test_scanners.py::TestParquet::()::test_timestamp_out_of_range[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action':None;'exec_single_node_rows_t;
> -- executing against localhost:21000
> use test_timestamp_out_of_range_dc37915d;
> -- 2019-05-02 11:09:22,266 INFO MainThread: Started query 
> ab498d07351289b2:03f61983
> SET 
> client_identifier=query_test/test_scanners.py::TestParquet::()::test_timestamp_out_of_range[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action':None;'exec_single_node_rows_t;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> SELECT * FROM out_of_range_timestamp;
> -- 2019-05-02 11:09:22,273 INFO MainThread: Started query 
> be4886563821196f:871eb37f
> -- executing against localhost:21000
> SELECT * FROM out_of_range_time_of_day;
> {noformat}
> Maybe some S3 consistency issue?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7482) Deadlock with unknown lock holder in JVM in java.security.Provider.getService()

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-7482.
---
Fix Version/s: Not Applicable
   Resolution: Cannot Reproduce

> Deadlock with unknown lock holder in JVM in 
> java.security.Provider.getService()
> ---
>
> Key: IMPALA-7482
> URL: https://issues.apache.org/jira/browse/IMPALA-7482
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Tim Armstrong
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: hang
> Fix For: Not Applicable
>
> Attachments: vb1220-jstack2.out
>
>
> We've seen several instances of these mystery deadlocks in impalad's embedded 
> JVM. The signature is a deadlock stemming from sun.security.provider.Sun 
> being locked by an unknown owner.
> {noformat}
> Found one Java-level deadlock:
> =
> "Thread-24":
>   waiting to lock monitor 0x12364688 (object 0x8027ef30, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x14120800
> {noformat}
> If this happens in HDFS, it causes HDFS I/O to hang and queries to get stuck. 
> If it happens in the Kudu client it also causes hangs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7482) Deadlock with unknown lock holder in JVM in java.security.Provider.getService()

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040396#comment-17040396
 ] 

Joe McDonnell commented on IMPALA-7482:
---

JDK-8215355 was fixed in 8u251. I am not aware of this reproducing on a JDK 
with the fix. I'm going to resolve this, but if it reoccurs we can reopen it.

> Deadlock with unknown lock holder in JVM in 
> java.security.Provider.getService()
> ---
>
> Key: IMPALA-7482
> URL: https://issues.apache.org/jira/browse/IMPALA-7482
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Tim Armstrong
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: hang
> Attachments: vb1220-jstack2.out
>
>
> We've seen several instances of these mystery deadlocks in impalad's embedded 
> JVM. The signature is a deadlock stemming from sun.security.provider.Sun 
> being locked by an unknown owner.
> {noformat}
> Found one Java-level deadlock:
> =
> "Thread-24":
>   waiting to lock monitor 0x12364688 (object 0x8027ef30, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x14120800
> {noformat}
> If this happens in HDFS, it causes HDFS I/O to hang and queries to get stuck. 
> If it happens in the Kudu client it also causes hangs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8533) Impala daemon crash on sort

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-8533:
--
Target Version: Impala 3.4.0  (was: Impala 3.3.0)

> Impala daemon crash on sort
> ---
>
> Key: IMPALA-8533
> URL: https://issues.apache.org/jira/browse/IMPALA-8533
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Jacob Evan Beard
>Assignee: Kurt Deschler
>Priority: Blocker
>  Labels: crash
> Attachments: fatal_error.txt, hs_err_pid8552.log, query.txt
>
>
> Running the attached data generation query crashes the Impala coordinator 
> daemon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9351) AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to non-existing path

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040344#comment-17040344
 ] 

Joe McDonnell commented on IMPALA-9351:
---

[~norbertluksa] One workaround is to use a different complex types table. The 
"complextypestbl" table has a complex schema, and we load our own file into it 
rather than write it with Hive. The file should always be in the same location. 
So, if you used the path 
"/test-warehouse/complextypestbl_orc_def/nullable.orc", I think it should work. 

[https://github.com/apache/impala/blob/master/testdata/datasets/functional/functional_schema_template.sql#L686-L709]

> AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to non-existing path
> -
>
> Key: IMPALA-9351
> URL: https://issues.apache.org/jira/browse/IMPALA-9351
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Fang-Yu Rao
>Assignee: Norbert Luksa
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to a non-existing path. 
> Specifically, we see the following error message.
> {code:java}
> Error Message
> Error during analysis:
> org.apache.impala.common.AnalysisException: Cannot infer schema, path does 
> not exist: 
> hdfs://localhost:20500/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0
> sql:
> create table if not exists newtbl_DNE like orc 
> '/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0'
> {code}
> The stack trace is provided in the following.
> {code:java}
> Stacktrace
> java.lang.AssertionError: 
> Error during analysis:
> org.apache.impala.common.AnalysisException: Cannot infer schema, path does 
> not exist: 
> hdfs://localhost:20500/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0
> sql:
> create table if not exists newtbl_DNE like orc 
> '/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0'
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.impala.common.FrontendFixture.analyzeStmt(FrontendFixture.java:397)
>   at 
> org.apache.impala.common.FrontendTestBase.AnalyzesOk(FrontendTestBase.java:244)
>   at 
> org.apache.impala.common.FrontendTestBase.AnalyzesOk(FrontendTestBase.java:185)
>   at 
> org.apache.impala.analysis.AnalyzeDDLTest.TestCreateTableLikeFileOrc(AnalyzeDDLTest.java:2045)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:236)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:386)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:323)
>   at 
> 

[jira] [Commented] (IMPALA-8870) Bump guava version when building against Hive 3

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040330#comment-17040330
 ] 

Joe McDonnell commented on IMPALA-8870:
---

Changing Target Version to Impala 4.0, as this is not required for Impala 3.4.

> Bump guava version when building against Hive 3
> ---
>
> Key: IMPALA-8870
> URL: https://issues.apache.org/jira/browse/IMPALA-8870
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Fang-Yu Rao
>Priority: Blocker
>
> Guava is pinned to 14.01 
> https://github.com/apache/impala/blob/8094811/impala-parent/pom.xml#L59
> {code}
> 
> 14.0.1
> {code}
> I think this has likely changed in Hive 3 and we probably want to revisit 
> this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8870) Bump guava version when building against Hive 3

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-8870:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> Bump guava version when building against Hive 3
> ---
>
> Key: IMPALA-8870
> URL: https://issues.apache.org/jira/browse/IMPALA-8870
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Fang-Yu Rao
>Priority: Blocker
>
> Guava is pinned to 14.01 
> https://github.com/apache/impala/blob/8094811/impala-parent/pom.xml#L59
> {code}
> 
> 14.0.1
> {code}
> I think this has likely changed in Hive 3 and we probably want to revisit 
> this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9386) IMPALA-9287 breaks catalog table loading

2020-02-19 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040328#comment-17040328
 ] 

Vihang Karajgaonkar commented on IMPALA-9386:
-

Hi [~stigahuang] Can you take a look at the patch above?

> IMPALA-9287 breaks catalog table loading
> 
>
> Key: IMPALA-9386
> URL: https://issues.apache.org/jira/browse/IMPALA-9386
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> IMPALA-9287 brings in hive-metastore dependency which brings in hive-serde 
> which brings in a different version of flatbuffers and may cause the table 
> loading to fail depending of which version of flatbuffer hive-metastore 
> brings in. We should exclude hive-serde from the dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9355) TestExchangeMemUsage.test_exchange_mem_usage_scaling doesn't hit the memory limit

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-9355:
--
Labels: broken-build flaky  (was: broken-build flaky-test)

> TestExchangeMemUsage.test_exchange_mem_usage_scaling doesn't hit the memory 
> limit
> -
>
> Key: IMPALA-9355
> URL: https://issues.apache.org/jira/browse/IMPALA-9355
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Fang-Yu Rao
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: broken-build, flaky
>
> The EE test {{test_exchange_mem_usage_scaling}} failed because the query at 
> [https://github.com/apache/impala/blame/master/testdata/workloads/functional-query/queries/QueryTest/exchange-mem-scaling.test#L7-L15]
>  does not hit the specified memory limit (170m) at 
> [https://github.com/apache/impala/blame/master/testdata/workloads/functional-query/queries/QueryTest/exchange-mem-scaling.test#L7].
>  We may need to further reduce the specified limit. In what follows the error 
> message is also given. Recall that the same issue occurred at 
> https://issues.apache.org/jira/browse/IMPALA-7873 but was resolved.
> {code:java}
> FAIL 
> query_test/test_mem_usage_scaling.py::TestExchangeMemUsage::()::test_exchange_mem_usage_scaling[protocol:
>  beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none]
> === FAILURES 
> ===
>  TestExchangeMemUsage.test_exchange_mem_usage_scaling[protocol: beeswax | 
> exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> [gw3] linux2 -- Python 2.7.12 
> /home/ubuntu/Impala/bin/../infra/python/env/bin/python
> query_test/test_mem_usage_scaling.py:386: in test_exchange_mem_usage_scaling
> self.run_test_case('QueryTest/exchange-mem-scaling', vector)
> common/impala_test_suite.py:674: in run_test_case
> expected_str, query)
> E   AssertionError: Expected exception: Memory limit exceeded
> E   
> E   when running:
> E   
> E   set mem_limit=170m;
> E   set num_scanner_threads=1;
> E   select *
> E   from tpch_parquet.lineitem l1
> E join tpch_parquet.lineitem l2 on l1.l_orderkey = l2.l_orderkey and
> E l1.l_partkey = l2.l_partkey and l1.l_suppkey = l2.l_suppkey
> E and l1.l_linenumber = l2.l_linenumber
> E   order by l1.l_orderkey desc, l1.l_partkey, l1.l_suppkey, l1.l_linenumber
> E   limit 5
> {code}
> [~tarmstr...@cloudera.com] and [~joemcdonnell] reviewed the patch at 
> [https://gerrit.cloudera.org/c/11965/]. Assign this JIRA to [~joemcdonnell] 
> for now. Please re-assign the JIRA to others as appropriate. Thanks!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7883) TestScannersFuzzing::test_fuzz_decimal_tbl generates crash in ParquetMetadataUtils::ValidateRowGroupColumn

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-7883:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> TestScannersFuzzing::test_fuzz_decimal_tbl generates crash in 
> ParquetMetadataUtils::ValidateRowGroupColumn
> --
>
> Key: IMPALA-7883
> URL: https://issues.apache.org/jira/browse/IMPALA-7883
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Joe McDonnell
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: broken-build
>
> An exhaustive test run ran into a SIGSEGV when running 
> TestScannersFuzzing::test_fuzz_decimal_tbl(). It produces the following stack:
> {noformat}
> #0 0x7f0bda92f1f7 in raise () from /lib64/libc.so.6
> #1 0x7f0bda9308e8 in abort () from /lib64/libc.so.6
> #2 0x7f0bddb00185 in os::abort(bool) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #3 0x7f0bddca2593 in VMError::report_and_die() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #4 0x7f0bddb0568f in JVM_handle_linux_signal () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #5 0x7f0bddafbbe3 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #6 
> #7 0x02f0d254 in impala::ParquetMetadataUtils::ValidateRowGroupColumn 
> (file_metadata=..., filename=0x1dab5d5b8 
> "hdfs://localhost:20500/test-warehouse/test_fuzz_decimal_tbl_4a8e12be.db/decimal_tbl/d6=1/copy2_6a476efdb58955a1-fcfa2ac5_1388116616_data.0.parq",
>  row_group_idx=0, col_idx=3, schema_element=..., state=0x4fc1dee0) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/exec/parquet-metadata-utils.cc:156
> #8 0x022ad8c3 in impala::BaseScalarColumnReader::Reset 
> (this=0x22877800, file_desc=..., col_chunk=..., row_group_idx=0) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/exec/parquet-column-readers.cc:928
> #9 0x022357b0 in impala::HdfsParquetScanner::InitScalarColumns 
> (this=0x22608800) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/exec/hdfs-parquet-scanner.cc:1501
> #10 0x0222d60f in impala::HdfsParquetScanner::NextRowGroup 
> (this=0x22608800) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/exec/hdfs-parquet-scanner.cc:626
> #11 0x0222bd26 in impala::HdfsParquetScanner::GetNextInternal 
> (this=0x22608800, row_batch=0x505cbbc0) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/exec/hdfs-parquet-scanner.cc:419
> #12 0x0222a249 in impala::HdfsParquetScanner::ProcessSplit 
> (this=0x22608800) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/exec/hdfs-parquet-scanner.cc:336
> #13 0x021ae94a in impala::HdfsScanNode::ProcessSplit 
> (this=0x156b7000, filter_ctxs=..., expr_results_pool=0x7f0b22a61420, 
> scan_range=0x22f64140, scanner_thread_reservation=0x7f0b22a61378) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/exec/hdfs-scan-node.cc:497
> #14 0x021adb97 in impala::HdfsScanNode::ScannerThread 
> (this=0x156b7000, first_thread=true, scanner_thread_reservation=40960) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/exec/hdfs-scan-node.cc:402
> #15 0x021acf88 in impala::HdfsScanNodeoperator()(void) 
> const (__closure=0x7f0b22a61ba8) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/exec/hdfs-scan-node.cc:323
> #16 0x021af426 in 
> boost::detail::function::void_function_obj_invoker0,
>  void>::invoke(boost::detail::function::function_buffer &) 
> (function_obj_ptr=...) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
> #17 0x01cc7614 in boost::function0::operator() 
> (this=0x7f0b22a61ba0) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
> #18 0x020fe391 in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) (name=..., category=..., 
> functor=..., parent_thread_info=0x7f0b2025c850, 
> thread_started=0x7f0b2025b220) at 
> /data/jenkins/workspace/impala-cdh6.1.0-exhaustive/repos/Impala/be/src/util/thread.cc:359{noformat}
> This reproduces fairly regularly (but not 100% consistently) on master when 
> running:
> {noformat}
> export SCANNER_FUZZ_SEED=1542888792
> tests/run-tests.py 

[jira] [Updated] (IMPALA-7083) AnalysisException for GROUP BY and ORDER BY expressions that are folded to constants from 2.9 onwards

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-7083:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> AnalysisException for GROUP BY and ORDER BY expressions that are folded to 
> constants from 2.9 onwards
> -
>
> Key: IMPALA-7083
> URL: https://issues.apache.org/jira/browse/IMPALA-7083
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.9.0
>Reporter: Eric Lin
>Priority: Critical
>  Labels: regression
>
> To reproduce, please run below impala query:
> {code}
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (a int);
> SELECT   ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE a
> end) AS b
> FROM  test 
> GROUP BY 1 
> ORDER BY ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE a
> end);
> {code}
> It will fail with below error:
> {code}
> ERROR: AnalysisException: ORDER BY expression not produced by aggregation 
> output (missing from GROUP BY clause?): (CASE WHEN TRUE THEN 1 ELSE a END)
> {code}
> However, if I replace column name "a" as a constant value, it works:
> {code}
> SELECT   ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE 2
> end) AS b
> FROM  test 
> GROUP BY 1 
> ORDER BY ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE 2
> end);
> {code}
> This issue is identified in CDH5.12.x (Impala 2.9), and no issues in 5.11.x 
> (Impala 2.8).
> We know that it can be worked around by re-write as below:
> {code}
> SELECT   ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE a
> end) AS b
> FROM  test 
> GROUP BY 1 
> ORDER BY 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6692) When partition exchange is followed by sort each sort node becomes a synchronization point across the cluster

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-6692:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> When partition exchange is followed by sort each sort node becomes a 
> synchronization point across the cluster
> -
>
> Key: IMPALA-6692
> URL: https://issues.apache.org/jira/browse/IMPALA-6692
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 2.10.0
>Reporter: Mostafa Mokhtar
>Priority: Critical
>  Labels: perf, resource-management
> Attachments: Kudu table insert without KRPC no sort.txt, Kudu table 
> insert without KRPC.txt, kudu_partial_sort_insert_vd1129.foo.com_2.txt, 
> profile-spilling.txt
>
>
> Issue described in this JIRA applies to 
> * Analytical functions
> * Writes to Partitioned Parquet tables
> * Writes to Kudu tables
> When inserting into a Kudu table from Impala the plan is something like HDFS 
> SCAN -> Partition Exchange -> Partial Sort -> Kudu Insert.
> The query initially makes good progress then significantly slows down and 
> very few nodes make progress.
> While the insert is running the query goes through different phases 
> * Phase 1
> ** Scan is reading data fast, sending data through to exchange 
> ** Partial Sort keeps accumulating batches
> ** Network and CPU is busy, life appears to be OK
> * Phase 2
> ** One of the Sort operators reaches its memory limit and stops calling 
> ExchangeNode::GetNext for a while
> ** This creates back pressure against the DataStreamSenders
> ** The Partial Sort doesn't call GetNext until it has finished sorting GBs of 
> data (Partial sort memory is unbounded as of 03/16/2018)
> ** All exchange operators in the cluster eventually get blocked on that Sort 
> operator and can no longer make progress
> ** After a while the Sort is able to accept more batches which temporarily 
> unblocks execution across the cluster
> ** Another sort operator reaches its memory limit and this loop repeats itself
> Below are stacks from one of the blocked hosts
> _Sort node waiting on data from exchange node as it didn't start sorting 
> since the memory limit for the sort wasn't reached_
> {code}
> Thread 90 (Thread 0x7f8d7d233700 (LWP 21625)):
> #0  0x003a6f00b68c in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x7fab1422174c in 
> std::condition_variable::wait(std::unique_lock&) () from 
> /opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.205/lib/impala/lib/libstdc++.so.6
> #2  0x00b4d5aa in void 
> std::_V2::condition_variable_any::wait 
> >(boost::unique_lock&) ()
> #3  0x00b4ab6a in 
> impala::KrpcDataStreamRecvr::SenderQueue::GetBatch(impala::RowBatch**) ()
> #4  0x00b4b0c8 in 
> impala::KrpcDataStreamRecvr::GetBatch(impala::RowBatch**) ()
> #5  0x00dca7c5 in 
> impala::ExchangeNode::FillInputRowBatch(impala::RuntimeState*) ()
> #6  0x00dcacae in 
> impala::ExchangeNode::GetNext(impala::RuntimeState*, impala::RowBatch*, 
> bool*) ()
> #7  0x01032ac3 in 
> impala::PartialSortNode::GetNext(impala::RuntimeState*, impala::RowBatch*, 
> bool*) ()
> #8  0x00ba9c92 in impala::FragmentInstanceState::ExecInternal() ()
> #9  0x00bac7df in impala::FragmentInstanceState::Exec() ()
> #10 0x00b9ab1a in 
> impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) ()
> #11 0x00d5da9f in 
> impala::Thread::SuperviseThread(std::basic_string std::char_traits, std::allocator > const&, 
> std::basic_string, std::allocator > 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #12 0x00d5e29a in boost::detail::thread_data void (*)(std::basic_string, std::allocator 
> > const&, std::basic_string, 
> std::allocator > const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value, 
> std::allocator > >, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > > >::run() ()
> #13 0x012d70ba in thread_proxy ()
> #14 0x003a6f007aa1 in start_thread () from /lib64/libpthread.so.0
> #15 0x003a6ece893d in clone () from /lib64/libc.so.6
> {code}
> _DataStreamSender blocked due to back pressure from the DataStreamRecvr on 
> the node which has a Sort that is spilling_
> {code}
> Thread 89 (Thread 0x7fa8f6a15700 (LWP 21626)):
> #0  0x003a6f00ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x01237e77 in 
> impala::KrpcDataStreamSender::Channel::WaitForRpc(std::unique_lock*)
>  ()
> #2  0x01238b8d in 
> 

[jira] [Commented] (IMPALA-6692) When partition exchange is followed by sort each sort node becomes a synchronization point across the cluster

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040322#comment-17040322
 ] 

Joe McDonnell commented on IMPALA-6692:
---

Setting target version to Impala 4.0, because this is not a blocker for Impala 
3.4

> When partition exchange is followed by sort each sort node becomes a 
> synchronization point across the cluster
> -
>
> Key: IMPALA-6692
> URL: https://issues.apache.org/jira/browse/IMPALA-6692
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 2.10.0
>Reporter: Mostafa Mokhtar
>Priority: Critical
>  Labels: perf, resource-management
> Attachments: Kudu table insert without KRPC no sort.txt, Kudu table 
> insert without KRPC.txt, kudu_partial_sort_insert_vd1129.foo.com_2.txt, 
> profile-spilling.txt
>
>
> Issue described in this JIRA applies to 
> * Analytical functions
> * Writes to Partitioned Parquet tables
> * Writes to Kudu tables
> When inserting into a Kudu table from Impala the plan is something like HDFS 
> SCAN -> Partition Exchange -> Partial Sort -> Kudu Insert.
> The query initially makes good progress then significantly slows down and 
> very few nodes make progress.
> While the insert is running the query goes through different phases 
> * Phase 1
> ** Scan is reading data fast, sending data through to exchange 
> ** Partial Sort keeps accumulating batches
> ** Network and CPU is busy, life appears to be OK
> * Phase 2
> ** One of the Sort operators reaches its memory limit and stops calling 
> ExchangeNode::GetNext for a while
> ** This creates back pressure against the DataStreamSenders
> ** The Partial Sort doesn't call GetNext until it has finished sorting GBs of 
> data (Partial sort memory is unbounded as of 03/16/2018)
> ** All exchange operators in the cluster eventually get blocked on that Sort 
> operator and can no longer make progress
> ** After a while the Sort is able to accept more batches which temporarily 
> unblocks execution across the cluster
> ** Another sort operator reaches its memory limit and this loop repeats itself
> Below are stacks from one of the blocked hosts
> _Sort node waiting on data from exchange node as it didn't start sorting 
> since the memory limit for the sort wasn't reached_
> {code}
> Thread 90 (Thread 0x7f8d7d233700 (LWP 21625)):
> #0  0x003a6f00b68c in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x7fab1422174c in 
> std::condition_variable::wait(std::unique_lock&) () from 
> /opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.205/lib/impala/lib/libstdc++.so.6
> #2  0x00b4d5aa in void 
> std::_V2::condition_variable_any::wait 
> >(boost::unique_lock&) ()
> #3  0x00b4ab6a in 
> impala::KrpcDataStreamRecvr::SenderQueue::GetBatch(impala::RowBatch**) ()
> #4  0x00b4b0c8 in 
> impala::KrpcDataStreamRecvr::GetBatch(impala::RowBatch**) ()
> #5  0x00dca7c5 in 
> impala::ExchangeNode::FillInputRowBatch(impala::RuntimeState*) ()
> #6  0x00dcacae in 
> impala::ExchangeNode::GetNext(impala::RuntimeState*, impala::RowBatch*, 
> bool*) ()
> #7  0x01032ac3 in 
> impala::PartialSortNode::GetNext(impala::RuntimeState*, impala::RowBatch*, 
> bool*) ()
> #8  0x00ba9c92 in impala::FragmentInstanceState::ExecInternal() ()
> #9  0x00bac7df in impala::FragmentInstanceState::Exec() ()
> #10 0x00b9ab1a in 
> impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) ()
> #11 0x00d5da9f in 
> impala::Thread::SuperviseThread(std::basic_string std::char_traits, std::allocator > const&, 
> std::basic_string, std::allocator > 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #12 0x00d5e29a in boost::detail::thread_data void (*)(std::basic_string, std::allocator 
> > const&, std::basic_string, 
> std::allocator > const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value, 
> std::allocator > >, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > > >::run() ()
> #13 0x012d70ba in thread_proxy ()
> #14 0x003a6f007aa1 in start_thread () from /lib64/libpthread.so.0
> #15 0x003a6ece893d in clone () from /lib64/libc.so.6
> {code}
> _DataStreamSender blocked due to back pressure from the DataStreamRecvr on 
> the node which has a Sort that is spilling_
> {code}
> Thread 89 (Thread 0x7fa8f6a15700 (LWP 21626)):
> #0  0x003a6f00ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x01237e77 in 
> impala::KrpcDataStreamSender::Channel::WaitForRpc(std::unique_lock*)

[jira] [Commented] (IMPALA-8908) Bad error message when failing to connect to HTTPS endpoint with shell

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040318#comment-17040318
 ] 

Joe McDonnell commented on IMPALA-8908:
---

Bumping Target Version to reflect that it is not blocking Impala 3.4.

> Bad error message when failing to connect to HTTPS endpoint with shell
> --
>
> Key: IMPALA-8908
> URL: https://issues.apache.org/jira/browse/IMPALA-8908
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tamas Mate
>Priority: Critical
>  Labels: observability, ramp-up
>
> Legitimate connection errors get masked with an UnboundLocalError. It looks 
> like THRIFT-3634 fixed this.
> {noformat}
> $ impala-shell.sh -i ip-10-97-80-186.cloudera.site --protocol=hs2 --ldap 
> --user csso_tarmstrong --ssl
> Starting Impala Shell using LDAP-based authentication
> SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> LDAP password for csso_tarmstrong:
> Traceback (most recent call last):
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 1880, in 
> impala_shell_main()
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 1841, in impala_shell_main
> with ImpalaShell(options, query_options) as shell:
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 243, in __init__
> self.do_connect(options.impalad)
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 812, in do_connect
> self._connect()
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 860, in _connect
> self.server_version, self.webserver_address = self.imp_client.connect()
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_client.py", 
> line 176, in connect
> self.transport = self._get_transport(self.client_connect_timeout_ms)
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_client.py", 
> line 472, in _get_transport
> transport.open()
>   File "/home/tarmstrong/Impala/incubator-impala/shell/thrift_sasl.py", line 
> 61, in open
> self._trans.open()
>   File 
> "/opt/Impala-Toolchain/thrift-0.9.3-p7/python/lib/python2.7/site-packages/thrift/transport/TSSLSocket.py",
>  line 258, in open
> logger.error('Error while connecting with %s.', ip_port, exc_info=True)
> UnboundLocalError: local variable 'ip_port' referenced before assignment
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9313) TSAN data race in TmpFileMgr::File::Blacklist

2020-02-19 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned IMPALA-9313:


Assignee: Sahil Takiar

> TSAN data race in TmpFileMgr::File::Blacklist
> -
>
> Key: IMPALA-9313
> URL: https://issues.apache.org/jira/browse/IMPALA-9313
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> Happens occasionally in buffer-pool-test
> {code:java}
> WARNING: ThreadSanitizer: data race (pid=100106)
>   Write of size 1 at 0x7b0c122d5330 by thread T173:
> #0 impala::TmpFileMgr::File::Blacklist(impala::ErrorMsg const&) 
> /home/systest/Impala/be/src/runtime/tmp-file-mgr.cc:276:16 
> (buffer-pool-test+0x1dbea20)
> #1 
> impala::TmpFileMgr::FileGroup::RecoverWriteError(impala::TmpFileMgr::WriteHandle*,
>  impala::Status const&) 
> /home/systest/Impala/be/src/runtime/tmp-file-mgr.cc:543:18 
> (buffer-pool-test+0x1dc2117)
> #2 
> impala::TmpFileMgr::FileGroup::WriteComplete(impala::TmpFileMgr::WriteHandle*,
>  impala::Status const&) 
> /home/systest/Impala/be/src/runtime/tmp-file-mgr.cc:520:14 
> (buffer-pool-test+0x1dc1fb6)
> #3 impala::TmpFileMgr::FileGroup::Write(impala::MemRange, 
> std::function, 
> std::unique_ptr std::default_delete 
> >*)::$_0::operator()(impala::Status const&) const /home/syste
> st/Impala/be/src/runtime/tmp-file-mgr.cc:422:37 (buffer-pool-test+0x1dc353b)
> #4 std::_Function_handler impala::TmpFileMgr::FileGroup::Write(impala::MemRange, std::function (impala::Status const&)>, std::unique_ptr std::default_delete >*)::$_0
> >::_M_invoke(std::_Any_data const&, impala::Status const&) 
> >/home/systest/Impala/toolchain/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../../include/c++/4.9.2/functional:2039:2
> > (buffer-pool-test+0x1dc3343)
> #5 std::function::operator()(impala::Status 
> const&) const 
> /home/systest/Impala/toolchain/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../../include/c++/4.9.2/functional:2439:14
>  (buffer-pool-test+0x1dc5972)
> #6 impala::io::RequestContext::WriteDone(impala::io::WriteRange*, 
> impala::Status const&) 
> /home/systest/Impala/be/src/runtime/io/request-context.cc:229:3 
> (buffer-pool-test+0x26b90a3)
> #7 impala::io::DiskIoMgr::Write(impala::io::RequestContext*, 
> impala::io::WriteRange*) 
> /home/systest/Impala/be/src/runtime/io/disk-io-mgr.cc:516:19 
> (buffer-pool-test+0x26a9dc7)
> #8 impala::io::DiskQueue::DiskThreadLoop(impala::io::DiskIoMgr*) 
> /home/systest/Impala/be/src/runtime/io/disk-io-mgr.cc:497:15 
> (buffer-pool-test+0x26a8c16)
> #9 boost::_mfi::mf1 impala::io::DiskIoMgr*>::operator()(impala::io::DiskQueue*, 
> impala::io::DiskIoMgr*) const 
> /home/systest/Impala/toolchain/boost-1.57.0-p3/include/boost/bind/mem_fn_template.hpp:165:16
>  (buffer-pool-test+0x26b3a7d)
> #10 void boost::_bi::list2, 
> boost::_bi::value 
> >::operator() impala::io::DiskIoMgr*>, boost::_bi::list0>(boost::_bi::type, 
> boost::_mfi::mf1 ala::io::DiskQueue, impala::io::DiskIoMgr*>&, boost::_bi::list0&, int) 
> /home/systest/Impala/toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:313:9
>  (buffer-pool-test+0x26b39bd)
> #11 boost::_bi::bind_t impala::io::DiskQueue, impala::io::DiskIoMgr*>, 
> boost::_bi::list2, 
> boost::_bi::value > >::operator()() 
> /home/systest/Impala/toolchain/boost-1.57.0-p
> 3/include/boost/bind/bind_template.hpp:20:16 (buffer-pool-test+0x26b3923)
> #12 
> boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf1, 
> boost::_bi::list2, 
> boost::_bi::value > >, void>:
> :invoke(boost::detail::function::function_buffer&) 
> /home/systest/Impala/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153:11
>  (buffer-pool-test+0x26b36c1)
> #13 boost::function0::operator()() const 
> /home/systest/Impala/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:766:14
>  (buffer-pool-test+0x1c22681)
> #14 impala::Thread::SuperviseThread(std::string const&, std::string 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /home/systest/Impala/be/src/util/thread.cc:360:3 (buffer-pool-test+0x2176a06)
> #15 void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator()  (*)(std::string const&, std::string const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise (impala::PromiseMode)0>*), boost::_bi::list0>(boost::_bi::type, void 
> (*&)(std::string const&, std::string const&, boost::function,
>  impala::ThreadDebugInfo const*, impala::Promise (impala::PromiseMode)0>*), boost::_bi::list0&, int) 
> 

[jira] [Updated] (IMPALA-8908) Bad error message when failing to connect to HTTPS endpoint with shell

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-8908:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> Bad error message when failing to connect to HTTPS endpoint with shell
> --
>
> Key: IMPALA-8908
> URL: https://issues.apache.org/jira/browse/IMPALA-8908
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tamas Mate
>Priority: Critical
>  Labels: observability, ramp-up
>
> Legitimate connection errors get masked with an UnboundLocalError. It looks 
> like THRIFT-3634 fixed this.
> {noformat}
> $ impala-shell.sh -i ip-10-97-80-186.cloudera.site --protocol=hs2 --ldap 
> --user csso_tarmstrong --ssl
> Starting Impala Shell using LDAP-based authentication
> SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> LDAP password for csso_tarmstrong:
> Traceback (most recent call last):
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 1880, in 
> impala_shell_main()
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 1841, in impala_shell_main
> with ImpalaShell(options, query_options) as shell:
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 243, in __init__
> self.do_connect(options.impalad)
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 812, in do_connect
> self._connect()
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 
> 860, in _connect
> self.server_version, self.webserver_address = self.imp_client.connect()
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_client.py", 
> line 176, in connect
> self.transport = self._get_transport(self.client_connect_timeout_ms)
>   File "/home/tarmstrong/Impala/incubator-impala/shell/impala_client.py", 
> line 472, in _get_transport
> transport.open()
>   File "/home/tarmstrong/Impala/incubator-impala/shell/thrift_sasl.py", line 
> 61, in open
> self._trans.open()
>   File 
> "/opt/Impala-Toolchain/thrift-0.9.3-p7/python/lib/python2.7/site-packages/thrift/transport/TSSLSocket.py",
>  line 258, in open
> logger.error('Error while connecting with %s.', ip_port, exc_info=True)
> UnboundLocalError: local variable 'ip_port' referenced before assignment
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7471) Impala crashes or returns incorrect results when querying parquet nested types

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-7471:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> Impala crashes or returns incorrect results when querying parquet nested types
> --
>
> Key: IMPALA-7471
> URL: https://issues.apache.org/jira/browse/IMPALA-7471
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: correctness, crash, parquet
> Attachments: test_users_131786401297925138_0.parquet
>
>
> From 
> http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-bug-with-nested-arrays-of-structures-where-some-of/m-p/78507/highlight/false#M4779
> {quote}We found a case where Impala returns incorrect values from simple 
> query. Our data contains nested array of structures and structures contains 
> other structures.
> We generated minimal sample data allowing to reproduce the issue.
>  
> SQL to create a table:
> {quote}
> {code}
> CREATE TABLE plat_test.test_users (
>   id INT,
>   name STRING,   
>   devices ARRAY<
> STRUCT<
>   id:STRING,
>   device_info:STRUCT<
> model:STRING
>   >
> >
>   >
> )
> STORED AS PARQUET
> {code}
> {quote}
> Please put attached parquet file to the location of the table and refresh the 
> table.
> In sample data we have 2 users, one with 2 devices, second one with 3. Some 
> of the devices.device_info.model fields are NULL.
>  
> When I issue a query:
> {quote}
> {code}
> SELECT u.name, d.device_info.model as model
> FROM test_users u,
> u.devices d;
> {code}
>  {quote}
> I'm expecting to get 5 records in results, but getting only one1.png
> If I change query to:
>  {quote}
> {code}
> SELECT u.name, d.device_info.model as model
> FROM test_users u
> LEFT OUTER JOIN u.devices d;
>  {code}
> {quote}
> I'm getting two records in the results, but still not as it should be.
> We found some workaround to this problem. If we add to the result columns 
> device.id we will get all records from parquet file:
> {quote}
> {code}
> SELECT u.name, d.id, d.device_info.model as model
> FROM test_users u
> , u.devices d
>  {code}
> {quote}
> And result is 3.png
>  
> But we can't rely on this workaround, because we don't need device.id in all 
> queries and Impala optimizes it, and as a result we are getting unpredicted 
> results.
>  
> I tested Hive query on this table and it returns expected results:
> {quote}
> {code}
> SELECT u.name, d.device_info.model
> FROM test_users u
> lateral view outer inline (u.devices) d;
>  {code}
> {quote}
> results:
> 4.png
> Please advice if it's a problem in Impala engine or we did some mistake in 
> our query.
>  
> Best regards,
> Come2Play team.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7371) TestInsertQueries.test_insert fails on S3 with 0 rows returned

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-7371:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> TestInsertQueries.test_insert fails on S3 with 0 rows returned
> --
>
> Key: IMPALA-7371
> URL: https://issues.apache.org/jira/browse/IMPALA-7371
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Bharath Vissapragada
>Priority: Critical
> Attachments: catalogd_excerpt.INFO, impalad_excerpt.INFO, 
> profile.txt, profile_excerpt.log
>
>
> Stacktrace
> {noformat}
> query_test/test_insert.py:118: in test_insert
> multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1)
> /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/impala_test_suite.py:426:
>  in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/impala_test_suite.py:299:
>  in __verify_results_and_errors
> replace_filenames_with_placeholder)
> /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/test_result_verifier.py:434:
>  in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> /data/jenkins/workspace/impala-cdh6.0.x-core-s3/repos/Impala/tests/common/test_result_verifier.py:261:
>  in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 75,false,0,0,0,0,0,0,'04/01/09','0' != None
> E 76,true,1,1,1,10,1.10023841858,10.1,'04/01/09','1' != None
> E 77,false,2,2,2,20,2.20047683716,20.2,'04/01/09','2' != None
> E 78,true,3,3,3,30,3.29952316284,30.3,'04/01/09','3' != None
> E 79,false,4,4,4,40,4.40095367432,40.4,'04/01/09','4' != None
> E 80,true,5,5,5,50,5.5,50.5,'04/01/09','5' != None
> E 81,false,6,6,6,60,6.59904632568,60.6,'04/01/09','6' != None
> E 82,true,7,7,7,70,7.69809265137,70.7,'04/01/09','7' != None
> E 83,false,8,8,8,80,8.80190734863,80.8,'04/01/09','8' != None
> E 84,true,9,9,9,90,9.89618530273,90.91,'04/01/09','9' != 
> None
> E 85,false,0,0,0,0,0,0,'04/02/09','0' != None
> E 86,true,1,1,1,10,1.10023841858,10.1,'04/02/09','1' != None
> E 87,false,2,2,2,20,2.20047683716,20.2,'04/02/09','2' != None
> E 88,true,3,3,3,30,3.29952316284,30.3,'04/02/09','3' != None
> E 89,false,4,4,4,40,4.40095367432,40.4,'04/02/09','4' != None
> E 90,true,5,5,5,50,5.5,50.5,'04/02/09','5' != None
> E 91,false,6,6,6,60,6.59904632568,60.6,'04/02/09','6' != None
> E 92,true,7,7,7,70,7.69809265137,70.7,'04/02/09','7' != None
> E 93,false,8,8,8,80,8.80190734863,80.8,'04/02/09','8' != None
> E 94,true,9,9,9,90,9.89618530273,90.91,'04/02/09','9' != 
> None
> E 95,false,0,0,0,0,0,0,'04/03/09','0' != None
> E 96,true,1,1,1,10,1.10023841858,10.1,'04/03/09','1' != None
> E 97,false,2,2,2,20,2.20047683716,20.2,'04/03/09','2' != None
> E 98,true,3,3,3,30,3.29952316284,30.3,'04/03/09','3' != None
> E 99,false,4,4,4,40,4.40095367432,40.4,'04/03/09','4' != None
> E Number of rows returned (expected vs actual): 25 != 0
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6890) split-hbase.sh: Can't get master address from ZooKeeper; znode data == null

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-6890:
--
Target Version: Impala 4.0  (was: Impala 2.13.0, Impala 3.4.0)

> split-hbase.sh: Can't get master address from ZooKeeper; znode data == null
> ---
>
> Key: IMPALA-6890
> URL: https://issues.apache.org/jira/browse/IMPALA-6890
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.12.0
>Reporter: Vuk Ercegovac
>Assignee: Joe McDonnell
>Priority: Critical
>
> {noformat}
> 20:57:13 FAILED (Took: 7 min 58 sec)
> 20:57:13 
> '/data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/testdata/bin/split-hbase.sh'
>  failed. Tail of log:
> 20:57:13 Wed Apr 18 20:49:43 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 Wed Apr 18 20:49:43 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 Wed Apr 18 20:49:44 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> ...
> 20:57:13 Wed Apr 18 20:57:13 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 
> 20:57:13  at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:157)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4329)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4321)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:2952)
> 20:57:13  at 
> org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.(HBaseTestDataRegionAssigment.java:74)
> 20:57:13  at 
> org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.main(HBaseTestDataRegionAssigment.java:310)
> 20:57:13 Caused by: org.apache.hadoop.hbase.MasterNotRunningException: 
> java.io.IOException: Can't get master address from ZooKeeper; znode data == 
> null
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1698)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1718)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1875)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
> 20:57:13  ... 5 more
> 20:57:13 Caused by: java.io.IOException: Can't get master address from 
> ZooKeeper; znode data == null
> 20:57:13  at 
> org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:154)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1648)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1689)
> 20:57:13  ... 9 more
> 20:57:13 Error in 
> /data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/testdata/bin/split-hbase.sh
>  at line 41: "$JAVA" ${JAVA_KERBEROS_MAGIC} \
> 20:57:13 Error in 
> /data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/bin/run-all-tests.sh
>  at line 48: # Run End-to-end Tests{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6890) split-hbase.sh: Can't get master address from ZooKeeper; znode data == null

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040302#comment-17040302
 ] 

Joe McDonnell commented on IMPALA-6890:
---

Changing target version to Impala 4, as this is an intermittent test-only issue.

> split-hbase.sh: Can't get master address from ZooKeeper; znode data == null
> ---
>
> Key: IMPALA-6890
> URL: https://issues.apache.org/jira/browse/IMPALA-6890
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.12.0
>Reporter: Vuk Ercegovac
>Assignee: Joe McDonnell
>Priority: Critical
>
> {noformat}
> 20:57:13 FAILED (Took: 7 min 58 sec)
> 20:57:13 
> '/data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/testdata/bin/split-hbase.sh'
>  failed. Tail of log:
> 20:57:13 Wed Apr 18 20:49:43 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 Wed Apr 18 20:49:43 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 Wed Apr 18 20:49:44 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> ...
> 20:57:13 Wed Apr 18 20:57:13 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 
> 20:57:13  at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:157)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4329)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4321)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:2952)
> 20:57:13  at 
> org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.(HBaseTestDataRegionAssigment.java:74)
> 20:57:13  at 
> org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.main(HBaseTestDataRegionAssigment.java:310)
> 20:57:13 Caused by: org.apache.hadoop.hbase.MasterNotRunningException: 
> java.io.IOException: Can't get master address from ZooKeeper; znode data == 
> null
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1698)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1718)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1875)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
> 20:57:13  ... 5 more
> 20:57:13 Caused by: java.io.IOException: Can't get master address from 
> ZooKeeper; znode data == null
> 20:57:13  at 
> org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:154)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1648)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1689)
> 20:57:13  ... 9 more
> 20:57:13 Error in 
> /data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/testdata/bin/split-hbase.sh
>  at line 41: "$JAVA" ${JAVA_KERBEROS_MAGIC} \
> 20:57:13 Error in 
> /data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/bin/run-all-tests.sh
>  at line 48: # Run End-to-end Tests{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6294) Concurrent hung with lots of spilling make slow progress due to blocking in DataStreamRecvr and DataStreamSender

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-6294:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> Concurrent hung with lots of spilling make slow progress due to blocking in 
> DataStreamRecvr and DataStreamSender
> 
>
> Key: IMPALA-6294
> URL: https://issues.apache.org/jira/browse/IMPALA-6294
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Mostafa Mokhtar
>Assignee: Michael Ho
>Priority: Critical
> Attachments: IMPALA-6285 TPCDS Q3 slow broadcast, 
> slow_broadcast_q3_reciever.txt, slow_broadcast_q3_sender.txt
>
>
> While running a highly concurrent spilling workload on a large cluster 
> queries start running slower, even light weight queries that are not running 
> are affected by this slow down. 
> {code}
>   EXCHANGE_NODE (id=9):(Total: 3m1s, non-child: 3m1s, % non-child: 
> 100.00%)
>  - ConvertRowBatchTime: 999.990us
>  - PeakMemoryUsage: 0
>  - RowsReturned: 108.00K (108001)
>  - RowsReturnedRate: 593.00 /sec
> DataStreamReceiver:
>   BytesReceived(4s000ms): 254.47 KB, 338.82 KB, 338.82 KB, 852.43 
> KB, 1.32 MB, 1.33 MB, 1.50 MB, 2.53 MB, 2.99 MB, 3.00 MB, 3.00 MB, 3.00 MB, 
> 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.16 MB, 3.49 MB, 3.80 
> MB, 4.15 MB, 4.55 MB, 4.84 MB, 4.99 MB, 5.07 MB, 5.41 MB, 5.75 MB, 5.92 MB, 
> 6.00 MB, 6.00 MB, 6.00 MB, 6.07 MB, 6.28 MB, 6.33 MB, 6.43 MB, 6.67 MB, 6.91 
> MB, 7.29 MB, 8.03 MB, 9.12 MB, 9.68 MB, 9.90 MB, 9.97 MB, 10.44 MB, 11.25 MB
>- BytesReceived: 11.73 MB (12301692)
>- DeserializeRowBatchTimer: 957.990ms
>- FirstBatchArrivalWaitTime: 0.000ns
>- PeakMemoryUsage: 644.44 KB (659904)
>- SendersBlockedTimer: 0.000ns
>- SendersBlockedTotalTimer(*): 0.000ns
> {code}
> {code}
> DataStreamSender (dst_id=9):(Total: 1s819ms, non-child: 1s819ms, % 
> non-child: 100.00%)
>- BytesSent: 234.64 MB (246033840)
>- NetworkThroughput(*): 139.58 MB/sec
>- OverallThroughput: 128.92 MB/sec
>- PeakMemoryUsage: 33.12 KB (33920)
>- RowsReturned: 108.00K (108001)
>- SerializeBatchTime: 133.998ms
>- TransmitDataRPCTime: 1s680ms
>- UncompressedRowBatchSize: 446.42 MB (468102200)
> {code}
> Timeouts seen in IMPALA-6285 are caused by this issue
> {code}
> I1206 12:44:14.925405 25274 status.cc:58] RPC recv timed out: Client 
> foo-17.domain.com:22000 timed-out during recv call.
> @   0x957a6a  impala::Status::Status()
> @  0x11dd5fe  
> impala::DataStreamSender::Channel::DoTransmitDataRpc()
> @  0x11ddcd4  
> impala::DataStreamSender::Channel::TransmitDataHelper()
> @  0x11de080  impala::DataStreamSender::Channel::TransmitData()
> @  0x11e1004  impala::ThreadPool<>::WorkerThread()
> @   0xd10063  impala::Thread::SuperviseThread()
> @   0xd107a4  boost::detail::thread_data<>::run()
> @  0x128997a  (unknown)
> @ 0x7f68c5bc7e25  start_thread
> @ 0x7f68c58f534d  __clone
> {code}
> A similar behavior was also observed with KRPC enabled IMPALA-6048



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5746) Remote fragments continue to hold onto memory after stopping the coordinator daemon

2020-02-19 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040296#comment-17040296
 ] 

Joe McDonnell commented on IMPALA-5746:
---

Since this is now about adding a test case, I'm going to change the target 
version to Impala 4.

> Remote fragments continue to hold onto memory after stopping the coordinator 
> daemon
> ---
>
> Key: IMPALA-5746
> URL: https://issues.apache.org/jira/browse/IMPALA-5746
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.10.0
>Reporter: Mostafa Mokhtar
>Assignee: Sahil Takiar
>Priority: Critical
> Attachments: remote_fragments_holding_memory.txt
>
>
> Repro 
> # Start running queries 
> # Kill the coordinator node 
> # On the running Impalad check the memz tab, remote fragments continue to run 
> and hold on to resources
> Remote fragments held on to memory +30 minutes after stopping the coordinator 
> service. 
> Attached thread dump from an Impalad running remote fragments .
> Snapshot of memz tab 30 minutes after killing the coordinator
> {code}
> Process: Limit=201.73 GB Total=5.32 GB Peak=179.36 GB
>   Free Disk IO Buffers: Total=1.87 GB Peak=1.87 GB
>   RequestPool=root.default: Total=1.35 GB Peak=178.51 GB
> Query(f64169d4bb3c901c:3a21d8ae): Total=2.64 MB Peak=104.73 MB
>   Fragment f64169d4bb3c901c:3a21d8ae0051: Total=2.64 MB Peak=2.67 MB
> AGGREGATION_NODE (id=15): Total=2.54 MB Peak=2.57 MB
>   Exprs: Total=30.12 KB Peak=30.12 KB
> EXCHANGE_NODE (id=14): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=12.29 KB
> DataStreamSender (dst_id=17): Total=85.31 KB Peak=85.31 KB
> CodeGen: Total=1.53 KB Peak=374.50 KB
>   Block Manager: Limit=161.39 GB Total=512.00 KB Peak=1.54 MB
> Query(2a4f12b3b4b1dc8c:db7e8cf2): Total=258.29 MB Peak=412.98 MB
>   Fragment 2a4f12b3b4b1dc8c:db7e8cf2008c: Total=2.29 MB Peak=2.29 MB
> SORT_NODE (id=11): Total=4.00 KB Peak=4.00 KB
> AGGREGATION_NODE (id=20): Total=2.27 MB Peak=2.27 MB
>   Exprs: Total=25.12 KB Peak=25.12 KB
> EXCHANGE_NODE (id=19): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=0
> DataStreamSender (dst_id=21): Total=3.88 KB Peak=3.88 KB
> CodeGen: Total=4.17 KB Peak=1.05 MB
>   Block Manager: Limit=161.39 GB Total=256.25 MB Peak=321.66 MB
> Query(68421d2a5dea0775:83f5d972): Total=282.77 MB Peak=443.53 MB
>   Fragment 68421d2a5dea0775:83f5d972004a: Total=26.77 MB Peak=26.92 MB
> SORT_NODE (id=8): Total=8.00 KB Peak=8.00 KB
>   Exprs: Total=4.00 KB Peak=4.00 KB
> ANALYTIC_EVAL_NODE (id=7): Total=4.00 KB Peak=4.00 KB
>   Exprs: Total=4.00 KB Peak=4.00 KB
> SORT_NODE (id=6): Total=24.00 MB Peak=24.00 MB
> AGGREGATION_NODE (id=12): Total=2.72 MB Peak=2.83 MB
>   Exprs: Total=85.12 KB Peak=85.12 KB
> EXCHANGE_NODE (id=11): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=84.80 KB
> DataStreamSender (dst_id=13): Total=1.27 KB Peak=1.27 KB
> CodeGen: Total=24.80 KB Peak=4.13 MB
>   Block Manager: Limit=161.39 GB Total=280.50 MB Peak=286.52 MB
> Query(e94c89fa89a74d27:82812bf9): Total=258.29 MB Peak=436.85 MB
>   Fragment e94c89fa89a74d27:82812bf9008e: Total=2.29 MB Peak=2.29 MB
> SORT_NODE (id=11): Total=4.00 KB Peak=4.00 KB
> AGGREGATION_NODE (id=20): Total=2.27 MB Peak=2.27 MB
>   Exprs: Total=25.12 KB Peak=25.12 KB
> EXCHANGE_NODE (id=19): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=0
> DataStreamSender (dst_id=21): Total=3.88 KB Peak=3.88 KB
> CodeGen: Total=4.17 KB Peak=1.05 MB
>   Block Manager: Limit=161.39 GB Total=256.25 MB Peak=321.62 MB
> Query(4e43dad3bdc935d8:938b8b7e): Total=2.65 MB Peak=105.60 MB
>   Fragment 4e43dad3bdc935d8:938b8b7e0052: Total=2.65 MB Peak=2.68 MB
> AGGREGATION_NODE (id=15): Total=2.55 MB Peak=2.57 MB
>   Exprs: Total=30.12 KB Peak=30.12 KB
> EXCHANGE_NODE (id=14): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=13.68 KB
> DataStreamSender (dst_id=17): Total=91.41 KB Peak=91.41 KB
> CodeGen: Total=1.53 KB Peak=374.50 KB
>   Block Manager: Limit=161.39 GB Total=512.00 KB Peak=1.30 MB
> Query(b34bdd65f1ed017e:5a0291bd): Total=2.37 MB Peak=106.56 MB
>   Fragment b34bdd65f1ed017e:5a0291bd004b: Total=2.37 MB Peak=2.37 MB
> SORT_NODE (id=6): Total=4.00 KB Peak=4.00 KB
> AGGREGATION_NODE (id=10): Total=2.35 MB Peak=2.35 MB
>   Exprs: Total=34.12 KB Peak=34.12 KB
> EXCHANGE_NODE (id=9): Total=0 Peak=0
> 

[jira] [Updated] (IMPALA-5746) Remote fragments continue to hold onto memory after stopping the coordinator daemon

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-5746:
--
Target Version: Impala 4.0  (was: Impala 3.4.0)

> Remote fragments continue to hold onto memory after stopping the coordinator 
> daemon
> ---
>
> Key: IMPALA-5746
> URL: https://issues.apache.org/jira/browse/IMPALA-5746
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.10.0
>Reporter: Mostafa Mokhtar
>Assignee: Sahil Takiar
>Priority: Critical
> Attachments: remote_fragments_holding_memory.txt
>
>
> Repro 
> # Start running queries 
> # Kill the coordinator node 
> # On the running Impalad check the memz tab, remote fragments continue to run 
> and hold on to resources
> Remote fragments held on to memory +30 minutes after stopping the coordinator 
> service. 
> Attached thread dump from an Impalad running remote fragments .
> Snapshot of memz tab 30 minutes after killing the coordinator
> {code}
> Process: Limit=201.73 GB Total=5.32 GB Peak=179.36 GB
>   Free Disk IO Buffers: Total=1.87 GB Peak=1.87 GB
>   RequestPool=root.default: Total=1.35 GB Peak=178.51 GB
> Query(f64169d4bb3c901c:3a21d8ae): Total=2.64 MB Peak=104.73 MB
>   Fragment f64169d4bb3c901c:3a21d8ae0051: Total=2.64 MB Peak=2.67 MB
> AGGREGATION_NODE (id=15): Total=2.54 MB Peak=2.57 MB
>   Exprs: Total=30.12 KB Peak=30.12 KB
> EXCHANGE_NODE (id=14): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=12.29 KB
> DataStreamSender (dst_id=17): Total=85.31 KB Peak=85.31 KB
> CodeGen: Total=1.53 KB Peak=374.50 KB
>   Block Manager: Limit=161.39 GB Total=512.00 KB Peak=1.54 MB
> Query(2a4f12b3b4b1dc8c:db7e8cf2): Total=258.29 MB Peak=412.98 MB
>   Fragment 2a4f12b3b4b1dc8c:db7e8cf2008c: Total=2.29 MB Peak=2.29 MB
> SORT_NODE (id=11): Total=4.00 KB Peak=4.00 KB
> AGGREGATION_NODE (id=20): Total=2.27 MB Peak=2.27 MB
>   Exprs: Total=25.12 KB Peak=25.12 KB
> EXCHANGE_NODE (id=19): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=0
> DataStreamSender (dst_id=21): Total=3.88 KB Peak=3.88 KB
> CodeGen: Total=4.17 KB Peak=1.05 MB
>   Block Manager: Limit=161.39 GB Total=256.25 MB Peak=321.66 MB
> Query(68421d2a5dea0775:83f5d972): Total=282.77 MB Peak=443.53 MB
>   Fragment 68421d2a5dea0775:83f5d972004a: Total=26.77 MB Peak=26.92 MB
> SORT_NODE (id=8): Total=8.00 KB Peak=8.00 KB
>   Exprs: Total=4.00 KB Peak=4.00 KB
> ANALYTIC_EVAL_NODE (id=7): Total=4.00 KB Peak=4.00 KB
>   Exprs: Total=4.00 KB Peak=4.00 KB
> SORT_NODE (id=6): Total=24.00 MB Peak=24.00 MB
> AGGREGATION_NODE (id=12): Total=2.72 MB Peak=2.83 MB
>   Exprs: Total=85.12 KB Peak=85.12 KB
> EXCHANGE_NODE (id=11): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=84.80 KB
> DataStreamSender (dst_id=13): Total=1.27 KB Peak=1.27 KB
> CodeGen: Total=24.80 KB Peak=4.13 MB
>   Block Manager: Limit=161.39 GB Total=280.50 MB Peak=286.52 MB
> Query(e94c89fa89a74d27:82812bf9): Total=258.29 MB Peak=436.85 MB
>   Fragment e94c89fa89a74d27:82812bf9008e: Total=2.29 MB Peak=2.29 MB
> SORT_NODE (id=11): Total=4.00 KB Peak=4.00 KB
> AGGREGATION_NODE (id=20): Total=2.27 MB Peak=2.27 MB
>   Exprs: Total=25.12 KB Peak=25.12 KB
> EXCHANGE_NODE (id=19): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=0
> DataStreamSender (dst_id=21): Total=3.88 KB Peak=3.88 KB
> CodeGen: Total=4.17 KB Peak=1.05 MB
>   Block Manager: Limit=161.39 GB Total=256.25 MB Peak=321.62 MB
> Query(4e43dad3bdc935d8:938b8b7e): Total=2.65 MB Peak=105.60 MB
>   Fragment 4e43dad3bdc935d8:938b8b7e0052: Total=2.65 MB Peak=2.68 MB
> AGGREGATION_NODE (id=15): Total=2.55 MB Peak=2.57 MB
>   Exprs: Total=30.12 KB Peak=30.12 KB
> EXCHANGE_NODE (id=14): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=13.68 KB
> DataStreamSender (dst_id=17): Total=91.41 KB Peak=91.41 KB
> CodeGen: Total=1.53 KB Peak=374.50 KB
>   Block Manager: Limit=161.39 GB Total=512.00 KB Peak=1.30 MB
> Query(b34bdd65f1ed017e:5a0291bd): Total=2.37 MB Peak=106.56 MB
>   Fragment b34bdd65f1ed017e:5a0291bd004b: Total=2.37 MB Peak=2.37 MB
> SORT_NODE (id=6): Total=4.00 KB Peak=4.00 KB
> AGGREGATION_NODE (id=10): Total=2.35 MB Peak=2.35 MB
>   Exprs: Total=34.12 KB Peak=34.12 KB
> EXCHANGE_NODE (id=9): Total=0 Peak=0
> DataStreamRecvr: Total=0 Peak=4.23 KB
> DataStreamSender (dst_id=11): Total=3.45 

[jira] [Updated] (IMPALA-2422) % escaping does not work correctly in a LIKE clause

2020-02-19 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-2422:
--
Target Version: Impala 4.0  (was: Impala 2.13.0)

> % escaping does not work correctly in a LIKE clause
> ---
>
> Key: IMPALA-2422
> URL: https://issues.apache.org/jira/browse/IMPALA-2422
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Frontend
>Affects Versions: Impala 2.2.4, Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, 
> Impala 2.6.0, Impala 2.7.0
>Reporter: Huaisi Xu
>Priority: Critical
>  Labels: correctness, downgraded, incompatibility
>
> {code:java}
> [localhost:21000] > select '%' like "\%";
> Query: select '%' like "\%"
> +---+
> | '%' like '\%' |
> +---+
> | false   |   -> should return true.
> +---+
> Fetched 1 row(s) in 0.01s
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9404) Instantiations/ExprTest.MathConversionFunctions fails in TSAN builds

2020-02-19 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040288#comment-17040288
 ] 

Sahil Takiar commented on IMPALA-9404:
--

Not sure why this is failing, but I don't think it is due to any data races 
possibly being detected by TSAN because these tests shouldn't contain any race 
conditions.

> Instantiations/ExprTest.MathConversionFunctions fails in TSAN builds
> 
>
> Key: IMPALA-9404
> URL: https://issues.apache.org/jira/browse/IMPALA-9404
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Sahil Takiar
>Priority: Major
>
> {code:java}
> [ RUN  ] Instantiations/ExprTest.MathConversionFunctions/0
> /home/systest/Impala/be/src/exprs/expr-test.cc:644: Failure
> Value of: ConvertValue(result)
>   Actual: 0
> Expected: expected_result
> Which is: 536870912
> length(unhex(repeat('a', 1024 * 1024 * 1024)))
> [  FAILED  ] Instantiations/ExprTest.MathConversionFunctions/0, where 
> GetParam() = 2-byte object <00-01> (285270 ms)
> [ RUN  ] Instantiations/ExprTest.MathConversionFunctions/1
> [   OK ] Instantiations/ExprTest.MathConversionFunctions/1 (257337 ms)
> [ RUN  ] Instantiations/ExprTest.MathConversionFunctions/2
> /home/systest/Impala/be/src/exprs/expr-test.cc:644: Failure
> Value of: ConvertValue(result)
>   Actual: 0
> Expected: expected_result
> Which is: 536870912
> length(unhex(repeat('a', 1024 * 1024 * 1024)))
> [  FAILED  ] Instantiations/ExprTest.MathConversionFunctions/2, where 
> GetParam() = 2-byte object <01-01> (498111 ms) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9404) Instantiations/ExprTest.MathConversionFunctions fails in TSAN builds

2020-02-19 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-9404:


 Summary: Instantiations/ExprTest.MathConversionFunctions fails in 
TSAN builds
 Key: IMPALA-9404
 URL: https://issues.apache.org/jira/browse/IMPALA-9404
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Sahil Takiar


{code:java}
[ RUN  ] Instantiations/ExprTest.MathConversionFunctions/0
/home/systest/Impala/be/src/exprs/expr-test.cc:644: Failure
Value of: ConvertValue(result)
  Actual: 0
Expected: expected_result
Which is: 536870912
length(unhex(repeat('a', 1024 * 1024 * 1024)))
[  FAILED  ] Instantiations/ExprTest.MathConversionFunctions/0, where 
GetParam() = 2-byte object <00-01> (285270 ms)
[ RUN  ] Instantiations/ExprTest.MathConversionFunctions/1
[   OK ] Instantiations/ExprTest.MathConversionFunctions/1 (257337 ms)
[ RUN  ] Instantiations/ExprTest.MathConversionFunctions/2
/home/systest/Impala/be/src/exprs/expr-test.cc:644: Failure
Value of: ConvertValue(result)
  Actual: 0
Expected: expected_result
Which is: 536870912
length(unhex(repeat('a', 1024 * 1024 * 1024)))
[  FAILED  ] Instantiations/ExprTest.MathConversionFunctions/2, where 
GetParam() = 2-byte object <01-01> (498111 ms) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9403) Allow TSAN to be set on codegen

2020-02-19 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-9403:


 Summary: Allow TSAN to be set on codegen
 Key: IMPALA-9403
 URL: https://issues.apache.org/jira/browse/IMPALA-9403
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Sahil Takiar


Similar to this commit, but for TSAN. Requires adding the {{-fsanitize=thread}} 
flag to {{CLANG_IR_CXX_FLAGS}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9395) RuntimeFilter causes impalad crash

2020-02-19 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040276#comment-17040276
 ] 

Tim Armstrong commented on IMPALA-9395:
---

I'm pretty sure this is a race with the local thread setting the filter here 
https://github.com/apache/impala/blob/f403a96700e47df184ff782378342569be8f1c58/be/src/runtime/runtime-filter-bank.cc#L243

In all other cases the pattern is
{code}
  filter_state->lock.lock();
  if (HasFilter()) return;
  filter_state->consumed_filter->filter_state->SetFilter(..);
  filter_state->lock.unlock();
{code}

> RuntimeFilter causes impalad crash
> --
>
> Key: IMPALA-9395
> URL: https://issues.apache.org/jira/browse/IMPALA-9395
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
>
> impalad crashed while running TestRuntimeFilters::test_basic_filters: 
> {code}
> 16:55:53  TestRuntimeFilters.test_basic_filters[protocol: beeswax | 
> exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> seq/none] 
> 16:55:53 query_test/test_runtime_filters.py:55: in test_basic_filters
> 16:55:53 test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : 
> str(WAIT_TIME_MS)})
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_test_suite.py:659:
>  in run_test_case
> 16:55:53 result = exec_fn(query, user=test_section.get('USER', 
> '').strip() or None)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_test_suite.py:594:
>  in __exec_in_impala
> 16:55:53 result = self.__execute_query(target_impalad_client, query, 
> user=user)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_test_suite.py:933:
>  in __execute_query
> 16:55:53 return impalad_client.execute(query, user=user)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_connection.py:205:
>  in execute
> 16:55:53 return self.__beeswax_client.execute(sql_stmt, user=user)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/beeswax/impala_beeswax.py:187:
>  in execute
> 16:55:53 handle = self.__execute_query(query_string.strip(), user=user)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/beeswax/impala_beeswax.py:365:
>  in __execute_query
> 16:55:53 self.wait_for_finished(handle)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/beeswax/impala_beeswax.py:386:
>  in wait_for_finished
> 16:55:53 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> 16:55:53 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 16:55:53 EQuery aborted:ExecQueryFInstances rpc 
> query_id=1b4b52ada7e0b713:a6ca09e2 failed: Exec() rpc failed: Network 
> error: Client connection negotiation failed: client connection to 
> 127.0.0.1:27002: connect: Connection refused (error 111)
> {code}
> minidump:
> {code}
> Operating system: Linux
>   0.0.0 Linux 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 
> 20:32:50 UTC 2017 x86_64
> CPU: amd64
>  family 6 model 85 stepping 4
>  1 CPU
> GPU: UNKNOWN
> Crash reason:  SIGABRT
> Crash address: 0x7d16057
> Process uptime: not available
> Thread 122 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x07208c80
> rsi = 0x6366   rdi = 0x6057
> rbp = 0x7fb39ecf7da0   rsp = 0x7fb39ecf7a38
>  r8 = 0xr9 = 0x7fb39ecf78b0
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x07208d00   r13 = 0x004b
> r14 = 0x07210644   r15 = 0x07208c80
> rip = 0x7fb40b9501f7
> Found by: given as instruction pointer in context
>  1  libc-2.17.so + 0x368e8
> rbp = 0x7fb39ecf7da0   rsp = 0x7fb39ecf7a40
> rip = 0x7fb40b9518e8
> Found by: stack scanning
>  2  impalad!google_breakpad::ExceptionHandler::HandleSignal(int, siginfo_t*, 
> void*) + 0x1e0
> rbp = 0x7fb39ecf7da0   rsp = 0x7fb39ecf7ac8
> rip = 0x04e866f0
> Found by: stack scanning
>  3  impalad!boost::detail::lexical_converter_impl impala::extdatasource::TColumnDesc>::try_convert(impala::extdatasource::TColumnDesc
>  const&, std::string&) [converter_lexical.hpp : 508 + 0x1]
> rbp = 0x7fb39ecf7da0   rsp = 0x7fb39ecf7ae0
> rip = 

[jira] [Commented] (IMPALA-9395) RuntimeFilter causes impalad crash

2020-02-19 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040263#comment-17040263
 ] 

Tim Armstrong commented on IMPALA-9395:
---

Looks like this query caused the crash:
{noformat}
I0216 07:41:57.279309 18338 impala-beeswax-server.cc:509] 
TClientRequest.queryOptions: TQueryOptions {
...
  74: client_identifier (string) = 
"query_test/test_runtime_filters.py::TestRuntimeFilters::()::test_basic_filters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab",
...
I0216 07:41:57.281234 18338 impala-server.cc:1042] 
3a4cdb1f69e0fe66:1afd3e99] Registered query 
query_id=3a4cdb1f69e0fe66:1afd3e99 
session_id=a449f7ded8f911f4:b44a4608ace48c82
I0216 07:41:57.281312 18338 Frontend.java:1499] 
3a4cdb1f69e0fe66:1afd3e99] Analyzing query: select straight_join 
count(*)
from alltypes a join [BROADCAST] alltypessmall c
on a.month = c.month join [BROADCAST] alltypesagg b
on a.month = b.id where b.int_col < 0 db: functional_seq
{noformat}

> RuntimeFilter causes impalad crash
> --
>
> Key: IMPALA-9395
> URL: https://issues.apache.org/jira/browse/IMPALA-9395
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
>
> impalad crashed while running TestRuntimeFilters::test_basic_filters: 
> {code}
> 16:55:53  TestRuntimeFilters.test_basic_filters[protocol: beeswax | 
> exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> seq/none] 
> 16:55:53 query_test/test_runtime_filters.py:55: in test_basic_filters
> 16:55:53 test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : 
> str(WAIT_TIME_MS)})
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_test_suite.py:659:
>  in run_test_case
> 16:55:53 result = exec_fn(query, user=test_section.get('USER', 
> '').strip() or None)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_test_suite.py:594:
>  in __exec_in_impala
> 16:55:53 result = self.__execute_query(target_impalad_client, query, 
> user=user)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_test_suite.py:933:
>  in __execute_query
> 16:55:53 return impalad_client.execute(query, user=user)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_connection.py:205:
>  in execute
> 16:55:53 return self.__beeswax_client.execute(sql_stmt, user=user)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/beeswax/impala_beeswax.py:187:
>  in execute
> 16:55:53 handle = self.__execute_query(query_string.strip(), user=user)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/beeswax/impala_beeswax.py:365:
>  in __execute_query
> 16:55:53 self.wait_for_finished(handle)
> 16:55:53 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/beeswax/impala_beeswax.py:386:
>  in wait_for_finished
> 16:55:53 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> 16:55:53 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 16:55:53 EQuery aborted:ExecQueryFInstances rpc 
> query_id=1b4b52ada7e0b713:a6ca09e2 failed: Exec() rpc failed: Network 
> error: Client connection negotiation failed: client connection to 
> 127.0.0.1:27002: connect: Connection refused (error 111)
> {code}
> minidump:
> {code}
> Operating system: Linux
>   0.0.0 Linux 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 
> 20:32:50 UTC 2017 x86_64
> CPU: amd64
>  family 6 model 85 stepping 4
>  1 CPU
> GPU: UNKNOWN
> Crash reason:  SIGABRT
> Crash address: 0x7d16057
> Process uptime: not available
> Thread 122 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x07208c80
> rsi = 0x6366   rdi = 0x6057
> rbp = 0x7fb39ecf7da0   rsp = 0x7fb39ecf7a38
>  r8 = 0xr9 = 0x7fb39ecf78b0
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x07208d00   r13 = 0x004b
> r14 = 0x07210644   r15 = 0x07208c80
> rip = 0x7fb40b9501f7
> Found by: given as instruction pointer in context
>  1  libc-2.17.so + 0x368e8
> rbp = 0x7fb39ecf7da0   rsp = 

[jira] [Commented] (IMPALA-9152) AuthorizationStmtTest.testColumnMaskEnabled failed in precommits.

2020-02-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040248#comment-17040248
 ] 

ASF subversion and git services commented on IMPALA-9152:
-

Commit f403a96700e47df184ff782378342569be8f1c58 in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=f403a96 ]

IMPALA-9152: Make Impala ranger plugin singleton in FE Tests

Fix the flakiness of Ranger FE tests in AuthorizationStmtTest which is
caused by a row filter policy not being cleanly deleted. There is a bug
in Ranger that policies being deleted in Ranger Admin server but still
exist in Ranger plugins when there are concurrent create policy and get
policy requests (RANGER-2727). It's more possible to hit the bug if we
have more ranger plugins running, since each plugin instance will poll
policies in each 30s regularly.

Impalad and Catalogd servers only initialize one ImpalaRangerPlugin
instance. However, AuthorizationStmtTest has embedded Frontend and
CatalogServiceCatalog objects. It will initialize two ranger plugin
instances totally. What's worse, the JUnit testing framework makes a new
object for each test method run. Currently there are 29 test methods in
AuthorizationStmtTest, which means 29 AuthorizationStmtTest objects will
be created. So we finally have 58 ranger plugin instances running, which
makes RANGER-2727 easy to happen.

The failure can be reproduced by adding the following new test and run
it with all existing tests:
  @Test
  public void testRangerPolicyRepeatedly() throws Exception {
if (authzProvider_ == AuthorizationProvider.SENTRY) return;
for (int i = 0; i < 100; ++i) {
  testRowFilterEnabled();
  testColumnMaskEnabled();
}
  }
We only explicitly create policies for column masking and row filtering
(other tests are using grant/revoke requests). This test increases the
number of CreatePolicy requests, so increases the possibility of
CreatePolicy requests running concurrently with GetPolicies requests
polling from other ranger plugin instances created by previous tests.

The fix is to make ImpalaRangerPlugin a singleton class so we will have
only one ranger plugin instance, which dramatically reduces the
possibility of hitting RANGER-2727. The thorough fix is bumping CDP
version after RANGER-2727 is resolved. Codes added in the previous patch
(c1244c2f04e629cc07b0830a597c70317be92768) are removed.

Tests:
 - Ran AuthorizationStmtTest with the above new test.

Change-Id: I91f2bad1a9ce897b45cfc42f97b192fe2f8c7e06
Reviewed-on: http://gerrit.cloudera.org:8080/15235
Reviewed-by: Csaba Ringhofer 
Tested-by: Impala Public Jenkins 


> AuthorizationStmtTest.testColumnMaskEnabled failed in precommits.
> -
>
> Key: IMPALA-9152
> URL: https://issues.apache.org/jira/browse/IMPALA-9152
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Anurag Mantripragada
>Assignee: Quanlong Huang
>Priority: Critical
>
> [~stigahuang] since you are going to work on this stuff, assigning it to you. 
> Please feel free to reassign it. Thank you!
> Encountered here:
> [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/8738]
> Looks like the test was expecting column masking error but encountered row 
> filtering error.
> {code:java}
> got error:
> Impala does not support row filtering yet. Row filtering is enabled on table: 
> functional.alltypes_view
> expected:
> Impala does not support column masking yet. Column masking is enabled on 
> column: functional.alltypes_view.string_col {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9402) bootstrap_system.sh fails to configure PostgreSQL 9 on some CentOS 7 systems

2020-02-19 Thread Laszlo Gaal (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Gaal reassigned IMPALA-9402:
---

Assignee: Laszlo Gaal

> bootstrap_system.sh fails to configure PostgreSQL 9 on some CentOS 7 systems
> 
>
> Key: IMPALA-9402
> URL: https://issues.apache.org/jira/browse/IMPALA-9402
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Laszlo Gaal
>Assignee: Laszlo Gaal
>Priority: Critical
>
> {{bin/bootstrap_system.sh}} relaxes the default PostgreSQL security rules to 
> allow password-less operations during data load. It uses {{sed}} to 
> manipulate the contents of the {{pg_hba.conf}} file, see the lines 
> [https://github.com/apache/impala/blob/master/bin/bootstrap_system.sh#L310]
> Unfortunately the {{sed}} patterns are too much system-dependent, which 
> caused a setup failure when a CentOS 7 test platform was set up with 
> PostgreSQL 9.2: the expected default content of pg_hba.conf did not match the 
> script.
> This failure broke the build in the dataload phase with the symptom:
> {code:java}
> 23:36:49 dropdb: could not connect to database template1: FATAL:  Peer 
> authentication failed for user "hiveuser"
> 23:36:49 createdb: could not connect to database template1: FATAL:  Peer 
> authentication failed for user "hiveuser" {code}
> This blocks automatic deployment of an Impala build on such a platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9402) bootstrap_system.sh fails to configure PostgreSQL 9 on some CentOS 7 systems

2020-02-19 Thread Laszlo Gaal (Jira)
Laszlo Gaal created IMPALA-9402:
---

 Summary: bootstrap_system.sh fails to configure PostgreSQL 9 on 
some CentOS 7 systems
 Key: IMPALA-9402
 URL: https://issues.apache.org/jira/browse/IMPALA-9402
 Project: IMPALA
  Issue Type: Bug
Reporter: Laszlo Gaal


{{bin/bootstrap_system.sh}} relaxes the default PostgreSQL security rules to 
allow password-less operations during data load. It uses {{sed}} to manipulate 
the contents of the {{pg_hba.conf}} file, see the lines 
[https://github.com/apache/impala/blob/master/bin/bootstrap_system.sh#L310]

Unfortunately the {{sed}} patterns are too much system-dependent, which caused 
a setup failure when a CentOS 7 test platform was set up with PostgreSQL 9.2: 
the expected default content of pg_hba.conf did not match the script.

This failure broke the build in the dataload phase with the symptom:
{code:java}
23:36:49 dropdb: could not connect to database template1: FATAL:  Peer 
authentication failed for user "hiveuser"
23:36:49 createdb: could not connect to database template1: FATAL:  Peer 
authentication failed for user "hiveuser" {code}
This blocks automatic deployment of an Impala build on such a platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org