[jira] [Resolved] (IMPALA-10942) Fix memory leak in admission controller

2022-04-05 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10942.
-
Resolution: Fixed

> Fix memory leak in admission controller
> ---
>
> Key: IMPALA-10942
> URL: https://issues.apache.org/jira/browse/IMPALA-10942
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
> Attachments: impala4-coordinator-memory-leak-heap-growth-profile.pdf
>
>
> [https://github.com/apache/impala/blob/1a61a8025c87c37921a1bba4c49f754d8bd10bcc/be/src/scheduling/admission-controller.cc#L1196]
> The intent of the aforementioned line in the code was to check and remove the 
> queue_node if the query was not queued. Instead it end up checking whether 
> that flag is a null pointer or not. Since a valid pointer is always sent, 
> this would result in the queue node never being removed resulting in a memory 
> leak   



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10942) Fix memory leak in admission controller

2022-04-05 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10942.
-
Resolution: Fixed

> Fix memory leak in admission controller
> ---
>
> Key: IMPALA-10942
> URL: https://issues.apache.org/jira/browse/IMPALA-10942
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
> Attachments: impala4-coordinator-memory-leak-heap-growth-profile.pdf
>
>
> [https://github.com/apache/impala/blob/1a61a8025c87c37921a1bba4c49f754d8bd10bcc/be/src/scheduling/admission-controller.cc#L1196]
> The intent of the aforementioned line in the code was to check and remove the 
> queue_node if the query was not queued. Instead it end up checking whether 
> that flag is a null pointer or not. Since a valid pointer is always sent, 
> this would result in the queue node never being removed resulting in a memory 
> leak   



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (IMPALA-11219) Add metric "admission-controller.total-dequeue-failed-coordinator-limited" for each resource pool

2022-04-01 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11219:
---

 Summary: Add metric  
"admission-controller.total-dequeue-failed-coordinator-limited" for each 
resource pool
 Key: IMPALA-11219
 URL: https://issues.apache.org/jira/browse/IMPALA-11219
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11219) Add metric "admission-controller.total-dequeue-failed-coordinator-limited" for each resource pool

2022-04-01 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11219:
---

 Summary: Add metric  
"admission-controller.total-dequeue-failed-coordinator-limited" for each 
resource pool
 Key: IMPALA-11219
 URL: https://issues.apache.org/jira/browse/IMPALA-11219
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (IMPALA-11200) Redundant additions to ExecOption field in query profile of grouping aggregator node when inside a subplan

2022-03-24 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11200:
---

 Summary: Redundant additions to ExecOption field in query profile 
of grouping aggregator node when inside a subplan 
 Key: IMPALA-11200
 URL: https://issues.apache.org/jira/browse/IMPALA-11200
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig


There is an excessive addition of "Codegen Enabled" text to "ExecOption" field 
in the query profile when a grouping agg is a part of a subplan that is created 
to unnest a complex type.
I was able to reproduce this using one of the queries we used for end to end 
testing:
{noformat}
use tpch_nested_parquet;
select c_custkey, v.* from customer c,
  (select o_orderpriority, count(o_orderkey) c, sum(o_totalprice) s,
  avg(o_totalprice) a, max(o_orderstatus) mx,
  min(o_orderdate) mn
   from c.c_orders
   group by o_orderpriority) v
where c_custkey < 4;
{noformat}
>From the query profile:
{noformat}
 AGGREGATION_NODE (id=4):
 - InactiveTotalTime: 0.000ns
 - PeakMemoryUsage: 36.04 MB (37794944)
 - RowsReturned: 0 (0)
 - RowsReturnedRate: 0
 - TotalTime: 1.571ms
GroupingAggregator 0:
  ExecOption: Codegen Enabled, Codegen Enabled, Codegen Enabled 
  <== THIS PART!
   - BuildTime: 68.253us
   - GetResultsTime: 72.634us
{noformat}
The reason this happens is because "Codegen Enabled" is added to the ExecOption 
everytime the agg node is Opened as a result of the subplan being called again 
and again to unnest.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11200) Redundant additions to ExecOption field in query profile of grouping aggregator node when inside a subplan

2022-03-24 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11200:
---

 Summary: Redundant additions to ExecOption field in query profile 
of grouping aggregator node when inside a subplan 
 Key: IMPALA-11200
 URL: https://issues.apache.org/jira/browse/IMPALA-11200
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig


There is an excessive addition of "Codegen Enabled" text to "ExecOption" field 
in the query profile when a grouping agg is a part of a subplan that is created 
to unnest a complex type.
I was able to reproduce this using one of the queries we used for end to end 
testing:
{noformat}
use tpch_nested_parquet;
select c_custkey, v.* from customer c,
  (select o_orderpriority, count(o_orderkey) c, sum(o_totalprice) s,
  avg(o_totalprice) a, max(o_orderstatus) mx,
  min(o_orderdate) mn
   from c.c_orders
   group by o_orderpriority) v
where c_custkey < 4;
{noformat}
>From the query profile:
{noformat}
 AGGREGATION_NODE (id=4):
 - InactiveTotalTime: 0.000ns
 - PeakMemoryUsage: 36.04 MB (37794944)
 - RowsReturned: 0 (0)
 - RowsReturnedRate: 0
 - TotalTime: 1.571ms
GroupingAggregator 0:
  ExecOption: Codegen Enabled, Codegen Enabled, Codegen Enabled 
  <== THIS PART!
   - BuildTime: 68.253us
   - GetResultsTime: 72.634us
{noformat}
The reason this happens is because "Codegen Enabled" is added to the ExecOption 
everytime the agg node is Opened as a result of the subplan being called again 
and again to unnest.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (IMPALA-11063) Add metrics for the state of each executor group set

2022-01-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-11063.
-
Target Version: Impala 4.1.0
Resolution: Fixed

> Add metrics for the state of each executor group set
> 
>
> Key: IMPALA-11063
> URL: https://issues.apache.org/jira/browse/IMPALA-11063
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> Add metrics similar to the following for each executor group set:
> {noformat}
> cluster-membership.executor-groups.total
> cluster-membership.executor-groups.total-healthy
> cluster-membership.backends.total{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (IMPALA-11063) Add metrics for the state of each executor group set

2022-01-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-11063.
-
Target Version: Impala 4.1.0
Resolution: Fixed

> Add metrics for the state of each executor group set
> 
>
> Key: IMPALA-11063
> URL: https://issues.apache.org/jira/browse/IMPALA-11063
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> Add metrics similar to the following for each executor group set:
> {noformat}
> cluster-membership.executor-groups.total
> cluster-membership.executor-groups.total-healthy
> cluster-membership.backends.total{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11033) Add support for multiple executor group sets

2022-01-12 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-11033.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Add support for multiple executor group sets
> 
>
> Key: IMPALA-11033
> URL: https://issues.apache.org/jira/browse/IMPALA-11033
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
> Fix For: Impala 4.1.0
>
>
> An executor group set is defined as the set of all executor groups that are 
> associated with the same resource pool  and have that resource pool name as 
> their prefix.
> eg.
> Assuming there is a resource pool named: "sample-pool"
> executor groups named such as "sample-pool-group1" and "sample-pool-group2" 
> both belong to the executor group set associated with that resource pool.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (IMPALA-11033) Add support for multiple executor group sets

2022-01-12 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-11033.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Add support for multiple executor group sets
> 
>
> Key: IMPALA-11033
> URL: https://issues.apache.org/jira/browse/IMPALA-11033
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
> Fix For: Impala 4.1.0
>
>
> An executor group set is defined as the set of all executor groups that are 
> associated with the same resource pool  and have that resource pool name as 
> their prefix.
> eg.
> Assuming there is a resource pool named: "sample-pool"
> executor groups named such as "sample-pool-group1" and "sample-pool-group2" 
> both belong to the executor group set associated with that resource pool.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11063) Add metrics for the state of each executor group set

2021-12-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-11063:
---

Assignee: Bikramjeet Vig

> Add metrics for the state of each executor group set
> 
>
> Key: IMPALA-11063
> URL: https://issues.apache.org/jira/browse/IMPALA-11063
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> Add metrics similar to the following for each executor group set:
> {noformat}
> cluster-membership.executor-groups.total
> cluster-membership.executor-groups.total-healthy
> cluster-membership.backends.total{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11063) Add metrics for the state of each executor group set

2021-12-20 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11063:
---

 Summary: Add metrics for the state of each executor group set
 Key: IMPALA-11063
 URL: https://issues.apache.org/jira/browse/IMPALA-11063
 Project: IMPALA
  Issue Type: Task
Reporter: Bikramjeet Vig


Add metrics similar to the following for each executor group set:
{noformat}
cluster-membership.executor-groups.total
cluster-membership.executor-groups.total-healthy
cluster-membership.backends.total{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (IMPALA-11063) Add metrics for the state of each executor group set

2021-12-20 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11063:
---

 Summary: Add metrics for the state of each executor group set
 Key: IMPALA-11063
 URL: https://issues.apache.org/jira/browse/IMPALA-11063
 Project: IMPALA
  Issue Type: Task
Reporter: Bikramjeet Vig


Add metrics similar to the following for each executor group set:
{noformat}
cluster-membership.executor-groups.total
cluster-membership.executor-groups.total-healthy
cluster-membership.backends.total{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11033) Add support for multiple executor group sets

2021-12-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-11033:

Description: 
An executor group set is defined as the set of all executor groups that are 
associated with the same resource pool  and have that resource pool name as 
their prefix.
eg.

Assuming there is a resource pool named: "sample-pool"
executor groups named such as "sample-pool-group1" and "sample-pool-group2" 
both belong to the executor group set associated with that resource pool.

  was:Add support for classifying executor groups in 2 types: REGULAR, LARGE 
and plugging them into the Frontend to allow unblock IMPALA-10992


> Add support for multiple executor group sets
> 
>
> Key: IMPALA-11033
> URL: https://issues.apache.org/jira/browse/IMPALA-11033
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> An executor group set is defined as the set of all executor groups that are 
> associated with the same resource pool  and have that resource pool name as 
> their prefix.
> eg.
> Assuming there is a resource pool named: "sample-pool"
> executor groups named such as "sample-pool-group1" and "sample-pool-group2" 
> both belong to the executor group set associated with that resource pool.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11033) Add support for multiple executor group sets

2021-12-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-11033:

Summary: Add support for multiple executor group sets  (was: Add support 
for setting up classification among multiple executor groups )

> Add support for multiple executor group sets
> 
>
> Key: IMPALA-11033
> URL: https://issues.apache.org/jira/browse/IMPALA-11033
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> Add support for classifying executor groups in 2 types: REGULAR, LARGE and 
> plugging them into the Frontend to allow unblock IMPALA-10992



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10993) Add query option that would allow the user to specify the size of the cluster that they want query to run on.

2021-12-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10993:

Priority: Critical  (was: Major)

> Add query option that would allow the user to specify the size of the cluster 
> that they want query to run on.
> -
>
> Key: IMPALA-10993
> URL: https://issues.apache.org/jira/browse/IMPALA-10993
> Project: IMPALA
>  Issue Type: Task
>Reporter: Amogh Margoor
>Assignee: Amogh Margoor
>Priority: Critical
>
> Add query option that would allow the user to specify the size of the cluster 
> that they want query to run on. Scheduler would then run this on the resource 
> group size larger than this number. For instance if user specifies 10 and 
> next large resource group is of size 30, it would pick up executor group 
> mapping to that resource group (of size 30) to schedule  query upon.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10991) Changes to support multiple resource groups

2021-12-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10991:

Priority: Critical  (was: Major)

> Changes to support multiple resource groups
> ---
>
> Key: IMPALA-10991
> URL: https://issues.apache.org/jira/browse/IMPALA-10991
> Project: IMPALA
>  Issue Type: Epic
>Reporter: Amogh Margoor
>Assignee: Amogh Margoor
>Priority: Critical
>
> Epic to track issues to add support for multiple resource group. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11033) Add support for setting up classification among multiple executor groups

2021-12-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-11033:

Priority: Critical  (was: Major)

> Add support for setting up classification among multiple executor groups 
> -
>
> Key: IMPALA-11033
> URL: https://issues.apache.org/jira/browse/IMPALA-11033
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> Add support for classifying executor groups in 2 types: REGULAR, LARGE and 
> plugging them into the Frontend to allow unblock IMPALA-10992



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10943) Add additional test to verify support for multiple executor groups that map to different resource groups

2021-12-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10943:

Priority: Critical  (was: Major)

> Add additional test to verify support for multiple executor groups that map 
> to different resource groups 
> -
>
> Key: IMPALA-10943
> URL: https://issues.apache.org/jira/browse/IMPALA-10943
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
> Fix For: Impala 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10992) Planner changes for estimate peak memory.

2021-12-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10992:

Priority: Critical  (was: Major)

> Planner changes for estimate peak memory.
> -
>
> Key: IMPALA-10992
> URL: https://issues.apache.org/jira/browse/IMPALA-10992
> Project: IMPALA
>  Issue Type: Task
>Reporter: Amogh Margoor
>Assignee: Qifan Chen
>Priority: Critical
>
> For ability to run large queries on larger executor group mapping to 
> different resource group, we would need to identify the large queries during 
> compile time. For this identification in first phase we can use peak memory 
> estimation to classify large queries. This Jira is to keep track of that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11033) Add support for setting up classification among multiple executor groups

2021-11-19 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-11033:
---

Assignee: Bikramjeet Vig

> Add support for setting up classification among multiple executor groups 
> -
>
> Key: IMPALA-11033
> URL: https://issues.apache.org/jira/browse/IMPALA-11033
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
>
> Add support for classifying executor groups in 2 types: REGULAR, LARGE and 
> plugging them into the Frontend to allow unblock IMPALA-10992



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11033) Add support for setting up classification among multiple executor groups

2021-11-19 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11033:
---

 Summary: Add support for setting up classification among multiple 
executor groups 
 Key: IMPALA-11033
 URL: https://issues.apache.org/jira/browse/IMPALA-11033
 Project: IMPALA
  Issue Type: Task
Reporter: Bikramjeet Vig


Add support for classifying executor groups in 2 types: REGULAR, LARGE and 
plugging them into the Frontend to allow unblock IMPALA-10992



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11033) Add support for setting up classification among multiple executor groups

2021-11-19 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11033:
---

 Summary: Add support for setting up classification among multiple 
executor groups 
 Key: IMPALA-11033
 URL: https://issues.apache.org/jira/browse/IMPALA-11033
 Project: IMPALA
  Issue Type: Task
Reporter: Bikramjeet Vig


Add support for classifying executor groups in 2 types: REGULAR, LARGE and 
plugging them into the Frontend to allow unblock IMPALA-10992



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (IMPALA-10943) Add additional test to verify support for multiple executor groups that map to different resource groups

2021-11-19 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10943.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Add additional test to verify support for multiple executor groups that map 
> to different resource groups 
> -
>
> Key: IMPALA-10943
> URL: https://issues.apache.org/jira/browse/IMPALA-10943
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
> Fix For: Impala 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10943) Add additional test to verify support for multiple executor groups that map to different resource groups

2021-11-19 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10943.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Add additional test to verify support for multiple executor groups that map 
> to different resource groups 
> -
>
> Key: IMPALA-10943
> URL: https://issues.apache.org/jira/browse/IMPALA-10943
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
> Fix For: Impala 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (IMPALA-11012) test_show_create_table flaky failure

2021-11-09 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11012:
---

 Summary: test_show_create_table flaky failure
 Key: IMPALA-11012
 URL: https://issues.apache.org/jira/browse/IMPALA-11012
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Zoltán Borók-Nagy


This test failed in one of the recent [GVO 
builds|https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/4902/]:
Stacktrace:
{noformat}
metadata/test_show_create_table.py:64: in test_show_create_table
unique_database)
metadata/test_show_create_table.py:127: in __run_show_create_table_test_case
result = self.__exec(test_case.show_create_table_sql)
metadata/test_show_create_table.py:135: in __exec
return self.execute_query_expect_success(self.client, sql_str)
common/impala_test_suite.py:831: in wrapper
return function(*args, **kwargs)
common/impala_test_suite.py:839: in execute_query_expect_success
result = cls.__execute_query(impalad_client, query, query_options, user)
common/impala_test_suite.py:956: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:212: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:189: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:365: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:359: in execute_query_async
handle = self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:522: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: AnalysisException: 
org.apache.impala.catalog.TableLoadingException: Error opening Iceberg table 
'test_show_create_table_8b557a01.iceberg_ctas'
E   CAUSED BY: TableLoadingException: Error opening Iceberg table 
'test_show_create_table_8b557a01.iceberg_ctas'
E   CAUSED BY: InconsistentMetadataFetchException: Catalog object 
TCatalogObject(type:TABLE, catalog_version:6846, 
table:TTable(db_name:test_show_create_table_8b557a01, tbl_name:iceberg_ctas)) 
changed version between accesses.
{noformat}

Reached out to Zoltan about this and he mentioned that this is probably due to 
a failure to detect a self-event on an iceberg table which resulted in a change 
to the catalog version of the table during some operation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (IMPALA-11012) test_show_create_table flaky failure

2021-11-09 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-11012:
---

 Summary: test_show_create_table flaky failure
 Key: IMPALA-11012
 URL: https://issues.apache.org/jira/browse/IMPALA-11012
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Zoltán Borók-Nagy


This test failed in one of the recent [GVO 
builds|https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/4902/]:
Stacktrace:
{noformat}
metadata/test_show_create_table.py:64: in test_show_create_table
unique_database)
metadata/test_show_create_table.py:127: in __run_show_create_table_test_case
result = self.__exec(test_case.show_create_table_sql)
metadata/test_show_create_table.py:135: in __exec
return self.execute_query_expect_success(self.client, sql_str)
common/impala_test_suite.py:831: in wrapper
return function(*args, **kwargs)
common/impala_test_suite.py:839: in execute_query_expect_success
result = cls.__execute_query(impalad_client, query, query_options, user)
common/impala_test_suite.py:956: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:212: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:189: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:365: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:359: in execute_query_async
handle = self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:522: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: AnalysisException: 
org.apache.impala.catalog.TableLoadingException: Error opening Iceberg table 
'test_show_create_table_8b557a01.iceberg_ctas'
E   CAUSED BY: TableLoadingException: Error opening Iceberg table 
'test_show_create_table_8b557a01.iceberg_ctas'
E   CAUSED BY: InconsistentMetadataFetchException: Catalog object 
TCatalogObject(type:TABLE, catalog_version:6846, 
table:TTable(db_name:test_show_create_table_8b557a01, tbl_name:iceberg_ctas)) 
changed version between accesses.
{noformat}

Reached out to Zoltan about this and he mentioned that this is probably due to 
a failure to detect a self-event on an iceberg table which resulted in a change 
to the catalog version of the table during some operation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10999) Falkiness in TestAsyncLoadData.test_async_load

2021-11-01 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10999:

Description: 
This test failed in one of the GVO's recently. 
[Link|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15097/testReport/junit/metadata.test_load/TestAsyncLoadData/test_async_load_enable_async_load_data_execution__False___protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/]

 
{noformat}
Error Message
metadata/test_load.py:197: in test_async_load assert(exec_end_state == 
finished_state) E   assert 3 == 4
Stacktrace
metadata/test_load.py:197: in test_async_load
assert(exec_end_state == finished_state)
E   assert 3 == 4
Standard Error
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2021-10-30 01:38:55,203 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2021-10-30 01:38:55,237 INFO MainThread: Closing active operation
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
SET sync_ddl=False;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_async_load_ff1c20a7` CASCADE;

-- 2021-10-30 01:38:55,281 INFO MainThread: Started query 
df43a0ff6165a9eb:33b0d69f
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
SET sync_ddl=False;
-- executing against localhost:21000

CREATE DATABASE `test_async_load_ff1c20a7`;

-- 2021-10-30 01:39:01,148 INFO MainThread: Started query 
e64bd28a97339b44:e76523a8
-- 2021-10-30 01:39:01,253 INFO MainThread: Created database 
"test_async_load_ff1c20a7" for test ID 
"metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:
 False | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none]"
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
-- connecting to: localhost:21000
-- executing against localhost:21000

create table test_async_load_ff1c20a7.test_load_nopart_beeswax_False like 
functional.alltypesnopart location 
'/test-warehouse/test_load_staging_beeswax_False';

-- 2021-10-30 01:39:09,435 INFO MainThread: Started query 
e543635533874c9e:fe238ca9
-- executing against localhost:21000

select count(*) from test_async_load_ff1c20a7.test_load_nopart_beeswax_False;

-- 2021-10-30 01:39:13,178 INFO MainThread: Started query 
5c4969e81b1b614b:26754a22
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
-- executing against localhost:21000

use functional;

-- 2021-10-30 01:39:13,413 INFO MainThread: Started query 
d340e3650cba2d6f:a35a14bb
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET enable_async_load_data_execution=False;
SET debug_action=CRS_DELAY_BEFORE_LOAD_DATA:SLEEP@3000;
SET exec_single_node_rows_threshold=0;
-- executing async: localhost:21000

load data inpath '/test-warehouse/test_load_staging_beeswax_False'   
into table test_async_load_ff1c20a7.test_load_nopart_beeswax_False;

-- 2021-10-30 01:39:16,472 INFO MainThread: Started query 
ac404aa17515170d:cbd45efd
-- getting 

[jira] [Created] (IMPALA-10999) Falkiness in TestAsyncLoadData.test_async_load

2021-11-01 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10999:
---

 Summary: Falkiness in TestAsyncLoadData.test_async_load
 Key: IMPALA-10999
 URL: https://issues.apache.org/jira/browse/IMPALA-10999
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Qifan Chen


This test failed in one of the GVO's recently. [Link
|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15097/testReport/junit/metadata.test_load/TestAsyncLoadData/test_async_load_enable_async_load_data_execution__False___protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/]

 
{noformat}
Error Message
metadata/test_load.py:197: in test_async_load assert(exec_end_state == 
finished_state) E   assert 3 == 4
Stacktrace
metadata/test_load.py:197: in test_async_load
assert(exec_end_state == finished_state)
E   assert 3 == 4
Standard Error
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2021-10-30 01:38:55,203 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2021-10-30 01:38:55,237 INFO MainThread: Closing active operation
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
SET sync_ddl=False;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_async_load_ff1c20a7` CASCADE;

-- 2021-10-30 01:38:55,281 INFO MainThread: Started query 
df43a0ff6165a9eb:33b0d69f
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
SET sync_ddl=False;
-- executing against localhost:21000

CREATE DATABASE `test_async_load_ff1c20a7`;

-- 2021-10-30 01:39:01,148 INFO MainThread: Started query 
e64bd28a97339b44:e76523a8
-- 2021-10-30 01:39:01,253 INFO MainThread: Created database 
"test_async_load_ff1c20a7" for test ID 
"metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:
 False | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none]"
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
-- connecting to: localhost:21000
-- executing against localhost:21000

create table test_async_load_ff1c20a7.test_load_nopart_beeswax_False like 
functional.alltypesnopart location 
'/test-warehouse/test_load_staging_beeswax_False';

-- 2021-10-30 01:39:09,435 INFO MainThread: Started query 
e543635533874c9e:fe238ca9
-- executing against localhost:21000

select count(*) from test_async_load_ff1c20a7.test_load_nopart_beeswax_False;

-- 2021-10-30 01:39:13,178 INFO MainThread: Started query 
5c4969e81b1b614b:26754a22
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
-- executing against localhost:21000

use functional;

-- 2021-10-30 01:39:13,413 INFO MainThread: Started query 
d340e3650cba2d6f:a35a14bb
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET enable_async_load_data_execution=False;
SET debug_action=CRS_DELAY_BEFORE_LOAD_DATA:SLEEP@3000;
SET exec_single_node_rows_threshold=0;
-- executing async: localhost:21000

load data inpath '/test-warehouse/test_load_staging_beeswax_False'   
into table 

[jira] [Created] (IMPALA-10999) Falkiness in TestAsyncLoadData.test_async_load

2021-11-01 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10999:
---

 Summary: Falkiness in TestAsyncLoadData.test_async_load
 Key: IMPALA-10999
 URL: https://issues.apache.org/jira/browse/IMPALA-10999
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Qifan Chen


This test failed in one of the GVO's recently. [Link
|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15097/testReport/junit/metadata.test_load/TestAsyncLoadData/test_async_load_enable_async_load_data_execution__False___protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/]

 
{noformat}
Error Message
metadata/test_load.py:197: in test_async_load assert(exec_end_state == 
finished_state) E   assert 3 == 4
Stacktrace
metadata/test_load.py:197: in test_async_load
assert(exec_end_state == finished_state)
E   assert 3 == 4
Standard Error
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2021-10-30 01:38:55,203 INFO MainThread: Closing active operation
-- connecting to localhost:28000 with impyla
-- 2021-10-30 01:38:55,237 INFO MainThread: Closing active operation
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
SET sync_ddl=False;
-- executing against localhost:21000

DROP DATABASE IF EXISTS `test_async_load_ff1c20a7` CASCADE;

-- 2021-10-30 01:38:55,281 INFO MainThread: Started query 
df43a0ff6165a9eb:33b0d69f
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
SET sync_ddl=False;
-- executing against localhost:21000

CREATE DATABASE `test_async_load_ff1c20a7`;

-- 2021-10-30 01:39:01,148 INFO MainThread: Started query 
e64bd28a97339b44:e76523a8
-- 2021-10-30 01:39:01,253 INFO MainThread: Created database 
"test_async_load_ff1c20a7" for test ID 
"metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:
 False | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none]"
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
-- connecting to: localhost:21000
-- executing against localhost:21000

create table test_async_load_ff1c20a7.test_load_nopart_beeswax_False like 
functional.alltypesnopart location 
'/test-warehouse/test_load_staging_beeswax_False';

-- 2021-10-30 01:39:09,435 INFO MainThread: Started query 
e543635533874c9e:fe238ca9
-- executing against localhost:21000

select count(*) from test_async_load_ff1c20a7.test_load_nopart_beeswax_False;

-- 2021-10-30 01:39:13,178 INFO MainThread: Started query 
5c4969e81b1b614b:26754a22
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
-- executing against localhost:21000

use functional;

-- 2021-10-30 01:39:13,413 INFO MainThread: Started query 
d340e3650cba2d6f:a35a14bb
SET 
client_identifier=metadata/test_load.py::TestAsyncLoadData::()::test_async_load[enable_async_load_data_execution:False|protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET enable_async_load_data_execution=False;
SET debug_action=CRS_DELAY_BEFORE_LOAD_DATA:SLEEP@3000;
SET exec_single_node_rows_threshold=0;
-- executing async: localhost:21000

load data inpath '/test-warehouse/test_load_staging_beeswax_False'   
into table 

[jira] [Closed] (IMPALA-10978) Add a developer query option MEM_LIMIT_COORDINATOR

2021-10-22 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig closed IMPALA-10978.
---
Resolution: Duplicate

> Add a developer query option MEM_LIMIT_COORDINATOR
> --
>
> Key: IMPALA-10978
> URL: https://issues.apache.org/jira/browse/IMPALA-10978
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Priority: Major
>
> Add a developer query option MEM_LIMIT_COORDINATOR to independently control 
> the mem_limit applied to the coordinator. This would be similar in 
> functionality to MEM_LIMIT_EXECUTORS added in IMPALA-8928



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-10978) Add a developer query option MEM_LIMIT_COORDINATOR

2021-10-22 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig closed IMPALA-10978.
---
Resolution: Duplicate

> Add a developer query option MEM_LIMIT_COORDINATOR
> --
>
> Key: IMPALA-10978
> URL: https://issues.apache.org/jira/browse/IMPALA-10978
> Project: IMPALA
>  Issue Type: Task
>Reporter: Bikramjeet Vig
>Priority: Major
>
> Add a developer query option MEM_LIMIT_COORDINATOR to independently control 
> the mem_limit applied to the coordinator. This would be similar in 
> functionality to MEM_LIMIT_EXECUTORS added in IMPALA-8928



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-10981) Re-visit memory estimates for HdfsTableSink

2021-10-22 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10981:
---

 Summary: Re-visit memory estimates for HdfsTableSink
 Key: IMPALA-10981
 URL: https://issues.apache.org/jira/browse/IMPALA-10981
 Project: IMPALA
  Issue Type: Improvement
Reporter: Bikramjeet Vig


There have been cases where the memory estimate for the table sink was off by 
10x. Re-visit the implementation and look at factors like input expressions and 
underlying storage layer to find out any discrepancies in the current 
estimation logic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10981) Re-visit memory estimates for HdfsTableSink

2021-10-22 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10981:
---

 Summary: Re-visit memory estimates for HdfsTableSink
 Key: IMPALA-10981
 URL: https://issues.apache.org/jira/browse/IMPALA-10981
 Project: IMPALA
  Issue Type: Improvement
Reporter: Bikramjeet Vig


There have been cases where the memory estimate for the table sink was off by 
10x. Re-visit the implementation and look at factors like input expressions and 
underlying storage layer to find out any discrepancies in the current 
estimation logic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-10978) Add a developer query option MEM_LIMIT_COORDINATOR

2021-10-20 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10978:
---

 Summary: Add a developer query option MEM_LIMIT_COORDINATOR
 Key: IMPALA-10978
 URL: https://issues.apache.org/jira/browse/IMPALA-10978
 Project: IMPALA
  Issue Type: Task
Reporter: Bikramjeet Vig


Add a developer query option MEM_LIMIT_COORDINATOR to independently control the 
mem_limit applied to the coordinator. This would be similar in functionality to 
MEM_LIMIT_EXECUTORS added in IMPALA-8928



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10978) Add a developer query option MEM_LIMIT_COORDINATOR

2021-10-20 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10978:
---

 Summary: Add a developer query option MEM_LIMIT_COORDINATOR
 Key: IMPALA-10978
 URL: https://issues.apache.org/jira/browse/IMPALA-10978
 Project: IMPALA
  Issue Type: Task
Reporter: Bikramjeet Vig


Add a developer query option MEM_LIMIT_COORDINATOR to independently control the 
mem_limit applied to the coordinator. This would be similar in functionality to 
MEM_LIMIT_EXECUTORS added in IMPALA-8928



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMPALA-10942) Fix memory leak in admission controller

2021-09-30 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10942:

Priority: Critical  (was: Major)

> Fix memory leak in admission controller
> ---
>
> Key: IMPALA-10942
> URL: https://issues.apache.org/jira/browse/IMPALA-10942
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> [https://github.com/apache/impala/blob/1a61a8025c87c37921a1bba4c49f754d8bd10bcc/be/src/scheduling/admission-controller.cc#L1196]
> The intent of the aforementioned line in the code was to check and remove the 
> queue_node if the query was not queued. Instead it end up checking whether 
> that flag is a null pointer or not. Since a valid pointer is always sent, 
> this would result in the queue node never being removed resulting in a memory 
> leak   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10943) Add additional test to verify support for multiple executor groups that map to different resource groups

2021-09-30 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10943:
---

 Summary: Add additional test to verify support for multiple 
executor groups that map to different resource groups 
 Key: IMPALA-10943
 URL: https://issues.apache.org/jira/browse/IMPALA-10943
 Project: IMPALA
  Issue Type: Task
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10943) Add additional test to verify support for multiple executor groups that map to different resource groups

2021-09-30 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10943:
---

 Summary: Add additional test to verify support for multiple 
executor groups that map to different resource groups 
 Key: IMPALA-10943
 URL: https://issues.apache.org/jira/browse/IMPALA-10943
 Project: IMPALA
  Issue Type: Task
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-10942) Fix memory leak in admission controller

2021-09-30 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10942:
---

 Summary: Fix memory leak in admission controller
 Key: IMPALA-10942
 URL: https://issues.apache.org/jira/browse/IMPALA-10942
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


[https://github.com/apache/impala/blob/1a61a8025c87c37921a1bba4c49f754d8bd10bcc/be/src/scheduling/admission-controller.cc#L1196]

The intent of the aforementioned line in the code was to check and remove the 
queue_node if the query was not queued. Instead it end up checking whether that 
flag is a null pointer or not. Since a valid pointer is always sent, this would 
result in the queue node never being removed resulting in a memory leak   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10942) Fix memory leak in admission controller

2021-09-30 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10942:
---

 Summary: Fix memory leak in admission controller
 Key: IMPALA-10942
 URL: https://issues.apache.org/jira/browse/IMPALA-10942
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


[https://github.com/apache/impala/blob/1a61a8025c87c37921a1bba4c49f754d8bd10bcc/be/src/scheduling/admission-controller.cc#L1196]

The intent of the aforementioned line in the code was to check and remove the 
queue_node if the query was not queued. Instead it end up checking whether that 
flag is a null pointer or not. Since a valid pointer is always sent, this would 
result in the queue node never being removed resulting in a memory leak   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMPALA-10877) test_admission_control_with_multiple_coords fails due to an assert

2021-08-23 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403439#comment-17403439
 ] 

Bikramjeet Vig commented on IMPALA-10877:
-

Looked at the logs, seems like a scan fragment was stuck probably due to the 
sleep in the query and was taking up a minimal amount of memory on one of the 
executors. This prevented the query from being admitted as the mem_limit was 
exactly equal to the process mem_limit. *To fix this flakiness*, we can reduce 
the mem_limit to something slightly smaller so that it still only lets a single 
query run at the same time have enough slack for a stuck fragment  to finish up 
before getting cancelled.

Attempt to admit query:
{noformat}
f446176d7cdc1a47:960d3634] Stats: agg_num_running=0, agg_num_queued=0, 
agg_mem_reserved=4.94 MB,  local_host(local_mem_admitted=0, 
num_admitted_running=0, num_queued=0, backend_mem_reserved=4.00 MB, 
topN_query_stats: queries=[c54c69cff22d2536:aae16d2c], 
total_mem_consumed=4.00 MB, fraction_of_pool_total_mem=1; pool_level_stats: 
num_running=1, min=4.00 MB, max=4.00 MB, pool_total_mem=4.00 MB, 
average_per_query=4.00 MB)
{noformat}
c54c69cff22d2536:aae16d2c is the stuck query taking up only 4MB of mem. 
Hence we get the queued reason as follows:
{noformat}
Could not dequeue query id=f446176d7cdc1a47:960d3634 reason: Not enough 
memory available on host 
impala-ec2-centos74-m5-4xlarge-ondemand-1daa.vpc.cloudera.com:27003. Needed 
4.00 GB but only 4.00 GB out of 4.00 GB was available.
{noformat}

> test_admission_control_with_multiple_coords fails due to an assert
> --
>
> Key: IMPALA-10877
> URL: https://issues.apache.org/jira/browse/IMPALA-10877
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Abhishek Rawat
>Assignee: Bikramjeet Vig
>Priority: Major
>
> The testcase fails due to following assert:
> {code:java}
> custom_cluster/test_executor_groups.py:579: in 
> test_admission_control_with_multiple_coords
> "admission-controller.agg-num-running.default-pool", 1, timeout=30)
> common/impala_service.py:143: in wait_for_metric_value
> self.__metric_timeout_assert(metric_name, expected_value, timeout)
> common/impala_service.py:210: in __metric_timeout_assert
> assert 0, assert_string
> E   AssertionError: Metric admission-controller.agg-num-running.default-pool 
> did not reach value 1 in 30s.
> E   Dumping debug webpages in JSON format...
> E   Dumped memz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20210819_21:39:45/json/memz.json
> E   Dumped metrics JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20210819_21:39:45/json/metrics.json
> E   Dumped queries JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20210819_21:39:45/json/queries.json
> E   Dumped sessions JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20210819_21:39:45/json/sessions.json
> E   Dumped threadz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20210819_21:39:45/json/threadz.json
> E   Dumped rpcz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20210819_21:39:45/json/rpcz.json
> E   Dumping minidumps for impalads/catalogds...
> E   Dumped minidump for Impalad PID 8103
> E   Dumped minidump for Impalad PID 8106
> E   Dumped minidump for Impalad PID 10328
> E   Dumped minidump for Impalad PID 10331
> E   Dumped minidump for Catalogd PID 8041
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10866) Ensure consistency between failure detection and registration/Ack of a coordinator by the admission service

2021-08-17 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10866:

Description: 
Ensure consistency between failure detection and registration/Ack of a 
coordinator by the admission service.
 Currently admission service utilizes the statestore membership updates to 
detect a coordinator going down but it still services RPCs from that 
coordinator if it is still up and able to contact the admission service.
 Using the current mechanisms of statestore updates(IMPALA-10594), admission 
heartbeats(IMPALA-10590, IMPALA-10720) and coordinator 
registration(IMPALA-9976) ensure that consistency is maintained between these 
mechanism.
 A possible implementation is:
 - Use statestore as the only source of truth.
 ** Consistency: Only allow a coord to register if it is registered with the 
statestore
 ** Atomicity: If the statestore update signals that a coord is down, remove 
all its state (running and queued queries) before you allow it to register again
 OR
 -Eventual consistency: We remove queries between subsequent statestore updates 
and if the coord comes back up and sends the full admission state, we can 
update the state of that query id if it has not been removed yet (since the 
full admission state only contains running queries)- Cant use this because only 
changes to the membership initiate the query removal process which would only 
happen once if a coord is removed.

  was:
Ensure consistency between failure detection and registration/Ack of a 
coordinator by the admission service.
 Currently admission service utilizes the statestore membership updates to 
detect a coordinator going down but it still services RPCs from that 
coordinator if it is still up and able to contact the admission service.
 Using the current mechanisms of statestore updates(IMPALA-10594), admission 
heartbeats(IMPALA-10590, IMPALA-10720) and coordinator 
registration(IMPALA-9976) ensure that consistency is maintained between these 
mechanism.
 A possible implementation is:
 - Use statestore as the only source of truth.
 ** Consistency: Only allow a coord to register if it is registered with the 
statestore
 ** Atomicity: If the statestore update signals that a coord is down, remove 
all its state (running and queued queries) before you allow it to register again
OR
Eventual consistency: We remove queries between subsequent statestore updates 
and if the coord comes back up and sends the full admission state, we can 
update the state of that query id if it has not been removed yet (since the 
full admission state only contains running queries)


> Ensure consistency between failure detection and registration/Ack of a 
> coordinator by the admission service
> ---
>
> Key: IMPALA-10866
> URL: https://issues.apache.org/jira/browse/IMPALA-10866
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 4.0.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
>
> Ensure consistency between failure detection and registration/Ack of a 
> coordinator by the admission service.
>  Currently admission service utilizes the statestore membership updates to 
> detect a coordinator going down but it still services RPCs from that 
> coordinator if it is still up and able to contact the admission service.
>  Using the current mechanisms of statestore updates(IMPALA-10594), admission 
> heartbeats(IMPALA-10590, IMPALA-10720) and coordinator 
> registration(IMPALA-9976) ensure that consistency is maintained between these 
> mechanism.
>  A possible implementation is:
>  - Use statestore as the only source of truth.
>  ** Consistency: Only allow a coord to register if it is registered with the 
> statestore
>  ** Atomicity: If the statestore update signals that a coord is down, remove 
> all its state (running and queued queries) before you allow it to register 
> again
>  OR
>  -Eventual consistency: We remove queries between subsequent statestore 
> updates and if the coord comes back up and sends the full admission state, we 
> can update the state of that query id if it has not been removed yet (since 
> the full admission state only contains running queries)- Cant use this 
> because only changes to the membership initiate the query removal process 
> which would only happen once if a coord is removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10866) Ensure consistency between failure detection and registration/Ack of a coordinator by the admission service

2021-08-17 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10866:

Description: 
Ensure consistency between failure detection and registration/Ack of a 
coordinator by the admission service.
 Currently admission service utilizes the statestore membership updates to 
detect a coordinator going down but it still services RPCs from that 
coordinator if it is still up and able to contact the admission service.
 Using the current mechanisms of statestore updates(IMPALA-10594), admission 
heartbeats(IMPALA-10590, IMPALA-10720) and coordinator 
registration(IMPALA-9976) ensure that consistency is maintained between these 
mechanism.
 A possible implementation is:
 - Use statestore as the only source of truth.
 ** Consistency: Only allow a coord to register if it is registered with the 
statestore
 ** Atomicity: If the statestore update signals that a coord is down, remove 
all its state (running and queued queries) before you allow it to register again
OR
Eventual consistency: We remove queries between subsequent statestore updates 
and if the coord comes back up and sends the full admission state, we can 
update the state of that query id if it has not been removed yet (since the 
full admission state only contains running queries)

  was:
Ensure consistency between failure detection and registration/Ack of a 
coordinator by the admission service.
 Currently admission service utilizes the statestore membership updates to 
detect a coordinator going down but it still services RPCs from that 
coordinator if it is still up and able to contact the admission service.
 Using the current mechanisms of statestore updates(IMPALA-10594), admission 
heartbeats(IMPALA-10590, IMPALA-10720) and coordinator 
registration(IMPALA-9976) ensure that consistency is maintained between these 
mechanism.
 A possible implementation is:
 - Use statestore as the only source of truth.
 ** Consistency: Only allow a coord to register if it is registered with the 
statestore
 ** Atomicity: If the statestore update signals that a coord is down, remove 
all its state (running and queued queries) before you allow it to register again


> Ensure consistency between failure detection and registration/Ack of a 
> coordinator by the admission service
> ---
>
> Key: IMPALA-10866
> URL: https://issues.apache.org/jira/browse/IMPALA-10866
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 4.0.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
>
> Ensure consistency between failure detection and registration/Ack of a 
> coordinator by the admission service.
>  Currently admission service utilizes the statestore membership updates to 
> detect a coordinator going down but it still services RPCs from that 
> coordinator if it is still up and able to contact the admission service.
>  Using the current mechanisms of statestore updates(IMPALA-10594), admission 
> heartbeats(IMPALA-10590, IMPALA-10720) and coordinator 
> registration(IMPALA-9976) ensure that consistency is maintained between these 
> mechanism.
>  A possible implementation is:
>  - Use statestore as the only source of truth.
>  ** Consistency: Only allow a coord to register if it is registered with the 
> statestore
>  ** Atomicity: If the statestore update signals that a coord is down, remove 
> all its state (running and queued queries) before you allow it to register 
> again
> OR
> Eventual consistency: We remove queries between subsequent statestore updates 
> and if the coord comes back up and sends the full admission state, we can 
> update the state of that query id if it has not been removed yet (since the 
> full admission state only contains running queries)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10866) Ensure consistency between failure detection and registration/Ack of a coordinator by the admission service

2021-08-17 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10866:
---

 Summary: Ensure consistency between failure detection and 
registration/Ack of a coordinator by the admission service
 Key: IMPALA-10866
 URL: https://issues.apache.org/jira/browse/IMPALA-10866
 Project: IMPALA
  Issue Type: Sub-task
Affects Versions: Impala 4.0.0
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


Ensure consistency between failure detection and registration/Ack of a 
coordinator by the admission service.
 Currently admission service utilizes the statestore membership updates to 
detect a coordinator going down but it still services RPCs from that 
coordinator if it is still up and able to contact the admission service.
 Using the current mechanisms of statestore updates(IMPALA-10594), admission 
heartbeats(IMPALA-10590, IMPALA-10720) and coordinator 
registration(IMPALA-9976) ensure that consistency is maintained between these 
mechanism.
 A possible implementation is:
 - Use statestore as the only source of truth.
 ** Consistency: Only allow a coord to register if it is registered with the 
statestore
 ** Atomicity: If the statestore update signals that a coord is down, remove 
all its state (running and queued queries) before you allow it to register again



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10866) Ensure consistency between failure detection and registration/Ack of a coordinator by the admission service

2021-08-17 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10866:
---

 Summary: Ensure consistency between failure detection and 
registration/Ack of a coordinator by the admission service
 Key: IMPALA-10866
 URL: https://issues.apache.org/jira/browse/IMPALA-10866
 Project: IMPALA
  Issue Type: Sub-task
Affects Versions: Impala 4.0.0
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


Ensure consistency between failure detection and registration/Ack of a 
coordinator by the admission service.
 Currently admission service utilizes the statestore membership updates to 
detect a coordinator going down but it still services RPCs from that 
coordinator if it is still up and able to contact the admission service.
 Using the current mechanisms of statestore updates(IMPALA-10594), admission 
heartbeats(IMPALA-10590, IMPALA-10720) and coordinator 
registration(IMPALA-9976) ensure that consistency is maintained between these 
mechanism.
 A possible implementation is:
 - Use statestore as the only source of truth.
 ** Consistency: Only allow a coord to register if it is registered with the 
statestore
 ** Atomicity: If the statestore update signals that a coord is down, remove 
all its state (running and queued queries) before you allow it to register again



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-10720) Add versioning to admission heartbeats

2021-08-17 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10720.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Add versioning to admission heartbeats
> --
>
> Key: IMPALA-10720
> URL: https://issues.apache.org/jira/browse/IMPALA-10720
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
> Fix For: Impala 4.1.0
>
>
> IMPALA-10590 added the mechanism of syncing currently running queries with 
> the admission service by sending a list of running query ids. An out of order 
> heartbeat can cause new queries to fail as they might not be a part of that 
> heartbeat. This can be fixed by keeping track of the latest version of 
> heartbeat that was last processed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10720) Add versioning to admission heartbeats

2021-08-17 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10720.
-
Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Add versioning to admission heartbeats
> --
>
> Key: IMPALA-10720
> URL: https://issues.apache.org/jira/browse/IMPALA-10720
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
> Fix For: Impala 4.1.0
>
>
> IMPALA-10590 added the mechanism of syncing currently running queries with 
> the admission service by sending a list of running query ids. An out of order 
> heartbeat can cause new queries to fail as they might not be a part of that 
> heartbeat. This can be fixed by keeping track of the latest version of 
> heartbeat that was last processed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8762) Track number of running queries on all backends in admission controller

2021-07-28 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-8762.

Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Track number of running queries on all backends in admission controller
> ---
>
> Key: IMPALA-8762
> URL: https://issues.apache.org/jira/browse/IMPALA-8762
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Lars Volker
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, scalability
> Fix For: Impala 4.1.0
>
>
> To support running multiple coordinators with executor groups and slot based 
> admission checks, all executors need to include the number of currently 
> running queries in their statestore updates, similar to mem reserved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8762) Track number of running queries on all backends in admission controller

2021-07-28 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-8762.

Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Track number of running queries on all backends in admission controller
> ---
>
> Key: IMPALA-8762
> URL: https://issues.apache.org/jira/browse/IMPALA-8762
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Lars Volker
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, scalability
> Fix For: Impala 4.1.0
>
>
> To support running multiple coordinators with executor groups and slot based 
> admission checks, all executors need to include the number of currently 
> running queries in their statestore updates, similar to mem reserved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IMPALA-8762) Track number of running queries on all backends in admission controller

2021-07-13 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-8762:
--

Assignee: Bikramjeet Vig

> Track number of running queries on all backends in admission controller
> ---
>
> Key: IMPALA-8762
> URL: https://issues.apache.org/jira/browse/IMPALA-8762
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Lars Volker
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, scalability
>
> To support running multiple coordinators with executor groups and slot based 
> admission checks, all executors need to include the number of currently 
> running queries in their statestore updates, similar to mem reserved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10783) run_and_verify_query_cancellation_test flakiness and improper error handling in TestImpalaShell

2021-07-08 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10783:
---

 Summary: run_and_verify_query_cancellation_test flakiness and 
improper error handling in TestImpalaShell
 Key: IMPALA-10783
 URL: https://issues.apache.org/jira/browse/IMPALA-10783
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 4.0
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


Some tests in TestImpalaShell run impala-shell in a seperate process but don't 
handle the case where the test can fail and the impala-shell process can linger 
on.

One such test run_and_verify_query_cancellation_test, failed due to flakiness 
and since it ran a query that returned a large result, the impala-shell process 
lingered on while fetching results. This caused the query to hold on to 
resources and starve the cluster of memory which caused other tests to fail due 
to not enough memory being available.

The flakiness in run_and_verify_query_cancellation_test was:
{noformat}
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/test_shell_commandline.py:414:
 in test_query_cancellation_during_wait_to_finish
self.run_and_verify_query_cancellation_test(vector, stmt, "RUNNING")
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/test_shell_commandline.py:422:
 in run_and_verify_query_cancellation_test
wait_for_query_state(vector, stmt, cancel_at_state)
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/util.py:330:
 in wait_for_query_state
raise Exception(exc_text)
E   Exception: The found in flight query is not the one under test: set all
{noformat}
the test checked for running queries too fast while the impala-shell was 
starting up. the impala-shell runs "set all" when it starts which the test 
picked up and raised an error thinking it did find its query.

The result of this lingering query caused other tests to fail and throw errors 
like:

{noformat}
query_test/test_tpcds_queries.py:107: in test_tpcds_q18a
self.run_test_case(self.get_workload() + '-q18a', vector)
common/impala_test_suite.py:678: in run_test_case
result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
common/impala_test_suite.py:616: in __exec_in_impala
result = self.__execute_query(target_impalad_client, query, user=user)
common/impala_test_suite.py:936: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:205: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:189: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:367: in __execute_query
self.wait_for_finished(handle)
beeswax/impala_beeswax.py:388: in wait_for_finished
raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EQuery aborted:Failed to get minimum memory reservation of 452.19 MB on 
daemon impala-ec2-centos74-m5-4xlarge-ondemand-191d.vpc.cloudera.com:27002 for 
query 394b7f96d554f99c:6882496c due to following error: Failed to 
increase reservation by 452.19 MB because it would exceed the applicable 
reservation limit for the "Process" ReservationTracker: reservation_limit=10.20 
GB reservation=9.91 GB used_reservation=0 child_reservations=9.91 GB
E   The top 5 queries that allocated memory under this tracker are:
E   Query(fa4ece9474a3f865:1b284e67): Reservation=9.60 GB 
ReservationLimit=9.60 GB OtherMemory=118.01 MB Total=9.71 GB Peak=9.71 GB
E   Query(534d07950247ae68:6f5a410d): Reservation=123.50 MB 
ReservationLimit=9.60 GB OtherMemory=2.68 MB Total=126.18 MB Peak=317.02 MB
E   Query(2e4f087aa8263e23:e697d8e8): Reservation=50.81 MB 
ReservationLimit=9.60 GB OtherMemory=42.62 MB Total=93.43 MB Peak=173.74 MB
E   Query(6e459d892dfa5050:5959219b): Reservation=28.88 MB 
ReservationLimit=9.60 GB OtherMemory=18.77 MB Total=47.64 MB Peak=53.11 MB
E   Query(ad455bea2e0adc64:2b0bbf35): Reservation=17.94 MB 
ReservationLimit=9.60 GB OtherMemory=15.22 MB Total=33.16 MB Peak=163.99 MB
E   
E   
E   
E   
E   
E   Memory is likely oversubscribed. Reducing query concurrency or configuring 
admission control may help avoid this error.
{noformat}

Logs confirmed that fa4ece9474a3f865:1b284e67 is the query id of the 
query that run_and_verify_query_cancellation_test ran.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10783) run_and_verify_query_cancellation_test flakiness and improper error handling in TestImpalaShell

2021-07-08 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10783:
---

 Summary: run_and_verify_query_cancellation_test flakiness and 
improper error handling in TestImpalaShell
 Key: IMPALA-10783
 URL: https://issues.apache.org/jira/browse/IMPALA-10783
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 4.0
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


Some tests in TestImpalaShell run impala-shell in a seperate process but don't 
handle the case where the test can fail and the impala-shell process can linger 
on.

One such test run_and_verify_query_cancellation_test, failed due to flakiness 
and since it ran a query that returned a large result, the impala-shell process 
lingered on while fetching results. This caused the query to hold on to 
resources and starve the cluster of memory which caused other tests to fail due 
to not enough memory being available.

The flakiness in run_and_verify_query_cancellation_test was:
{noformat}
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/test_shell_commandline.py:414:
 in test_query_cancellation_during_wait_to_finish
self.run_and_verify_query_cancellation_test(vector, stmt, "RUNNING")
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/test_shell_commandline.py:422:
 in run_and_verify_query_cancellation_test
wait_for_query_state(vector, stmt, cancel_at_state)
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/util.py:330:
 in wait_for_query_state
raise Exception(exc_text)
E   Exception: The found in flight query is not the one under test: set all
{noformat}
the test checked for running queries too fast while the impala-shell was 
starting up. the impala-shell runs "set all" when it starts which the test 
picked up and raised an error thinking it did find its query.

The result of this lingering query caused other tests to fail and throw errors 
like:

{noformat}
query_test/test_tpcds_queries.py:107: in test_tpcds_q18a
self.run_test_case(self.get_workload() + '-q18a', vector)
common/impala_test_suite.py:678: in run_test_case
result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
common/impala_test_suite.py:616: in __exec_in_impala
result = self.__execute_query(target_impalad_client, query, user=user)
common/impala_test_suite.py:936: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:205: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:189: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:367: in __execute_query
self.wait_for_finished(handle)
beeswax/impala_beeswax.py:388: in wait_for_finished
raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EQuery aborted:Failed to get minimum memory reservation of 452.19 MB on 
daemon impala-ec2-centos74-m5-4xlarge-ondemand-191d.vpc.cloudera.com:27002 for 
query 394b7f96d554f99c:6882496c due to following error: Failed to 
increase reservation by 452.19 MB because it would exceed the applicable 
reservation limit for the "Process" ReservationTracker: reservation_limit=10.20 
GB reservation=9.91 GB used_reservation=0 child_reservations=9.91 GB
E   The top 5 queries that allocated memory under this tracker are:
E   Query(fa4ece9474a3f865:1b284e67): Reservation=9.60 GB 
ReservationLimit=9.60 GB OtherMemory=118.01 MB Total=9.71 GB Peak=9.71 GB
E   Query(534d07950247ae68:6f5a410d): Reservation=123.50 MB 
ReservationLimit=9.60 GB OtherMemory=2.68 MB Total=126.18 MB Peak=317.02 MB
E   Query(2e4f087aa8263e23:e697d8e8): Reservation=50.81 MB 
ReservationLimit=9.60 GB OtherMemory=42.62 MB Total=93.43 MB Peak=173.74 MB
E   Query(6e459d892dfa5050:5959219b): Reservation=28.88 MB 
ReservationLimit=9.60 GB OtherMemory=18.77 MB Total=47.64 MB Peak=53.11 MB
E   Query(ad455bea2e0adc64:2b0bbf35): Reservation=17.94 MB 
ReservationLimit=9.60 GB OtherMemory=15.22 MB Total=33.16 MB Peak=163.99 MB
E   
E   
E   
E   
E   
E   Memory is likely oversubscribed. Reducing query concurrency or configuring 
admission control may help avoid this error.
{noformat}

Logs confirmed that fa4ece9474a3f865:1b284e67 is the query id of the 
query that run_and_verify_query_cancellation_test ran.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMPALA-10767) Fix handling of queued queries for coordinator failure modes and during cancellation

2021-06-24 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10767:

Description: 
IMPALA-10594 and IMPALA-10590 do not ensure that queued queries are removed 
from the admission-controller and admission_state_map_ . A situation can arise 
where the coordinator that got killed did not get a chance of calling 
GetQueryStatus() which calls WaitOnQueued() for queued queries. This results in 
a memory leak where the queue_node in admission-controller and the 
admission_state in admission_state_map_ are never removed.
 Moreover, queued queries can get into an undesirable state where if the failed 
coord is not in the cluster_membership, the query will stay in the queue 
indefinitely as it would keep hitting the unable to deque condition where the 
coordinator is not registered in the cluster_membership yet.

Another undesirable condition can arise for queued queries that were canceled, 
these never get removed from the admission_state_map_ as entries in it are only 
removed when a running query is released, running queries are synced via 
admission heartbeat, and all running queries are removed when the coordinator 
goes down. (running queries refers to the queries that have been successfully 
admitted)

  was:
IMPALA-10594 and IMPALA-10590 do not ensure that queued queries are removed 
from the admission-controller and admission_state_map_ . A situation can arise 
where the coordinator that got killed did not get a chance of calling 
GetQueryStatus() which calls WaitOnQueued() for queued queries. This results in 
a memory leak where the queue_node in admission-controller and the 
admission_state in admission_state_map_ are never removed.
Moreover, queued queries can get into an undesirable state where if the failed 
coord is not in the cluster_membership, the query will stay in the queue 
indefinitely as it would keep hitting the unable to deque condition where the 
coordinator is not registered in the cluster_membership yet.

Another undesirable condition can arise for queued queries that were canceled, 
these never get removed from the admission_state_map_ as entries in it are only 
removed when a running query is released, running queries are synced via 
admission heartbeat, and all running queries are removed when the coordinator 
goes down.


> Fix handling of queued queries for coordinator failure modes and during 
> cancellation
> 
>
> Key: IMPALA-10767
> URL: https://issues.apache.org/jira/browse/IMPALA-10767
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
>
> IMPALA-10594 and IMPALA-10590 do not ensure that queued queries are removed 
> from the admission-controller and admission_state_map_ . A situation can 
> arise where the coordinator that got killed did not get a chance of calling 
> GetQueryStatus() which calls WaitOnQueued() for queued queries. This results 
> in a memory leak where the queue_node in admission-controller and the 
> admission_state in admission_state_map_ are never removed.
>  Moreover, queued queries can get into an undesirable state where if the 
> failed coord is not in the cluster_membership, the query will stay in the 
> queue indefinitely as it would keep hitting the unable to deque condition 
> where the coordinator is not registered in the cluster_membership yet.
> Another undesirable condition can arise for queued queries that were 
> canceled, these never get removed from the admission_state_map_ as entries in 
> it are only removed when a running query is released, running queries are 
> synced via admission heartbeat, and all running queries are removed when the 
> coordinator goes down. (running queries refers to the queries that have been 
> successfully admitted)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9976) Implement recovery for the admission control service

2021-06-24 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-9976:
--

Assignee: Bikramjeet Vig

> Implement recovery for the admission control service
> 
>
> Key: IMPALA-9976
> URL: https://issues.apache.org/jira/browse/IMPALA-9976
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Reporter: Thomas Tauber-Marshall
>Assignee: Bikramjeet Vig
>Priority: Major
>
> If the new admission control daemon fails, it would be good to be able to 
> recover gracefully.
> Most of the admission control state is already stored in the statestore, so 
> it should be possible to launch a new admission control daemon and have it 
> catch up by retrieving everything from the statestore. Already running 
> queries should not be affected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10767) Fix handling of queued queries for coordinator failure modes and during cancellation

2021-06-24 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10767:
---

 Summary: Fix handling of queued queries for coordinator failure 
modes and during cancellation
 Key: IMPALA-10767
 URL: https://issues.apache.org/jira/browse/IMPALA-10767
 Project: IMPALA
  Issue Type: Sub-task
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


IMPALA-10594 and IMPALA-10590 do not ensure that queued queries are removed 
from the admission-controller and admission_state_map_ . A situation can arise 
where the coordinator that got killed did not get a chance of calling 
GetQueryStatus() which calls WaitOnQueued() for queued queries. This results in 
a memory leak where the queue_node in admission-controller and the 
admission_state in admission_state_map_ are never removed.
Moreover, queued queries can get into an undesirable state where if the failed 
coord is not in the cluster_membership, the query will stay in the queue 
indefinitely as it would keep hitting the unable to deque condition where the 
coordinator is not registered in the cluster_membership yet.

Another undesirable condition can arise for queued queries that were canceled, 
these never get removed from the admission_state_map_ as entries in it are only 
removed when a running query is released, running queries are synced via 
admission heartbeat, and all running queries are removed when the coordinator 
goes down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10767) Fix handling of queued queries for coordinator failure modes and during cancellation

2021-06-24 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10767:
---

 Summary: Fix handling of queued queries for coordinator failure 
modes and during cancellation
 Key: IMPALA-10767
 URL: https://issues.apache.org/jira/browse/IMPALA-10767
 Project: IMPALA
  Issue Type: Sub-task
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


IMPALA-10594 and IMPALA-10590 do not ensure that queued queries are removed 
from the admission-controller and admission_state_map_ . A situation can arise 
where the coordinator that got killed did not get a chance of calling 
GetQueryStatus() which calls WaitOnQueued() for queued queries. This results in 
a memory leak where the queue_node in admission-controller and the 
admission_state in admission_state_map_ are never removed.
Moreover, queued queries can get into an undesirable state where if the failed 
coord is not in the cluster_membership, the query will stay in the queue 
indefinitely as it would keep hitting the unable to deque condition where the 
coordinator is not registered in the cluster_membership yet.

Another undesirable condition can arise for queued queries that were canceled, 
these never get removed from the admission_state_map_ as entries in it are only 
removed when a running query is released, running queries are synced via 
admission heartbeat, and all running queries are removed when the coordinator 
goes down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-10720) Add versioning to admission heartbeats

2021-05-26 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10720:
---

 Summary: Add versioning to admission heartbeats
 Key: IMPALA-10720
 URL: https://issues.apache.org/jira/browse/IMPALA-10720
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Affects Versions: Impala 4.0
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


IMPALA-10590 added the mechanism of syncing currently running queries with the 
admission service by sending a list of running query ids. An out of order 
heartbeat can cause new queries to fail as they might not be a part of that 
heartbeat. This can be fixed by keeping track of the latest version of 
heartbeat that was last processed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-10720) Add versioning to admission heartbeats

2021-05-26 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10720:
---

 Summary: Add versioning to admission heartbeats
 Key: IMPALA-10720
 URL: https://issues.apache.org/jira/browse/IMPALA-10720
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Affects Versions: Impala 4.0
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


IMPALA-10590 added the mechanism of syncing currently running queries with the 
admission service by sending a list of running query ids. An out of order 
heartbeat can cause new queries to fail as they might not be a part of that 
heartbeat. This can be fixed by keeping track of the latest version of 
heartbeat that was last processed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10596) TestAdmissionControllerStressWithACService.test_mem_limit fails with "Invalid or unknown query handle" when canceling a query

2021-04-05 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315056#comment-17315056
 ] 

Bikramjeet Vig commented on IMPALA-10596:
-

The previous [patch|http://gerrit.cloudera.org:8080/17256] finally allowed the 
actual error to be logged.
{noformat}
custom_cluster/test_admission_controller.py:2078: in test_mem_limit
{'request_pool': self.pool_name, 'mem_limit': query_mem_limit})
custom_cluster/test_admission_controller.py:1936: in run_admission_test
assert metric_deltas['dequeued'] == 0,\
E   AssertionError: Queued queries should not run until others are made to 
finish
E   assert 4 == 0
{noformat}
As I suspected, after turning on result spooling by default the queries are 
finishing early and releasing resources which allows queued queries to run. 
This test uses a long running query that returns a lot of rows to control 
resources being held and result spooling messes with that control. Since 
adapting this to result spooling would not add any test coverage benefit, we 
can just turn it off for these stress tests.

> TestAdmissionControllerStressWithACService.test_mem_limit fails with "Invalid 
> or unknown query handle" when canceling a query
> -
>
> Key: IMPALA-10596
> URL: https://issues.apache.org/jira/browse/IMPALA-10596
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Andrew Sherman
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> TestAdmissionControllerStress.test_mem_limit fails similarly
> {code}
> custom_cluster/test_admission_controller.py:1437: in teardown
> client.cancel(thread.query_handle)
> common/impala_connection.py:215: in cancel
> return self.__beeswax_client.cancel_query(operation_handle.get_handle())
> beeswax/impala_beeswax.py:369: in cancel_query
> return self.__do_rpc(lambda: self.imp_service.Cancel(query_id))
> beeswax/impala_beeswax.py:520: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: Invalid or unknown query handle: 
> 174962332188aac2:1713d0fe.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10397) TestAutoScaling.test_single_workload failed in exhaustive release build

2021-03-31 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10397.
-
Resolution: Fixed

> TestAutoScaling.test_single_workload failed in exhaustive release build
> ---
>
> Key: IMPALA-10397
> URL: https://issues.apache.org/jira/browse/IMPALA-10397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Zoltán Borók-Nagy
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> TestAutoScaling.test_single_workload failed in an exhaustive release build.
> *Error details*
> AssertionError: Number of backends did not reach 5 within 45 s assert 
> any( at 0x7f772c155e10>)
> *Stack trace*
> {noformat}
> custom_cluster/test_auto_scaling.py:95: in test_single_workload
>  assert any(self._get_num_backends() >= cluster_size or sleep(1)
> E AssertionError: Number of backends did not reach 5 within 45 s
> E assert any( at 0x7f772c155e10>){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-10397) TestAutoScaling.test_single_workload failed in exhaustive release build

2021-03-31 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10397.
-
Resolution: Fixed

> TestAutoScaling.test_single_workload failed in exhaustive release build
> ---
>
> Key: IMPALA-10397
> URL: https://issues.apache.org/jira/browse/IMPALA-10397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Zoltán Borók-Nagy
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> TestAutoScaling.test_single_workload failed in an exhaustive release build.
> *Error details*
> AssertionError: Number of backends did not reach 5 within 45 s assert 
> any( at 0x7f772c155e10>)
> *Stack trace*
> {noformat}
> custom_cluster/test_auto_scaling.py:95: in test_single_workload
>  assert any(self._get_num_backends() >= cluster_size or sleep(1)
> E AssertionError: Number of backends did not reach 5 within 45 s
> E assert any( at 0x7f772c155e10>){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-10596) TestAdmissionControllerStressWithACService.test_mem_limit fails with "Invalid or unknown query handle" when canceling a query

2021-03-31 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311963#comment-17311963
 ] 

Bikramjeet Vig edited comment on IMPALA-10596 at 3/31/21, 5:56 PM:
---

>From the logs it seems like the client running the query was closed earlier 
>than the teardown() method. If the client is closed an Unregistered is 
>triggered which is why when the teardown issues a cancel on that query_id it 
>hits the aforementioned error.
The last part of the log is always "MainThread: Found all 8 admitted threads 
after 0.0 seconds".
I think what might be happening here is that it hits an exception which causes 
the SubmitQueryThread to exit and close the client in its _finally_ block 
before teardown gets a chance to cancel the query. We also don't get to the see 
log lines printed before the client is closed, not sure if that is because the 
logger does not get a chance to flush before the process dies.

Filtered logs  from on the failed builds:
{noformat}
19643:I0317 04:04:55.293568 21378 impala-server.cc:1388] UnregisterQuery(): 
query_id=6a4a4ca00aeb3326:0eb710ec
19644:I0317 04:04:55.293584 21378 coordinator.cc:706] ExecState: query 
id=6a4a4ca00aeb3326:0eb710ec execution cancelled
19645:I0317 04:04:55.293594 21378 coordinator-backend-state.cc:974] 
query_id=6a4a4ca00aeb3326:0eb710ec target backend=127.0.0.1:27000: 
Sending CancelQueryFInstances rpc
19668:I0317 04:04:55.300276 20592 control-service.cc:219] 
CancelQueryFInstances(): query_id=6a4a4ca00aeb3326:0eb710ec
19669:I0317 04:04:55.300292 20592 query-exec-mgr.cc:126] QueryState: 
query_id=6a4a4ca00aeb3326:0eb710ec refcnt=5
19670:I0317 04:04:55.300297 20592 query-state.cc:974] Cancel: 
query_id=6a4a4ca00aeb3326:0eb710ec
19671:I0317 04:04:55.300305 20592 krpc-data-stream-mgr.cc:339] cancelling 
active streams for query_id=6a4a4ca00aeb3326:0eb710ec
19672:I0317 04:04:55.300380 21466 krpc-data-stream-mgr.cc:308] 
6a4a4ca00aeb3326:0eb710ec] DeregisterRecvr(): 
fragment_instance_id=6a4a4ca00aeb3326:0eb710ec, node=31
19673:I0317 04:04:55.300405 21378 coordinator-backend-state.cc:974] 
query_id=6a4a4ca00aeb3326:0eb710ec target backend=127.0.0.1:27002: Not 
cancelling because the backend is already done: 
19674:I0317 04:04:55.300420 21378 coordinator-backend-state.cc:974] 
query_id=6a4a4ca00aeb3326:0eb710ec target backend=127.0.0.1:27001: Not 
cancelling because the backend is already done: 
19675:I0317 04:04:55.300426 21378 coordinator.cc:994] CancelBackends() 
query_id=6a4a4ca00aeb3326:0eb710ec, tried to cancel 1 backends
19680:I0317 04:04:55.300539 21378 coordinator.cc:1370] Release admission 
control resources for query_id=6a4a4ca00aeb3326:0eb710ec
19704:I0317 04:04:55.309382 22599 impala-server.cc:1577] Cancel(): 
query_id=6a4a4ca00aeb3326:0eb710ec
19705:I0317 04:04:55.309398 22599 impala-server.cc:1538] Invalid or unknown 
query handle: 6a4a4ca00aeb3326:0eb710ec.
19751:I0317 04:04:55.317633 21466 query-state.cc:957] 
6a4a4ca00aeb3326:0eb710ec] Instance completed. 
instance_id=6a4a4ca00aeb3326:0eb710ec #in-flight=14 status=CANCELLED: 
Cancelled
19752:I0317 04:04:55.317644 21448 query-state.cc:468] 
6a4a4ca00aeb3326:0eb710ec] UpdateBackendExecState(): last report for 
6a4a4ca00aeb3326:0eb710ec
19754:I0317 04:04:55.320560 21464 query-state.cc:957] 
6a4a4ca00aeb3326:0eb710ec0002] Instance completed. 
instance_id=6a4a4ca00aeb3326:0eb710ec0002 #in-flight=12 status=CANCELLED: 
Cancelled
19765:I0317 04:04:55.334306 20592 impala-server.cc:1551] Invalid or unknown 
query handle: 6a4a4ca00aeb3326:0eb710ec.
19766:I0317 04:04:55.334322 20592 control-service.cc:179] ReportExecStatus(): 
Received report for unknown query ID (probably closed or cancelled): 
6a4a4ca00aeb3326:0eb710ec remote host=127.0.0.1:59410
19767:I0317 04:04:55.334400 21448 query-state.cc:738] 
6a4a4ca00aeb3326:0eb710ec] Cancelling fragment instances as directed by 
the coordinator. Returned status: ReportExecStatus(): Received report for 
unknown query ID (probably closed or cancelled): 
6a4a4ca00aeb3326:0eb710ec remote host=127.0.0.1:59410
19768:I0317 04:04:55.334416 21448 query-state.cc:974] 
6a4a4ca00aeb3326:0eb710ec] Cancel: 
query_id=6a4a4ca00aeb3326:0eb710ec
19773:I0317 04:04:55.348507 21253 impala-server.cc:1420] Query successfully 
unregistered: query_id=6a4a4ca00aeb3326:0eb710ec
19775:I0317 04:04:55.376655 21253 query-exec-mgr.cc:213] ReleaseQueryState(): 
deleted query_id=6a4a4ca00aeb3326:0eb710ec
{noformat}

I'll try to simulate this on my local and see if i hit the same behavior


was (Author: bikramjeet.vig):
>From the logs it seems like the client running the query was closed earlier 
>than the teardown() method. If the client is closed an 

[jira] [Commented] (IMPALA-10596) TestAdmissionControllerStressWithACService.test_mem_limit fails with "Invalid or unknown query handle" when canceling a query

2021-03-31 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312587#comment-17312587
 ] 

Bikramjeet Vig commented on IMPALA-10596:
-

So I tried to simulate this on my local by introducing a failed assert right 
after we call self.wait_for_admitted_threads(metric_deltas['admitted']) and the 
log line "Found all 6 admitted threads after 0.0 seconds" is printed. I noticed 
that teardown is not called right away and instead all the running threads are 
just waiting in the  "while not self.shutdown" loop and when they error out of 
that, the client gets closed and eventually teardown is called which then 
errors out since the query_id was already closed when the client closed.
I'll push out a patch that removes the thread.query_handle if the client is 
closed. That should fix this issue *and expose the real assert which was hit 
that failed the test*. However, I am still not sure why the test just keeps 
waiting for the other threads to finish before calling teardown.

> TestAdmissionControllerStressWithACService.test_mem_limit fails with "Invalid 
> or unknown query handle" when canceling a query
> -
>
> Key: IMPALA-10596
> URL: https://issues.apache.org/jira/browse/IMPALA-10596
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Andrew Sherman
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> TestAdmissionControllerStress.test_mem_limit fails similarly
> {code}
> custom_cluster/test_admission_controller.py:1437: in teardown
> client.cancel(thread.query_handle)
> common/impala_connection.py:215: in cancel
> return self.__beeswax_client.cancel_query(operation_handle.get_handle())
> beeswax/impala_beeswax.py:369: in cancel_query
> return self.__do_rpc(lambda: self.imp_service.Cancel(query_id))
> beeswax/impala_beeswax.py:520: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: Invalid or unknown query handle: 
> 174962332188aac2:1713d0fe.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10596) TestAdmissionControllerStressWithACService.test_mem_limit fails with "Invalid or unknown query handle" when canceling a query

2021-03-31 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-10596:
---

Assignee: Bikramjeet Vig  (was: Thomas Tauber-Marshall)

> TestAdmissionControllerStressWithACService.test_mem_limit fails with "Invalid 
> or unknown query handle" when canceling a query
> -
>
> Key: IMPALA-10596
> URL: https://issues.apache.org/jira/browse/IMPALA-10596
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Andrew Sherman
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> TestAdmissionControllerStress.test_mem_limit fails similarly
> {code}
> custom_cluster/test_admission_controller.py:1437: in teardown
> client.cancel(thread.query_handle)
> common/impala_connection.py:215: in cancel
> return self.__beeswax_client.cancel_query(operation_handle.get_handle())
> beeswax/impala_beeswax.py:369: in cancel_query
> return self.__do_rpc(lambda: self.imp_service.Cancel(query_id))
> beeswax/impala_beeswax.py:520: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: Invalid or unknown query handle: 
> 174962332188aac2:1713d0fe.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10596) TestAdmissionControllerStressWithACService.test_mem_limit fails with "Invalid or unknown query handle" when canceling a query

2021-03-30 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311963#comment-17311963
 ] 

Bikramjeet Vig commented on IMPALA-10596:
-

>From the logs it seems like the client running the query was closed earlier 
>than the teardown() method. If the client is closed an Unregistered is 
>triggered which is why when the teardown issues a cancel on that query_id it 
>hits the aforementioned error.
The last part of the log is always "MainThread: Found all 8 admitted threads 
after 0.0 seconds".
I think what might be happening here is that it hits an exception which causes 
the SubmitQueryThread to exit and close the client in its _finally_ block 
before teardown gets a change to cancel the query. We also don't get to the see 
log lines printed before the client is closed, not sure if that is because the 
logger does not get a chance to flush before the process dies.

Filtered logs  from on the failed builds:
{noformat}
19643:I0317 04:04:55.293568 21378 impala-server.cc:1388] UnregisterQuery(): 
query_id=6a4a4ca00aeb3326:0eb710ec
19644:I0317 04:04:55.293584 21378 coordinator.cc:706] ExecState: query 
id=6a4a4ca00aeb3326:0eb710ec execution cancelled
19645:I0317 04:04:55.293594 21378 coordinator-backend-state.cc:974] 
query_id=6a4a4ca00aeb3326:0eb710ec target backend=127.0.0.1:27000: 
Sending CancelQueryFInstances rpc
19668:I0317 04:04:55.300276 20592 control-service.cc:219] 
CancelQueryFInstances(): query_id=6a4a4ca00aeb3326:0eb710ec
19669:I0317 04:04:55.300292 20592 query-exec-mgr.cc:126] QueryState: 
query_id=6a4a4ca00aeb3326:0eb710ec refcnt=5
19670:I0317 04:04:55.300297 20592 query-state.cc:974] Cancel: 
query_id=6a4a4ca00aeb3326:0eb710ec
19671:I0317 04:04:55.300305 20592 krpc-data-stream-mgr.cc:339] cancelling 
active streams for query_id=6a4a4ca00aeb3326:0eb710ec
19672:I0317 04:04:55.300380 21466 krpc-data-stream-mgr.cc:308] 
6a4a4ca00aeb3326:0eb710ec] DeregisterRecvr(): 
fragment_instance_id=6a4a4ca00aeb3326:0eb710ec, node=31
19673:I0317 04:04:55.300405 21378 coordinator-backend-state.cc:974] 
query_id=6a4a4ca00aeb3326:0eb710ec target backend=127.0.0.1:27002: Not 
cancelling because the backend is already done: 
19674:I0317 04:04:55.300420 21378 coordinator-backend-state.cc:974] 
query_id=6a4a4ca00aeb3326:0eb710ec target backend=127.0.0.1:27001: Not 
cancelling because the backend is already done: 
19675:I0317 04:04:55.300426 21378 coordinator.cc:994] CancelBackends() 
query_id=6a4a4ca00aeb3326:0eb710ec, tried to cancel 1 backends
19680:I0317 04:04:55.300539 21378 coordinator.cc:1370] Release admission 
control resources for query_id=6a4a4ca00aeb3326:0eb710ec
19704:I0317 04:04:55.309382 22599 impala-server.cc:1577] Cancel(): 
query_id=6a4a4ca00aeb3326:0eb710ec
19705:I0317 04:04:55.309398 22599 impala-server.cc:1538] Invalid or unknown 
query handle: 6a4a4ca00aeb3326:0eb710ec.
19751:I0317 04:04:55.317633 21466 query-state.cc:957] 
6a4a4ca00aeb3326:0eb710ec] Instance completed. 
instance_id=6a4a4ca00aeb3326:0eb710ec #in-flight=14 status=CANCELLED: 
Cancelled
19752:I0317 04:04:55.317644 21448 query-state.cc:468] 
6a4a4ca00aeb3326:0eb710ec] UpdateBackendExecState(): last report for 
6a4a4ca00aeb3326:0eb710ec
19754:I0317 04:04:55.320560 21464 query-state.cc:957] 
6a4a4ca00aeb3326:0eb710ec0002] Instance completed. 
instance_id=6a4a4ca00aeb3326:0eb710ec0002 #in-flight=12 status=CANCELLED: 
Cancelled
19765:I0317 04:04:55.334306 20592 impala-server.cc:1551] Invalid or unknown 
query handle: 6a4a4ca00aeb3326:0eb710ec.
19766:I0317 04:04:55.334322 20592 control-service.cc:179] ReportExecStatus(): 
Received report for unknown query ID (probably closed or cancelled): 
6a4a4ca00aeb3326:0eb710ec remote host=127.0.0.1:59410
19767:I0317 04:04:55.334400 21448 query-state.cc:738] 
6a4a4ca00aeb3326:0eb710ec] Cancelling fragment instances as directed by 
the coordinator. Returned status: ReportExecStatus(): Received report for 
unknown query ID (probably closed or cancelled): 
6a4a4ca00aeb3326:0eb710ec remote host=127.0.0.1:59410
19768:I0317 04:04:55.334416 21448 query-state.cc:974] 
6a4a4ca00aeb3326:0eb710ec] Cancel: 
query_id=6a4a4ca00aeb3326:0eb710ec
19773:I0317 04:04:55.348507 21253 impala-server.cc:1420] Query successfully 
unregistered: query_id=6a4a4ca00aeb3326:0eb710ec
19775:I0317 04:04:55.376655 21253 query-exec-mgr.cc:213] ReleaseQueryState(): 
deleted query_id=6a4a4ca00aeb3326:0eb710ec
{noformat}

I'll try to simulate this on my local and see if i hit the same behavior

> TestAdmissionControllerStressWithACService.test_mem_limit fails with "Invalid 
> or unknown query handle" when canceling a query
> 

[jira] [Resolved] (IMPALA-8202) TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid or unknown query handle"

2021-03-30 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-8202.

Resolution: Fixed

The recent re-opening of this issue is due a separate problem which is tracked 
in IMPALA-10596 

> TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid 
> or unknown query handle"
> 
>
> Key: IMPALA-8202
> URL: https://issues.apache.org/jira/browse/IMPALA-8202
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Andrew Sherman
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> teardown() attempts to close each submission thread that was used. But one of 
> them times out.
> {quote}
> 06:05:22  ERROR at teardown of 
> TestAdmissionControllerStress.test_mem_limit[num_queries: 50 | protocol: 
> beeswax | table_format: text/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> submission_delay_ms: 50 | round_robin_submission: True] 
> 06:05:22 custom_cluster/test_admission_controller.py:1004: in teardown
> 06:05:22 client.cancel(thread.query_handle)
> 06:05:22 common/impala_connection.py:183: in cancel
> 06:05:22 return 
> self.__beeswax_client.cancel_query(operation_handle.get_handle())
> 06:05:22 beeswax/impala_beeswax.py:364: in cancel_query
> 06:05:22 return self.__do_rpc(lambda: self.imp_service.Cancel(query_id))
> 06:05:22 beeswax/impala_beeswax.py:512: in __do_rpc
> 06:05:22 raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> 06:05:22 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 06:05:22 EINNER EXCEPTION: 
> 06:05:22 EMESSAGE: Invalid or unknown query handle
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8202) TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid or unknown query handle"

2021-03-30 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-8202.

Resolution: Fixed

The recent re-opening of this issue is due a separate problem which is tracked 
in IMPALA-10596 

> TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid 
> or unknown query handle"
> 
>
> Key: IMPALA-8202
> URL: https://issues.apache.org/jira/browse/IMPALA-8202
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Andrew Sherman
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> teardown() attempts to close each submission thread that was used. But one of 
> them times out.
> {quote}
> 06:05:22  ERROR at teardown of 
> TestAdmissionControllerStress.test_mem_limit[num_queries: 50 | protocol: 
> beeswax | table_format: text/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> submission_delay_ms: 50 | round_robin_submission: True] 
> 06:05:22 custom_cluster/test_admission_controller.py:1004: in teardown
> 06:05:22 client.cancel(thread.query_handle)
> 06:05:22 common/impala_connection.py:183: in cancel
> 06:05:22 return 
> self.__beeswax_client.cancel_query(operation_handle.get_handle())
> 06:05:22 beeswax/impala_beeswax.py:364: in cancel_query
> 06:05:22 return self.__do_rpc(lambda: self.imp_service.Cancel(query_id))
> 06:05:22 beeswax/impala_beeswax.py:512: in __do_rpc
> 06:05:22 raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> 06:05:22 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 06:05:22 EINNER EXCEPTION: 
> 06:05:22 EMESSAGE: Invalid or unknown query handle
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10397) TestAutoScaling.test_single_workload failed in exhaustive release build

2021-03-29 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311057#comment-17311057
 ] 

Bikramjeet Vig commented on IMPALA-10397:
-

This definitely seems like an outcome of my previous change. That part of the 
test that relies on query rate has been historically flaky, I think we can just 
get rid of it since we are already verifying that the cluster grew before this 
(which is what the test is meant to check). I'll remove the flaky part in my 
next patch.

> TestAutoScaling.test_single_workload failed in exhaustive release build
> ---
>
> Key: IMPALA-10397
> URL: https://issues.apache.org/jira/browse/IMPALA-10397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Zoltán Borók-Nagy
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> TestAutoScaling.test_single_workload failed in an exhaustive release build.
> *Error details*
> AssertionError: Number of backends did not reach 5 within 45 s assert 
> any( at 0x7f772c155e10>)
> *Stack trace*
> {noformat}
> custom_cluster/test_auto_scaling.py:95: in test_single_workload
>  assert any(self._get_num_backends() >= cluster_size or sleep(1)
> E AssertionError: Number of backends did not reach 5 within 45 s
> E assert any( at 0x7f772c155e10>){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10397) TestAutoScaling.test_single_workload failed in exhaustive release build

2021-03-22 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17306497#comment-17306497
 ] 

Bikramjeet Vig commented on IMPALA-10397:
-

Looked at the logs and it seems like the autoscaler never stated up a new 
cluster. The autoscaler checks for queued queries every 1 sec, so this can only 
happens if it never notices queries queue. Since this is happening on release 
builds I think the queries are fast enough to finish early and never get 
queued. Increasing the runtime for them should be able to fix this issue

> TestAutoScaling.test_single_workload failed in exhaustive release build
> ---
>
> Key: IMPALA-10397
> URL: https://issues.apache.org/jira/browse/IMPALA-10397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Zoltán Borók-Nagy
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> TestAutoScaling.test_single_workload failed in an exhaustive release build.
> *Error details*
> AssertionError: Number of backends did not reach 5 within 45 s assert 
> any( at 0x7f772c155e10>)
> *Stack trace*
> {noformat}
> custom_cluster/test_auto_scaling.py:95: in test_single_workload
>  assert any(self._get_num_backends() >= cluster_size or sleep(1)
> E AssertionError: Number of backends did not reach 5 within 45 s
> E assert any( at 0x7f772c155e10>){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-10397) TestAutoScaling.test_single_workload failed in exhaustive release build

2021-03-22 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reopened IMPALA-10397:
-

Happened again in some of the release builds

> TestAutoScaling.test_single_workload failed in exhaustive release build
> ---
>
> Key: IMPALA-10397
> URL: https://issues.apache.org/jira/browse/IMPALA-10397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Zoltán Borók-Nagy
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> TestAutoScaling.test_single_workload failed in an exhaustive release build.
> *Error details*
> AssertionError: Number of backends did not reach 5 within 45 s assert 
> any( at 0x7f772c155e10>)
> *Stack trace*
> {noformat}
> custom_cluster/test_auto_scaling.py:95: in test_single_workload
>  assert any(self._get_num_backends() >= cluster_size or sleep(1)
> E AssertionError: Number of backends did not reach 5 within 45 s
> E assert any( at 0x7f772c155e10>){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10499) test_misc failing

2021-02-10 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10499:
---

 Summary: test_misc failing
 Key: IMPALA-10499
 URL: https://issues.apache.org/jira/browse/IMPALA-10499
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Tamas Mate


IMPALA-10379 added this test recently.
{noformat}
query_test/test_queries.py:187: in test_misc
self.run_test_case('QueryTest/misc', vector)
common/impala_test_suite.py:691: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:527: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:409: in verify_raw_results
verify_results(expected_types, actual_types, order_matters=True)
common/test_result_verifier.py:305: in verify_results
assert expected_results == actual_results
E   assert ['INT'] == ['TINYINT']
E At index 0 diff: 'INT' != 'TINYINT'
E Use -v to get the full diff
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10499) test_misc failing

2021-02-10 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10499:
---

 Summary: test_misc failing
 Key: IMPALA-10499
 URL: https://issues.apache.org/jira/browse/IMPALA-10499
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Tamas Mate


IMPALA-10379 added this test recently.
{noformat}
query_test/test_queries.py:187: in test_misc
self.run_test_case('QueryTest/misc', vector)
common/impala_test_suite.py:691: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:527: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:409: in verify_raw_results
verify_results(expected_types, actual_types, order_matters=True)
common/test_result_verifier.py:305: in verify_results
assert expected_results == actual_results
E   assert ['INT'] == ['TINYINT']
E At index 0 diff: 'INT' != 'TINYINT'
E Use -v to get the full diff
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-10397) TestAutoScaling.test_single_workload failed in exhaustive release build

2021-02-10 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10397.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> TestAutoScaling.test_single_workload failed in exhaustive release build
> ---
>
> Key: IMPALA-10397
> URL: https://issues.apache.org/jira/browse/IMPALA-10397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Zoltán Borók-Nagy
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> TestAutoScaling.test_single_workload failed in an exhaustive release build.
> *Error details*
> AssertionError: Number of backends did not reach 5 within 45 s assert 
> any( at 0x7f772c155e10>)
> *Stack trace*
> {noformat}
> custom_cluster/test_auto_scaling.py:95: in test_single_workload
>  assert any(self._get_num_backends() >= cluster_size or sleep(1)
> E AssertionError: Number of backends did not reach 5 within 45 s
> E assert any( at 0x7f772c155e10>){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-10397) TestAutoScaling.test_single_workload failed in exhaustive release build

2021-02-10 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10397.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> TestAutoScaling.test_single_workload failed in exhaustive release build
> ---
>
> Key: IMPALA-10397
> URL: https://issues.apache.org/jira/browse/IMPALA-10397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Zoltán Borók-Nagy
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> TestAutoScaling.test_single_workload failed in an exhaustive release build.
> *Error details*
> AssertionError: Number of backends did not reach 5 within 45 s assert 
> any( at 0x7f772c155e10>)
> *Stack trace*
> {noformat}
> custom_cluster/test_auto_scaling.py:95: in test_single_workload
>  assert any(self._get_num_backends() >= cluster_size or sleep(1)
> E AssertionError: Number of backends did not reach 5 within 45 s
> E assert any( at 0x7f772c155e10>){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10480) heap-use-after-free crash in ASAN build

2021-02-09 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10480.
-
Resolution: Duplicate

> heap-use-after-free crash in ASAN build
> ---
>
> Key: IMPALA-10480
> URL: https://issues.apache.org/jira/browse/IMPALA-10480
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build
>
> Likely candidates that triggered this:
> {noformat}
>  
> query_test.test_tpch_nested_queries.TestTpchNestedQuery.test_tpch_q20[protocol:
>  beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> orc/def/block] 8.4 sec 1
>  query_test.test_tpch_queries.TestTpchQuery.test_tpch[protocol: beeswax | 
> exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> orc/def/block-TPC-H: Q2]8.4 sec 1
>  query_test.test_queries.TestHdfsQueries.test_hdfs_scan_node[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> rc/snap/block]  8.4 sec 1
> {noformat}
> Error:
> {noformat}
> ==28216==ERROR: AddressSanitizer: heap-use-after-free on address 
> 0x7fb838f33800 at pc 0x01b74b61 bp 0x7fb91d19f0c0 sp 0x7fb91d19e870
> READ of size 1048576 at 0x7fb838f33800 thread T82 (rpc reactor-287)
> #0 0x1b74b60 in read_iovec(void*, __sanitizer::__sanitizer_iovec*, 
> unsigned long, unsigned long) 
> /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
> #1 0x1b8b1c1 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, 
> long) 
> /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
> #2 0x1b8daa3 in __interceptor_sendmsg 
> /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
> #3 0x3b1fc7c in kudu::Socket::Writev(iovec const*, int, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3
> #4 0x36ef1d5 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26
> #5 0x36f7c90 in kudu::rpc::Connection::WriteHandler(ev::io&, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31
> #6 0x598c3d2 in ev_invoke_pending 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598c3d2)
> #7 0x3681ffc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3
> #8 0x598fa7f in ev_run 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598fa7f)
> #9 0x36821f1 in kudu::rpc::ReactorThread::RunThread() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9
> #10 0x369392b in boost::_bi::bind_t kudu::rpc::ReactorThread>, 
> boost::_bi::list1 > 
> >::operator()() 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
> #11 0x23f26b6 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
> #12 0x23eef29 in kudu::Thread::SuperviseThread(void*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3
> #13 0x7fc169a0fe24 in start_thread (/lib64/libpthread.so.0+0x7e24)
> #14 0x7fc16645934c in __clone (/lib64/libc.so.6+0xf834c)
> 0x7fb838f33800 is located 0 bytes inside of 1048577-byte region 
> [0x7fb838f33800,0x7fb839033801)
> freed by thread T117 here:
> #0 0x1bfab40 in operator delete(void*) 
> /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/asan_new_delete.cc:137
> #1 0x7fc166d5c5a9 in __gnu_cxx::new_allocator::deallocate(char*, 
> unsigned long) 
> /mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:125
> #2 0x7fc166d5c5a9 in std::allocator_traits 
> 

[jira] [Commented] (IMPALA-10480) heap-use-after-free crash in ASAN build

2021-02-09 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282125#comment-17282125
 ] 

Bikramjeet Vig commented on IMPALA-10480:
-

[~fangyurao] Thanks for letting me know, I'll mark this as a duplicate

> heap-use-after-free crash in ASAN build
> ---
>
> Key: IMPALA-10480
> URL: https://issues.apache.org/jira/browse/IMPALA-10480
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build
>
> Likely candidates that triggered this:
> {noformat}
>  
> query_test.test_tpch_nested_queries.TestTpchNestedQuery.test_tpch_q20[protocol:
>  beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> orc/def/block] 8.4 sec 1
>  query_test.test_tpch_queries.TestTpchQuery.test_tpch[protocol: beeswax | 
> exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> orc/def/block-TPC-H: Q2]8.4 sec 1
>  query_test.test_queries.TestHdfsQueries.test_hdfs_scan_node[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> rc/snap/block]  8.4 sec 1
> {noformat}
> Error:
> {noformat}
> ==28216==ERROR: AddressSanitizer: heap-use-after-free on address 
> 0x7fb838f33800 at pc 0x01b74b61 bp 0x7fb91d19f0c0 sp 0x7fb91d19e870
> READ of size 1048576 at 0x7fb838f33800 thread T82 (rpc reactor-287)
> #0 0x1b74b60 in read_iovec(void*, __sanitizer::__sanitizer_iovec*, 
> unsigned long, unsigned long) 
> /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
> #1 0x1b8b1c1 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, 
> long) 
> /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
> #2 0x1b8daa3 in __interceptor_sendmsg 
> /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
> #3 0x3b1fc7c in kudu::Socket::Writev(iovec const*, int, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3
> #4 0x36ef1d5 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26
> #5 0x36f7c90 in kudu::rpc::Connection::WriteHandler(ev::io&, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31
> #6 0x598c3d2 in ev_invoke_pending 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598c3d2)
> #7 0x3681ffc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3
> #8 0x598fa7f in ev_run 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598fa7f)
> #9 0x36821f1 in kudu::rpc::ReactorThread::RunThread() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9
> #10 0x369392b in boost::_bi::bind_t kudu::rpc::ReactorThread>, 
> boost::_bi::list1 > 
> >::operator()() 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
> #11 0x23f26b6 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
> #12 0x23eef29 in kudu::Thread::SuperviseThread(void*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3
> #13 0x7fc169a0fe24 in start_thread (/lib64/libpthread.so.0+0x7e24)
> #14 0x7fc16645934c in __clone (/lib64/libc.so.6+0xf834c)
> 0x7fb838f33800 is located 0 bytes inside of 1048577-byte region 
> [0x7fb838f33800,0x7fb839033801)
> freed by thread T117 here:
> #0 0x1bfab40 in operator delete(void*) 
> /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/asan_new_delete.cc:137
> #1 0x7fc166d5c5a9 in __gnu_cxx::new_allocator::deallocate(char*, 
> unsigned long) 
> 

[jira] [Created] (IMPALA-10497) test_no_fd_caching_on_cached_data failing

2021-02-09 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10497:
---

 Summary: test_no_fd_caching_on_cached_data failing
 Key: IMPALA-10497
 URL: https://issues.apache.org/jira/browse/IMPALA-10497
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Riza Suminto



{noformat}
Error Message
assert 1 == 0  +  where 1 = >()  +
where > = 
.cached_handles
Stacktrace
custom_cluster/test_hdfs_fd_caching.py:202: in test_no_fd_caching_on_cached_data
assert self.cached_handles() == 0
E   assert 1 == 0
E+  where 1 = >()
E+where > = 
.cached_handles
Standard Error
-- 2021-02-08 06:40:41,413 INFO MainThread: Starting cluster with command: 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/bin/start-impala-cluster.py
 '--state_store_args=--statestore_update_frequency_ms=50 
--statestore_priority_update_frequency_ms=50 
--statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 
--log_dir=/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests
 --log_level=1 '--impalad_args=--max_cached_file_handles=16 
--unused_file_handle_timeout_sec=5 --data_cache=/tmp:500MB 
--always_use_data_cache=true ' '--state_store_args=None ' 
'--catalogd_args=--load_catalog_in_background=false ' 
--impalad_args=--default_query_options=
06:40:42 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
06:40:42 MainThread: Starting State Store logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/statestored.INFO
06:40:42 MainThread: Starting Catalog Service logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
06:40:42 MainThread: Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad.INFO
06:40:42 MainThread: Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
06:40:42 MainThread: Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:45 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000
06:40:45 MainThread: Debug webpage not yet available: ('Connection aborted.', 
error(111, 'Connection refused'))
06:40:47 MainThread: Debug webpage did not become available in expected time.
06:40:47 MainThread: Waiting for num_known_live_backends=3. Current value: None
06:40:48 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:48 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000
06:40:48 MainThread: Waiting for num_known_live_backends=3. Current value: 0
06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:49 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000
06:40:49 MainThread: num_known_live_backends has reached value: 3
06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:49 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25001
06:40:49 MainThread: num_known_live_backends has reached value: 3
06:40:50 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:50 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25002
06:40:50 MainThread: num_known_live_backends has reached value: 3
06:40:50 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 
executors).
-- 2021-02-08 06:40:51,049 DEBUGMainThread: Found 3 impalad/1 statestored/1 
catalogd process(es)
-- 2021-02-08 06:40:51,049 INFO MainThread: Getting metric: 
statestore.live-backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25010
-- 2021-02-08 06:40:51,050 INFO MainThread: Starting new HTTP connection 
(1): impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com
-- 2021-02-08 06:40:51,052 INFO MainThread: Metric 
'statestore.live-backends' has reached desired value: 4
-- 2021-02-08 06:40:51,052 DEBUGMainThread: Getting num_known_live_backends 
from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000
-- 2021-02-08 06:40:51,053 INFO MainThread: Starting new HTTP connection 
(1): impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com
-- 2021-02-08 06:40:51,054 INFO MainThread: num_known_live_backends has 
reached value: 3
-- 2021-02-08 06:40:51,054 DEBUG

[jira] [Created] (IMPALA-10497) test_no_fd_caching_on_cached_data failing

2021-02-09 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10497:
---

 Summary: test_no_fd_caching_on_cached_data failing
 Key: IMPALA-10497
 URL: https://issues.apache.org/jira/browse/IMPALA-10497
 Project: IMPALA
  Issue Type: Bug
Reporter: Bikramjeet Vig
Assignee: Riza Suminto



{noformat}
Error Message
assert 1 == 0  +  where 1 = >()  +
where > = 
.cached_handles
Stacktrace
custom_cluster/test_hdfs_fd_caching.py:202: in test_no_fd_caching_on_cached_data
assert self.cached_handles() == 0
E   assert 1 == 0
E+  where 1 = >()
E+where > = 
.cached_handles
Standard Error
-- 2021-02-08 06:40:41,413 INFO MainThread: Starting cluster with command: 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/bin/start-impala-cluster.py
 '--state_store_args=--statestore_update_frequency_ms=50 
--statestore_priority_update_frequency_ms=50 
--statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 
--log_dir=/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests
 --log_level=1 '--impalad_args=--max_cached_file_handles=16 
--unused_file_handle_timeout_sec=5 --data_cache=/tmp:500MB 
--always_use_data_cache=true ' '--state_store_args=None ' 
'--catalogd_args=--load_catalog_in_background=false ' 
--impalad_args=--default_query_options=
06:40:42 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
06:40:42 MainThread: Starting State Store logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/statestored.INFO
06:40:42 MainThread: Starting Catalog Service logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
06:40:42 MainThread: Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad.INFO
06:40:42 MainThread: Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
06:40:42 MainThread: Starting Impala Daemon logging to 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:45 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000
06:40:45 MainThread: Debug webpage not yet available: ('Connection aborted.', 
error(111, 'Connection refused'))
06:40:47 MainThread: Debug webpage did not become available in expected time.
06:40:47 MainThread: Waiting for num_known_live_backends=3. Current value: None
06:40:48 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:48 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000
06:40:48 MainThread: Waiting for num_known_live_backends=3. Current value: 0
06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:49 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000
06:40:49 MainThread: num_known_live_backends has reached value: 3
06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:49 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25001
06:40:49 MainThread: num_known_live_backends has reached value: 3
06:40:50 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
06:40:50 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25002
06:40:50 MainThread: num_known_live_backends has reached value: 3
06:40:50 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 
executors).
-- 2021-02-08 06:40:51,049 DEBUGMainThread: Found 3 impalad/1 statestored/1 
catalogd process(es)
-- 2021-02-08 06:40:51,049 INFO MainThread: Getting metric: 
statestore.live-backends from 
impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25010
-- 2021-02-08 06:40:51,050 INFO MainThread: Starting new HTTP connection 
(1): impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com
-- 2021-02-08 06:40:51,052 INFO MainThread: Metric 
'statestore.live-backends' has reached desired value: 4
-- 2021-02-08 06:40:51,052 DEBUGMainThread: Getting num_known_live_backends 
from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000
-- 2021-02-08 06:40:51,053 INFO MainThread: Starting new HTTP connection 
(1): impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com
-- 2021-02-08 06:40:51,054 INFO MainThread: num_known_live_backends has 
reached value: 3
-- 2021-02-08 06:40:51,054 DEBUG

[jira] [Created] (IMPALA-10480) heap-use-after-free crash in ASAN build

2021-02-05 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10480:
---

 Summary: heap-use-after-free crash in ASAN build
 Key: IMPALA-10480
 URL: https://issues.apache.org/jira/browse/IMPALA-10480
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 4.0
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


Likely candidates that triggered this:

{noformat}
 
query_test.test_tpch_nested_queries.TestTpchNestedQuery.test_tpch_q20[protocol: 
beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
orc/def/block]   8.4 sec 1
 query_test.test_tpch_queries.TestTpchQuery.test_tpch[protocol: beeswax | 
exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
orc/def/block-TPC-H: Q2]  8.4 sec 1
 query_test.test_queries.TestHdfsQueries.test_hdfs_scan_node[protocol: beeswax 
| exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
rc/snap/block]8.4 sec 1
{noformat}

Error:
{noformat}
==28216==ERROR: AddressSanitizer: heap-use-after-free on address 0x7fb838f33800 
at pc 0x01b74b61 bp 0x7fb91d19f0c0 sp 0x7fb91d19e870
READ of size 1048576 at 0x7fb838f33800 thread T82 (rpc reactor-287)
#0 0x1b74b60 in read_iovec(void*, __sanitizer::__sanitizer_iovec*, unsigned 
long, unsigned long) 
/mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
#1 0x1b8b1c1 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, long) 
/mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
#2 0x1b8daa3 in __interceptor_sendmsg 
/mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
#3 0x3b1fc7c in kudu::Socket::Writev(iovec const*, int, long*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3
#4 0x36ef1d5 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26
#5 0x36f7c90 in kudu::rpc::Connection::WriteHandler(ev::io&, int) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31
#6 0x598c3d2 in ev_invoke_pending 
(/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598c3d2)
#7 0x3681ffc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3
#8 0x598fa7f in ev_run 
(/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598fa7f)
#9 0x36821f1 in kudu::rpc::ReactorThread::RunThread() 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9
#10 0x369392b in boost::_bi::bind_t, 
boost::_bi::list1 > 
>::operator()() 
/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
#11 0x23f26b6 in boost::function0::operator()() const 
/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
#12 0x23eef29 in kudu::Thread::SuperviseThread(void*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3
#13 0x7fc169a0fe24 in start_thread (/lib64/libpthread.so.0+0x7e24)
#14 0x7fc16645934c in __clone (/lib64/libc.so.6+0xf834c)

0x7fb838f33800 is located 0 bytes inside of 1048577-byte region 
[0x7fb838f33800,0x7fb839033801)
freed by thread T117 here:
#0 0x1bfab40 in operator delete(void*) 
/mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/asan_new_delete.cc:137
#1 0x7fc166d5c5a9 in __gnu_cxx::new_allocator::deallocate(char*, 
unsigned long) 
/mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:125
#2 0x7fc166d5c5a9 in std::allocator_traits 
>::deallocate(std::allocator&, char*, unsigned long) 
/mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/alloc_traits.h:462
#3 0x7fc166d5c5a9 in std::__cxx11::basic_string, std::allocator >::_M_destroy(unsigned long) 
/mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:226
#4 0x7fc166d5c5a9 in std::__cxx11::basic_string, std::allocator >::reserve(unsigned 

[jira] [Created] (IMPALA-10480) heap-use-after-free crash in ASAN build

2021-02-05 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10480:
---

 Summary: heap-use-after-free crash in ASAN build
 Key: IMPALA-10480
 URL: https://issues.apache.org/jira/browse/IMPALA-10480
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 4.0
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


Likely candidates that triggered this:

{noformat}
 
query_test.test_tpch_nested_queries.TestTpchNestedQuery.test_tpch_q20[protocol: 
beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
orc/def/block]   8.4 sec 1
 query_test.test_tpch_queries.TestTpchQuery.test_tpch[protocol: beeswax | 
exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
orc/def/block-TPC-H: Q2]  8.4 sec 1
 query_test.test_queries.TestHdfsQueries.test_hdfs_scan_node[protocol: beeswax 
| exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
rc/snap/block]8.4 sec 1
{noformat}

Error:
{noformat}
==28216==ERROR: AddressSanitizer: heap-use-after-free on address 0x7fb838f33800 
at pc 0x01b74b61 bp 0x7fb91d19f0c0 sp 0x7fb91d19e870
READ of size 1048576 at 0x7fb838f33800 thread T82 (rpc reactor-287)
#0 0x1b74b60 in read_iovec(void*, __sanitizer::__sanitizer_iovec*, unsigned 
long, unsigned long) 
/mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
#1 0x1b8b1c1 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, long) 
/mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
#2 0x1b8daa3 in __interceptor_sendmsg 
/mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
#3 0x3b1fc7c in kudu::Socket::Writev(iovec const*, int, long*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3
#4 0x36ef1d5 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26
#5 0x36f7c90 in kudu::rpc::Connection::WriteHandler(ev::io&, int) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31
#6 0x598c3d2 in ev_invoke_pending 
(/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598c3d2)
#7 0x3681ffc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3
#8 0x598fa7f in ev_run 
(/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598fa7f)
#9 0x36821f1 in kudu::rpc::ReactorThread::RunThread() 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9
#10 0x369392b in boost::_bi::bind_t, 
boost::_bi::list1 > 
>::operator()() 
/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
#11 0x23f26b6 in boost::function0::operator()() const 
/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
#12 0x23eef29 in kudu::Thread::SuperviseThread(void*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3
#13 0x7fc169a0fe24 in start_thread (/lib64/libpthread.so.0+0x7e24)
#14 0x7fc16645934c in __clone (/lib64/libc.so.6+0xf834c)

0x7fb838f33800 is located 0 bytes inside of 1048577-byte region 
[0x7fb838f33800,0x7fb839033801)
freed by thread T117 here:
#0 0x1bfab40 in operator delete(void*) 
/mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/asan_new_delete.cc:137
#1 0x7fc166d5c5a9 in __gnu_cxx::new_allocator::deallocate(char*, 
unsigned long) 
/mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:125
#2 0x7fc166d5c5a9 in std::allocator_traits 
>::deallocate(std::allocator&, char*, unsigned long) 
/mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/alloc_traits.h:462
#3 0x7fc166d5c5a9 in std::__cxx11::basic_string, std::allocator >::_M_destroy(unsigned long) 
/mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:226
#4 0x7fc166d5c5a9 in std::__cxx11::basic_string, std::allocator >::reserve(unsigned 

[jira] [Resolved] (IMPALA-10373) Run impala docker containers as a regular linux user with uid/gid 1000

2020-12-03 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10373.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Run impala docker containers as a regular linux user with uid/gid 1000
> --
>
> Key: IMPALA-10373
> URL: https://issues.apache.org/jira/browse/IMPALA-10373
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
> Fix For: Impala 4.0
>
>
> The convention in in linux is to that anything below 1000 is reserved for 
> system accounts, services, and other special accounts, and regular user UIDs 
> and GIDs stay above 1000. This will ensure that the 'impala' user created 
> that runs the impala executable inside the docker container gets assigned 
> 1000 uid and gid.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10373) Run impala docker containers as a regular linux user with uid/gid 1000

2020-12-03 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10373.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Run impala docker containers as a regular linux user with uid/gid 1000
> --
>
> Key: IMPALA-10373
> URL: https://issues.apache.org/jira/browse/IMPALA-10373
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
> Fix For: Impala 4.0
>
>
> The convention in in linux is to that anything below 1000 is reserved for 
> system accounts, services, and other special accounts, and regular user UIDs 
> and GIDs stay above 1000. This will ensure that the 'impala' user created 
> that runs the impala executable inside the docker container gets assigned 
> 1000 uid and gid.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8990) TestAdmissionController.test_set_request_pool seems flaky

2020-12-03 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-8990.

Fix Version/s: Impala 4.0
   Resolution: Fixed

> TestAdmissionController.test_set_request_pool seems flaky
> -
>
> Key: IMPALA-8990
> URL: https://issues.apache.org/jira/browse/IMPALA-8990
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Michael Ho
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: flaky
> Fix For: Impala 4.0
>
>
> Expected query error didn't occur. Happened once so far. [~bikram], can you 
> please take a look ?
> {noformat}
> Error Message
> AssertionError: Query should return error assert False
> Stacktrace
> hs2/hs2_test_suite.py:63: in add_session
> lambda: fn(self))
> hs2/hs2_test_suite.py:44: in add_session_helper
> fn()
> hs2/hs2_test_suite.py:63: in 
> lambda: fn(self))
> custom_cluster/test_admission_controller.py:312: in test_set_request_pool
> self.__check_pool_rejected(client, 'root.queueA', "exceeded timeout")
> custom_cluster/test_admission_controller.py:195: in __check_pool_rejected
> assert False, "Query should return error"
> E   AssertionError: Query should return error
> E   assert False
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8990) TestAdmissionController.test_set_request_pool seems flaky

2020-12-03 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-8990.

Fix Version/s: Impala 4.0
   Resolution: Fixed

> TestAdmissionController.test_set_request_pool seems flaky
> -
>
> Key: IMPALA-8990
> URL: https://issues.apache.org/jira/browse/IMPALA-8990
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Michael Ho
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: flaky
> Fix For: Impala 4.0
>
>
> Expected query error didn't occur. Happened once so far. [~bikram], can you 
> please take a look ?
> {noformat}
> Error Message
> AssertionError: Query should return error assert False
> Stacktrace
> hs2/hs2_test_suite.py:63: in add_session
> lambda: fn(self))
> hs2/hs2_test_suite.py:44: in add_session_helper
> fn()
> hs2/hs2_test_suite.py:63: in 
> lambda: fn(self))
> custom_cluster/test_admission_controller.py:312: in test_set_request_pool
> self.__check_pool_rejected(client, 'root.queueA', "exceeded timeout")
> custom_cluster/test_admission_controller.py:195: in __check_pool_rejected
> assert False, "Query should return error"
> E   AssertionError: Query should return error
> E   assert False
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10373) Run impala docker containers as a regular linux user with uid/gid 1000

2020-12-01 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10373:
---

 Summary: Run impala docker containers as a regular linux user with 
uid/gid 1000
 Key: IMPALA-10373
 URL: https://issues.apache.org/jira/browse/IMPALA-10373
 Project: IMPALA
  Issue Type: Improvement
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


The convention in in linux is to that anything below 1000 is reserved for 
system accounts, services, and other special accounts, and regular user UIDs 
and GIDs stay above 1000. This will ensure that the 'impala' user created that 
runs the impala executable inside the docker container gets assigned 1000 uid 
and gid.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10373) Run impala docker containers as a regular linux user with uid/gid 1000

2020-12-01 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10373:
---

 Summary: Run impala docker containers as a regular linux user with 
uid/gid 1000
 Key: IMPALA-10373
 URL: https://issues.apache.org/jira/browse/IMPALA-10373
 Project: IMPALA
  Issue Type: Improvement
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


The convention in in linux is to that anything below 1000 is reserved for 
system accounts, services, and other special accounts, and regular user UIDs 
and GIDs stay above 1000. This will ensure that the 'impala' user created that 
runs the impala executable inside the docker container gets assigned 1000 uid 
and gid.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMPALA-10348) test_sequential_startup_wait failed due to query admission exceeding timeout 60000ms

2020-11-24 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-10348:

Attachment: test_log.txt

> test_sequential_startup_wait failed due to query admission exceeding timeout 
> 6ms
> 
>
> Key: IMPALA-10348
> URL: https://issues.apache.org/jira/browse/IMPALA-10348
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build, flaky
> Attachments: 
> catalogd.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.ERROR.20201119-201129.19576,
>  
> catalogd.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.INFO.20201119-201129.19576,
>  
> catalogd.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.WARNING.20201119-201129.19576,
>  
> impalad.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.ERROR.20201119-201129.19631,
>  
> impalad.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.INFO.20201119-201129.19631,
>  
> impalad.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.WARNING.20201119-201129.19631,
>  
> statestored.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.ERROR.20201119-201129.19560,
>  
> statestored.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.INFO.20201119-201129.19560,
>  
> statestored.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.WARNING.20201119-201129.19560,
>  test_log.txt
>
>
> We found in a recent s3 build that the test of 
> {{test_sequential_startup_wait}} failed due to admission for query exceeding 
> timeout 6ms.
> The error message in the console output of the Jenkins job is the following.
> {noformat}
> Error Message:
> EQuery aborted:Admission for query exceeded timeout 6ms in pool 
> default-pool. Queued reason: Waiting for executors to start. Only DDL queries 
> and queries scheduled only on the coordinator (either NUM_NODES set to 1 or 
> when small query optimization is triggered) can currently run. Additional 
> Details: Not Applicable
> {noformat}
> Before the timeout was reached we saw the following entries in the 
> corresponding log file produced by an impalad.
> {noformat}
> W1119 20:11:45.731312 20667 executor-group.cc:164] 
> e64b11989b018148:20eb7ce5] Executor group default-pool-group1 is 
> unhealthy: 1 out of 3 are available.
> W1119 20:11:45.731338 20667 admission-controller.cc:1558] 
> e64b11989b018148:20eb7ce5] Waiting for executors to start. Only DDL 
> queries and queries scheduled only on the coordinator (either NUM_NODES set 
> to 1 or when small query optimization is triggered) can currently run.
> I1119 20:11:45.731348 20667 admission-controller.cc:1210] 
> e64b11989b018148:20eb7ce5] Queuing, query 
> id=e64b11989b018148:20eb7ce5 reason: Waiting for executors to start. 
> Only DDL queries and queries scheduled only on the coordinator (either 
> NUM_NODES set to 1 or when small query optimization is triggered) can 
> currently run.
> I1119 20:11:45.773303 20040 admission-controller.cc:1876] Could not dequeue 
> query id=e64b11989b018148:20eb7ce5 reason: Waiting for executors to 
> start. Only DDL queries and queries scheduled only on the coordinator (either 
> NUM_NODES set to 1 or when small query optimization is triggered) can 
> currently run.
> {noformat}
> The corresponding log files are also provided.
> The test was recently revised in IMPALA-8830, maybe [~bikramjeet.vig] could 
> provide some insight into it. Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10348) test_sequential_startup_wait failed due to query admission exceeding timeout 60000ms

2020-11-24 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238456#comment-17238456
 ] 

Bikramjeet Vig commented on IMPALA-10348:
-

Update:
I compiled a timeline of events from coordinator, statestore, and the test logs 
to get an idea of what was happening. Seems like the last executor started at 
20:11:49 but was not recognized by the coordinator till 20:15:20 which caused 
the query that queued at 20:11:45 to timeout at 20:12:45.
Not sure what was holding up the coordinator to recognize and update its 
membership view. will have to dig in more

{noformat}
W1119 20:11:45.731312 20667 executor-group.cc:164] 
e64b11989b018148:20eb7ce5] Executor group default-pool-group1 is 
unhealthy: 1 out of 3 are available.

I1119 20:11:45.731348 20667 admission-controller.cc:1210] 
e64b11989b018148:20eb7ce5] Queuing, query 
id=e64b11989b018148:20eb7ce5 reason: Waiting for executors to start. 
Only DDL queries and queries scheduled only on the coordinator (either 
NUM_NODES set to 1 or when small query optimization is triggered) can currently 
run.

I1119 20:11:49.496661 21016 statestore.cc:650] Subscriber 
'impa...@impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com:27002' 
registered (registration id: 054017c8c8b88935:a73adf2ac1968294)

W1119 20:11:49.542403 20040 executor-group.cc:164] Executor group 
default-pool-group1 is unhealthy: 2 out of 3 are available.

20:11:54 MainThread: Found 4 impalad/1 statestored/1 catalogd process(es)
20:11:54 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com:25000
20:11:54 MainThread: Waiting for num_known_live_backends=4. Current value: 3


I1119 20:12:45.731456 20667 admission-controller.cc:1272] 
e64b11989b018148:20eb7ce5] Admission for query exceeded timeout 6ms 
in pool default-pool. Queued reason: Waiting for executors to start. Only DDL 
queries and queries scheduled only on the coordinator (either NUM_NODES set to 
1 or when small query optimization is triggered) can currently run. Additional 
Details: Not Applicable


20:15:20 MainThread: Getting num_known_live_backends from 
impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com:25000
20:15:20 MainThread: num_known_live_backends has reached value: 4
{noformat}


> test_sequential_startup_wait failed due to query admission exceeding timeout 
> 6ms
> 
>
> Key: IMPALA-10348
> URL: https://issues.apache.org/jira/browse/IMPALA-10348
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: broken-build, flaky
> Attachments: 
> catalogd.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.ERROR.20201119-201129.19576,
>  
> catalogd.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.INFO.20201119-201129.19576,
>  
> catalogd.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.WARNING.20201119-201129.19576,
>  
> impalad.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.ERROR.20201119-201129.19631,
>  
> impalad.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.INFO.20201119-201129.19631,
>  
> impalad.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.WARNING.20201119-201129.19631,
>  
> statestored.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.ERROR.20201119-201129.19560,
>  
> statestored.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.INFO.20201119-201129.19560,
>  
> statestored.impala-ec2-centos74-m5-4xlarge-ondemand-0d33.vpc.cloudera.com.jenkins.log.WARNING.20201119-201129.19560
>
>
> We found in a recent s3 build that the test of 
> {{test_sequential_startup_wait}} failed due to admission for query exceeding 
> timeout 6ms.
> The error message in the console output of the Jenkins job is the following.
> {noformat}
> Error Message:
> EQuery aborted:Admission for query exceeded timeout 6ms in pool 
> default-pool. Queued reason: Waiting for executors to start. Only DDL queries 
> and queries scheduled only on the coordinator (either NUM_NODES set to 1 or 
> when small query optimization is triggered) can currently run. Additional 
> Details: Not Applicable
> {noformat}
> Before the timeout was reached we saw the following entries in the 
> corresponding log file produced by an impalad.
> {noformat}
> W1119 20:11:45.731312 20667 executor-group.cc:164] 
> e64b11989b018148:20eb7ce5] Executor group default-pool-group1 is 
> unhealthy: 1 out of 3 are available.
> W1119 20:11:45.731338 20667 admission-controller.cc:1558] 
> 

[jira] [Resolved] (IMPALA-8202) TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid or unknown query handle"

2020-11-24 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-8202.

Fix Version/s: Impala 4.0
   Resolution: Fixed

> TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid 
> or unknown query handle"
> 
>
> Key: IMPALA-8202
> URL: https://issues.apache.org/jira/browse/IMPALA-8202
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Andrew Sherman
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> teardown() attempts to close each submission thread that was used. But one of 
> them times out.
> {quote}
> 06:05:22  ERROR at teardown of 
> TestAdmissionControllerStress.test_mem_limit[num_queries: 50 | protocol: 
> beeswax | table_format: text/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> submission_delay_ms: 50 | round_robin_submission: True] 
> 06:05:22 custom_cluster/test_admission_controller.py:1004: in teardown
> 06:05:22 client.cancel(thread.query_handle)
> 06:05:22 common/impala_connection.py:183: in cancel
> 06:05:22 return 
> self.__beeswax_client.cancel_query(operation_handle.get_handle())
> 06:05:22 beeswax/impala_beeswax.py:364: in cancel_query
> 06:05:22 return self.__do_rpc(lambda: self.imp_service.Cancel(query_id))
> 06:05:22 beeswax/impala_beeswax.py:512: in __do_rpc
> 06:05:22 raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> 06:05:22 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 06:05:22 EINNER EXCEPTION: 
> 06:05:22 EMESSAGE: Invalid or unknown query handle
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8202) TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid or unknown query handle"

2020-11-24 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-8202.

Fix Version/s: Impala 4.0
   Resolution: Fixed

> TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid 
> or unknown query handle"
> 
>
> Key: IMPALA-8202
> URL: https://issues.apache.org/jira/browse/IMPALA-8202
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Andrew Sherman
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> teardown() attempts to close each submission thread that was used. But one of 
> them times out.
> {quote}
> 06:05:22  ERROR at teardown of 
> TestAdmissionControllerStress.test_mem_limit[num_queries: 50 | protocol: 
> beeswax | table_format: text/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> submission_delay_ms: 50 | round_robin_submission: True] 
> 06:05:22 custom_cluster/test_admission_controller.py:1004: in teardown
> 06:05:22 client.cancel(thread.query_handle)
> 06:05:22 common/impala_connection.py:183: in cancel
> 06:05:22 return 
> self.__beeswax_client.cancel_query(operation_handle.get_handle())
> 06:05:22 beeswax/impala_beeswax.py:364: in cancel_query
> 06:05:22 return self.__do_rpc(lambda: self.imp_service.Cancel(query_id))
> 06:05:22 beeswax/impala_beeswax.py:512: in __do_rpc
> 06:05:22 raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> 06:05:22 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 06:05:22 EINNER EXCEPTION: 
> 06:05:22 EMESSAGE: Invalid or unknown query handle
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMPALA-9355) TestExchangeMemUsage.test_exchange_mem_usage_scaling doesn't hit the memory limit

2020-11-19 Thread Bikramjeet Vig (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17235862#comment-17235862
 ] 

Bikramjeet Vig commented on IMPALA-9355:


saw this again here during GVO: 
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3498/

> TestExchangeMemUsage.test_exchange_mem_usage_scaling doesn't hit the memory 
> limit
> -
>
> Key: IMPALA-9355
> URL: https://issues.apache.org/jira/browse/IMPALA-9355
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Fang-Yu Rao
>Assignee: Qifan Chen
>Priority: Critical
>  Labels: broken-build, flaky
>
> The EE test {{test_exchange_mem_usage_scaling}} failed because the query at 
> [https://github.com/apache/impala/blame/master/testdata/workloads/functional-query/queries/QueryTest/exchange-mem-scaling.test#L7-L15]
>  does not hit the specified memory limit (170m) at 
> [https://github.com/apache/impala/blame/master/testdata/workloads/functional-query/queries/QueryTest/exchange-mem-scaling.test#L7].
>  We may need to further reduce the specified limit. In what follows the error 
> message is also given. Recall that the same issue occurred at 
> https://issues.apache.org/jira/browse/IMPALA-7873 but was resolved.
> {code:java}
> FAIL 
> query_test/test_mem_usage_scaling.py::TestExchangeMemUsage::()::test_exchange_mem_usage_scaling[protocol:
>  beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none]
> === FAILURES 
> ===
>  TestExchangeMemUsage.test_exchange_mem_usage_scaling[protocol: beeswax | 
> exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none] 
> [gw3] linux2 -- Python 2.7.12 
> /home/ubuntu/Impala/bin/../infra/python/env/bin/python
> query_test/test_mem_usage_scaling.py:386: in test_exchange_mem_usage_scaling
> self.run_test_case('QueryTest/exchange-mem-scaling', vector)
> common/impala_test_suite.py:674: in run_test_case
> expected_str, query)
> E   AssertionError: Expected exception: Memory limit exceeded
> E   
> E   when running:
> E   
> E   set mem_limit=170m;
> E   set num_scanner_threads=1;
> E   select *
> E   from tpch_parquet.lineitem l1
> E join tpch_parquet.lineitem l2 on l1.l_orderkey = l2.l_orderkey and
> E l1.l_partkey = l2.l_partkey and l1.l_suppkey = l2.l_suppkey
> E and l1.l_linenumber = l2.l_linenumber
> E   order by l1.l_orderkey desc, l1.l_partkey, l1.l_suppkey, l1.l_linenumber
> E   limit 5
> {code}
> [~tarmstr...@cloudera.com] and [~joemcdonnell] reviewed the patch at 
> [https://gerrit.cloudera.org/c/11965/]. Assign this JIRA to [~joemcdonnell] 
> for now. Please re-assign the JIRA to others as appropriate. Thanks!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10245) Test fails in TestKuduReadTokenSplit.test_kudu_scanner

2020-10-23 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10245.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Test fails in TestKuduReadTokenSplit.test_kudu_scanner
> --
>
> Key: IMPALA-10245
> URL: https://issues.apache.org/jira/browse/IMPALA-10245
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Bikramjeet Vig
>Priority: Critical
> Fix For: Impala 4.0
>
>
> Tests with erasure-coding enabled failed in: 
> query_test.test_kudu.TestKuduReadTokenSplit.test_kudu_scanner[protocol: 
> beeswax | exec_option: \{'kudu_read_mode': 'READ_AT_SNAPSHOT', 'batch_size': 
> 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> table_format: kudu/none] (from pytest)
> {code:java}
> query_test/test_kudu.py:1508: in test_kudu_scanner
> targeted_kudu_scan_range_length=None, plans=plans)
> query_test/test_kudu.py:1542: in __get_num_scanner_instances
> assert len(matches.groups()) == 1
> E   AttributeError: 'NoneType' object has no attribute 'groups' {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10245) Test fails in TestKuduReadTokenSplit.test_kudu_scanner

2020-10-23 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10245.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Test fails in TestKuduReadTokenSplit.test_kudu_scanner
> --
>
> Key: IMPALA-10245
> URL: https://issues.apache.org/jira/browse/IMPALA-10245
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Bikramjeet Vig
>Priority: Critical
> Fix For: Impala 4.0
>
>
> Tests with erasure-coding enabled failed in: 
> query_test.test_kudu.TestKuduReadTokenSplit.test_kudu_scanner[protocol: 
> beeswax | exec_option: \{'kudu_read_mode': 'READ_AT_SNAPSHOT', 'batch_size': 
> 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> table_format: kudu/none] (from pytest)
> {code:java}
> query_test/test_kudu.py:1508: in test_kudu_scanner
> targeted_kudu_scan_range_length=None, plans=plans)
> query_test/test_kudu.py:1542: in __get_num_scanner_instances
> assert len(matches.groups()) == 1
> E   AttributeError: 'NoneType' object has no attribute 'groups' {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IMPALA-10245) Test fails in TestKuduReadTokenSplit.test_kudu_scanner

2020-10-20 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-10245:
---

Assignee: Bikramjeet Vig  (was: Thomas Tauber-Marshall)

> Test fails in TestKuduReadTokenSplit.test_kudu_scanner
> --
>
> Key: IMPALA-10245
> URL: https://issues.apache.org/jira/browse/IMPALA-10245
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> Tests with erasure-coding enabled failed in: 
> query_test.test_kudu.TestKuduReadTokenSplit.test_kudu_scanner[protocol: 
> beeswax | exec_option: \{'kudu_read_mode': 'READ_AT_SNAPSHOT', 'batch_size': 
> 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> table_format: kudu/none] (from pytest)
> {code:java}
> query_test/test_kudu.py:1508: in test_kudu_scanner
> targeted_kudu_scan_range_length=None, plans=plans)
> query_test/test_kudu.py:1542: in __get_num_scanner_instances
> assert len(matches.groups()) == 1
> E   AttributeError: 'NoneType' object has no attribute 'groups' {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10210) Avoid authentication for connection from a trusted domain over http

2020-10-02 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10210:
---

 Summary: Avoid authentication for connection from a trusted domain 
over http
 Key: IMPALA-10210
 URL: https://issues.apache.org/jira/browse/IMPALA-10210
 Project: IMPALA
  Issue Type: Improvement
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


Add the ability to skip authentication over Http for the both hs2 and the 
Impala debug web service.
The current idea is to still require that the client specify a username in its 
request via a basic auth header so impala can attribute the connection to that 
username.
This change should also add the ability to use the "X-Forwarded-For" header to 
get the  the real client ip address in case proxies are used in between.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-10210) Avoid authentication for connection from a trusted domain over http

2020-10-02 Thread Bikramjeet Vig (Jira)
Bikramjeet Vig created IMPALA-10210:
---

 Summary: Avoid authentication for connection from a trusted domain 
over http
 Key: IMPALA-10210
 URL: https://issues.apache.org/jira/browse/IMPALA-10210
 Project: IMPALA
  Issue Type: Improvement
Reporter: Bikramjeet Vig
Assignee: Bikramjeet Vig


Add the ability to skip authentication over Http for the both hs2 and the 
Impala debug web service.
The current idea is to still require that the client specify a username in its 
request via a basic auth header so impala can attribute the connection to that 
username.
This change should also add the ability to use the "X-Forwarded-For" header to 
get the  the real client ip address in case proxies are used in between.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9503) Expose 'healthz' endpoint for statestored and catalogd

2020-08-06 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-9503.

Resolution: Duplicate

> Expose 'healthz' endpoint for statestored and catalogd
> --
>
> Key: IMPALA-9503
> URL: https://issues.apache.org/jira/browse/IMPALA-9503
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: Bikramjeet Vig
>Priority: Major
>
> IMPALA-8895 exposed the end points for impalads. It seems only coordinator 
> and executors expose 'healthz' endpoint. It will be good to expose the 
> endpoints on statestored and catalogd.
>  
> {code:java}
> curl http://localhost:25010/healthz
> No URI handler for '/healthz'
> curl http://localhost:25020/healthz
> No URI handler for '/healthz'{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9503) Expose 'healthz' endpoint for statestored and catalogd

2020-08-06 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-9503:
--

Assignee: Bikramjeet Vig  (was: Alice Fan)

> Expose 'healthz' endpoint for statestored and catalogd
> --
>
> Key: IMPALA-9503
> URL: https://issues.apache.org/jira/browse/IMPALA-9503
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: Bikramjeet Vig
>Priority: Major
>
> IMPALA-8895 exposed the end points for impalads. It seems only coordinator 
> and executors expose 'healthz' endpoint. It will be good to expose the 
> endpoints on statestored and catalogd.
>  
> {code:java}
> curl http://localhost:25010/healthz
> No URI handler for '/healthz'
> curl http://localhost:25020/healthz
> No URI handler for '/healthz'{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10037) BytesRead check in TestMtDopScanNode.test_mt_dop_scan_node is flaky

2020-08-06 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10037.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> BytesRead check in TestMtDopScanNode.test_mt_dop_scan_node is flaky
> ---
>
> Key: IMPALA-10037
> URL: https://issues.apache.org/jira/browse/IMPALA-10037
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Impala 4.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   >