[jira] [Commented] (TEZ-4540) Reading proto data more than 2GB from multiple splits fails

2024-06-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856507#comment-17856507
 ] 

László Bodor commented on TEZ-4540:
---

merged to master, thanks [~Aggarwal_Raghav] for the patch and [~zabetak] for 
the review!

> Reading proto data more than 2GB from multiple splits fails
> ---
>
> Key: TEZ-4540
> URL: https://issues.apache.org/jira/browse/TEZ-4540
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.2
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Refer to this: HIVE-28026 and https://github.com/apache/hive/pull/5033



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4540) Reading proto data more than 2GB from multiple splits fails

2024-06-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved TEZ-4540.
---
Resolution: Fixed

> Reading proto data more than 2GB from multiple splits fails
> ---
>
> Key: TEZ-4540
> URL: https://issues.apache.org/jira/browse/TEZ-4540
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.2
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Refer to this: HIVE-28026 and https://github.com/apache/hive/pull/5033



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4554) Counter for used/all nodes within a DAG

2024-06-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856408#comment-17856408
 ] 

László Bodor commented on TEZ-4554:
---

https://github.com/apache/tez/pull/362

> Counter for used/all nodes within a DAG
> ---
>
> Key: TEZ-4554
> URL: https://issues.apache.org/jira/browse/TEZ-4554
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> This is the tez container node counter corresponding to HIVE-28201.
> Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
> doesn't make sense to distinguish between all and used nodes, we count all 
> nodes that ran at least 1 task attempt:
> NODES_USED_COUNT
> The number of used containers has been implemented in TEZ-2119



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4554) Counter for used/all nodes within a DAG

2024-06-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4554:
--
Summary: Counter for used/all nodes within a DAG  (was: Counter for used 
nodes within a DAG)

> Counter for used/all nodes within a DAG
> ---
>
> Key: TEZ-4554
> URL: https://issues.apache.org/jira/browse/TEZ-4554
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> This is the tez container node counter corresponding to HIVE-28201.
> Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
> doesn't make sense to distinguish between all and used nodes, we count all 
> nodes that ran at least 1 task attempt:
> NODES_USED_COUNT
> The number of used containers has been implemented in TEZ-2119



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TEZ-4554) Counter for used nodes within a DAG

2024-06-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned TEZ-4554:
-

Assignee: László Bodor

> Counter for used nodes within a DAG
> ---
>
> Key: TEZ-4554
> URL: https://issues.apache.org/jira/browse/TEZ-4554
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> This is the tez container node counter corresponding to HIVE-28201.
> Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
> doesn't make sense to distinguish between all and used nodes, we count all 
> nodes that ran at least 1 task attempt:
> NODES_USED_COUNT
> The number of used containers has been implemented in TEZ-2119



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TEZ-4570) Implement data-via-events for ordered outputs

2024-06-18 Thread Jonathan Turner Eagles (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Turner Eagles reassigned TEZ-4570:
---

Assignee: Jonathan Turner Eagles

> Implement data-via-events for ordered outputs
> -
>
> Key: TEZ-4570
> URL: https://issues.apache.org/jira/browse/TEZ-4570
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
>
> Currently, data-via-events is only implemented by the unordered outputs and 
> unordered fetch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4569) SCATTER_GATHER + BROADCAST hangs on DAG Recovery

2024-06-16 Thread Shohei Okumiya (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855389#comment-17855389
 ] 

Shohei Okumiya commented on TEZ-4569:
-

We have another discussion here.

https://lists.apache.org/thread/q7cnz81k39wzd29hrp08o5vohbrdlhk2

> SCATTER_GATHER + BROADCAST hangs on DAG Recovery
> 
>
> Key: TEZ-4569
> URL: https://issues.apache.org/jira/browse/TEZ-4569
> Project: Apache Tez
>  Issue Type: Improvement
>Affects Versions: 0.9.2, 0.10.3
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
> Attachments: image-2024-06-11-20-45-12-540.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A Tez DAG fails to initialize itself when an Application Master is timely 
> preempted.
>  
> The problem typically happens with Map Join(Broadcast Hash Join) of Hive when 
> the broadcast edge is multi-staged. In the following case, the smaller side 
> includes one aggregation, and the condition is satisfied.
>  
> {code:java}
> CREATE TABLE small AS SELECT 1 AS id;
> CREATE TABLE big AS SELECT 1 AS id UNION ALL SELECT 2 AS id UNION ALL SELECT 
> 3 AS id;
> SELECT *
> FROM big
> JOIN (SELECT id, count(*) AS num FROM small GROUP BY id) s ON big.id = s.id 
> {code}
> Once it happens, a retried AM fails to configure the Map Join vertex. In the 
> following case, Map 1 never starts.
>  
>  
> {code:java}
> --
>         VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 2 .. container     SUCCEEDED      1          1        0        0  
>      0       1  
> Reducer 3 .. container     SUCCEEDED      1          1        0        0  
>      0       0  
> Map 1            container  INITIALIZING     -1          0        0       -1  
>      0       0  
> --
>  {code}
> Tez starts Map 2 and Map 1 once their splits are configured. The hang issue 
> happens when an AM is retried before it starts Reducer 3.
> !image-2024-06-11-20-45-12-540.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4571) Shared fetch enabled fetches all partitions on task 0000s directly to disk for non-broadcast edges

2024-06-14 Thread Jonathan Turner Eagles (Jira)
Jonathan Turner Eagles created TEZ-4571:
---

 Summary: Shared fetch enabled fetches all partitions on task s 
directly to disk for non-broadcast edges
 Key: TEZ-4571
 URL: https://issues.apache.org/jira/browse/TEZ-4571
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Jonathan Turner Eagles


I think the idea of shared fetch is to fetch once per node for broadcast input. 
However, the enabled in the fetcher doesn't check the edge type but only that 
1) shared fetch is enabled and 2) task for the vertex is . For broadcast 
edge this is correct perhaps, but for non-broadcast edges, all partitions are 
fetched to disk without possibility of sharing



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4570) Implement data-via-events for ordered outputs

2024-06-14 Thread Jonathan Turner Eagles (Jira)
Jonathan Turner Eagles created TEZ-4570:
---

 Summary: Implement data-via-events for ordered outputs
 Key: TEZ-4570
 URL: https://issues.apache.org/jira/browse/TEZ-4570
 Project: Apache Tez
  Issue Type: New Feature
Reporter: Jonathan Turner Eagles


Currently, data-via-events is only implemented by the unordered outputs and 
unordered fetch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4569) SCATTER_GATHER + BROADCAST hangs on DAG Recovery

2024-06-12 Thread Shohei Okumiya (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shohei Okumiya updated TEZ-4569:

Affects Version/s: 0.9.2

> SCATTER_GATHER + BROADCAST hangs on DAG Recovery
> 
>
> Key: TEZ-4569
> URL: https://issues.apache.org/jira/browse/TEZ-4569
> Project: Apache Tez
>  Issue Type: Improvement
>Affects Versions: 0.9.2, 0.10.3
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
> Attachments: image-2024-06-11-20-45-12-540.png
>
>
> A Tez DAG fails to initialize itself when an Application Master is timely 
> preempted.
>  
> The problem typically happens with Map Join(Broadcast Hash Join) of Hive when 
> the broadcast edge is multi-staged. In the following case, the smaller side 
> includes one aggregation, and the condition is satisfied.
>  
> {code:java}
> CREATE TABLE small AS SELECT 1 AS id;
> CREATE TABLE big AS SELECT 1 AS id UNION ALL SELECT 2 AS id UNION ALL SELECT 
> 3 AS id;
> SELECT *
> FROM big
> JOIN (SELECT id, count(*) AS num FROM small GROUP BY id) s ON big.id = s.id 
> {code}
> Once it happens, a retried AM fails to configure the Map Join vertex. In the 
> following case, Map 1 never starts.
>  
>  
> {code:java}
> --
>         VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 2 .. container     SUCCEEDED      1          1        0        0  
>      0       1  
> Reducer 3 .. container     SUCCEEDED      1          1        0        0  
>      0       0  
> Map 1            container  INITIALIZING     -1          0        0       -1  
>      0       0  
> --
>  {code}
> Tez starts Map 2 and Map 1 once their splits are configured. The hang issue 
> happens when an AM is retried before it starts Reducer 3.
> !image-2024-06-11-20-45-12-540.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4569) SCATTER_GATHER + BROADCAST hangs on DAG Recovery

2024-06-11 Thread Shohei Okumiya (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854034#comment-17854034
 ] 

Shohei Okumiya commented on TEZ-4569:
-

Quickly checking, this part is suspicious. This validation assumes all 
predecessors have been configured when any successors start. If a vertice 
accepts a broadcast edge and a data source, the assumption could be wrong.

[https://github.com/apache/tez/blob/rel/release-0.10.3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2846]

In my mind, we may have two kinds of approaches.
 # We don't start any successors unless all predecessors have been initialized
 # We correctly restore the state of AM even when any vertices false-start 
themselves

> SCATTER_GATHER + BROADCAST hangs on DAG Recovery
> 
>
> Key: TEZ-4569
> URL: https://issues.apache.org/jira/browse/TEZ-4569
> Project: Apache Tez
>  Issue Type: Improvement
>Affects Versions: 0.10.3
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
> Attachments: image-2024-06-11-20-45-12-540.png
>
>
> A Tez DAG fails to initialize itself when an Application Master is timely 
> preempted.
>  
> The problem typically happens with Map Join(Broadcast Hash Join) of Hive when 
> the broadcast edge is multi-staged. In the following case, the smaller side 
> includes one aggregation, and the condition is satisfied.
>  
> {code:java}
> CREATE TABLE small AS SELECT 1 AS id;
> CREATE TABLE big AS SELECT 1 AS id UNION ALL SELECT 2 AS id UNION ALL SELECT 
> 3 AS id;
> SELECT *
> FROM big
> JOIN (SELECT id, count(*) AS num FROM small GROUP BY id) s ON big.id = s.id 
> {code}
> Once it happens, a retried AM fails to configure the Map Join vertex. In the 
> following case, Map 1 never starts.
>  
>  
> {code:java}
> --
>         VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 2 .. container     SUCCEEDED      1          1        0        0  
>      0       1  
> Reducer 3 .. container     SUCCEEDED      1          1        0        0  
>      0       0  
> Map 1            container  INITIALIZING     -1          0        0       -1  
>      0       0  
> --
>  {code}
> Tez starts Map 2 and Map 1 once their splits are configured. The hang issue 
> happens when an AM is retried before it starts Reducer 3.
> !image-2024-06-11-20-45-12-540.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4569) SCATTER_GATHER + BROADCAST hangs on DAG Recovery

2024-06-11 Thread Shohei Okumiya (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854024#comment-17854024
 ] 

Shohei Okumiya commented on TEZ-4569:
-

I created a test case to reproduce the issue first. The 
[testTableScanTemporalFailure|https://github.com/okumin/tez/commit/deac035274bd0b958fbfdf3557dc7120c16fddc5#diff-ad65a331fa51a07f3cc5301ca7df09c199e9730a6f889bc8b1859554ccfc0519R199-R217]
 is the most straightforward reproduction.
{code:java}
2024-06-11 20:18:55,719 INFO  [Time-limited test] client.DAGClientImpl 
(DAGClientImpl.java:log(709)) - DAG: State: RUNNING Progress: 200% TotalTasks: 
1 Succeeded: 2 Running: 0 Failed: 0 Killed: 0 KilledTaskAttempts: 1
2024-06-11 20:18:55,720 INFO  [Time-limited test] client.DAGClientImpl 
(DAGClientImpl.java:log(709)) -     VertexStatus: VertexName: TableScan 
Progress: 100% TotalTasks: 1 Succeeded: 1 Running: 0 Failed: 0 Killed: 0 
KilledTaskAttempts: 1
2024-06-11 20:18:55,721 INFO  [Time-limited test] client.DAGClientImpl 
(DAGClientImpl.java:log(709)) -     VertexStatus: VertexName: Aggregation 
Progress: 100% TotalTasks: 1 Succeeded: 1 Running: 0 Failed: 0 Killed: 0
2024-06-11 20:18:55,721 INFO  [Time-limited test] client.DAGClientImpl 
(DAGClientImpl.java:log(709)) -     VertexStatus: VertexName: MapJoin Progress: 
0% TotalTasks: -1 Succeeded: 0 Running: 0 Failed: 0 Killed: 0
2024-06-11 20:19:00,756 INFO  [Time-limited test] client.DAGClientImpl 
(DAGClientImpl.java:log(709)) - DAG: State: RUNNING Progress: 200% TotalTasks: 
1 Succeeded: 2 Running: 0 Failed: 0 Killed: 0 KilledTaskAttempts: 1
2024-06-11 20:19:00,757 INFO  [Time-limited test] client.DAGClientImpl 
(DAGClientImpl.java:log(709)) -     VertexStatus: VertexName: TableScan 
Progress: 100% TotalTasks: 1 Succeeded: 1 Running: 0 Failed: 0 Killed: 0 
KilledTaskAttempts: 1
2024-06-11 20:19:00,757 INFO  [Time-limited test] client.DAGClientImpl 
(DAGClientImpl.java:log(709)) -     VertexStatus: VertexName: Aggregation 
Progress: 100% TotalTasks: 1 Succeeded: 1 Running: 0 Failed: 0 Killed: 0
2024-06-11 20:19:00,758 INFO  [Time-limited test] client.DAGClientImpl 
(DAGClientImpl.java:log(709)) -     VertexStatus: VertexName: MapJoin Progress: 
0% TotalTasks: -1 Succeeded: 0 Running: 0 Failed: 0 Killed: 0 {code}

> SCATTER_GATHER + BROADCAST hangs on DAG Recovery
> 
>
> Key: TEZ-4569
> URL: https://issues.apache.org/jira/browse/TEZ-4569
> Project: Apache Tez
>  Issue Type: Improvement
>Affects Versions: 0.10.3
>Reporter: Shohei Okumiya
>Assignee: Shohei Okumiya
>Priority: Major
> Attachments: image-2024-06-11-20-45-12-540.png
>
>
> A Tez DAG fails to initialize itself when an Application Master is timely 
> preempted.
>  
> The problem typically happens with Map Join(Broadcast Hash Join) of Hive when 
> the broadcast edge is multi-staged. In the following case, the smaller side 
> includes one aggregation, and the condition is satisfied.
>  
> {code:java}
> CREATE TABLE small AS SELECT 1 AS id;
> CREATE TABLE big AS SELECT 1 AS id UNION ALL SELECT 2 AS id UNION ALL SELECT 
> 3 AS id;
> SELECT *
> FROM big
> JOIN (SELECT id, count(*) AS num FROM small GROUP BY id) s ON big.id = s.id 
> {code}
> Once it happens, a retried AM fails to configure the Map Join vertex. In the 
> following case, Map 1 never starts.
>  
>  
> {code:java}
> --
>         VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 2 .. container     SUCCEEDED      1          1        0        0  
>      0       1  
> Reducer 3 .. container     SUCCEEDED      1          1        0        0  
>      0       0  
> Map 1            container  INITIALIZING     -1          0        0       -1  
>      0       0  
> --
>  {code}
> Tez starts Map 2 and Map 1 once their splits are configured. The hang issue 
> happens when an AM is retried before it starts Reducer 3.
> !image-2024-06-11-20-45-12-540.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4569) SCATTER_GATHER + BROADCAST hangs on DAG Recovery

2024-06-11 Thread Shohei Okumiya (Jira)
Shohei Okumiya created TEZ-4569:
---

 Summary: SCATTER_GATHER + BROADCAST hangs on DAG Recovery
 Key: TEZ-4569
 URL: https://issues.apache.org/jira/browse/TEZ-4569
 Project: Apache Tez
  Issue Type: Improvement
Affects Versions: 0.10.3
Reporter: Shohei Okumiya
Assignee: Shohei Okumiya
 Attachments: image-2024-06-11-20-45-12-540.png

A Tez DAG fails to initialize itself when an Application Master is timely 
preempted.

 

The problem typically happens with Map Join(Broadcast Hash Join) of Hive when 
the broadcast edge is multi-staged. In the following case, the smaller side 
includes one aggregation, and the condition is satisfied.

 
{code:java}
CREATE TABLE small AS SELECT 1 AS id;
CREATE TABLE big AS SELECT 1 AS id UNION ALL SELECT 2 AS id UNION ALL SELECT 3 
AS id;
SELECT *
FROM big
JOIN (SELECT id, count(*) AS num FROM small GROUP BY id) s ON big.id = s.id 
{code}
Once it happens, a retried AM fails to configure the Map Join vertex. In the 
following case, Map 1 never starts.

 

 
{code:java}
--
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
FAILED  KILLED  
--
Map 2 .. container     SUCCEEDED      1          1        0        0    
   0       1  
Reducer 3 .. container     SUCCEEDED      1          1        0        0    
   0       0  
Map 1            container  INITIALIZING     -1          0        0       -1    
   0       0  
--
 {code}
Tez starts Map 2 and Map 1 once their splits are configured. The hang issue 
happens when an AM is retried before it starts Reducer 3.

!image-2024-06-11-20-45-12-540.png!

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4568) ProfileServlet: add html to output formats and prepare for profiler 3.0

2024-06-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved TEZ-4568.
---
Resolution: Fixed

> ProfileServlet: add html to output formats and prepare for profiler 3.0
> ---
>
> Key: TEZ-4568
> URL: https://issues.apache.org/jira/browse/TEZ-4568
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> this is the same as HIVE-28305
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L143
> in recent versions of async-profiler, SVG is not accepted at all, and 
> unfortunately, HTML cannot even be chosen due to a strict parse:
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L353
> for backward compatibility, SVG is fine, but HTML should be added to the enum



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4568) ProfileServlet: add html to output formats and prepare for profiler 3.0

2024-06-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852720#comment-17852720
 ] 

László Bodor commented on TEZ-4568:
---

merged to master, thanks [~ayushtkn] for the review!

> ProfileServlet: add html to output formats and prepare for profiler 3.0
> ---
>
> Key: TEZ-4568
> URL: https://issues.apache.org/jira/browse/TEZ-4568
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> this is the same as HIVE-28305
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L143
> in recent versions of async-profiler, SVG is not accepted at all, and 
> unfortunately, HTML cannot even be chosen due to a strict parse:
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L353
> for backward compatibility, SVG is fine, but HTML should be added to the enum



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4568) ProfileServlet: add html to output formats and prepare for profiler 3.0

2024-06-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4568:
--
Summary: ProfileServlet: add html to output formats and prepare for 
profiler 3.0  (was: ProfileServlet: add html to output formats)

> ProfileServlet: add html to output formats and prepare for profiler 3.0
> ---
>
> Key: TEZ-4568
> URL: https://issues.apache.org/jira/browse/TEZ-4568
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> this is the same as HIVE-28305
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L143
> in recent versions of async-profiler, SVG is not accepted at all, and 
> unfortunately, HTML cannot even be chosen due to a strict parse:
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L353
> for backward compatibility, SVG is fine, but HTML should be added to the enum



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4568) ProfileServlet: add html to output formats and prepare for profiler 3.0

2024-06-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4568:
--
Fix Version/s: 0.10.4

> ProfileServlet: add html to output formats and prepare for profiler 3.0
> ---
>
> Key: TEZ-4568
> URL: https://issues.apache.org/jira/browse/TEZ-4568
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> this is the same as HIVE-28305
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L143
> in recent versions of async-profiler, SVG is not accepted at all, and 
> unfortunately, HTML cannot even be chosen due to a strict parse:
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L353
> for backward compatibility, SVG is fine, but HTML should be added to the enum



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4568) ProfileServlet: add html to output formats

2024-06-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4568:
--
Description: 
this is the same as HIVE-28305

https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L143

in recent versions of async-profiler, SVG is not accepted at all, and 
unfortunately, HTML cannot even be chosen due to a strict parse:
https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L353

for backward compatibility, SVG is fine, but HTML should be added to the enum

> ProfileServlet: add html to output formats
> --
>
> Key: TEZ-4568
> URL: https://issues.apache.org/jira/browse/TEZ-4568
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> this is the same as HIVE-28305
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L143
> in recent versions of async-profiler, SVG is not accepted at all, and 
> unfortunately, HTML cannot even be chosen due to a strict parse:
> https://github.com/apache/tez/blob/38c5aaccdf8e4f7db210975021c78b6db556c87f/tez-common/src/main/java/org/apache/tez/common/web/ProfileServlet.java#L353
> for backward compatibility, SVG is fine, but HTML should be added to the enum



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TEZ-4568) ProfileServlet: add html to output formats

2024-06-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned TEZ-4568:
-

Assignee: László Bodor

> ProfileServlet: add html to output formats
> --
>
> Key: TEZ-4568
> URL: https://issues.apache.org/jira/browse/TEZ-4568
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4568) ProfileServlet: add html to output formats

2024-06-05 Thread Jira
László Bodor created TEZ-4568:
-

 Summary: ProfileServlet: add html to output formats
 Key: TEZ-4568
 URL: https://issues.apache.org/jira/browse/TEZ-4568
 Project: Apache Tez
  Issue Type: Bug
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4567) Failed to load Lz4 codec

2024-05-28 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850257#comment-17850257
 ] 

Bilwa S T commented on TEZ-4567:


cc [~ayushsaxena] [~abstractdog] 

> Failed to load Lz4 codec
> 
>
> Key: TEZ-4567
> URL: https://issues.apache.org/jira/browse/TEZ-4567
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.3
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> Currently we use hadoop 3.3.6 version. As part of this Jira HADOOP-17292, Lz4 
> is a provided dependency in Hadoop Common 3.3.1 for Lz4Codec, so we need to 
> add the dependency in tez as well. Otherwise we get the below exception when 
> we run hive job on tez
> {code:java}
> Caused by: java.lang.NoClassDefFoundError: net/jpountz/lz4/LZ4Factory at 
> org.apache.hadoop.io.compress.lz4.Lz4Compressor.(Lz4Compressor.java:66) 
> at org.apache.hadoop.io.compress.Lz4Codec.createCompressor(Lz4Codec.java:119) 
> at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:152) 
> at 
> org.apache.hadoop.io.compress.CompressionCodec$Util.createOutputStreamWithCodecPool(CompressionCodec.java:131)
>  at 
> org.apache.hadoop.io.compress.Lz4Codec.createOutputStream(Lz4Codec.java:70) 
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createCompressedStream(Utilities.java:949)
>  at 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80)
>  at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
>  at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
>  at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
>  at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
>  at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
>  at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926) at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) 
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926) at 
> org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
>  at 
> org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
>  at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:110)
>  at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:64)
>  at 
> org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116) at 
> org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926) at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) 
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:154)
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:556) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>  ... 19 more Caused by: java.lang.ClassNotFoundException: 
> net.jpountz.lz4.LZ4Factory at 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
>  at 
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
>  at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) ... 53 
> more{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4567) Failed to load Lz4 codec

2024-05-28 Thread Bilwa S T (Jira)
Bilwa S T created TEZ-4567:
--

 Summary: Failed to load Lz4 codec
 Key: TEZ-4567
 URL: https://issues.apache.org/jira/browse/TEZ-4567
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.10.3
Reporter: Bilwa S T
Assignee: Bilwa S T


Currently we use hadoop 3.3.6 version of hadoop. As part of this Jira 
HADOOP-17292, Lz4 is a provided dependency in Hadoop Common 3.3.1 for Lz4Codec, 
so we need to add the dependency in tez as well. Otherwise we get the below 
exception when we run hive job on tez
{code:java}
Caused by: java.lang.NoClassDefFoundError: net/jpountz/lz4/LZ4Factory at 
org.apache.hadoop.io.compress.lz4.Lz4Compressor.(Lz4Compressor.java:66) 
at org.apache.hadoop.io.compress.Lz4Codec.createCompressor(Lz4Codec.java:119) 
at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:152) at 
org.apache.hadoop.io.compress.CompressionCodec$Util.createOutputStreamWithCodecPool(CompressionCodec.java:131)
 at org.apache.hadoop.io.compress.Lz4Codec.createOutputStream(Lz4Codec.java:70) 
at 
org.apache.hadoop.hive.ql.exec.Utilities.createCompressedStream(Utilities.java:949)
 at 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80)
 at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
 at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
 at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
 at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
 at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
 at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926) at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) 
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926) at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
 at 
org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
 at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:110) 
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:64)
 at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116) 
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926) at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) 
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:154)
 at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:556) at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
 ... 19 more Caused by: java.lang.ClassNotFoundException: 
net.jpountz.lz4.LZ4Factory at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
 at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
 at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) ... 53 
more{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4567) Failed to load Lz4 codec

2024-05-28 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated TEZ-4567:
---
Description: 
Currently we use hadoop 3.3.6 version. As part of this Jira HADOOP-17292, Lz4 
is a provided dependency in Hadoop Common 3.3.1 for Lz4Codec, so we need to add 
the dependency in tez as well. Otherwise we get the below exception when we run 
hive job on tez
{code:java}
Caused by: java.lang.NoClassDefFoundError: net/jpountz/lz4/LZ4Factory at 
org.apache.hadoop.io.compress.lz4.Lz4Compressor.(Lz4Compressor.java:66) 
at org.apache.hadoop.io.compress.Lz4Codec.createCompressor(Lz4Codec.java:119) 
at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:152) at 
org.apache.hadoop.io.compress.CompressionCodec$Util.createOutputStreamWithCodecPool(CompressionCodec.java:131)
 at org.apache.hadoop.io.compress.Lz4Codec.createOutputStream(Lz4Codec.java:70) 
at 
org.apache.hadoop.hive.ql.exec.Utilities.createCompressedStream(Utilities.java:949)
 at 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80)
 at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
 at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
 at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
 at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
 at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
 at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926) at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) 
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926) at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
 at 
org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
 at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:110) 
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:64)
 at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116) 
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926) at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) 
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993) at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939) at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
 at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:154)
 at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:556) at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
 ... 19 more Caused by: java.lang.ClassNotFoundException: 
net.jpountz.lz4.LZ4Factory at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
 at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
 at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) ... 53 
more{code}

  was:
Currently we use hadoop 3.3.6 version of hadoop. As part of this Jira 
HADOOP-17292, Lz4 is a provided dependency in Hadoop Common 3.3.1 for Lz4Codec, 
so we need to add the dependency in tez as well. Otherwise we get the below 
exception when we run hive job on tez
{code:java}
Caused by: java.lang.NoClassDefFoundError: net/jpountz/lz4/LZ4Factory at 
org.apache.hadoop.io.compress.lz4.Lz4Compressor.(Lz4Compressor.java:66) 
at org.apache.hadoop.io.compress.Lz4Codec.createCompressor(Lz4Codec.java:119) 
at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:152) at 
org.apache.hadoop.io.compress.CompressionCodec$Util.createOutputStreamWithCodecPool(CompressionCodec.java:131)
 at org.apache.hadoop.io.compress.Lz4Codec.createOutputStream(Lz4Codec.java:70) 
at 
org.apache.hadoop.hive.ql.exec.Utilities.createCompressedStream(Utilities.java:949)
 at 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80)
 at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
 at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282

[jira] [Commented] (TEZ-4564) TezClient to expose Tez AM host:port

2024-05-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849884#comment-17849884
 ] 

Ayush Saxena commented on TEZ-4564:
---

Committed to master.
Thanx [~abstractdog] for the contribution!!!

> TezClient to expose Tez AM host:port
> 
>
> Key: TEZ-4564
> URL: https://issues.apache.org/jira/browse/TEZ-4564
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, there is no easy way to retrieve the AM's host and (RPC) port from 
> a TezClient (or even a DagClient). While implementing HIVE-28095 I'm thinking 
> about it to be useful as we might be interested in it later when it comes to 
> query tracking/history.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4564) TezClient to expose Tez AM host:port

2024-05-27 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved TEZ-4564.
---
Resolution: Fixed

> TezClient to expose Tez AM host:port
> 
>
> Key: TEZ-4564
> URL: https://issues.apache.org/jira/browse/TEZ-4564
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, there is no easy way to retrieve the AM's host and (RPC) port from 
> a TezClient (or even a DagClient). While implementing HIVE-28095 I'm thinking 
> about it to be useful as we might be interested in it later when it comes to 
> query tracking/history.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4355) Unit test precommit improvements - full coverage

2024-05-27 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4355:
--
Summary: Unit test precommit improvements - full coverage  (was: Unit test 
precommit improvements - parallel, full coverage)

> Unit test precommit improvements - full coverage
> 
>
> Key: TEZ-4355
> URL: https://issues.apache.org/jira/browse/TEZ-4355
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> 1. What about running all unit tests in precommit? With the current precommit 
> load in Tez project, it's worth trying (however it needs some flakiness fixes)
> 2. Run tests in splits in a parallel fashion: 2 different, deterministic 
> splits could be a) tez-tests module vs. b) all the rest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4557) Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar

2024-05-22 Thread Raghav Aggarwal (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848656#comment-17848656
 ] 

Raghav Aggarwal commented on TEZ-4557:
--

[~ayushtkn] , can you provide your inputs here?

I don't have hive4 cluster(with ranger) to test this but irrespective of that, 
I think the issue will come. My understanding is, hadoop depends on httpclient 
jar and it was shipping it with hadoop-common transitively but after exclusion 
this hadoop functionality is broken unless tez had direct dependency on 
httpclient, which is not the case.

Exclusion from tez would have made sense if there were 2 different version of 
httpclient (one coming transitively from hadoop and other from tez via direct 
dependency).

> Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar
> ---
>
> Key: TEZ-4557
> URL: https://issues.apache.org/jira/browse/TEZ-4557
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When insert data into table located in encryption zone using Hive with tez 
> fails as the httpclient jar has been excluded from hadoop transitive 
> dependency. Same query passes with MR.
> Tez: 0.10.2,0.10.3
> Hadoop: 3.3.6
> Hive: 3.1.2
>  
> Steps to reproduce issue:
> 1. Create a encryption key using ranger keyadmin user.
> 2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
> 3. create table tbl(id int) location '/user/raghav/encrypt_zone';
> 4. insert into tbl values(1);
>  
> Stacktrace:
> {code:java}
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/http/client/utils/URIBuilder
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
>     at 
> org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
>     at 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
>     at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
>

[jira] [Resolved] (TEZ-4566) NPE in TezChild while fetching attemptId when container is asked to shut down

2024-05-22 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved TEZ-4566.
---
Fix Version/s: 0.10.4
   Resolution: Fixed

> NPE in TezChild while fetching attemptId when container is asked to shut down
> -
>
> Key: TEZ-4566
> URL: https://issues.apache.org/jira/browse/TEZ-4566
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {noformat}
> 2024-05-21T08:50:28,800  WARN [LocalTaskExecutionThread #0] 
> common.TezUtilsInternal: Not configured with appender named: CLA. Cannot 
> reconfigure logger output
> 2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] impl.VertexImpl: 
> Task Completion: vertex_1716306608007_0001_13_00 [Map 1], tasks=4, failed=0, 
> killed=0, success=2, completed=2, commits=0, err=null
> 2024-05-21T08:50:28,800  INFO [TezChild] task.ContainerReporter: Attempting 
> to fetch new task for container container_1716306608007_0001_00_24
> 2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] 
> HistoryEventHandler.criticalEvents: 
> [HISTORY][DAG:dag_1716306608007_0001_13][Event:CONTAINER_STOPPED]: 
> containerId=container_1716306608007_0001_00_24, 
> stoppedTime=1716306628800, exitStatus=0
> 2024-05-21T08:50:28,800  INFO [TezChild] app.TezTaskCommunicatorImpl: 
> Container with id: container_1716306608007_0001_00_24 is valid, but no 
> longer registered, and will be killed
> 2024-05-21T08:50:28,800  INFO [TezChild] task.ContainerReporter: Got 
> TaskUpdate for containerId= container_1716306608007_0001_00_24: 0 ms 
> after starting to poll. TaskInfo: shouldDie: true
> 2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] impl.VertexImpl: 
> Source task attempt completed for vertex: vertex_1716306608007_0001_13_01 
> [Reducer 2] attempt: attempt_1716306608007_0001_13_00_01_0 with state: 
> SUCCEEDED vertexState: RUNNING 
> 2024-05-21T08:50:28,801  INFO [LocalContainerLauncher-SubTaskRunner] 
> launcher.LocalContainerLauncher: Ignoring stop request for containerId: 
> container_1716306608007_0001_00_24
> 2024-05-21T08:50:28,800  INFO [CallbackExecutor] 
> launcher.LocalContainerLauncher: Container: 
> container_1716306608007_0001_00_24: Execution Failed:
> java.lang.NullPointerException: null
>         at org.apache.tez.runtime.task.TezChild.run(TezChild.java:252) 
> ~[tez-runtime-internals-0.10.3.jar:0.10.3]
>         at 
> org.apache.tez.dag.app.launcher.LocalContainerLauncher$1.call(LocalContainerLauncher.java:409)
>  ~[tez-dag-0.10.3.jar:0.10.3]
>         at 
> org.apache.tez.dag.app.launcher.LocalContainerLauncher$1.call(LocalContainerLauncher.java:400)
>  ~[tez-dag-0.10.3.jar:0.10.3]
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
>  ~[guava-22.0.jar:?]
>         at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
>  ~[guava-22.0.jar:?]
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
>  ~[guava-22.0.jar:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_342]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_342]
>         at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_342]
> {noformat}
> Can be reproduced by Running {{TestCrudCompactorOnTez}} in Hive code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4566) NPE in TezChild while fetching attemptId when container is asked to shut down

2024-05-22 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848465#comment-17848465
 ] 

Ayush Saxena commented on TEZ-4566:
---

Committed to master.

Thanx [~abstractdog] for the review!!!

> NPE in TezChild while fetching attemptId when container is asked to shut down
> -
>
> Key: TEZ-4566
> URL: https://issues.apache.org/jira/browse/TEZ-4566
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {noformat}
> 2024-05-21T08:50:28,800  WARN [LocalTaskExecutionThread #0] 
> common.TezUtilsInternal: Not configured with appender named: CLA. Cannot 
> reconfigure logger output
> 2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] impl.VertexImpl: 
> Task Completion: vertex_1716306608007_0001_13_00 [Map 1], tasks=4, failed=0, 
> killed=0, success=2, completed=2, commits=0, err=null
> 2024-05-21T08:50:28,800  INFO [TezChild] task.ContainerReporter: Attempting 
> to fetch new task for container container_1716306608007_0001_00_24
> 2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] 
> HistoryEventHandler.criticalEvents: 
> [HISTORY][DAG:dag_1716306608007_0001_13][Event:CONTAINER_STOPPED]: 
> containerId=container_1716306608007_0001_00_24, 
> stoppedTime=1716306628800, exitStatus=0
> 2024-05-21T08:50:28,800  INFO [TezChild] app.TezTaskCommunicatorImpl: 
> Container with id: container_1716306608007_0001_00_24 is valid, but no 
> longer registered, and will be killed
> 2024-05-21T08:50:28,800  INFO [TezChild] task.ContainerReporter: Got 
> TaskUpdate for containerId= container_1716306608007_0001_00_24: 0 ms 
> after starting to poll. TaskInfo: shouldDie: true
> 2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] impl.VertexImpl: 
> Source task attempt completed for vertex: vertex_1716306608007_0001_13_01 
> [Reducer 2] attempt: attempt_1716306608007_0001_13_00_01_0 with state: 
> SUCCEEDED vertexState: RUNNING 
> 2024-05-21T08:50:28,801  INFO [LocalContainerLauncher-SubTaskRunner] 
> launcher.LocalContainerLauncher: Ignoring stop request for containerId: 
> container_1716306608007_0001_00_24
> 2024-05-21T08:50:28,800  INFO [CallbackExecutor] 
> launcher.LocalContainerLauncher: Container: 
> container_1716306608007_0001_00_24: Execution Failed:
> java.lang.NullPointerException: null
>         at org.apache.tez.runtime.task.TezChild.run(TezChild.java:252) 
> ~[tez-runtime-internals-0.10.3.jar:0.10.3]
>         at 
> org.apache.tez.dag.app.launcher.LocalContainerLauncher$1.call(LocalContainerLauncher.java:409)
>  ~[tez-dag-0.10.3.jar:0.10.3]
>         at 
> org.apache.tez.dag.app.launcher.LocalContainerLauncher$1.call(LocalContainerLauncher.java:400)
>  ~[tez-dag-0.10.3.jar:0.10.3]
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
>  ~[guava-22.0.jar:?]
>         at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
>  ~[guava-22.0.jar:?]
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
>  ~[guava-22.0.jar:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_342]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_342]
>         at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_342]
> {noformat}
> Can be reproduced by Running {{TestCrudCompactorOnTez}} in Hive code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4566) NPE in TezChild while fetching attemptId when container is asked to shut down

2024-05-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4566:
--
Summary: NPE in TezChild while fetching attemptId when container is asked 
to shut down  (was: NPE in TezChild while fetching attemptId)

> NPE in TezChild while fetching attemptId when container is asked to shut down
> -
>
> Key: TEZ-4566
> URL: https://issues.apache.org/jira/browse/TEZ-4566
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> 2024-05-21T08:50:28,800  WARN [LocalTaskExecutionThread #0] 
> common.TezUtilsInternal: Not configured with appender named: CLA. Cannot 
> reconfigure logger output
> 2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] impl.VertexImpl: 
> Task Completion: vertex_1716306608007_0001_13_00 [Map 1], tasks=4, failed=0, 
> killed=0, success=2, completed=2, commits=0, err=null
> 2024-05-21T08:50:28,800  INFO [TezChild] task.ContainerReporter: Attempting 
> to fetch new task for container container_1716306608007_0001_00_24
> 2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] 
> HistoryEventHandler.criticalEvents: 
> [HISTORY][DAG:dag_1716306608007_0001_13][Event:CONTAINER_STOPPED]: 
> containerId=container_1716306608007_0001_00_24, 
> stoppedTime=1716306628800, exitStatus=0
> 2024-05-21T08:50:28,800  INFO [TezChild] app.TezTaskCommunicatorImpl: 
> Container with id: container_1716306608007_0001_00_24 is valid, but no 
> longer registered, and will be killed
> 2024-05-21T08:50:28,800  INFO [TezChild] task.ContainerReporter: Got 
> TaskUpdate for containerId= container_1716306608007_0001_00_24: 0 ms 
> after starting to poll. TaskInfo: shouldDie: true
> 2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] impl.VertexImpl: 
> Source task attempt completed for vertex: vertex_1716306608007_0001_13_01 
> [Reducer 2] attempt: attempt_1716306608007_0001_13_00_01_0 with state: 
> SUCCEEDED vertexState: RUNNING 
> 2024-05-21T08:50:28,801  INFO [LocalContainerLauncher-SubTaskRunner] 
> launcher.LocalContainerLauncher: Ignoring stop request for containerId: 
> container_1716306608007_0001_00_24
> 2024-05-21T08:50:28,800  INFO [CallbackExecutor] 
> launcher.LocalContainerLauncher: Container: 
> container_1716306608007_0001_00_24: Execution Failed:
> java.lang.NullPointerException: null
>         at org.apache.tez.runtime.task.TezChild.run(TezChild.java:252) 
> ~[tez-runtime-internals-0.10.3.jar:0.10.3]
>         at 
> org.apache.tez.dag.app.launcher.LocalContainerLauncher$1.call(LocalContainerLauncher.java:409)
>  ~[tez-dag-0.10.3.jar:0.10.3]
>         at 
> org.apache.tez.dag.app.launcher.LocalContainerLauncher$1.call(LocalContainerLauncher.java:400)
>  ~[tez-dag-0.10.3.jar:0.10.3]
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
>  ~[guava-22.0.jar:?]
>         at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
>  ~[guava-22.0.jar:?]
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
>  ~[guava-22.0.jar:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_342]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_342]
>         at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_342]
> {noformat}
> Can be reproduced by Running {{TestCrudCompactorOnTez}} in Hive code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4566) NPE in TezChild while fetching attemptId

2024-05-21 Thread Ayush Saxena (Jira)
Ayush Saxena created TEZ-4566:
-

 Summary: NPE in TezChild while fetching attemptId
 Key: TEZ-4566
 URL: https://issues.apache.org/jira/browse/TEZ-4566
 Project: Apache Tez
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena


{noformat}
2024-05-21T08:50:28,800  WARN [LocalTaskExecutionThread #0] 
common.TezUtilsInternal: Not configured with appender named: CLA. Cannot 
reconfigure logger output
2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] impl.VertexImpl: 
Task Completion: vertex_1716306608007_0001_13_00 [Map 1], tasks=4, failed=0, 
killed=0, success=2, completed=2, commits=0, err=null
2024-05-21T08:50:28,800  INFO [TezChild] task.ContainerReporter: Attempting to 
fetch new task for container container_1716306608007_0001_00_24
2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] 
HistoryEventHandler.criticalEvents: 
[HISTORY][DAG:dag_1716306608007_0001_13][Event:CONTAINER_STOPPED]: 
containerId=container_1716306608007_0001_00_24, stoppedTime=1716306628800, 
exitStatus=0
2024-05-21T08:50:28,800  INFO [TezChild] app.TezTaskCommunicatorImpl: Container 
with id: container_1716306608007_0001_00_24 is valid, but no longer 
registered, and will be killed
2024-05-21T08:50:28,800  INFO [TezChild] task.ContainerReporter: Got TaskUpdate 
for containerId= container_1716306608007_0001_00_24: 0 ms after starting to 
poll. TaskInfo: shouldDie: true
2024-05-21T08:50:28,800  INFO [Dispatcher thread {Central}] impl.VertexImpl: 
Source task attempt completed for vertex: vertex_1716306608007_0001_13_01 
[Reducer 2] attempt: attempt_1716306608007_0001_13_00_01_0 with state: 
SUCCEEDED vertexState: RUNNING 
2024-05-21T08:50:28,801  INFO [LocalContainerLauncher-SubTaskRunner] 
launcher.LocalContainerLauncher: Ignoring stop request for containerId: 
container_1716306608007_0001_00_24
2024-05-21T08:50:28,800  INFO [CallbackExecutor] 
launcher.LocalContainerLauncher: Container: 
container_1716306608007_0001_00_24: Execution Failed:
java.lang.NullPointerException: null
        at org.apache.tez.runtime.task.TezChild.run(TezChild.java:252) 
~[tez-runtime-internals-0.10.3.jar:0.10.3]
        at 
org.apache.tez.dag.app.launcher.LocalContainerLauncher$1.call(LocalContainerLauncher.java:409)
 ~[tez-dag-0.10.3.jar:0.10.3]
        at 
org.apache.tez.dag.app.launcher.LocalContainerLauncher$1.call(LocalContainerLauncher.java:400)
 ~[tez-dag-0.10.3.jar:0.10.3]
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
 ~[guava-22.0.jar:?]
        at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
 ~[guava-22.0.jar:?]
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
 ~[guava-22.0.jar:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_342]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_342]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_342]
{noformat}
Can be reproduced by Running {{TestCrudCompactorOnTez}} in Hive code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4564) TezClient to expose Tez AM host:port

2024-05-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4564:
--
Description: Currently, there is no easy way to retrieve the AM's host and 
(RPC) port from a TezClient (or even a DagClient). While implementing 
HIVE-28095 I'm thinking about it to be useful as we might be interested in it 
later when it comes to query tracking/history.

> TezClient to expose Tez AM host:port
> 
>
> Key: TEZ-4564
> URL: https://issues.apache.org/jira/browse/TEZ-4564
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, there is no easy way to retrieve the AM's host and (RPC) port from 
> a TezClient (or even a DagClient). While implementing HIVE-28095 I'm thinking 
> about it to be useful as we might be interested in it later when it comes to 
> query tracking/history.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4565) TestAnalyzer subtest testInternalPreemption is flaky

2024-05-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4565:
--
Fix Version/s: 0.10.4

> TestAnalyzer subtest testInternalPreemption is flaky
> 
>
> Key: TEZ-4565
> URL: https://issues.apache.org/jira/browse/TEZ-4565
> Project: Apache Tez
>  Issue Type: Test
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4565) TestAnalyzer subtest testInternalPreemption is flaky

2024-05-16 Thread Jonathan Turner Eagles (Jira)
Jonathan Turner Eagles created TEZ-4565:
---

 Summary: TestAnalyzer subtest testInternalPreemption is flaky
 Key: TEZ-4565
 URL: https://issues.apache.org/jira/browse/TEZ-4565
 Project: Apache Tez
  Issue Type: Test
Reporter: Jonathan Turner Eagles
Assignee: Jonathan Turner Eagles






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TEZ-4564) TezClient to expose Tez AM host:port

2024-05-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned TEZ-4564:
-

Assignee: László Bodor

> TezClient to expose Tez AM host:port
> 
>
> Key: TEZ-4564
> URL: https://issues.apache.org/jira/browse/TEZ-4564
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4564) TezClient to expose Tez AM host:port

2024-05-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4564:
--
Fix Version/s: 0.10.4

> TezClient to expose Tez AM host:port
> 
>
> Key: TEZ-4564
> URL: https://issues.apache.org/jira/browse/TEZ-4564
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4564) TezClient to expose Tez AM host:port

2024-05-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4564:
--
Summary: TezClient to expose Tez AM host:port  (was: DAGClient to expose 
Tez AM host:port)

> TezClient to expose Tez AM host:port
> 
>
> Key: TEZ-4564
> URL: https://issues.apache.org/jira/browse/TEZ-4564
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4564) DAGClient to expose Tez AM host:port

2024-05-16 Thread Jira
László Bodor created TEZ-4564:
-

 Summary: DAGClient to expose Tez AM host:port
 Key: TEZ-4564
 URL: https://issues.apache.org/jira/browse/TEZ-4564
 Project: Apache Tez
  Issue Type: Improvement
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4547) Add Tez AM JobID to the JobConf

2024-05-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4547:
--
Fix Version/s: 0.10.4

> Add Tez AM JobID to the JobConf
> ---
>
> Key: TEZ-4547
> URL: https://issues.apache.org/jira/browse/TEZ-4547
> Project: Apache Tez
>  Issue Type: Improvement
>Affects Versions: 0.10.2
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Tez creates JobIDs for tasks by appending the vertex index to the cluster 
> timestamp to avoid multiple jobs run in a single Tez session sharing a JobID. 
> Hadoop's MagicS3GuardCommitter needs a job-wide UUID to ensure that the task 
> committers and the job committer write to/read from the same paths and can 
> hence actually commit data. Adding the AM's JobID to the Configuration 
> objects allows applications like Hive to pass that as the UUID to the 
> committer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TEZ-4547) Add Tez AM JobID to the JobConf

2024-05-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned TEZ-4547:
-

Assignee: Venkatasubrahmanian Narayanan

> Add Tez AM JobID to the JobConf
> ---
>
> Key: TEZ-4547
> URL: https://issues.apache.org/jira/browse/TEZ-4547
> Project: Apache Tez
>  Issue Type: Improvement
>Affects Versions: 0.10.2
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Tez creates JobIDs for tasks by appending the vertex index to the cluster 
> timestamp to avoid multiple jobs run in a single Tez session sharing a JobID. 
> Hadoop's MagicS3GuardCommitter needs a job-wide UUID to ensure that the task 
> committers and the job committer write to/read from the same paths and can 
> hence actually commit data. Adding the AM's JobID to the Configuration 
> objects allows applications like Hive to pass that as the UUID to the 
> committer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4019) Modify Tez shuffle handler to use AuxiliaryLocalPathHandler instead of LocalDirAllocator

2024-05-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846209#comment-17846209
 ] 

László Bodor commented on TEZ-4019:
---

merged to master, thanks [~jeagles] for the patch!

> Modify Tez shuffle handler to use AuxiliaryLocalPathHandler instead of 
> LocalDirAllocator
> 
>
> Key: TEZ-4019
> URL: https://issues.apache.org/jira/browse/TEZ-4019
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Like with the MR shuffle handler , this new API (YARN-7244) exposed in Hadoop 
> version 2.8.2 and up helps keep the NM's view of disks good to use and the 
> auxiliary services' view in sync. Tez right now compiles with 2.7 but when we 
> move that we should allow this new good behavior to come in.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TEZ-4019) Modify Tez shuffle handler to use AuxiliaryLocalPathHandler instead of LocalDirAllocator

2024-05-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned TEZ-4019:
-

Assignee: Jonathan Turner Eagles  (was: Kuhu Shukla)

> Modify Tez shuffle handler to use AuxiliaryLocalPathHandler instead of 
> LocalDirAllocator
> 
>
> Key: TEZ-4019
> URL: https://issues.apache.org/jira/browse/TEZ-4019
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Kuhu Shukla
>Assignee: Jonathan Turner Eagles
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Like with the MR shuffle handler , this new API (YARN-7244) exposed in Hadoop 
> version 2.8.2 and up helps keep the NM's view of disks good to use and the 
> auxiliary services' view in sync. Tez right now compiles with 2.7 but when we 
> move that we should allow this new good behavior to come in.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4019) Modify Tez shuffle handler to use AuxiliaryLocalPathHandler instead of LocalDirAllocator

2024-05-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4019:
--
Fix Version/s: 0.10.4

> Modify Tez shuffle handler to use AuxiliaryLocalPathHandler instead of 
> LocalDirAllocator
> 
>
> Key: TEZ-4019
> URL: https://issues.apache.org/jira/browse/TEZ-4019
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Like with the MR shuffle handler , this new API (YARN-7244) exposed in Hadoop 
> version 2.8.2 and up helps keep the NM's view of disks good to use and the 
> auxiliary services' view in sync. Tez right now compiles with 2.7 but when we 
> move that we should allow this new good behavior to come in.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4019) Modify Tez shuffle handler to use AuxiliaryLocalPathHandler instead of LocalDirAllocator

2024-05-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved TEZ-4019.
---
Resolution: Fixed

> Modify Tez shuffle handler to use AuxiliaryLocalPathHandler instead of 
> LocalDirAllocator
> 
>
> Key: TEZ-4019
> URL: https://issues.apache.org/jira/browse/TEZ-4019
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Kuhu Shukla
>Assignee: Jonathan Turner Eagles
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Like with the MR shuffle handler , this new API (YARN-7244) exposed in Hadoop 
> version 2.8.2 and up helps keep the NM's view of disks good to use and the 
> auxiliary services' view in sync. Tez right now compiles with 2.7 but when we 
> move that we should allow this new good behavior to come in.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4542) Tez application may fail due to int overflow when record size is large and sort memory is low.

2024-05-14 Thread Chenyu Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846208#comment-17846208
 ] 

Chenyu Zheng commented on TEZ-4542:
---

Thanks [~abstractdog] and [~rbalamohan] for the review!

[~abstractdog]  BTW, do you mind taking a look at HIVE-27985 ? 

> Tez application may fail due to int overflow when record size is large and 
> sort memory is low.
> --
>
> Key: TEZ-4542
> URL: https://issues.apache.org/jira/browse/TEZ-4542
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.9.2
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Tez application application fail, then found this error stack:
> {code:java}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
>   ... 18 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:402)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:907)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:675)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:753)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:314)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:277)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:270)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:256)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361)
>   ... 19 more
> Caused by: java.lang.IllegalArgumentException
>   at java.nio.Buffer.position(Buffer.java:244)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.(PipelinedSorter.java:936)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:350)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:406)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:379)
>   at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor$TezKVOutputCollector.collect(TezProcessor.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:541)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:385)
>   ... 28 more {code}
> After adding the debug log, it is easy to find this problem. The variable 
> `dataSize` in {{{}PipelinedSorter::{}}}SortSpan is overflow. 
> This problem will be triggered if the following two conditions are met at the 
> same time:
>  * Too many IO for vertex, causing the memory allocated to each I/O for 
> sorting to be too small.
>  * When average record size is larger than 2K, `dataSize`  in 
> {{{}PipelinedSorter::{}}}SortSpan is overflow will be overflow, will not 
> try to allocate less meta space. Then raise exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4542) Tez application may fail due to int overflow when record size is large and sort memory is low.

2024-05-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4542:
--
Fix Version/s: 0.10.4

> Tez application may fail due to int overflow when record size is large and 
> sort memory is low.
> --
>
> Key: TEZ-4542
> URL: https://issues.apache.org/jira/browse/TEZ-4542
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.9.2
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Tez application application fail, then found this error stack:
> {code:java}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
>   ... 18 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:402)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:907)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:675)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:753)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:314)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:277)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:270)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:256)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361)
>   ... 19 more
> Caused by: java.lang.IllegalArgumentException
>   at java.nio.Buffer.position(Buffer.java:244)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.(PipelinedSorter.java:936)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:350)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:406)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:379)
>   at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor$TezKVOutputCollector.collect(TezProcessor.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:541)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:385)
>   ... 28 more {code}
> After adding the debug log, it is easy to find this problem. The variable 
> `dataSize` in {{{}PipelinedSorter::{}}}SortSpan is overflow. 
> This problem will be triggered if the following two conditions are met at the 
> same time:
>  * Too many IO for vertex, causing the memory allocated to each I/O for 
> sorting to be too small.
>  * When average record size is larger than 2K, `dataSize`  in 
> {{{}PipelinedSorter::{}}}SortSpan is overflow will be overflow, will not 
> try to allocate less meta space. Then raise exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4542) Tez application may fail due to int overflow when record size is large and sort memory is low.

2024-05-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved TEZ-4542.
---
Resolution: Fixed

> Tez application may fail due to int overflow when record size is large and 
> sort memory is low.
> --
>
> Key: TEZ-4542
> URL: https://issues.apache.org/jira/browse/TEZ-4542
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.9.2
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Tez application application fail, then found this error stack:
> {code:java}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
>   ... 18 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:402)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:907)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:675)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:753)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:314)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:277)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:270)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:256)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361)
>   ... 19 more
> Caused by: java.lang.IllegalArgumentException
>   at java.nio.Buffer.position(Buffer.java:244)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.(PipelinedSorter.java:936)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:350)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:406)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:379)
>   at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor$TezKVOutputCollector.collect(TezProcessor.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:541)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:385)
>   ... 28 more {code}
> After adding the debug log, it is easy to find this problem. The variable 
> `dataSize` in {{{}PipelinedSorter::{}}}SortSpan is overflow. 
> This problem will be triggered if the following two conditions are met at the 
> same time:
>  * Too many IO for vertex, causing the memory allocated to each I/O for 
> sorting to be too small.
>  * When average record size is larger than 2K, `dataSize`  in 
> {{{}PipelinedSorter::{}}}SortSpan is overflow will be overflow, will not 
> try to allocate less meta space. Then raise exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4542) Tez application may fail due to int overflow when record size is large and sort memory is low.

2024-05-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846206#comment-17846206
 ] 

László Bodor commented on TEZ-4542:
---

merged to master, thanks [~zhengchenyu] for the patch and [~rbalamohan] for the 
review!

> Tez application may fail due to int overflow when record size is large and 
> sort memory is low.
> --
>
> Key: TEZ-4542
> URL: https://issues.apache.org/jira/browse/TEZ-4542
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.9.2
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Tez application application fail, then found this error stack:
> {code:java}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
>   ... 18 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:402)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:907)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:675)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:753)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:314)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:277)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:270)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:256)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361)
>   ... 19 more
> Caused by: java.lang.IllegalArgumentException
>   at java.nio.Buffer.position(Buffer.java:244)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.(PipelinedSorter.java:936)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:350)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:406)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:379)
>   at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor$TezKVOutputCollector.collect(TezProcessor.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:541)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:385)
>   ... 28 more {code}
> After adding the debug log, it is easy to find this problem. The variable 
> `dataSize` in {{{}PipelinedSorter::{}}}SortSpan is overflow. 
> This problem will be triggered if the following two conditions are met at the 
> same time:
>  * Too many IO for vertex, causing the memory allocated to each I/O for 
> sorting to be too small.
>  * When average record size is larger than 2K, `dataSize`  in 
> {{{}PipelinedSorter::{}}}SortSpan is overflow will be overflow, will not 
> try to allocate less meta space. Then raise exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4563) Bump org.bouncycastle:bcprov-jdk18on from 1.77 to 1.78

2024-05-09 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved TEZ-4563.
---
Fix Version/s: 0.10.4
   Resolution: Fixed

> Bump org.bouncycastle:bcprov-jdk18on from 1.77 to 1.78 
> ---
>
> Key: TEZ-4563
> URL: https://issues.apache.org/jira/browse/TEZ-4563
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> PR by dependabot
> https://github.com/apache/tez/pull/352



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4563) Bump org.bouncycastle:bcprov-jdk18on from 1.77 to 1.78

2024-05-07 Thread Ayush Saxena (Jira)
Ayush Saxena created TEZ-4563:
-

 Summary: Bump org.bouncycastle:bcprov-jdk18on from 1.77 to 1.78 
 Key: TEZ-4563
 URL: https://issues.apache.org/jira/browse/TEZ-4563
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Ayush Saxena


PR by dependabot
https://github.com/apache/tez/pull/352



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4562) Fix Tez Job Analyzer after TEZ_DAG_EXTRA_INFO

2024-05-07 Thread Jonathan Turner Eagles (Jira)
Jonathan Turner Eagles created TEZ-4562:
---

 Summary: Fix Tez Job Analyzer after TEZ_DAG_EXTRA_INFO
 Key: TEZ-4562
 URL: https://issues.apache.org/jira/browse/TEZ-4562
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Jonathan Turner Eagles
Assignee: Jonathan Turner Eagles


TEZ-3611 split DAG INFO and DAG EXTRA INFO but tez job analyzer wasn't updated 
to account for the change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4558) Update build setup maven version and enforcer minimum to correct minimum

2024-05-07 Thread Jonathan Turner Eagles (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Turner Eagles resolved TEZ-4558.
-
Fix Version/s: 0.10.4
   Resolution: Fixed

> Update build setup maven version and enforcer minimum to correct minimum
> 
>
> Key: TEZ-4558
> URL: https://issues.apache.org/jira/browse/TEZ-4558
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Build can't succeed according to build instructions and maven required 
> version enforcement.
> maven-enforcer-plugin: requireMavenVersion 3.0.2
> [MVNVM] Using maven: 3.1.0
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:3.0.0:enforce 
> (enforce-maven-version) on project tez: The plugin 
> org.apache.maven.plugins:maven-enforcer-plugin:3.0.0 requires Maven version 
> 3.1.1 -> [Help 1]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (TEZ-4355) Unit test precommit improvements - parallel, full coverage

2024-05-07 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844191#comment-17844191
 ] 

László Bodor edited comment on TEZ-4355 at 5/7/24 8:26 AM:
---

I feel the full test coverage becomes more important after regressions like 
TEZ-4559
I believe we should run all unit tests in the precommit, it takes ~1h as far as 
I can remember, will come back to this soon, cc: [~ayushtkn]


was (Author: abstractdog):
I feel the full test coverage becomes more important after regressions like 
TEZ-4559
I believe we should run all unit tests in the precommit, it takes ~1h as far as 
I can remember

> Unit test precommit improvements - parallel, full coverage
> --
>
> Key: TEZ-4355
> URL: https://issues.apache.org/jira/browse/TEZ-4355
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> 1. What about running all unit tests in precommit? With the current precommit 
> load in Tez project, it's worth trying (however it needs some flakiness fixes)
> 2. Run tests in splits in a parallel fashion: 2 different, deterministic 
> splits could be a) tez-tests module vs. b) all the rest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4559) Fix Retry logic in case of Recovery

2024-05-07 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844194#comment-17844194
 ] 

Ayush Saxena commented on TEZ-4559:
---

Committed to master.
Thanx [~abstractdog] for the contribution!!!

> Fix Retry logic in case of Recovery
> ---
>
> Key: TEZ-4559
> URL: https://issues.apache.org/jira/browse/TEZ-4559
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: László Bodor
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> These tests are broken
> TestAMRecovery, TestDAGRecovery, TestRecovery
> This was broken by TEZ-4543, where we simply returned a failed DAG if the 
> requested DAG status cannot be found. This completely breaks recovery 
> scenarios where the dagClient might keep asking for the failed DAGs status 
> (while the AM restarts after a failure).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4355) Unit test precommit improvements - parallel, full coverage

2024-05-07 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844191#comment-17844191
 ] 

László Bodor commented on TEZ-4355:
---

I feel the full test coverage becomes more important after regressions like 
TEZ-4559
I believe we should run all unit tests in the precommit, it takes ~1h as far as 
I can remember

> Unit test precommit improvements - parallel, full coverage
> --
>
> Key: TEZ-4355
> URL: https://issues.apache.org/jira/browse/TEZ-4355
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> 1. What about running all unit tests in precommit? With the current precommit 
> load in Tez project, it's worth trying (however it needs some flakiness fixes)
> 2. Run tests in splits in a parallel fashion: 2 different, deterministic 
> splits could be a) tez-tests module vs. b) all the rest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TEZ-4355) Unit test precommit improvements - parallel, full coverage

2024-05-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned TEZ-4355:
-

Assignee: László Bodor

> Unit test precommit improvements - parallel, full coverage
> --
>
> Key: TEZ-4355
> URL: https://issues.apache.org/jira/browse/TEZ-4355
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> 1. What about running all unit tests in precommit? With the current precommit 
> load in Tez project, it's worth trying (however it needs some flakiness fixes)
> 2. Run tests in splits in a parallel fashion: 2 different, deterministic 
> splits could be a) tez-tests module vs. b) all the rest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4561) Improve reported exception when DAGAppMaster is shutting down

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4561:
--
Description: 
https://github.com/apache/tez/blob/66a6ca64b5edde0d30bea0962cb132f3c4982469/tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java#L1683

the AM can return this exception during a shutdown like below:
{code}
TezUncheckedException: Cannot get ApplicationACLs before all services have 
started
   at 
org.apache.tez.dag.app.DAGAppMaster$RunningAppContext.getApplicationACLs(DAGAppMaster.java:1733)
   at 
org.apache.tez.dag.app.rm.container.AMContainerImpl$LaunchRequestTransition.transition(AMContainerImpl.java:513)
   at 
org.apache.tez.dag.app.rm.container.AMContainerImpl$LaunchRequestTransition.transition(AMContainerImpl.java:470)
   at 
org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
   at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:493)
   at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:64)
   at 
org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:441)
   at 
org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:78)
   at 
org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:68)
   at 
org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:40)
   at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:200)
   at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:118)
   at java.base/java.lang.Thread.run(Thread.java:829)\r
{code}
which is confusing, and doesn't make the log reader aware that 
getServiceState() != STATE.STARTED is not an initialization problem (especially 
confusing in case of an AM which is already running for a long time), instead 
STATE.STOPPED

we should check that and report (maybe even with a timestamp when the 
shutdownhook was started)

> Improve reported exception when DAGAppMaster is shutting down
> -
>
> Key: TEZ-4561
> URL: https://issues.apache.org/jira/browse/TEZ-4561
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
>
> https://github.com/apache/tez/blob/66a6ca64b5edde0d30bea0962cb132f3c4982469/tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java#L1683
> the AM can return this exception during a shutdown like below:
> {code}
> TezUncheckedException: Cannot get ApplicationACLs before all services have 
> started
>at 
> org.apache.tez.dag.app.DAGAppMaster$RunningAppContext.getApplicationACLs(DAGAppMaster.java:1733)
>at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$LaunchRequestTransition.transition(AMContainerImpl.java:513)
>at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$LaunchRequestTransition.transition(AMContainerImpl.java:470)
>at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
>at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:493)
>at 
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:64)
>at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:441)
>at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:78)
>at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:68)
>at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:40)
>at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:200)
>at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:118)
>at java.base/java.lang.Thread.run(Thread.java:829)\r
> {code}
> which is confusing, and doesn't make the log reader aware that 
> getServiceState() != STATE.STARTED is not an initialization problem 
> (especially confusing in case of an AM which is already running for a long 
> time), instead STATE.STOPPED
> we should check that and report (maybe even with a timestamp when the 
> shutdownhook was started)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4561) Improve reported exception when DAGAppMaster is shutting down

2024-05-06 Thread Jira
László Bodor created TEZ-4561:
-

 Summary: Improve reported exception when DAGAppMaster is shutting 
down
 Key: TEZ-4561
 URL: https://issues.apache.org/jira/browse/TEZ-4561
 Project: Apache Tez
  Issue Type: Improvement
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4559) Fix Retry logic in case of Recovery

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4559:
--
Description: 
These tests are broken
TestAMRecovery, TestDAGRecovery, TestRecovery

This was broken by TEZ-4543, where we simply returned a failed DAG if the 
requested DAG status cannot be found. This completely breaks recovery scenarios 
where the dagClient might keep asking for the failed DAGs status (while the AM 
restarts after a failure).

  was:
These tests are broken
TestAMRecovery, TestDAGRecovery, TestRecovery

This was broken by TEZ-4543, where we simply returned a failed DAG if the 
requested DAG status


> Fix Retry logic in case of Recovery
> ---
>
> Key: TEZ-4559
> URL: https://issues.apache.org/jira/browse/TEZ-4559
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: László Bodor
>Priority: Major
>
> These tests are broken
> TestAMRecovery, TestDAGRecovery, TestRecovery
> This was broken by TEZ-4543, where we simply returned a failed DAG if the 
> requested DAG status cannot be found. This completely breaks recovery 
> scenarios where the dagClient might keep asking for the failed DAGs status 
> (while the AM restarts after a failure).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4559) Fix Retry logic in case of Recovery

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4559:
--
Description: 
These tests are broken
TestAMRecovery, TestDAGRecovery, TestRecovery

This was broken by TEZ-4543, where we simply returned a failed DAG if the 
requested DAG status

  was:
These tests are broken
TestAMRecovery, TestDAGRecovery, TestRecovery


> Fix Retry logic in case of Recovery
> ---
>
> Key: TEZ-4559
> URL: https://issues.apache.org/jira/browse/TEZ-4559
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: László Bodor
>Priority: Major
>
> These tests are broken
> TestAMRecovery, TestDAGRecovery, TestRecovery
> This was broken by TEZ-4543, where we simply returned a failed DAG if the 
> requested DAG status



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TEZ-4559) Fix Retry logic in case of Recovery

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned TEZ-4559:
-

Assignee: László Bodor

> Fix Retry logic in case of Recovery
> ---
>
> Key: TEZ-4559
> URL: https://issues.apache.org/jira/browse/TEZ-4559
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: László Bodor
>Priority: Major
>
> These tests are broken
> TestAMRecovery, TestDAGRecovery, TestRecovery



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4552) Upgrade protobuf to 3.24.4 due to CVE.

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4552:
--
Fix Version/s: 0.10.4

> Upgrade protobuf to 3.24.4 due to CVE.
> --
>
> Key: TEZ-4552
> URL: https://issues.apache.org/jira/browse/TEZ-4552
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> I found that there are 3 CVE issues that we need to deal with. These CVE 
> issues are related to protobuf. Our protobuf uses 3.21.1, which is an old 
> version. This PR will try to upgrade the protobuf version to solve the CVE 
> issue.
>  * 
> [CVE-2022-3171|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3171]
>  * 
> [CVE-2022-3509|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3509]
>  * 
> [CVE-2022-3510|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3510]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4552) Upgrade protobuf to 3.24.4 due to CVE.

2024-05-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843662#comment-17843662
 ] 

László Bodor commented on TEZ-4552:
---

merged to master, thanks [~slfan1989] for this patch!

> Upgrade protobuf to 3.24.4 due to CVE.
> --
>
> Key: TEZ-4552
> URL: https://issues.apache.org/jira/browse/TEZ-4552
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> I found that there are 3 CVE issues that we need to deal with. These CVE 
> issues are related to protobuf. Our protobuf uses 3.21.1, which is an old 
> version. This PR will try to upgrade the protobuf version to solve the CVE 
> issue.
>  * 
> [CVE-2022-3171|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3171]
>  * 
> [CVE-2022-3509|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3509]
>  * 
> [CVE-2022-3510|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3510]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4552) Upgrade protobuf to 3.24.4 due to CVE.

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved TEZ-4552.
---
Resolution: Fixed

> Upgrade protobuf to 3.24.4 due to CVE.
> --
>
> Key: TEZ-4552
> URL: https://issues.apache.org/jira/browse/TEZ-4552
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> I found that there are 3 CVE issues that we need to deal with. These CVE 
> issues are related to protobuf. Our protobuf uses 3.21.1, which is an old 
> version. This PR will try to upgrade the protobuf version to solve the CVE 
> issue.
>  * 
> [CVE-2022-3171|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3171]
>  * 
> [CVE-2022-3509|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3509]
>  * 
> [CVE-2022-3510|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3510]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4560) Upgrade bouncycastle to 1.77 due to CVE.

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved TEZ-4560.
---
Resolution: Fixed

> Upgrade bouncycastle to 1.77 due to CVE.
> 
>
> Key: TEZ-4560
> URL: https://issues.apache.org/jira/browse/TEZ-4560
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There are 2 CVE issues in bcprov-jdk15on, CVE-2023-33202 and CVE-2023-33201. 
> We can find more information at the following link:
> [https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk15on/1.70]
> The link to the CVE is as follows: 
> [CVE-2023-33202|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33202]
> [CVE-2023-33201|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33201]
> We can upgrade bcprov-jdk15on to bcprov-jdk18on to address the CVE issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4560) Upgrade bouncycastle to 1.77 due to CVE.

2024-05-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843661#comment-17843661
 ] 

László Bodor commented on TEZ-4560:
---

merged to master, thanks [~slfan1989] for this patch!

> Upgrade bouncycastle to 1.77 due to CVE.
> 
>
> Key: TEZ-4560
> URL: https://issues.apache.org/jira/browse/TEZ-4560
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There are 2 CVE issues in bcprov-jdk15on, CVE-2023-33202 and CVE-2023-33201. 
> We can find more information at the following link:
> [https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk15on/1.70]
> The link to the CVE is as follows: 
> [CVE-2023-33202|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33202]
> [CVE-2023-33201|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33201]
> We can upgrade bcprov-jdk15on to bcprov-jdk18on to address the CVE issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4560) Upgrade bouncycastle to 1.77 due to CVE.

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4560:
--
Fix Version/s: 0.10.4

> Upgrade bouncycastle to 1.77 due to CVE.
> 
>
> Key: TEZ-4560
> URL: https://issues.apache.org/jira/browse/TEZ-4560
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There are 2 CVE issues in bcprov-jdk15on, CVE-2023-33202 and CVE-2023-33201. 
> We can find more information at the following link:
> [https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk15on/1.70]
> The link to the CVE is as follows: 
> [CVE-2023-33202|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33202]
> [CVE-2023-33201|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33201]
> We can upgrade bcprov-jdk15on to bcprov-jdk18on to address the CVE issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4551) Upgrade commons-io to 2.16.0.

2024-05-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843659#comment-17843659
 ] 

László Bodor commented on TEZ-4551:
---

merged to master, thanks [~slfan1989] for the patch!

> Upgrade commons-io to 2.16.0. 
> --
>
> Key: TEZ-4551
> URL: https://issues.apache.org/jira/browse/TEZ-4551
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We are currently using commons-io version 2.8.0, which is an older version 
> (Sep 09, 2020). Commons-io has been upgraded to 2.16.0 (Mar 28, 2024). We can 
> try to upgrade the version to 2.16.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4551) Upgrade commons-io to 2.16.0.

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved TEZ-4551.
---
Resolution: Fixed

> Upgrade commons-io to 2.16.0. 
> --
>
> Key: TEZ-4551
> URL: https://issues.apache.org/jira/browse/TEZ-4551
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We are currently using commons-io version 2.8.0, which is an older version 
> (Sep 09, 2020). Commons-io has been upgraded to 2.16.0 (Mar 28, 2024). We can 
> try to upgrade the version to 2.16.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4551) Upgrade commons-io to 2.16.0.

2024-05-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4551:
--
Fix Version/s: 0.10.4

> Upgrade commons-io to 2.16.0. 
> --
>
> Key: TEZ-4551
> URL: https://issues.apache.org/jira/browse/TEZ-4551
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We are currently using commons-io version 2.8.0, which is an older version 
> (Sep 09, 2020). Commons-io has been upgraded to 2.16.0 (Mar 28, 2024). We can 
> try to upgrade the version to 2.16.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4560) Upgrade bouncycastle to 1.77 due to CVE.

2024-05-04 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated TEZ-4560:

Description: 
There are 2 CVE issues in bcprov-jdk15on, CVE-2023-33202 and CVE-2023-33201. We 
can find more information at the following link:

[https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk15on/1.70]

The link to the CVE is as follows: 

[CVE-2023-33202|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33202]
[CVE-2023-33201|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33201]

We can upgrade bcprov-jdk15on to bcprov-jdk18on to address the CVE issues.

  was:
There are 2 CVE issues in bcprov-jdk15on, CVE-2023-33202 and CVE-2023-33201. We 
can find more information at the following link:

[https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk15on/1.70]

 

[CVE-2023-33202|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33202]
[CVE-2023-33201|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33201]


> Upgrade bouncycastle to 1.77 due to CVE.
> 
>
> Key: TEZ-4560
> URL: https://issues.apache.org/jira/browse/TEZ-4560
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>
> There are 2 CVE issues in bcprov-jdk15on, CVE-2023-33202 and CVE-2023-33201. 
> We can find more information at the following link:
> [https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk15on/1.70]
> The link to the CVE is as follows: 
> [CVE-2023-33202|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33202]
> [CVE-2023-33201|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33201]
> We can upgrade bcprov-jdk15on to bcprov-jdk18on to address the CVE issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4560) Upgrade bouncycastle to 1.77 due to CVE.

2024-05-04 Thread Shilun Fan (Jira)
Shilun Fan created TEZ-4560:
---

 Summary: Upgrade bouncycastle to 1.77 due to CVE.
 Key: TEZ-4560
 URL: https://issues.apache.org/jira/browse/TEZ-4560
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Shilun Fan
Assignee: Shilun Fan


There are 2 CVE issues in bcprov-jdk15on, CVE-2023-33202 and CVE-2023-33201. We 
can find more information at the following link:

[https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk15on/1.70]

 

[CVE-2023-33202|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33202]
[CVE-2023-33201|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33201]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4552) Upgrade protobuf to 3.24.4 due to CVE.

2024-05-04 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated TEZ-4552:

Summary: Upgrade protobuf to 3.24.4 due to CVE.  (was: Upgrade protobuf to 
3.23.4. )

> Upgrade protobuf to 3.24.4 due to CVE.
> --
>
> Key: TEZ-4552
> URL: https://issues.apache.org/jira/browse/TEZ-4552
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4552) Upgrade protobuf to 3.24.4 due to CVE.

2024-05-04 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated TEZ-4552:

Description: 
I found that there are 3 CVE issues that we need to deal with. These CVE issues 
are related to protobuf. Our protobuf uses 3.21.1, which is an old version. 
This PR will try to upgrade the protobuf version to solve the CVE issue.
 * [CVE-2022-3171|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3171]
 * [CVE-2022-3509|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3509]
 * [CVE-2022-3510|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3510]

> Upgrade protobuf to 3.24.4 due to CVE.
> --
>
> Key: TEZ-4552
> URL: https://issues.apache.org/jira/browse/TEZ-4552
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I found that there are 3 CVE issues that we need to deal with. These CVE 
> issues are related to protobuf. Our protobuf uses 3.21.1, which is an old 
> version. This PR will try to upgrade the protobuf version to solve the CVE 
> issue.
>  * 
> [CVE-2022-3171|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3171]
>  * 
> [CVE-2022-3509|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3509]
>  * 
> [CVE-2022-3510|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-3510]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4543) Throw a special exception to DagClient when there is no current DAG

2024-05-03 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843166#comment-17843166
 ] 

Ayush Saxena commented on TEZ-4543:
---

This is leading to some test failures:
TestAMRecovery, TestDAGRecovery, TestRecovery
ref: https://ci-hadoop.apache.org/job/Tez-qbt-0.10-Build/183/testReport/

I have created TEZ-4559, maybe it is breaking the Recovery code

> Throw a special exception to DagClient when there is no current DAG
> ---
>
> Key: TEZ-4543
> URL: https://issues.apache.org/jira/browse/TEZ-4543
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> given the following scenario:
> 1. DAG is assigned to an AM
> 2. AM is killed (e.g. OOMKilled by k8s), HS2 keeps asking the status, facing 
> network errors:
> {code}
> hiveserver2 <14>1 2024-02-26T15:59:56.538Z hiveserver2-0 hiveserver2 1 
> dedef3f4-339f-4ba3-a6ae-300751d3561d [mdc@18060 class="client.DAGClientImpl" 
> dagId="dag_1708961199044_0003_1" level="INFO" operationLogLevel="EXECUTION" 
> queryId="hive_20240226155836_6b1e9eb9-efd7-42fd-8872-f4189c5dda3a" 
> sessionId="9e4cb344-ad7f-4344-9b24-aedaf0e73bf4" 
> thread="HiveServer2-Background-Pool: Thread-129"] Cannot retrieve DAG Status 
> due to IOException: DestHost:destPort 
> query-coordinator-0-0.query-coordinator-0-service.compute-1708603165-qlg5.svc.cluster.local:2
>  , LocalHost:localPort hiveserver2-0/100.100.83.80:0. Failed on local 
> exception: java.io.IOException: java.io.IOException: Connection reset by peer
> {code}
> by this time, HS2 cannot tell if the AM is lost forever, or there is a 
> recoverable intermittent network issue
> 3. AM restarts quite quickly and the DagClient in HS2 tries to fetch the DAG 
> status (getDagStatus call) from the restarted coordinator, HS2 isn't even 
> able to realize it was talking to a new AM, and keeps asking for DAG status
> 4. in AM, the below exception is kept thrown and it's not handled by the 
> DagClient
> {code}
>  <14>1 2024-02-05T02:06:58.065Z query-coordinator-0-4 query-coordinator 1 
> 10757dcc-1e4c-4dd2-ba76-8a2411ab1bdf [mdc@18060 class="ipc.Server" 
> level="INFO" thread="IPC Server handler 0 on 2"] IPC Server handler 0 on 
> 2, call Call#15312255 Retry#0 
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.getDAGStatus 
> from 127.0.0.6:56221
> org.apache.tez.dag.api.TezException: No running dag at present
> at 
> org.apache.tez.dag.api.client.DAGClientHandler.getDAG(DAGClientHandler.java:99)
> at 
> org.apache.tez.dag.api.client.DAGClientHandler.getACLManager(DAGClientHandler.java:181)
> at 
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.getDAGStatus(DAGClientAMProtocolBlockingPBServerImpl.java:102)
> at 
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:8513)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917)
> at java.base/java.security.AccessController.doPrivileged(Native Method)
> at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894)
> {code}
> AM should be able to return a specialized exception which can be handled by 
> the client



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4559) Fix Retry logic in case of Recovery

2024-05-03 Thread Ayush Saxena (Jira)
Ayush Saxena created TEZ-4559:
-

 Summary: Fix Retry logic in case of Recovery
 Key: TEZ-4559
 URL: https://issues.apache.org/jira/browse/TEZ-4559
 Project: Apache Tez
  Issue Type: Bug
Reporter: Ayush Saxena


These tests are broken
TestAMRecovery, TestDAGRecovery, TestRecovery



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4558) Update build setup maven version and enforcer minimum to correct minimum

2024-05-02 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843003#comment-17843003
 ] 

Jonathan Turner Eagles commented on TEZ-4558:
-

Personally, I've been using 3.6.3. But have been using mvnvm recently which 
grabs the maven required version from the pom file and uses that maven version 
to build the project with.

> Update build setup maven version and enforcer minimum to correct minimum
> 
>
> Key: TEZ-4558
> URL: https://issues.apache.org/jira/browse/TEZ-4558
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Build can't succeed according to build instructions and maven required 
> version enforcement.
> maven-enforcer-plugin: requireMavenVersion 3.0.2
> [MVNVM] Using maven: 3.1.0
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:3.0.0:enforce 
> (enforce-maven-version) on project tez: The plugin 
> org.apache.maven.plugins:maven-enforcer-plugin:3.0.0 requires Maven version 
> 3.1.1 -> [Help 1]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4558) Update build setup maven version and enforcer minimum to correct minimum

2024-05-02 Thread Jonathan Turner Eagles (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Turner Eagles updated TEZ-4558:

Description: 
Build can't succeed according to build instructions and maven required version 
enforcement.

maven-enforcer-plugin: requireMavenVersion 3.0.2
[MVNVM] Using maven: 3.1.0
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-enforcer-plugin:3.0.0:enforce 
(enforce-maven-version) on project tez: The plugin 
org.apache.maven.plugins:maven-enforcer-plugin:3.0.0 requires Maven version 
3.1.1 -> [Help 1]

> Update build setup maven version and enforcer minimum to correct minimum
> 
>
> Key: TEZ-4558
> URL: https://issues.apache.org/jira/browse/TEZ-4558
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
>
> Build can't succeed according to build instructions and maven required 
> version enforcement.
> maven-enforcer-plugin: requireMavenVersion 3.0.2
> [MVNVM] Using maven: 3.1.0
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:3.0.0:enforce 
> (enforce-maven-version) on project tez: The plugin 
> org.apache.maven.plugins:maven-enforcer-plugin:3.0.0 requires Maven version 
> 3.1.1 -> [Help 1]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4558) Update build setup maven version and enforcer minimum to correct minimum

2024-05-02 Thread Jonathan Turner Eagles (Jira)
Jonathan Turner Eagles created TEZ-4558:
---

 Summary: Update build setup maven version and enforcer minimum to 
correct minimum
 Key: TEZ-4558
 URL: https://issues.apache.org/jira/browse/TEZ-4558
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Jonathan Turner Eagles
Assignee: Jonathan Turner Eagles






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (TEZ-4557) Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar

2024-05-02 Thread Raghav Aggarwal (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842941#comment-17842941
 ] 

Raghav Aggarwal edited comment on TEZ-4557 at 5/2/24 11:38 AM:
---

[~ayushtkn], I am using Hive 3.1.2 , hadoop 3.3.6 and tez 0.10.2.

The issue should happen in hive 4 with tez 0.10.3, as httpclient jar is missing 
from tez/lib. Haven't tested it explicitly with those version as ranger 
integration will be required.


was (Author: JIRAUSER295901):
I am using Hive 3.1.2 , hadoop 3.3.6 and tez 0.10.2.

The issue should happen in hive 4 with tez 0.10.3, as httpclient jar is missing 
from tez/lib. 

> Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar
> ---
>
> Key: TEZ-4557
> URL: https://issues.apache.org/jira/browse/TEZ-4557
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When insert data into table located in encryption zone using Hive with tez 
> fails as the httpclient jar has been excluded from hadoop transitive 
> dependency. Same query passes with MR.
> Tez: 0.10.2,0.10.3
> Hadoop: 3.3.6
> Hive: 3.1.2
>  
> Steps to reproduce issue:
> 1. Create a encryption key using ranger keyadmin user.
> 2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
> 3. create table tbl(id int) location '/user/raghav/encrypt_zone';
> 4. insert into tbl values(1);
>  
> Stacktrace:
> {code:java}
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/http/client/utils/URIBuilder
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
>     at 
> org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
>     at 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
>     at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
>     at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOpera

[jira] [Commented] (TEZ-4557) Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar

2024-05-02 Thread Raghav Aggarwal (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842941#comment-17842941
 ] 

Raghav Aggarwal commented on TEZ-4557:
--

I am using Hive 3.1.2 , hadoop 3.3.6 and tez 0.10.2.

The issue should happen in hive 4 with tez 0.10.3, as httpclient jar is missing 
from tez/lib. 

> Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar
> ---
>
> Key: TEZ-4557
> URL: https://issues.apache.org/jira/browse/TEZ-4557
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When insert data into table located in encryption zone using Hive with tez 
> fails as the httpclient jar has been excluded from hadoop transitive 
> dependency. Same query passes with MR.
> Tez: 0.10.2,0.10.3
> Hadoop: 3.3.6
> Hive: 3.1.2
>  
> Steps to reproduce issue:
> 1. Create a encryption key using ranger keyadmin user.
> 2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
> 3. create table tbl(id int) location '/user/raghav/encrypt_zone';
> 4. insert into tbl values(1);
>  
> Stacktrace:
> {code:java}
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/http/client/utils/URIBuilder
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
>     at 
> org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
>     at 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
>     at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
>     at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
>     at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
>     at 
> org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
>

[jira] [Commented] (TEZ-4557) Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar

2024-05-02 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842938#comment-17842938
 ] 

Ayush Saxena commented on TEZ-4557:
---

My Standard question:
Does this reproduce on latest Tez release & Hive-4.0

Hive-4.0 & Tez-0.10.3 supports hadoop 3.3.6, the Hive-3.x supports hadoop 3.1.0

> Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar
> ---
>
> Key: TEZ-4557
> URL: https://issues.apache.org/jira/browse/TEZ-4557
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When insert data into table located in encryption zone using Hive with tez 
> fails as the httpclient jar has been excluded from hadoop transitive 
> dependency. Same query passes with MR.
> Tez: 0.10.2,0.10.3
> Hadoop: 3.3.6
> Hive: 3.1.2
>  
> Steps to reproduce issue:
> 1. Create a encryption key using ranger keyadmin user.
> 2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
> 3. create table tbl(id int) location '/user/raghav/encrypt_zone';
> 4. insert into tbl values(1);
>  
> Stacktrace:
> {code:java}
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/http/client/utils/URIBuilder
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
>     at 
> org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
>     at 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
>     at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
>     at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
>     at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
>     at 
> org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:1

[jira] [Updated] (TEZ-4557) Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar

2024-05-02 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated TEZ-4557:
-
Description: 
When insert data into table located in encryption zone using Hive with tez 
fails as the httpclient jar has been excluded from hadoop transitive 
dependency. Same query passes with MR.

Tez: 0.10.2,0.10.3

Hadoop: 3.3.6

Hive: 3.1.2

 

Steps to reproduce issue:

1. Create a encryption key using ranger keyadmin user.
2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
3. create table tbl(id int) location '/user/raghav/encrypt_zone';
4. insert into tbl values(1);

 

Stacktrace:
{code:java}
Caused by: java.lang.NoClassDefFoundError: 
org/apache/http/client/utils/URIBuilder
    at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
    at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
    at 
org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
    at 
org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
    at 
org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
    at 
org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
    at 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
    at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
    at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
    at 
org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
    at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:110)
    at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:64)
    at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
    at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:154

[jira] [Commented] (TEZ-4557) Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar

2024-05-02 Thread Raghav Aggarwal (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842937#comment-17842937
 ] 

Raghav Aggarwal commented on TEZ-4557:
--

CC [~abstractdog] 

> Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar
> ---
>
> Key: TEZ-4557
> URL: https://issues.apache.org/jira/browse/TEZ-4557
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When insert data into table located in encryption zone using Hive with tez 
> fails as the httpclient jar has been excluded from hadoop transitive 
> dependency. 
> Tez: 0.10.2,0.10.3
> Hadoop: 3.3.6
> Hive: 3.1.2
>  
> Steps to reproduce issue:
> 1. Create a encryption key using ranger keyadmin user.
> 2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
> 3. create table tbl(id int) location '/user/raghav/encrypt_zone';
> 4. insert into tbl values(1);
>  
> Stacktrace:
> {code:java}
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/http/client/utils/URIBuilder
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
>     at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
>     at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
>     at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
>     at 
> org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
>     at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
>     at 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
>     at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
>     at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
>     at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
>     at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
>     at 
> org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
>     at 
> org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
>     at 
>

[jira] [Updated] (TEZ-4557) Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar

2024-05-02 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated TEZ-4557:
-
Description: 
When insert data into table located in encryption zone using Hive with tez 
fails as the httpclient jar has been excluded from hadoop transitive 
dependency. 

Tez: 0.10.2,0.10.3

Hadoop: 3.3.6

Hive: 3.1.2

 

Steps to reproduce issue:

1. Create a encryption key using ranger keyadmin user.
2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
3. create table tbl(id int) location '/user/raghav/encrypt_zone';
4. insert into tbl values(1);

 

Stacktrace:
{code:java}
Caused by: java.lang.NoClassDefFoundError: 
org/apache/http/client/utils/URIBuilder
    at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
    at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
    at 
org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
    at 
org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
    at 
org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
    at 
org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
    at 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
    at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
    at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
    at 
org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
    at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:110)
    at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:64)
    at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
    at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:154

[jira] [Updated] (TEZ-4557) Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar

2024-05-02 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated TEZ-4557:
-
Description: 
When insert data into table located in encryption zone using Hive with tez 
fails as the httpclient jar has been excluded from hadoop transitive 
dependency. 

 

Steps to reproduce issue:

1. Create a encryption key using ranger keyadmin user.
2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
3. create table tbl(id int) location '/user/raghav/encrypt_zone';
4. insert into tbl values(1);

 

Stacktrace:
{code:java}
Caused by: java.lang.NoClassDefFoundError: 
org/apache/http/client/utils/URIBuilder
    at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
    at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
    at 
org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
    at 
org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
    at 
org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
    at 
org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
    at 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
    at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
    at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
    at 
org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
    at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:110)
    at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:64)
    at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
    at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:154)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:556

[jira] [Created] (TEZ-4557) Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar

2024-05-02 Thread Raghav Aggarwal (Jira)
Raghav Aggarwal created TEZ-4557:


 Summary: Revert TEZ-4303, NoClassDefFoundError because of missing 
httpclient jar
 Key: TEZ-4557
 URL: https://issues.apache.org/jira/browse/TEZ-4557
 Project: Apache Tez
  Issue Type: Bug
Reporter: Raghav Aggarwal
Assignee: Raghav Aggarwal


Steps to reproduce issue:

1. Create a encryption key using ranger keyadmin user.
2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
3. create table tbl(id int) location '/user/raghav/encrypt_zone';
4. insert into tbl values(1);

 

Stacktrace:
{code:java}
Caused by: java.lang.NoClassDefFoundError: 
org/apache/http/client/utils/URIBuilder
    at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
    at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
    at 
org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
    at 
org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
    at 
org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
    at 
org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
    at 
org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
    at 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
    at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
    at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
    at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
    at 
org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
    at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:110)
    at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:64)
    at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
    at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
    at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
    at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:154

[jira] [Resolved] (TEZ-4553) Default task scheduler to DagAwareTaskScheduler to avoid hang in TEZ-3535

2024-05-01 Thread Jonathan Turner Eagles (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Turner Eagles resolved TEZ-4553.
-
Fix Version/s: 0.10.4
   Resolution: Fixed

> Default task scheduler to DagAwareTaskScheduler to avoid hang in TEZ-3535
> -
>
> Key: TEZ-4553
> URL: https://issues.apache.org/jira/browse/TEZ-4553
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4555) Fail fast in LocalClient if the dirs (log, local) haven't been created

2024-04-25 Thread Jira


[ 
https://issues.apache.org/jira/browse/TEZ-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840916#comment-17840916
 ] 

László Bodor commented on TEZ-4555:
---

thanks [~ayushtkn] for the review!

> Fail fast in LocalClient if the dirs (log, local) haven't been created
> --
>
> Key: TEZ-4555
> URL: https://issues.apache.org/jira/browse/TEZ-4555
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/tez/blob/f080031f5c72bc4bfd8090ccdc670bdc0f7fd090/tez-dag/src/main/java/org/apache/tez/client/LocalClient.java#L332-L335
> {code}
>   Path logDir = new Path(userDir, "localmode-log-dir");
>   Path localDir = new Path(userDir, "localmode-local-dir");
>   localFs.mkdirs(logDir);
>   localFs.mkdirs(localDir);
> {code}
> in case of a non-writable local fs path (/base), this mkdirs silently returns 
> with false, whereas I can see that it's not writable on my mac:
> {code}
>  mkdir -p /base
> mkdir: /base: Read-only file system
> {code}
> leading to a confusing error message later:
> {code}
> 2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error 
> starting DAGAppMaster
> java.io.FileNotFoundException: 
> /base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb
>  (No such file or directory)
>   at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
>   at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
>   at java.io.FileInputStream.(FileInputStream.java:138) 
> ~[?:1.8.0_292]
>   at 
> org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84)
>  ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at 
> org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) 
> ~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) 
> [tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
> {code}
> actually, the fix should be done in HIVE-28212, but we need to fail fast here 
> and give a hint to the user about the folder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TEZ-4555) Fail fast in LocalClient if the dirs (log, local) haven't been created

2024-04-25 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved TEZ-4555.
---
Fix Version/s: 0.10.4
   Resolution: Fixed

> Fail fast in LocalClient if the dirs (log, local) haven't been created
> --
>
> Key: TEZ-4555
> URL: https://issues.apache.org/jira/browse/TEZ-4555
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 0.10.4
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/tez/blob/f080031f5c72bc4bfd8090ccdc670bdc0f7fd090/tez-dag/src/main/java/org/apache/tez/client/LocalClient.java#L332-L335
> {code}
>   Path logDir = new Path(userDir, "localmode-log-dir");
>   Path localDir = new Path(userDir, "localmode-local-dir");
>   localFs.mkdirs(logDir);
>   localFs.mkdirs(localDir);
> {code}
> in case of a non-writable local fs path (/base), this mkdirs silently returns 
> with false, whereas I can see that it's not writable on my mac:
> {code}
>  mkdir -p /base
> mkdir: /base: Read-only file system
> {code}
> leading to a confusing error message later:
> {code}
> 2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error 
> starting DAGAppMaster
> java.io.FileNotFoundException: 
> /base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb
>  (No such file or directory)
>   at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
>   at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
>   at java.io.FileInputStream.(FileInputStream.java:138) 
> ~[?:1.8.0_292]
>   at 
> org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84)
>  ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at 
> org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) 
> ~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) 
> [tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
> {code}
> actually, the fix should be done in HIVE-28212, but we need to fail fast here 
> and give a hint to the user about the folder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-4555) Fail fast in LocalClient if the dirs (log, local) haven't been created

2024-04-25 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840746#comment-17840746
 ] 

Ayush Saxena commented on TEZ-4555:
---

Committed to master.

Thanx [~abstractdog] for the contribution!!!

> Fail fast in LocalClient if the dirs (log, local) haven't been created
> --
>
> Key: TEZ-4555
> URL: https://issues.apache.org/jira/browse/TEZ-4555
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/tez/blob/f080031f5c72bc4bfd8090ccdc670bdc0f7fd090/tez-dag/src/main/java/org/apache/tez/client/LocalClient.java#L332-L335
> {code}
>   Path logDir = new Path(userDir, "localmode-log-dir");
>   Path localDir = new Path(userDir, "localmode-local-dir");
>   localFs.mkdirs(logDir);
>   localFs.mkdirs(localDir);
> {code}
> in case of a non-writable local fs path (/base), this mkdirs silently returns 
> with false, whereas I can see that it's not writable on my mac:
> {code}
>  mkdir -p /base
> mkdir: /base: Read-only file system
> {code}
> leading to a confusing error message later:
> {code}
> 2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error 
> starting DAGAppMaster
> java.io.FileNotFoundException: 
> /base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb
>  (No such file or directory)
>   at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
>   at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
>   at java.io.FileInputStream.(FileInputStream.java:138) 
> ~[?:1.8.0_292]
>   at 
> org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84)
>  ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at 
> org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) 
> ~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) 
> [tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
> {code}
> actually, the fix should be done in HIVE-28212, but we need to fail fast here 
> and give a hint to the user about the folder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4556) Apache Tez Release 0.10.4

2024-04-24 Thread Jira
László Bodor created TEZ-4556:
-

 Summary: Apache Tez Release 0.10.4
 Key: TEZ-4556
 URL: https://issues.apache.org/jira/browse/TEZ-4556
 Project: Apache Tez
  Issue Type: Improvement
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4555) Fail fast in LocalClient if the dirs (log, local) haven't been created

2024-04-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4555:
--
Summary: Fail fast in LocalClient if the dirs (log, local) haven't been 
created  (was: Fail fast in LocalClient if the dirs haven't been created)

> Fail fast in LocalClient if the dirs (log, local) haven't been created
> --
>
> Key: TEZ-4555
> URL: https://issues.apache.org/jira/browse/TEZ-4555
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> https://github.com/apache/tez/blob/f080031f5c72bc4bfd8090ccdc670bdc0f7fd090/tez-dag/src/main/java/org/apache/tez/client/LocalClient.java#L332-L335
> {code}
>   Path logDir = new Path(userDir, "localmode-log-dir");
>   Path localDir = new Path(userDir, "localmode-local-dir");
>   localFs.mkdirs(logDir);
>   localFs.mkdirs(localDir);
> {code}
> in case of a non-writable local fs path (/base), this mkdirs silently returns 
> with false, whereas I can see that it's not writable on my mac:
> {code}
>  mkdir -p /base
> mkdir: /base: Read-only file system
> {code}
> leading to a confusing error message later:
> {code}
> 2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error 
> starting DAGAppMaster
> java.io.FileNotFoundException: 
> /base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb
>  (No such file or directory)
>   at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
>   at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
>   at java.io.FileInputStream.(FileInputStream.java:138) 
> ~[?:1.8.0_292]
>   at 
> org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84)
>  ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at 
> org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) 
> ~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) 
> [tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
> {code}
> actually, the fix should be done in HIVE-28212, but we need to fail fast here 
> and give a hint to the user about the folder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4555) Fail fast in LocalClient if the dirs haven't been created

2024-04-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4555:
--
Description: 
https://github.com/apache/tez/blob/f080031f5c72bc4bfd8090ccdc670bdc0f7fd090/tez-dag/src/main/java/org/apache/tez/client/LocalClient.java#L332-L335
{code}
  Path logDir = new Path(userDir, "localmode-log-dir");
  Path localDir = new Path(userDir, "localmode-local-dir");
  localFs.mkdirs(logDir);
  localFs.mkdirs(localDir);
{code}

in case of a non-writable local fs path (/base), this mkdirs silently returns 
with false, whereas I can see that it's not writable on my mac:
{code}
 mkdir -p /base
mkdir: /base: Read-only file system
{code}

leading to a confusing error message later:
{code}
2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error 
starting DAGAppMaster
java.io.FileNotFoundException: 
/base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb
 (No such file or directory)
at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
at java.io.FileInputStream.(FileInputStream.java:138) 
~[?:1.8.0_292]
at 
org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84)
 ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
at 
org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) 
~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) 
[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
{code}

actually, the fix should be done in HIVE-28212, but we need to fail fast here 
and give a hint to the user about the folder


> Fail fast in LocalClient if the dirs haven't been created
> -
>
> Key: TEZ-4555
> URL: https://issues.apache.org/jira/browse/TEZ-4555
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> https://github.com/apache/tez/blob/f080031f5c72bc4bfd8090ccdc670bdc0f7fd090/tez-dag/src/main/java/org/apache/tez/client/LocalClient.java#L332-L335
> {code}
>   Path logDir = new Path(userDir, "localmode-log-dir");
>   Path localDir = new Path(userDir, "localmode-local-dir");
>   localFs.mkdirs(logDir);
>   localFs.mkdirs(localDir);
> {code}
> in case of a non-writable local fs path (/base), this mkdirs silently returns 
> with false, whereas I can see that it's not writable on my mac:
> {code}
>  mkdir -p /base
> mkdir: /base: Read-only file system
> {code}
> leading to a confusing error message later:
> {code}
> 2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error 
> starting DAGAppMaster
> java.io.FileNotFoundException: 
> /base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb
>  (No such file or directory)
>   at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
>   at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
>   at java.io.FileInputStream.(FileInputStream.java:138) 
> ~[?:1.8.0_292]
>   at 
> org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84)
>  ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at 
> org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) 
> ~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) 
> [tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
> {code}
> actually, the fix should be done in HIVE-28212, but we need to fail fast here 
> and give a hint to the user about the folder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TEZ-4555) Fail fast in LocalClient if the dirs haven't been created

2024-04-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned TEZ-4555:
-

Assignee: László Bodor

> Fail fast in LocalClient if the dirs haven't been created
> -
>
> Key: TEZ-4555
> URL: https://issues.apache.org/jira/browse/TEZ-4555
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4555) Fail fast in LocalClient if the dirs haven't been created

2024-04-24 Thread Jira
László Bodor created TEZ-4555:
-

 Summary: Fail fast in LocalClient if the dirs haven't been created
 Key: TEZ-4555
 URL: https://issues.apache.org/jira/browse/TEZ-4555
 Project: Apache Tez
  Issue Type: Bug
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4554) Counter for used nodes within a DAG

2024-04-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4554:
--
Description: 
This is the tez container node counter corresponding to HIVE-28201.
Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
doesn't make sense to distinguish between all and used nodes, we count all 
nodes that ran at least 1 task attempt:
NODE_USED_COUNT

> Counter for used nodes within a DAG
> ---
>
> Key: TEZ-4554
> URL: https://issues.apache.org/jira/browse/TEZ-4554
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Priority: Major
>
> This is the tez container node counter corresponding to HIVE-28201.
> Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
> doesn't make sense to distinguish between all and used nodes, we count all 
> nodes that ran at least 1 task attempt:
> NODE_USED_COUNT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TEZ-4554) Counter for used nodes within a DAG

2024-04-17 Thread Jira
László Bodor created TEZ-4554:
-

 Summary: Counter for used nodes within a DAG
 Key: TEZ-4554
 URL: https://issues.apache.org/jira/browse/TEZ-4554
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4554) Counter for used nodes within a DAG

2024-04-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4554:
--
Description: 
This is the tez container node counter corresponding to HIVE-28201.
Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
doesn't make sense to distinguish between all and used nodes, we count all 
nodes that ran at least 1 task attempt:
NODES_USED_COUNT

The number of used containers has been implemented in TEZ-2119

  was:
This is the tez container node counter corresponding to HIVE-28201.
Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
doesn't make sense to distinguish between all and used nodes, we count all 
nodes that ran at least 1 task attempt:
NODE_USED_COUNT

The number of used containers has been implemented in TEZ-2119


> Counter for used nodes within a DAG
> ---
>
> Key: TEZ-4554
> URL: https://issues.apache.org/jira/browse/TEZ-4554
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Priority: Major
>
> This is the tez container node counter corresponding to HIVE-28201.
> Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
> doesn't make sense to distinguish between all and used nodes, we count all 
> nodes that ran at least 1 task attempt:
> NODES_USED_COUNT
> The number of used containers has been implemented in TEZ-2119



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TEZ-4554) Counter for used nodes within a DAG

2024-04-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/TEZ-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated TEZ-4554:
--
Description: 
This is the tez container node counter corresponding to HIVE-28201.
Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
doesn't make sense to distinguish between all and used nodes, we count all 
nodes that ran at least 1 task attempt:
NODE_USED_COUNT

The number of used containers has been implemented in TEZ-2119

  was:
This is the tez container node counter corresponding to HIVE-28201.
Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
doesn't make sense to distinguish between all and used nodes, we count all 
nodes that ran at least 1 task attempt:
NODE_USED_COUNT


> Counter for used nodes within a DAG
> ---
>
> Key: TEZ-4554
> URL: https://issues.apache.org/jira/browse/TEZ-4554
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: László Bodor
>Priority: Major
>
> This is the tez container node counter corresponding to HIVE-28201.
> Considering tez containers as ephemeral ones (instead of LLAP nodes), it 
> doesn't make sense to distinguish between all and used nodes, we count all 
> nodes that ran at least 1 task attempt:
> NODE_USED_COUNT
> The number of used containers has been implemented in TEZ-2119



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >