Aggarwal-Raghav commented on PR #456:
URL: https://github.com/apache/tez/pull/456#issuecomment-3807049845

   Thanks for the pointers @abstractdog .
   1. Yes, the implementation is reminiscent of hive (TBH, pom.xml and 
build-docker.sh and some parts of Dockerfile are taken from hive to some extent)
   2. For basic startup of tez am without hadoop jars, I didn't observed any 
issue. As tez tar ball contains few hadoop jars and i think they and their 
transitive dependency jars are sufficient for tez-am to be client of hadoop 
services (but I have commit ready just in case if we later want to remove 
hadoop tarball)
   3. **No Update. I believe, code change in DagAppMaster is required for 
segregation.**
   4. Raised https://github.com/apache/tez/pull/458
   
   **Few additional things:**
   DAGAppMaster#serviceInit() => DAGAppMaster#createTaskSchedulerManager is 
trying to connect to ResourceManager even in zookeeper mode . I think we 
shouldn't use YARN scheduler and maybe move to 
[Yunikorn](https://yunikorn.apache.org/)  (we are using that in spark 
internally). Let me know how to proceed for this? For now should I raise a PR 
for skipping it if zk mode is enabled?
   ```
   2026-01-27 19:13:06,207 INFO zookeeper.ZkAMRegistry: Added AMRecord to 
zkpath /tez-external-sessions/tez_am/server/application_1769280834537_0000
   2026-01-27 19:13:06,208 INFO app.DAGAppMaster: Added AMRecord: 
{hostName=2d0733bd53ae, externalId=tez-session-, hostIp=172.17.0.2, port=10001, 
computeName=default-compute, appId=application_1769280834537_0000} to registry..
   2026-01-27 19:13:06,210 INFO rm.TaskSchedulerManager: Creating YARN 
TaskScheduler: org.apache.tez.dag.app.rm.DagAwareYarnTaskScheduler
   2026-01-27 19:13:06,253 INFO conf.Configuration: resource-types.xml not found
   2026-01-27 19:13:06,253 INFO resource.ResourceUtils: Unable to find 
'resource-types.xml'.
   2026-01-27 19:13:06,259 INFO Configuration.deprecation: 
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
   2026-01-27 19:13:06,259 INFO Configuration.deprecation: 
yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, 
use yarn.system-metrics-publisher.enabled
   2026-01-27 19:13:06,263 INFO rm.DagAwareYarnTaskScheduler: scheduler 
initialized with maxRMHeartbeatInterval:1000 reuseEnabled:true reuseRack:true 
reuseAny:false localityDelay:250 preemptPercentage:10 preemptMaxWaitTime:60000 
numHeartbeatsBetweenPreemptions:3 idleContainerMinTimeout:5000 
idleContainerMaxTimeout:10000 sessionMinHeldContainers:0
   2026-01-27 19:13:06,267 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at /0.0.0.0:8030
   2026-01-27 19:13:07,572 INFO ipc.Client: Retrying connect to server: 
0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
   2026-01-27 19:13:08,580 INFO ipc.Client: Retrying connect to server: 
0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
   2026-01-27 19:13:09,588 INFO ipc.Client: Retrying connect to server: 
0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
   2026-01-27 19:13:10,595 INFO ipc.Client: Retrying connect to server: 
0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to