SteNicholas commented on PR #3518:
URL: https://github.com/apache/celeborn/pull/3518#issuecomment-3456422278
@pan3793, thanks for review. I have verified the packaging via
`./build/make-distribution.sh -Pspark-3.5`, which generates the native folder
for `celeborn-client-spark-3-shaded_2.12-0.7.0-SNAPSHOT.jar`.
```
$ jar -xvf celeborn-client-spark-3-shaded_2.12-0.7.0-SNAPSHOT.jar
...
$ ls META-INF/native/
libnetty_transport_native_epoll_riscv64.so
liborg_apache_celeborn_shaded_netty_transport_native_epoll_aarch_64.so
liborg_apache_celeborn_shaded_netty_transport_native_kqueue_x86_64.jnilib
liborg_apache_celeborn_shaded_netty_resolver_dns_native_macos_aarch_64.jnilib
liborg_apache_celeborn_shaded_netty_transport_native_epoll_x86_64.so
liborg_apache_celeborn_shaded_netty_resolver_dns_native_macos_x86_64.jnilib
liborg_apache_celeborn_shaded_netty_transport_native_kqueue_aarch_64.jnilib
```
```
25/10/28 20:14:18,582 INFO [main] SignalUtils: Registering signal handler
for TERM
25/10/28 20:14:18,625 INFO [main] SignalUtils: Registering signal handler
for HUP
25/10/28 20:14:18,626 INFO [main] SignalUtils: Registering signal handler
for INT
25/10/28 20:14:24,231 WARN [main] JVMSource: Add gauge jvm.thread.deadlocks
failed, the value type class java.util.Collections$EmptySet is not a number
25/10/28 20:14:24,297 INFO [main] Dispatcher: Celeborn dispatcher
numThreads: 64
25/10/28 20:14:24,330 INFO [main] TransportContext: SSL not enabled for
module = rpc_service
25/10/28 20:14:25,319 INFO [main] TransportClientFactory: Module rpc_service
mode KQUEUE threads 64
25/10/28 20:14:25,364 INFO [main] NettyRpcEnvFactory: Starting RPC Server
[Master] on 30.177.49.196:9097 with advertised endpoint 30.177.49.196:9097
25/10/28 20:14:30,480 INFO [main] Utils: Successfully started service
'Master' on port 9097.
25/10/28 20:14:30,690 INFO [main] Master: Metrics system enabled.
25/10/28 20:14:30,724 INFO [main] log: Logging initialized @18339ms to
org.eclipse.jetty.util.log.Slf4jLog
25/10/28 20:14:31,908 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@29a23c3d{/,null,AVAILABLE}
25/10/28 20:14:31,910 WARN [main] ContextHandler:
o.e.j.s.ServletContextHandler@4fdca00a{/,null,STOPPED} contextPath ends with /
25/10/28 20:14:31,911 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@4fdca00a{/swagger-static,null,AVAILABLE}
25/10/28 20:14:31,911 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@5a8c93{/swagger,null,AVAILABLE}
25/10/28 20:14:31,912 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@119b0892{/help,null,AVAILABLE}
25/10/28 20:14:31,912 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@4ed4a7e4{/docs,null,AVAILABLE}
25/10/28 20:14:31,919 INFO [main] Master: Adding metrics servlet handler
with path /metrics/prometheus
25/10/28 20:14:31,920 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@4086d8fb{/metrics/prometheus,null,AVAILABLE}
25/10/28 20:14:31,920 INFO [main] Master: Adding metrics servlet handler
with path /metrics/json
25/10/28 20:14:31,920 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@66236a0a{/metrics/json,null,AVAILABLE}
25/10/28 20:14:31,921 INFO [main] Server: jetty-9.4.56.v20240826; built:
2024-08-26T17:15:05.868Z; git: ec6782ff5ead824dabdcf47fa98f90a4aedff401; jvm
1.8.0_202-b08
25/10/28 20:14:31,931 INFO [main] Server: Started @19548ms
25/10/28 20:14:31,942 INFO [main] AbstractConnector: Started
ServerConnector@1ee47d9e{HTTP/1.1, (http/1.1)}{30.177.49.196:9098}
25/10/28 20:14:31,943 INFO [main] HttpServer: master: HttpServer started on
30.177.49.196:9098.
25/10/28 20:14:31,944 INFO [main] Master: Master started.
25/10/28 20:15:35,740 INFO [celeborn-dispatcher-7] Master: Registered worker
Host: 30.177.49.196
RpcPort: 55893
PushPort: 55894
FetchPort: 55896
ReplicatePort: 55895
InternalPort: 55893
SlotsUsed: 0
LastHeartbeat: 0
HeartbeatElapsedSeconds: 1761653735
Disks:
DiskInfo0: DiskInfo(maxSlots: 7366, availableSlots: 4166, committed
shuffles 0, running applications 0, mountPoint: /, usableSpace: 260.4 GiB,
totalSpace: 460.4 GiB, avgFlushTime: 0 ns, avgFetchTime: 0 ns, activeSlots: 0,
storageType: HDD) status: HEALTHY dirs shuffleAllocations: Map()
UserResourceConsumption: empty
WorkerRef: null
WorkerStatus: WorkerStatus{state=Normal, stateStartTime=1761653735704}
NetworkLocation: /default-rack
NextInterruptionNotice: None
.
25/10/28 20:19:30,693 INFO [master-partition-size-updater] Master: Cluster
estimate partition size 64.0 MiB
25/10/28 20:29:30,711 INFO [master-partition-size-updater] Master: Cluster
estimate partition size 64.0 MiB
25/10/28 20:36:51,379 INFO [celeborn-dispatcher-28] Master: Unregister
shuffle 1761654853643-ee5683af89d367555a3089ff88f0bdd8-0
25/10/28 20:36:51,392 INFO [celeborn-dispatcher-29] Master: Unregister
shuffle 1761654853643-ee5683af89d367555a3089ff88f0bdd8-1
25/10/28 20:39:30,733 INFO [master-partition-size-updater] Master: Cluster
estimate partition size 64.0 MiB
25/10/28 20:42:55,494 INFO [celeborn-dispatcher-27] Master: Unregister
shuffle 1761655214105-5f7c7a8bb9eb8aea1f39663cb8423189-0
25/10/28 20:42:55,506 INFO [celeborn-dispatcher-28] Master: Unregister
shuffle 1761655214105-5f7c7a8bb9eb8aea1f39663cb8423189-1
25/10/28 20:43:55,496 INFO [celeborn-dispatcher-37] Master: Unregister
shuffle 1761655214105-5f7c7a8bb9eb8aea1f39663cb8423189-2
25/10/28 20:43:55,511 INFO [celeborn-dispatcher-38] Master: Unregister
shuffle 1761655214105-5f7c7a8bb9eb8aea1f39663cb8423189-3
25/10/28 20:44:20,785 WARN [celeborn-dispatcher-45] TagsManager: No tags
provided
25/10/28 20:44:20,813 INFO [celeborn-dispatcher-45] Master: Successfully
offered slots for 1 reducers of
1761655214105-5f7c7a8bb9eb8aea1f39663cb8423189-4 on 1 workers.
25/10/28 20:44:30,809 WARN [celeborn-dispatcher-49] Master: Application
1761654853643-ee5683af89d367555a3089ff88f0bdd8 timeout, trigger applicationLost
event.
25/10/28 20:44:30,815 INFO [master-noneager-handler-0] Master: Removed
application 1761654853643-ee5683af89d367555a3089ff88f0bdd8
25/10/28 20:45:55,520 INFO [celeborn-dispatcher-61] Master: Unregister
shuffle 1761655214105-5f7c7a8bb9eb8aea1f39663cb8423189-4
25/10/28 20:49:30,748 INFO [master-partition-size-updater] Master: Cluster
estimate partition size 64.0 MiB
25/10/28 20:50:12,364 INFO [celeborn-dispatcher-40] Master: Unregister
shuffle 1761655631769-7cb65d4ac219c0d27851bc7fdddd0245-0
25/10/28 20:52:00,826 WARN [celeborn-dispatcher-54] Master: Application
1761655214105-5f7c7a8bb9eb8aea1f39663cb8423189 timeout, trigger applicationLost
event.
25/10/28 20:52:00,835 INFO [master-noneager-handler-1] Master: Removed
application 1761655214105-5f7c7a8bb9eb8aea1f39663cb8423189
25/10/28 20:57:00,840 WARN [celeborn-dispatcher-3] Master: Application
1761655631769-7cb65d4ac219c0d27851bc7fdddd0245 timeout, trigger applicationLost
event.
25/10/28 20:57:00,845 INFO [master-noneager-handler-2] Master: Removed
application 1761655631769-7cb65d4ac219c0d27851bc7fdddd0245
25/10/28 20:59:30,762 INFO [master-partition-size-updater] Master: Cluster
estimate partition size 64.0 MiB
25/10/28 21:00:44,393 WARN [celeborn-dispatcher-42] TagsManager: No tags
provided
25/10/28 21:00:44,396 INFO [celeborn-dispatcher-42] Master: Successfully
offered slots for 1 reducers of
1761656290656-706ab0b05e029b488a48ca824a735696-3 on 1 workers.
25/10/28 21:00:47,892 INFO [celeborn-dispatcher-43] Master: Unregister
shuffle 1761656290656-706ab0b05e029b488a48ca824a735696-0
25/10/28 21:01:47,889 INFO [celeborn-dispatcher-54] Master: Unregister
shuffle 1761656290656-706ab0b05e029b488a48ca824a735696-1
25/10/28 21:01:47,911 INFO [celeborn-dispatcher-55] Master: Unregister
shuffle 1761656290656-706ab0b05e029b488a48ca824a735696-2
25/10/28 21:01:47,922 INFO [celeborn-dispatcher-53] Master: Unregister
shuffle 1761656290656-706ab0b05e029b488a48ca824a735696-3
25/10/28 21:09:30,777 INFO [master-partition-size-updater] Master: Cluster
estimate partition size 64.0 MiB
25/10/28 21:09:30,865 WARN [celeborn-dispatcher-23] Master: Application
1761656290656-706ab0b05e029b488a48ca824a735696 timeout, trigger applicationLost
event.
25/10/28 21:09:30,884 INFO [master-noneager-handler-3] Master: Removed
application 1761656290656-706ab0b05e029b488a48ca824a735696
tHandler@2dd8ff1d{/,null,STOPPED} contextPath ends with /
25/10/28 20:15:35,443 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@2dd8ff1d{/swagger-static,null,AVAILABLE}
25/10/28 20:15:35,443 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@36b9cb99{/swagger,null,AVAILABLE}
25/10/28 20:15:35,443 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@2bfaba70{/help,null,AVAILABLE}
25/10/28 20:15:35,444 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@9301672{/docs,null,AVAILABLE}
25/10/28 20:15:35,451 INFO [main] Worker: Adding metrics servlet handler
with path /metrics/prometheus
25/10/28 20:15:35,451 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@7fd987ef{/metrics/prometheus,null,AVAILABLE}
25/10/28 20:15:35,451 INFO [main] Worker: Adding metrics servlet handler
with path /metrics/json
25/10/28 20:15:35,451 INFO [main] ContextHandler: Started
o.e.j.s.ServletContextHandler@7209ffb5{/metrics/json,null,AVAILABLE}
25/10/28 20:15:35,452 INFO [main] Server: jetty-9.4.56.v20240826; built:
2024-08-26T17:15:05.868Z; git: ec6782ff5ead824dabdcf47fa98f90a4aedff401; jvm
1.8.0_202-b08
25/10/28 20:15:35,464 INFO [main] Server: Started @19824ms
25/10/28 20:15:35,479 INFO [main] AbstractConnector: Started
ServerConnector@57bbebe7{HTTP/1.1, (http/1.1)}{30.177.49.196:9096}
25/10/28 20:15:35,483 INFO [main] HttpServer: worker: HttpServer started on
30.177.49.196:9096.
25/10/28 20:15:35,485 INFO [main] Worker: Starting Worker
30.177.49.196:55894:55896:55895 with {/=DiskInfo(maxSlots: 0, availableSlots:
0, committed shuffles 0, running applications 0, mountPoint: /, usableSpace:
260.4 GiB, totalSpace: 460.4 GiB, avgFlushTime: 0 ns, avgFetchTime: 0 ns,
activeSlots: 0, storageType: HDD) status: HEALTHY dirs
/private/tmp/celeborn_worker/celeborn-worker/shuffle_data shuffleAllocations:
Map()} slots.
25/10/28 20:15:35,642 INFO [main] MasterClient: connect to master
30.177.49.196:9097.
25/10/28 20:15:35,763 INFO [main] Worker: Register worker successfully.
25/10/28 20:15:35,771 INFO [main] Worker: Worker started.
```
```
2025-10-28 21:10:50,855 INFO
org.apache.celeborn.plugin.flink.RemoteShuffleMaster [] - Register job:
ec0e0012f46b3f03a5371d637584be00.
2025-10-28 21:10:50,856 INFO
org.apache.celeborn.plugin.flink.RemoteShuffleMaster [] -
CelebornAppId: 1761657018509-ec0e0012f46b3f03a5371d637584be00
2025-10-28 21:10:56,005 INFO
org.apache.celeborn.common.rpc.netty.Dispatcher [] - Celeborn
dispatcher numThreads: 32
2025-10-28 21:10:56,021 INFO
org.apache.celeborn.common.network.TransportContext [] - SSL not
enabled for module = rpc_app_lifecyclemanager
2025-10-28 21:10:56,749 INFO
org.apache.celeborn.common.network.client.TransportClientFactory [] - Module
rpc_app_lifecyclemanager mode KQUEUE threads 14
2025-10-28 21:10:56,783 INFO
org.apache.celeborn.common.rpc.netty.NettyRpcEnvFactory [] - Starting RPC
Server [LifecycleManager] on 30.177.49.196:0 with advertised endpoint
30.177.49.196:0
2025-10-28 21:11:01,877 INFO org.apache.celeborn.common.util.Utils
[] - Successfully started service 'LifecycleManager' on port
59322.
2025-10-28 21:11:01,881 INFO org.apache.celeborn.client.LifecycleManager
[] - Starting LifecycleManager on 30.177.49.196:59322
2025-10-28 21:11:01,894 INFO
org.apache.celeborn.common.client.StaticMasterEndpointResolver [] -
masterEndpoints = List(30.177.49.196:9097)
2025-10-28 21:11:01,961 INFO
org.apache.celeborn.client.ApplicationHeartbeater [] - Send app
heartbeat with written: 0.0 B, file count: 0, shuffle count: 0, shuffle
fallback counts: Map(), application count: 1, application fallback counts: Map()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]