Hi, all I use Flink DataSet API to do some batch job, read some log then group and sort them. Our cluster has almost 2000 servers, we get used to use traditional MR job, then I tried Flink to do some experiment job, but I counter this error and can not continue, does anyone can help with it?
Our MR jobs also counter such connection error sometimes, but it will retry serval times then get success. It seems that the whole calculation process failed when one single task failed in Flink. java.io.IOException: Cannot get library with hash 858478de9791c1a5fbbb138c02ec182b916f7962 at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerReferenceToBlobKeyAndGetURL(BlobLibraryCacheManager.java:262) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:116) at org.apache.flink.runtime.taskmanager.Task.createUserCodeClassloader(Task.java:721) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:472) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Failed to fetch BLOB 858478de9791c1a5fbbb138c02ec182b916f7962 from /10.132.99.150:42927 and store it under /tmp/blobStore-a2b79e70-74b9-49e8-a5bb-f2842aeec3b0/cache/blob_858478de9791c1a5fbbb138c02ec182b916f7962 at org.apache.flink.runtime.blob.BlobCache.getURL(BlobCache.java:177) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerReferenceToBlobKeyAndGetURL(BlobLibraryCacheManager.java:253) ... 4 more Caused by: java.io.IOException: Could not connect to BlobServer at address /10.132.99.150:42927 at org.apache.flink.runtime.blob.BlobClient.<init>(BlobClient.java:88) at org.apache.flink.runtime.blob.BlobCache.getURL(BlobCache.java:124) ... 5 more Caused by: java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at java.net.Socket.connect(Socket.java:538) at org.apache.flink.runtime.blob.BlobClient.<init>(BlobClient.java:84) ... 6 more -- Best regards Sili Liu