Hi Colleagues,Perhaps some of these questions should be asked in Flink forum. 
Pls let me know if thats the case so I can post it there  as well.
I am facing something new when running my Beam app in a 4 nodes Flink Cluster.I 
list the behavior items:1- Dashboard shows all nodes actively running2- All 
slots being consumed3- There is a Flink instance daemon running in every node4- 
I submit the app fat jar from one of the servers to the cluster where JM is 
running successfully5- All Task managers log progress reports6- Only one of the 
nodes, on random basis, reports *.out logs (logs commuted by the app, not 
Flink). Other nodes donr report any *.out. Zero size.7- The app runs under 
heavy load for some reasonable time before start crunching...i.e. slowing 
down...which is expected7- The node where the *.out gets reported, runs out of 
space and "/"  is 100%. Other nodes stay at "/" being 10%.8- I get the 
following exception in the *.out of the only node that reports *.out.
Questions:1- Why other nodes dont report *.out at runtime & only random node 
reports it?2- What should I do in my app to avoid this exception?3- What can I 
configure in Flink and/or in my environment and/or the servers config to avoid 
this issue?
I really appreciate your valuable time & help.It is very crucial to pass this 
issue in our bench-marking efforts and product selection process.
Cheers+here is the exception:Amir-
log4j:ERROR Failed to flush writer,java.io.IOException: No space left on device 
       at java.io.FileOutputStream.writeBytes(Native Method)        at 
java.io.FileOutputStream.write(FileOutputStream.java:326)        at 
sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)        at 
sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)        at 
sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)        at 
sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)        at 
java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)        at 
org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)        at 
org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)        at 
org.apache.log4j.WriterAppender.append(WriterAppender.java:162)        at 
org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)        at 
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
        at org.apache.log4j.Category.callAppenders(Category.java:206)        at 
org.apache.log4j.Category.forcedLog(Category.java:391)        at 
org.apache.log4j.Category.log(Category.java:856)        at 
org.slf4j.impl.Log4jLoggerAdapter.warn(Log4jLoggerAdapter.java:420)        at 
org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1078)
        at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.trigger(UnboundedSourceWrapper.java:366)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$TriggerTask.run(StreamTask.java:710)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)        
at java.util.concurrent.FutureTask.run(FutureTask.java:266)        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
       at java.lang.Thread.run(Thread.java:745)[abahman@beam4 log]$

Reply via email to