Hi Yang,
We run a self-compiled Flink-1.12-SNAPSHOT, and could not see any 
taskmanager/jobmanager logs.
I have checked the log4j.properties file, and it's in the right format. And the 
FLINK_CONF_DIR is set.
When checking the java dynamic options of task manager, I found that the log 
related options are not
set.
This is the output when ussuing "ps -ef | grep <container_id>".


yarn     31049 30974  9 13:57 ?        00:03:31 
/usr/lib/jvm/jdk1.8.0_121/bin/java -Xmx536870902 -Xms536870902 
-XX:MaxDirectMemorySize=268435458 -XX:MaxMetaspaceSize=268435456 
org.apache.flink.yarn.YarnTaskExecutorRunner -D 
taskmanager.memory.framework.off-heap.size=134217728b -D 
taskmanager.memory.network.max=134217730b -D 
taskmanager.memory.network.min=134217730b -D 
taskmanager.memory.framework.heap.size=134217728b -D 
taskmanager.memory.managed.size=536870920b -D taskmanager.cpu.cores=1.0 -D 
taskmanager.memory.task.heap.size=402653174b -D 
taskmanager.memory.task.off-heap.size=0b --configDir . 
-Djobmanager.rpc.address=dhpdn09-113 
-Dtaskmanager.resource-id=container_1604585185669_635512_01_000713 -Dweb.port=0 
-Dweb.tmpdir=/tmp/flink-web-1d373ec2-0cbe-49b8-9592-3ac1d207ad63 
-Djobmanager.rpc.port=40093 -Drest.address=dhpdn09-113


My question is, what maybe the problem for this? And any suggestions?


By the way, we submit the program from Java program instead of from the command 
line.


Thanks.


ps: I sent the mail to spark user mail list un-attentionally. So I resent it to 
the Flink user mail list. Sorry for the inconvenience to  @Yang Wang 
















At 2020-11-03 20:56:19, "Yang Wang" <danrtsey...@gmail.com> wrote:

You could issue "ps -ef | grep container_id_for_some_tm". And then you will 
find the
following java options about log4j.


-Dlog.file=/var/log/hadoop-yarn/containers/application_xx/container_xx/taskmanager.log
-Dlog4j.configuration=file:./log4j.properties
-Dlog4j.configurationFile=file:./log4j.properties



Best,
Yang


Diwakar Jha <diwakar.n...@gmail.com> 于2020年11月2日周一 下午11:37写道:

Sure. I will check that and get back to you. could you please share how to 
check java dynamic options?


Best,
Diwakar


On Mon, Nov 2, 2020 at 1:33 AM Yang Wang <danrtsey...@gmail.com> wrote:

If you have already updated the log4j.properties, and it still could not work, 
then I
suggest to log in the Yarn NodeManager machine and check the log4j.properties
in the container workdir is correct. Also you could have a look at the java 
dynamic
options are correctly set.


I think it should work if the log4j.properties and java dynamic options are set 
correctly.


BTW, could you share the new yarn logs?


Best,
Yang


Diwakar Jha <diwakar.n...@gmail.com> 于2020年11月2日周一 下午4:32写道:





Hi Yang,


Thank you so much for taking a look at the log files. I changed my 
log4j.properties. Below is the actual file that I got from EMR 6.1.0 
distribution of flink 1.11. I observed that it is different from Flink 1.11 
that i downloaded so i changed it. Still I didn't see any logs.


Actual
log4j.rootLogger=INFO,file

# Log all infos in the given file
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.file.file=${log.file}
log4j.appender.file.append=false
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p 
%-60c %x - %m%n

# suppress the irrelevant (wrong) warnings from the netty channel handler
log4j.logger.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,file





modified : commented the above and added new logging from actual flink 
application log4.properties file


#log4j.rootLogger=INFO,file

# Log all infos in the given file
#log4j.appender.file=org.apache.log4j.FileAppender
#log4j.appender.file.file=${log.file}
#log4j.appender.file.append=false
#log4j.appender.file.layout=org.apache.log4j.PatternLayout
#log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p 
%-60c %x - %m%n

# suppress the irrelevant (wrong) warnings from the netty channel handler
#log4j.logger.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,file

# This affects logging for both user code and Flink
rootLogger.level = INFO
rootLogger.appenderRef.file.ref = MainAppender

# Uncomment this if you want to _only_ change Flink's logging
#logger.flink.name = org.apache.flink
#logger.flink.level = INFO

# The following lines keep the log level of common libraries/connectors on
# log level INFO. The root logger does not override this. You have to manually
# change the log levels here.
logger.akka.name = akka
logger.akka.level = INFO
logger.kafka.name= org.apache.kafka
logger.kafka.level = INFO
logger.hadoop.name = org.apache.hadoop
logger.hadoop.level = INFO
logger.zookeeper.name = org.apache.zookeeper
logger.zookeeper.level = INFO

# Log all infos in the given file
appender.main.name = MainAppender
appender.main.type = File
appender.main.append = false
appender.main.fileName = ${sys:log.file}
appender.main.layout.type = PatternLayout
appender.main.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n

# Suppress the irrelevant (wrong) warnings from the Netty channel handler
logger.netty.name = 
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline
logger.netty.level = OFF



**********************************
I also think its related to the log4j setting but I'm not able to figure it 
out. 
Please let me know if you want any other log files or configuration. 


Thanks.


On Sun, Nov 1, 2020 at 10:06 PM Yang Wang <danrtsey...@gmail.com> wrote:

Hi Diwakar Jha,


From the logs you have provided, everything seems working as expected. The 
JobManager and TaskManager
java processes have been started with correct dynamic options, especially for 
the logging.


Could you share the content of $FLINK_HOME/conf/log4j.properties? I think 
there's something wrong with the
log4j config file. For example, it is a log4j1 format. But we are using log4j2 
in Flink 1.11.




Best,
Yang


Diwakar Jha <diwakar.n...@gmail.com> 于2020年11月2日周一 上午1:57写道:

Hi
I'm running Flink 1.11 on EMR 6.1.0. I can see my job is running fine but i'm 
not seeing any taskmanager/jobmanager logs. 
I see the below error in stdout.
18:29:19.834 [flink-akka.actor.default-dispatcher-28] ERROR 
org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler - 
Failed to transfer file fromTaskExecutor container_1604033334508_0001_01_000004.
java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: 
The file LOG does not exist on the TaskExecutor.


I'm stuck at this step for a couple of days now and not able to migrate to 
Flink 1.11. I would appreciate it if anyone can help me.
i have the following setup : 
a) i'm deploying flink using yarn. I have attached yarn application id logs.
c) stsd setup

metrics.reporters: stsd
metrics.reporter.stsd.factory.class: 
org.apache.flink.metrics.statsd.StatsDReporterFactory
metrics.reporter.stsd.host: localhost
metrics.reporter.stsd.port: 8125

Reply via email to