Stephen Gran created FLINK-6408:
-----------------------------------

             Summary: Repeated loading of configuration files in hadoop 
filesystem code paths
                 Key: FLINK-6408
                 URL: https://issues.apache.org/jira/browse/FLINK-6408
             Project: Flink
          Issue Type: Bug
    Affects Versions: 1.2.1
            Reporter: Stephen Gran
            Priority: Minor


We are running flink on mesos in AWS.  Checkpointing is enabled with an s3 
backend, configured via the hadoop s3a filesystem implementation and done every 
second.

We are seeing roughly 3 million log events per hour from a relatively small 
job, and it appears that this is because every s3 copy event reloads the hadoop 
configuration, which in turn reloads the flink configuration.  The flink 
configuration loader is outputting each key/value pair every time it is 
invoked, leading to this volume of logs.

While the logging is relatively easy to deal with - just a log4j setting - the 
behaviour is probably suboptimal.  It seems that the configuration loader could 
easily be changed over to a singleton pattern to prevent the constant rereading 
of files.

If you're interested, we can probably knock up a patch for this in a relatively 
short time.

Cheers,



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to