Hi Jared,

this problem looks strange to me. Logback should not change its
configuration if not explicitly being tinkered around with it.

Could you quickly explain me how your mesos setup works? Are you submitting
the job via the Web UI? I'm just wondering because I see client side as
well as cluster side logging statements in your log snippet. It could also
be helpful to get access to the complete cluster logs (including the
client) in order to pinpoint the problem. Would that be possible?

Have you tried using a different logback version? Just to rule out that
this is a logback specific problem.

Concerning the verbose GlobalConfiguration logging, this could be related
to [1], which is fixed in the latest master.

[1] https://issues.apache.org/jira/browse/FLINK-7643

On Mon, Oct 23, 2017 at 10:17 AM, Piotr Nowojski <pi...@data-artisans.com>
wrote:

> Till could you take a look at this?
>
> Piotrek
>
> On 18 Oct 2017, at 20:32, Jared Stehler <jared.stehler@
> intellifylearning.com> wrote:
>
> I’m having an issue where I’ve got logging setup and functioning for my
> flink-mesos deployment, and works fine up to a point (the same point every
> time) where it seems to fall back to “defaults” and loses all of my
> configured filtering.
>
> 2017-10-11 21:37:17.454 [flink-akka.actor.default-dispatcher-17] INFO
>  o.a.f.m.runtime.clusterframework.MesosFlinkResourceManager  -
> TaskManager taskmanager-00008 has started.
> 2017-10-11 21:37:17.454 [flink-akka.actor.default-dispatcher-16] INFO
>  org.apache.flink.runtime.instance.InstanceManager  - Registered
> TaskManager at ip-10-80-54-201 (akka.tcp://flink@ip-10-80-54-
> 201.us-west-2.compute.internal:31014/user/taskmanager) as
> 697add78bd00fe7dc6a7aa60bc8d75fb. Current number of registered hosts is
> 39. Current number of alive task slots is 39.
> 2017-10-11 21:37:18.820 [flink-akka.actor.default-dispatcher-17] INFO
>  org.apache.flink.runtime.instance.InstanceManager  - Registered
> TaskManager at ip-10-80-54-201 (akka.tcp://flink@ip-10-80-54-
> 201.us-west-2.compute.internal:31018/user/taskmanager) as
> a6cff0f18d71aabfb3b112f5e2c36c2b. Current number of registered hosts is
> 40. Current number of alive task slots is 40.
> 2017-10-11 21:37:18.821 [flink-akka.actor.default-dispatcher-17] INFO
>  o.a.f.m.runtime.clusterframework.MesosFlinkResourceManager  -
> TaskManager taskmanager-00010 has started.
> 2017-10-11 21:39:04,371:6171(0x7f67fe9cd700):ZOO_WARN@
> zookeeper_interest@1570: Exceeded deadline by 13ms
>
> — here is where it turns over into default pattern layout ---
> *21:39:05.616 [nioEventLoopGroup-5-6] INFO
>  o.a.flink.runtime.blob.BlobClient - Blob client connecting to
> akka://flink/user/jobmanager*
>
> 21:39:09.322 [nioEventLoopGroup-5-6] INFO  o.a.flink.runtime.client.JobClient
> - Checking and uploading JAR files
> 21:39:09.322 [nioEventLoopGroup-5-6] INFO  o.a.flink.runtime.blob.BlobClient
> - Blob client connecting to akka://flink/user/jobmanager
> 21:39:09.788 [flink-akka.actor.default-dispatcher-4] INFO
>  o.a.f.m.r.c.MesosJobManager - Submitting job 005b570ff2866023aa905f2bc850f7a3
> (Sa-As-2b-Submission-Join-V3 := demos-demo500--data-canvas-2-sa-qs-as-v3).
> 21:39:09.789 [flink-akka.actor.default-dispatcher-4] INFO
>  o.a.f.m.r.c.MesosJobManager - Using restart strategy
> FailureRateRestartStrategy(failuresInterval=120000 msdelayInterval=1000
> msmaxFailuresPerInterval=3) for 005b570ff2866023aa905f2bc850f7a3.
> 21:39:09.789 [flink-akka.actor.default-dispatcher-4] INFO
>  o.a.f.r.e.ExecutionGraph - Job recovers via failover strategy: full graph
> restart
> 21:39:09.790 [flink-akka.actor.default-dispatcher-4] INFO
>  o.a.f.m.r.c.MesosJobManager - Running initialization on master for job
> Sa-As-2b-Submission-Join-V3 := demos-demo500--data-canvas-2-sa-qs-as-v3 (
> 005b570ff2866023aa905f2bc850f7a3).
> 21:39:09.790 [flink-akka.actor.default-dispatcher-4] INFO
>  o.a.f.m.r.c.MesosJobManager - Successfully ran initialization on master in
> 0 ms.
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] WARN
>  o.a.f.configuration.Configuration - Config uses deprecated configuration
> key 'high-availability.zookeeper.storageDir' instead of proper key
> 'high-availability.storageDir'
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] INFO
>  o.a.f.c.GlobalConfiguration - Loading configuration property:
> mesos.failover-timeout, 60
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] INFO
>  o.a.f.c.GlobalConfiguration - Loading configuration property:
> mesos.initial-tasks, 1
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] INFO
>  o.a.f.c.GlobalConfiguration - Loading configuration property:
> mesos.maximum-failed-tasks, -1
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] INFO
>  o.a.f.c.GlobalConfiguration - Loading configuration property:
> mesos.resourcemanager.framework.role, '*'
>
> The reason this is a vexing issue is that the app master then proceeds to
> dump megabytes of " o.a.f.c.GlobalConfiguration - Loading configuration
> property:” messages into the log, and I’m unable to filter them out.
>
> My logback config is:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <configuration debug="true">
>     <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
>         <encoder>
>             <pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level
> %logger{60} %X{sourceThread} - %msg%n</pattern>
>         </encoder>
>     </appender>
>
>     <appender name="SENTRY" class="io.sentry.logback.SentryAppender">
>         <filter class="ch.qos.logback.classic.filter.ThresholdFilter">
>             <level>ERROR</level>
>         </filter>
>     </appender>
>
>     <logger name="org.apache.flink.runtime.metrics.MetricRegistry"
> level="OFF" />
>     <logger name="org.apache.kafka.clients.ClientUtils" level="OFF" />
>     <logger 
> name="org.apache.flink.runtime.webmonitor.files.StaticFileServerHandler"
> level="OFF" />
>     <logger 
> name="org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkBase"
> level="OFF" />
>
>     <logger name="org.apache.flink.configuration.GlobalConfiguration"
> level="WARN" />
>     <logger name="org.apache.flink.runtime.checkpoint.CheckpointCoordinator"
> level="WARN" />
>
>     <logger name="org.elasticsearch.client.transport" level="DEBUG" />
>
>     <root level="INFO">
>         <appender-ref ref="CONSOLE" />
>         <appender-ref ref="SENTRY" />
>     </root>
> </configuration>
>
>
>
> --
> Jared Stehler
> Chief Architect - Intellify Learning
> o: 617.701.6330 x703 <(617)%20701-6330>
>
>
>
>
>

Reply via email to