[
https://issues.apache.org/jira/browse/FLINK-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
SUBRAMANYA SURESH updated FLINK-9471:
-------------------------------------
Description:
We are using Flink SQL, I see job ending logs that are logged at info level,
that makes it very hard for me to tune out the Info messages in the
configuration. Note: If I do end up using Info, the same executionGraph logs
the entire query for the operationGraph for every info statement, and this
fills up the logs easily if we have say 100-200 queries.
Note the "-" below indicate an entire line of execution graph for this query
(redacted for privacy).
018-03-30 03:32:09,942 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom
Source -> -
-
-
-
- (208/725).}
at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:948)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Exception: Could not materialize checkpoint 1 for operator
Source: Custom Source -> (-
-
-
-
-
) (208/725).
... 6 more
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Could
not flush and close the file system output stream to
hdfs://security-temp/savedSearches/checkpoint/561eb649376bef2f2d8daa1e3a0fa6db/chk-1/067924e4-c861-4de1-823e-b255a0bf9998
in order to obtain the stream state handle
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:892)
was:
We are using Flink SQL, I see job ending logs that are logged at info level,
that makes it very hard for me to tune out the Info messages in the
configuration. Note: If I do end up using Info, the same executionGraph logs
the entire query for the operationGraph for every info statement, and this
fills up the logs easily if we have say 100-200 queries.
Note the "-" below indicate an entire line of execution graph for this query
(redacted for privacy).
2018-03-30 03:32:09,943 INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Discarding
checkpoint 1 because: java.lang.Exception: Could not materialize checkpoint 1
for operator Source: Custom Source -> (Map -> where: (AND(=-
-
-
-
-.}
at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:948)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Exception: Could not materialize checkpoint 1 for operator
Source: Custom Source -> (Map -> where: (AND(=(Environment,
_UTF-16LE'SFDC-IT'), =(RuleMatch, _UTF-16LE'SFA'), =(LogType,
_UTF-16LE'SAML-AUTH'), =(Outcome, _UTF-16LE'DENY'))), select: (proctime,
CAST(_UTF-16LE'SFDC-IT') AS Environment, CollectedTimestamp, EventTimestamp,
_raw, Aggregator), Map -> where: (AND(=(Environment, _UTF-16LE'SFDC-IT'),
=(RuleMatch, _UTF-16LE'SFA'), =(LogType, _UTF-16LE'SAML-AUTH'), =(Outcome,
_UTF-16LE'DENY'))), select: (proctime, CAST(_UTF-16LE'SFDC-IT') AS Environment,
CollectedTimestamp, EventTimestamp, _raw, Aggregator)) (353/725).
... 6 more
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Could
not flush and close the file system output stream to
hdfs://security-temp/savedSearches/checkpoint/561eb649376bef2f2d8daa1e3a0fa6db/chk-1/31b94717-9e6d-49b8-b64d-2a1a8ba04425
in order to obtain the stream state handle
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:892)
... 5 more
Suppressed: java.lang.Exception: Could not properly cancel managed operator
state future.
at
org.apache.flink.streaming.api.operators.OperatorSnapshotResult.cancel(OperatorSnapshotResult.java:99)
at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.cleanup(StreamTask.java:976)
at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
... 5 more
> Job ending exceptions being logged at Info level
> ------------------------------------------------
>
> Key: FLINK-9471
> URL: https://issues.apache.org/jira/browse/FLINK-9471
> Project: Flink
> Issue Type: Bug
> Affects Versions: 1.4.2
> Reporter: SUBRAMANYA SURESH
> Priority: Major
>
> We are using Flink SQL, I see job ending logs that are logged at info level,
> that makes it very hard for me to tune out the Info messages in the
> configuration. Note: If I do end up using Info, the same executionGraph logs
> the entire query for the operationGraph for every info statement, and this
> fills up the logs easily if we have say 100-200 queries.
> Note the "-" below indicate an entire line of execution graph for this query
> (redacted for privacy).
> 018-03-30 03:32:09,942 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph - Source:
> Custom Source -> -
> -
> -
> -
> - (208/725).}
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:948)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 1 for
> operator Source: Custom Source -> (-
> -
> -
> -
> -
> ) (208/725).
> ... 6 more
> Caused by: java.util.concurrent.ExecutionException: java.io.IOException:
> Could not flush and close the file system output stream to
> hdfs://security-temp/savedSearches/checkpoint/561eb649376bef2f2d8daa1e3a0fa6db/chk-1/067924e4-c861-4de1-823e-b255a0bf9998
> in order to obtain the stream state handle
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:892)
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)