In case you experience an exception similar to the following:
org.apache.flink.fs.s3base.shaded.com.amazonaws.SdkClientException: Data
read has a different length than the expected: dataLength=53562;
expectedLength=65536; includeSkipped=true; in.getClass()=class
No, I didn't because it's inconvenient for us to have 2 different docker
images for streaming and batch jobs.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
The problem happens in batch jobs (the ones that use ExecutionEnvironment)
that use state processor api for bootstrapping initial savepoint for
streaming job.
We are building a single docker image for streaming and batch versions of
the job. In that image we put both presto (which we use for
Actually, I forgot to mention that it happens when there's also a presto
library in plugins folder (we are using presto for checkpoints and hadoop
for file sinks in the job itself)
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
We've added flink-s3-fs-hadoop library to plugins folder and trying to
bootstrap state to S3 using S3A protocol. The following exception happens
(unless hadoop library is put to lib folder instead of plugins). Looks like
S3A filesystem is trying to use "local" filesystem for temporary files and
Due to OptimizerPlanEnvironment.execute() throwing exception on the last line
there is not way to post-process batch job execution result, like:
JobExecutionResult r = env.execute(); // execute batch job
analyzeResult(r); // this will never get executed due to plan optimization
The following code:
val MAILBOX_SET_TYPE_INFO = object: TypeHint>() {}.typeInfo
val env = StreamExecutionEnvironment.getExecutionEnvironment()
println(MAILBOX_SET_TYPE_INFO.createSerializer(env.config))
prints:
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer@2c39e53
While there
Switching to 1.8 didn't help. Timeout exception from Kinesis is a
consequence, not a reason.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Looks like this is the issue:
https://issues.apache.org/jira/browse/FLINK-11164
We'll try switching to 1.8 and see if it helps.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
I've looked into this problem a little bit more. And it looks like the
problem is caused by some problem with Kinesis sink. There is an exception
in the logs at the moment in time when the job gets restored after being
stalled for about 15 minutes:
Encountered an unexpected expired iterator
The image should be visible now at
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoints-timing-out-for-no-apparent-reason-td28793.html#none
It doesn't look like it is a disk performance or network issue. Feels more
like some buffer overflowing or timeout due to slightly
We have an issue with a job when it occasionally times out while creating
snapshots for no apparent reason:
Details:
- Flink 1.7.2
- Checkpoints are saved to S3 with presto
- Incremental checkpoints are used
What might be the cause of this issue? It feels like some internal s3 client
timeout
I've tried that, but the problem is:
- FileInputFormat#getInputSplitAssigner return type is
LocatableInputSplitAssigner
- LocatableInputSplitAssigner is final
Which makes it impossible to override the split assigner unfortunately
--
Sent from:
Why is FileInputFormat#getInputSplitAssigner not configurable though? It
makes sense to let those who use FileInputFormat set the desired split
assigner (and make LocatableInputSplitAssigner just a default one).
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
14 matches
Mail list logo