Hi Addison,
I have a lot of things I don't understand. Is your source self-generated
message? Why can't source receive input? If the source is unacceptable then
why is it called source? Isn't kafka-connector the input as source?
If you mean that under normal circumstances it can't receive
Thanks, I'll look into it.
On Fri, Aug 24, 2018, 19:44 vino yang wrote:
> Hi Hao Sun,
>
> From the error log, it seems that the jar package for the job was not
> found.
> You must make sure your Jar is in the classpath.
> Related documentation may not be up-to-date, and there is a discussion on
Hi Averell,
The checkpoint is automatically triggered periodically according to the
checkpoint interval set by the user. I believe that you should have no
doubt about this.
There are many reasons for the Job failure.
The technical definition is that the Job does not normally enter the final
Hi Hao Sun,
>From the error log, it seems that the jar package for the job was not
found.
You must make sure your Jar is in the classpath.
Related documentation may not be up-to-date, and there is a discussion on
this issue on this mailing list. [1]
I see that the status of FLINK-10001 [2] is
I got an error like this.
$ docker run -it flink-job:latest job-cluster
Starting the job-cluster
config file:
jobmanager.rpc.address: localhost
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 1024m
taskmanager.numberOfTaskSlots: 1
parallelism.default: 1
rest.port:
Hi,
Using StreamingFileSink is not a convenient option for production use for
us as it doesn't support s3*. I could use StreamingFileSink just to verify,
but I don't see much point in doing so. Please consider my previous comment:
> I realized that BucketingSink must not play any role in this
Hi Chang,
A time-saving tip for finding which library contains a class: go to
https://search.maven.org/
and enter fc: followed by the fully-qualified name of the class. You should
get the library as a search result.
In this case for example, you'd search for
Ok, I think before further debugging the window reduced state,
could you try the new ‘StreamingFileSink’ [1] introduced in Flink 1.6.0 instead
of the previous 'BucketingSink’?
Cheers,
Andrey
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
Yes, sorry for my confusing comment. I just meant that it seems like
there's a bug somewhere now that the output is missing some data.
> I would wait and check the actual output in s3 because it is the main
result of the job
Yes, and that's what I have already done. There seems to be always some
Hi Juho,
So it is a per key deduplication job.
Yes, I would wait and check the actual output in s3 because it is the main
result of the job and
> The late data around the time of taking savepoint might be not included into
> the savepoint but it should be behind the snapshotted offset in
Hi Timo,
Thanks for your answer
I was looking for a Tuple as to feed a BatchTableSink subclass, but it
may be achived with a Row instead?
All the best
François
2018-08-24 10:21 GMT+02:00 Timo Walther :
> Hi,
>
> tuples are just a sub category of rows. Because the tuple arity is limited
> to
Thanks for your answer!
I check for the missed data from the final output on s3. So I wait until
the next day, then run the same thing re-implemented in batch, and compare
the output.
> The late data around the time of taking savepoint might be not included
into the savepoint but it should be
Hi Juho,
Where exactly does the data miss? When do you notice that?
Do you check it:
- debugging `DistinctFunction.reduce` right after resume in the middle of the
day
or
- some distinct records miss in the final output of BucketingSink in s3 after
window result is actually triggered and
Hi Vino,
Regarding this statement "/Checkpoints are taken automatically and are used
for automatic restarting job in case of a failure/", I do not quite
understand the definition of a failure, and how to simulate that while
testing my application. Possible scenarios that I can think of:
(1)
This thread is also useful in this context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/difference-between-checkpoints-amp-savepoints-td14787.html
Hi Henry,
In addition to Vino’s answer, there are several things to keep in mind about
“checkpoints vs savepoints".
Checkpoints are designed mostly for fault tolerance of running Flink job and
automatic recovery
that is why by default Flink manages their storage itself. Though it is correct
No worries, I found it here:
org.apache.flink
flink-runtime_${scala.binary.version}
${flink.version}
test-jar
test
Best regards/祝好,
Chang Liu 刘畅
On Fri, Aug 24, 2018 at 1:16 PM Chang Liu wrote:
> Hi Hequn,
>
> I have added the following dependencies:
>
>
>
Hi Hequn,
I have added the following dependencies:
org.apache.flink
flink-streaming-java_${scala.binary.version}
${flink.version}
test-jar
test
org.mockito
mockito-core
2.21.0
test
But got the exception: java.lang.NoClassDefFoundError:
Good day everyone,
I'm writing unit test for the bug fix FLINK-9940, and found that in some
existing tests in flink-fs-tests cannot detect the issue when the file
monitoring function emits duplicated files (i.e. a same file is reported
multiple times).
Could I just fix this as part of that
Hi Henry,
A good answer from stackoverflow:
Apache Flink's Checkpoints and Savepoints are similar in that way they both
are mechanisms for preserving internal state of Flink's applications.
Checkpoints are taken automatically and are used for automatic restarting
job in case of a failure.
hello thank you very much
I took a look on the link but now how can I check the conditions to get
aggregator results?
El vie., 24 ago. 2018 a las 5:27, aitozi () escribió:
> Hi,
>
> Now that it still not support the aggregator function in cep
> iterativeCondition. Now may be you need to check
Hi,
Now that it still not support the aggregator function in cep
iterativeCondition. Now may be you need to check the condition by yourself
to get the aggregator result. I will work for this these day, you can take a
look on this issue:
Hi Juho,
As Aljoscha mentioned the current TTL implementation was mostly targeted to
data privacy applications
where only processing time matters.
I think the event time can be also useful for TTL and should address your
concerns.
The event time extension is on the road map for the future
Hello
I am developing an application where I use Flink(v 1.4.2) CEP , is there
any aggregation function to match cumulative amounts or counts in a
IterativeCondition within a period of time for a KeyBy elements?
if a cumulative amount reaches thresholds fire a result
Thank you
Regards
One more safer approach is to execute cancel with savepoint on all jobs first
>> this sounds great!
Thanks
Youjun
发件人: vino yang
发送时间: Friday, August 24, 2018 2:43 PM
收件人: Yuan,Youjun ; user
主题: Re: jobmanager holds too many CLOSE_WAIT connection to datanode
Hi Youjun,
You can see if there
Hi,
tuples are just a sub category of rows. Because the tuple arity is
limited to 25 fields. I think the easiest solution would be to write
your own converter that maps rows to tuples if you know that you will
not need more than 25 fields. Otherwise it might be easier to just use a
Hi All, I check the documentation of Flink release 1.6, find that I can use checkpoints to resume the program either. As I encountered some problems when using savepoints, I have the following questions: 1. Can I use checkpoints only, but not use savepoints, because it can also use to resume
Hi Youjun,
You can see if there is any real data transfer between these connections.
I guess there may be some connection leaks here, and if so, it's a bug.
On the other hand, the 1.4 version is a bit old, can you compare the 1.5 or
1.6 whether the same problem exists?
I suggest you create an
28 matches
Mail list logo