Hi Vino,
Thanks for your quick reply, but I think these two questions are different.
The checkpoint in that question
finally finished, but my checkpoint failed due to s3 client timeout. You
can see from my screenshot that
showed the checkpoint failed in a short time.
According to configuration,
Hi Tony,
A while ago, I have answered a similar question.[1]
You can try to increase this value appropriately. You can't put this
configuration in flink-conf.yaml, you can put it in the submit command of
the job[2], or in the configuration file you specify.
[1]:
Hi Encho,
>From your description, I feel that there are extra bugs.
About your description:
*- Start both job managers*
*- Start a batch job in JobManager 1 and let it finish*
*The jobgraphs in both Zookeeper and HDFS remained.*
Is it necessarily happening every time?
In the Standalone
Hi Elias,
Can you express this matter more clearly?
The reason the KeyedStream object exists is that it needs to provide some
different transform methods than the DataStream object.
These transform methods are limited to keyBy.
Why do you need to execute keyBy twice to get a KeyedStream object?
Hi Elias,
>From the source code, the reason for throwing this exception is because
StateTtlConfig is set to StateTtlConfig.DISABLED.
Please refer to the usage and description of the official Flink
documentation for details.[1]
And there is a note you should pay attention : Only TTLs in reference
Is there a reason queryable state can't work with state TTL? Trying to use
both at the same time leads to a "IllegalArgumentException: Queryable state
is currently not supported with TTL"
Operators on a KeyedStream don't return a new KeyedStream. Is there a
reason for this? You need to perform `keyBy` again to get a KeyedStream.
Presumably if you key by the same value there won't be any shuffled data,
but the key may no longer be available within the stream record.
Hi Henry,
Fabian is right. You can try to use window join if your want a bounded join.
According to your descriptions. I think what you want is(correct me if I'm
wrong) :
- Only join data within 3 days
- Score should be calculated in bounded way
- Retract previous score which exceed 3 days
So,
Hi Fabian,
I am working on a application that compute the “score" of an article by
the number of praises, and reduce the score by the time, I am balancing on two
choices:
1. Use global window join the article and article praise, with 3 days
state retention, but I can not get the
Thank you Fabian.
Regards,
Averell
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Thanks Till for the follow up, I can run my job now.
On Tue, Aug 28, 2018, 00:57 Till Rohrmann wrote:
> Hi Hao,
>
> Vino is right, you need to specify the -j/--job-classname option which
> specifies the job name you want to execute. Please make sure that the jar
> containing this class is on
+1 for removing these slides.
On Mon, Aug 27, 2018 at 10:03 AM Fabian Hueske wrote:
> I agree to remove the slides section.
> A lot of the content is out-dated and hence not only useless but might
> sometimes even cause confusion.
>
> Best,
> Fabian
>
>
>
> Am Mo., 27. Aug. 2018 um 08:29 Uhr
Hi,
CMCF is not a source, only the file monitoring function is. Barriers are
injected by the FMF when the JM sends a checkpoint message. The barriers
then travel to the CMCF and trigger the Checkpoint ING.
Fabian
Averell schrieb am Di., 28. Aug. 2018, 12:02:
> Hello Fabian,
>
> Thanks for
Hi Zhengwen,
I have tested my job manually (both by submitting a job and through
execute()) and I am trying to write a test.
The following project states that it has the feature "Test stream windowing
with timestamped input" but I do not want to rely on a project other than
flink.
Hello Till,
I spend a few more hours testing and looking at the logs and it seems like
there's a more general problem here. While the two job managers are active
neither of them can properly delete jobgraphs. The above problem I
described comes from the fact that Kubernetes gets JobManager 1
Hi,
Currently, Flink's window operators require increasing timestamp
attributes. This limitation exists to be able to clean up the state of a
window operator. A join operator does not preserve the order of timestamps.
Hence, timestamp attributes lose their monotonictity property and a window
Hi Nicos,
Under the flink-example module, there are many examples, including batch
and streaming. You could build the project from the source, this way you
could found many jars under the target directory. You can submit these jars
to the Flink cluster. Also, you could run these examples directly
Hi all,
How can I test in Java any streaming job that has a time window?
best,
Nicos
Hello Fabian,
Thanks for the answer. However, my question is a little bit different.
Let me rephrase my example and my question:
* I have 10,000 unsplittable small files to read, which, in total, has
about 10M output lines.
* From Flink's reporting web GUI, I can see that CFMF and
Hi,
How do you reduce the speed to avoid this issue? Do you mean reducing the
parallelism of source or downstream tasks?
As I know, data buffering is managed by flink internal buffer pool and memory
manager, so it will not cause OOM issue.
I just wonder the OOM may be caused by temporary byte
Hi Hequn,
You can't use window or other bounded operators after non-window join.
The time attribute fields can not be passed through because of semantic
conflict.
Why does Flink have this limitation?
I have a temp view
var finalTable =
Hi Ning,
could you replace the Kafka Source by a custom
SourceFunction-implementation, which just produces the new events in a loop
as fast as possible. This way we can rule out that the ingestion is
responsible for the performance jump or the limit at 5000 events/s and can
benchmark the Flink
Thanks. For the explanation – I was suspected it might be like this and I
wanted to double check before building inconsistent programs ☺)
Would it be interesting for the community to have also something that would
also be able to share/broadcast items from one task to the other tasks. Spark
Hi Averell,
Barriers are injected into the regular data flow by source functions.
In case of a file monitoring source, the barriers are injected into the
stream of file splits that are passed to the
ContinuousFileMonitoringFunction.
The CFMF puts the splits into a queue and processes them with a
Hi spoon_lz,
Thank you for asking this question. Kafka 1.0's connector is currently in
PR review status. I still need some time to refactor it. You can track its
status through FLINK-7964.[1]
Regarding the time, I hope it will be released along with 1.7.0.
[1]:
Hi Hao,
Vino is right, you need to specify the -j/--job-classname option which
specifies the job name you want to execute. Please make sure that the jar
containing this class is on the class path.
I recently pushed some fixes which generate a better error message than the
one you've received. If
Hi,
Xingcan is right. There is no hidden state synchronization happening.
You have to ensure that the broadcast state is the same at every parallel
instance. Hence, it should only be modified by the
processBroadcastElement() method that receives the same broadcasted
elements on all task instance.
Hequn is right. If you know the maximum delay of your position corrections,
then you need to buffer the enrichment information for so long.
Cheers,
Till
On Thu, Aug 23, 2018 at 9:04 AM Hequn Cheng wrote:
> Hi Harsh,
>
> > What I don't get is, how would this work when I have more than 2
>
Hi Encho,
thanks a lot for reporting this issue. The problem arises whenever the old
leader maintains the connection to ZooKeeper. If this is the case, then
ephemeral nodes which we create to protect against faulty delete operations
are not removed and consequently the new leader is not able to
Hi Encho,
A temporary solution can be used to determine if it has been cleaned up by
monitoring the specific JobID under Zookeeper's "/jobgraph".
Another solution, modify the source code, rudely modify the cleanup mode to
the synchronous form, but the flink operation Zookeeper's path needs to
30 matches
Mail list logo