One of the key challenges is isolation, eg. ensuring that one job cannot
access the credentials of another. The easiest solution today is to use
the YARN deployment mode, with a separate app per job. Meanwhile,
improvements being made under the FLIP-6 banner for 1.4+ are lying
groundwork for a
Note that the built-in `BoundedOutOfOrdernessTimestampExtractor` generates
watermarks based only on the timestamp of incoming events. Without new
events, `BoundedOutOfOrdernessTimestampExtractor` will not advance the
event-time clock. That explains why the window doesn't trigger immediately
I think your snippet looks good. The Jackson ObjectMapper is designed to
be reused by numerous threads, and routinely stored as a static field. It
is somewhat expensive to create.
Hope this helps,
-Eron
On Thu, Aug 3, 2017 at 7:46 AM, Nico Kruber wrote:
> Hi Peter,
>
Flink 1.3 relies on a two-step approach of deploying the cluster on Mesos
and then deploying the job. The cron task seems like a good approach;
maybe retry until the job is successfully deployed (using the 'detached'
option assumedly). One complication is that the slots take some time to
come
For reference: [FLINK-6466] Build Hadoop 2.8.0 convenience binaries
On Wed, Aug 9, 2017 at 6:41 AM, Aljoscha Krettek
wrote:
> So you're saying that this works if you manually compile Flink for Hadoop
> 2.8.0? If yes, I think the solution is that we have to provide binaries
It sounds to me that the TGT is expiring (usually after 12 hours). This
shouldn't happen in the keytab scenario because of a background thread
provided by Hadoop that periodically performs a re-login using the keytab.
More details on the Hadoop internals here:
The directory referred to by `blob.storage.directory` is best described as
a local cache. For recovery purposes the JARs are also stored in `
high-availability.storageDir`.At least that's my reading of the code in
1.2. Maybe there's some YARN specific behavior too, sorry if this
information
When you submit a program via the REST API, the main method executes inside
the JobManager process.Unfortunately a static variable is used to
establish the execution environment that the program obtains from
`ExecutionEnvironment.getExecutionEnvironment()`. From the stack trace it
appears
Yes, registered timers are stored in managed keyed state and should be
fault-tolerant.
-Eron
On Thu, May 25, 2017 at 9:28 AM, Moiz S Jinia wrote:
> With a checkpointed RocksDB based state backend, can I expect the
> registered processing timers to be fault tolerant?
Try setting the assigner on the Kafka consumer, rather than on the
DataStream:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/
kafka.html#kafka-consumers-and-timestamp-extractionwatermark-emission
I believe this will produce a per-partition assigner and forward only
Gordon's suggestion seems like a good way to provide per-job credentials
based on application-specific properties. In contrast, Flink's built-in
JAAS features are aimed at making the Flink cluster's Kerberos credentials
available to jobs.
I want to reiterate that all jobs (for a given Flink
It appears that you're building Flink from source, and attempting to start
the cluster from the 'flink-dist/src/...' directory. Please use
'flink-dist/target/...' instead since that's where the built distribution
is located. For example, check:
I think the short-term approach is to place an nginx proxy in front, in
combination with some form of isolation of the underlying endpoint. That
addresses the authentication piece but not fine-grained authorization. Be
aware that the Flink JM is not multi-user due to lack of isolation among
Unfortunately Flink does not yet support SSL mutual authentication nor any
form of client authentication. There is an ongoing discussion about it:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Service-Authorization-redux-td18890.html
A workaround that I've seen is to
Good news, it can be done if you carefully encode the target directory with
percent-encoding, as per:
https://tools.ietf.org/html/rfc3986#section-2.1
For example, given the directory `s3:///savepoint-bucket/my-awesome-job`,
which encodes to `s3%3A%2F%2F%2Fsavepoint-bucket%2Fmy-awesome-job`, I was
Hello, the current behavior is that Flink holds onto received offers for up
to two minutes while it attempts to provision the TMs. Flink can combine
small offers to form a single TM, to combat fragmentation that develops
over time in a Mesos cluster. Are you saying that unused offers aren't
; > It's a bit confusing because the cancel api works with either and the
> proxy
> > url sometimes works as this was successful http://
> > {ip}:20888/proxy/application_1504649135200_0001/jobs/
> cca2dd609c716a7b0a19570
> > 0777e5b1f/cancel-with-savepoint/target-dir
act Instant getWatermark();
>
> public abstract CheckpointMark getCheckpointMark();
> }
>
> Here the driver would sit outside and call the source whenever data should
> be provided. Stop/cancel would not be a feature of the source function but
> of the code that calls it
I too am curious about stop vs cancel. I'm trying to understand the
motivations a bit more.
The current behavior of stop is basically that the sources become bounded,
leading to the job winding down.
The interesting question is how best to support 'planned' maintenance
procedures such as app
Yes, a given host may be used as both a job manager and as a task manager.
On Mon, Sep 11, 2017 at 8:27 AM, AndreaKinn wrote:
> Hi,
> I'm configuring a cluster composed by just three nodes.
> Looking at cluster setup guide
>
Hi,
You do not need to install Flink onto the nodes. The approach used by
Flink is that the task managers download a copy of Flink from the
appmaster. The entire installation tree of Flink is downloaded (i.e. the
bin/lib/conf directories). The only assumed dependency is Java, which may
be
As mentioned earlier, the watermark is the basis for reasoning about the
overall progression of time. Many operators use the watermark to
correctly organize records, e.g. into the correct time-based window.
Within that window the records may still be unordered. That said, some
operators do
There was also a talk about containerization at Flink Forward that touches
on some of your questions.
https://www.youtube.com/watch?v=w721NI-mtAA=2s=PLDX4T_cnKjD0JeULl1X6iTn7VIkDeYX_X=33
Eron
On Wed, Sep 27, 2017 at 9:12 AM, Stefan Richter wrote:
> Hi,
>
> from
Hello,
You can run the WordCount program directly within your IDE. An embedded
Flink environment is automatically created.
-Eron
On Thu, Aug 17, 2017 at 8:03 AM, P. Ramanjaneya Reddy
wrote:
> Thanks.
> But I want to run in debug mode..how to run without jar and using the
Raja,
According to those configuration values, the delegation token would be
automatically renewed every 24 hours, then expire entirely after 7 days.
You say that the job ran without issue for 'a few days'. Can we conclude
that the job hit the 7-day DT expiration?
Flink supports the use of
Please mail me more information, in particular the JM log and the
information on the 'offers' tab on the Mesos UI. Also, are you using any
Mesos roles?
Thanks
On Thu, Aug 31, 2017 at 9:02 AM, prashantnayak <
prash...@intellifylearning.com> wrote:
> Hi Eron
>
> No, unfortunately we did not
Hello, did you resolve this issue?
Thanks,
Eron Wright
On Wed, Jul 12, 2017 at 11:09 AM, Prashant Nayak <
prash...@intellifylearning.com> wrote:
>
> Hi
>
> We’re running Flink 1.3.1 on Mesos.
>
> From time-to-time, the Flink app master seems to have trouble with Meso
By following Chesney's recommendation we will hopefully uncover an SSL
error that is being masked. Another thing to try is to disable hostname
verification (it is enabled by default) to see whether the certificate is
being rejected.
On Wed, Oct 4, 2017 at 5:15 AM, Chesnay Schepler
Take a look at the section of Flink documentation titled "Event Time and
Watermarks":
https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/event_time.html#event-time-and-watermarks
Also read the excellent series "Streaming 101" and "102", has useful
animations depicting the flow of
Jared, I think you're correct that the shaded `ObjectMapper` is missing.
Based on the above details and a quick look at the Fenzo code, it appears
that this bug is expressed when you've configured some hard constraints
that Mesos couldn't satisfy. Do you agree? Thanks also for suggesting a
fix,
To my knowledge the various RPC clients take care of renewal (whether
reactively or using a renewal thread). Some examples:
https://github.com/apache/hadoop/blob/release-2.7.3-RC2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L638
Since the connection pool is per-TM, and a given job may span numerous TMs,
I think you're imagining a per-TM cleanup.
Your idea of using reference counting seems like the best option. Maintain
a static count of open instances. Use a synchronization block to manage
the count and to
A couple of other details worth mentioning:
1. Your choice of connector matters a lot. Some connectors provide
limited ordering guarantees, especially across keys, which leads to a
highly disordered stream of events (with respect to event time) from the
perspective of the watermark generator.
Thanks for investigating this, Jared. I would summarize it as
Flink-on-Mesos cannot be used in Hadoop-free mode in Flink 1.4.0. I filed
an improvement bug to support this scenario: FLINK-8247
On Tue, Dec 12, 2017 at 11:46 AM, Jared Stehler <
jared.steh...@intellifylearning.com> wrote:
> I
It might be possible to apply backpressure to the channels that are
significantly ahead in event time. Tao, it would not be trivial, but if
you'd like to investigate more deeply, take a look at the Flink runtime's
`StatusWatermarkValve` and the associated stream input processors to see
how an
If I understand you correctly, the high-availability path isn't being
changed but other TM-related settings are, and the recovered TMs aren't
picking up the new configuration. I don't think that Flink supports
on-the-fly reconfiguration of a Task Manager at this time.
As a workaround, to
As a follow-up question, how well does the history server work for
observing a running job? I'm trying to understand whether, in the
cluster-per-job model, a user would be expected to hop from the Web UI to
the History Server once the job completed.
Thanks
On Wed, Oct 4, 2017 at 3:49 AM,
Consider the watermarks that are generated by your chosen watermark
generator as an +assertion+ about the progression of time, based on domain
knowledge, observation of elements, and connector specifics. The generator
is asserting that any elements observed after a given watermark will come
later
You must specify the full class name, in this case `org.example.WordCount`,
for the `--class` argument.
On Fri, Jan 19, 2018 at 9:35 AM, Jesse Lacika wrote:
> I feel like this is probably the simplest thing, but I can't seem to
> figure it out, and I've searched and searched and
host.
Eron
On Thu, Jan 11, 2018 at 7:18 AM, Gary Yao <g...@data-artisans.com> wrote:
> Hi Dongwon,
>
> I am not familiar with the deployment on DC/OS. However, Eron Wright and
> Jörg
> Schad (cc'd), who have worked on the Mesos integration, might be able to
> help
I believe you can extend the `KeyedDeserializationSchema` that you pass to
the consumer to check for end-of-stream markers.
https://ci.apache.org/projects/flink/flink-docs-release-1.4/api/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.html#isEndOfStream-T-
Eron
On
I took a brief look as to why the queryable state server would bind to the
loopback address. Both the qs server and the
org.apache.flink.runtime.io.network.netty.NettyServer do bind the local
address based on the TM address. That address is based on the
"taskmanager.hostname" configuration
I think there's a misconception about `setAutowatermarkInterval`. It
establishes the rate at which your periodic watermark generator is polled
for the current watermark. Like most generators,
`BoundedOutOfOrdernessTimestampExtractor` produces a watermark based solely
on observed elements.
t; unauthorised mode if zookeeper allows.
>
> And then it connected successfully!
>
>
>
> Do I missing any configuration in flink/zookeeper side.
>
> Expecting you suggestion here.
>
>
>
> Regards
>
> Sarthak Sahu
>
>
>
> *From:* Eron Wright [mailto:
Here's a PR for queryable state TLS that I closed because I didn't have
time, and because I get the impression that the queryable state feature is
used very often.Feel free to take it up, if you like.
https://github.com/apache/flink/pull/6626
-Eron
On Wed, Jul 3, 2019 at 11:21 PM Avi Levi
45 matches
Mail list logo