Hi,
It is recommended to always call update().
State modifications by modifying objects is only possible because the heap
based backends do not serialize or copy records to avoid additional costs.
Hence, this is rather a side effect than a provided API. As soon as you
change the state backend,
Hi Paul,
Maybe Aljoscha (in CC) can help you with this question.
AFAIK, he has some experience with Flink and Kerberos.
Best, Fabian
2018-08-13 14:51 GMT+02:00 Paul Lam :
> Hi,
>
> I built Flink from the latest 1.5.x source code, and got some strange
> outputs from the command line when
Hi,
It is sufficient to implement the CheckpointedFunction interface.
Since SourceFunctions emit records in a separate thread, you need to ensure
that not record is emitted while the shapshotState method is called.
Flink provides a lock to synchronize data emission and state snapshotting.
See the
Hi Mingliang,
let me answer your second question first:
> Another question is about the alignment buffer, I thought it was only
used for multiple input stream cases. But for keyed process function , what
is actually aligned?
When a task sends records to multiple downstream tasks (task not
ong configuration of the cluster; there was only 1
> task manager with 1 slot.
>
> If I submit a job with "flink run -p 24 ...", will the job hang until at
> least 24 slots are available?
>
> Regards,
> Alexis.
>
> On Fri, 10 Aug 2018, 14:01 Fabian Hueske wrote:
>
Hi,
ExecutionEnvironment.socketTextStream is deprecated and it is very likely
that it will be removed because of its limited use.
I would recommend to have at the implementation of the SourceFunction [1]
and adapt it to your needs.
Best, Fabian
[1]
Hi Henry,
The problem is that the table that results from the query does not have a
unique key.
You can only use an upsert sink if the table has a (composite) unique key.
Since this is not the case, you cannot use upsert sink.
However, you can implement a StreamRetractionTableSink which allows to
Hi Averell,
Conceptually, you are right. Checkpoints are taken at every operator at the
same "logical" time.
It is not important, that each operator checkpoints at the same wallclock
time. Instead, the need to take a checkpoint when they have processed the
same input.
This is implemented with
upBy(0, 1)
>> .reduceGroup(groupReducer)
>> .withForwardedFields("_1")
>> .output(outputFormat)
>>
>> It seems to work well, and the semantic annotation does remove a hash
>> partition from the execution plan.
>>
>> Regards,
>> Ale
Hi,
Elias and Paul have good points.
I think the performance degradation is mostly to the lack of function
chaining in the rebalance case.
If all steps are just map functions, they can be chained in the
no-rebalance case.
That means, records are passed via function calls.
If you add rebalancing,
Hi Will,
The distinct operator is implemented as a groupBy(distinctKeys) and a
ReduceFunction that returns the first argument.
Hence, it depends on the order in which the records are processed by the
ReduceFunction.
Flink does not maintain a deterministic order because it is quite expensive
in
dian and the caller indicates
> UTF-16BE, Flink should rewrite the charsetName as UTF-16LE.
>
> I hope this makes sense and that I haven't been testing incorrectly or
> misreading the code.
>
> Thank you,
> David
>
> On Thu, Aug 9, 2018 at 4:04 AM Fabian Hueske wro
Hi Averell,
One comment regarding what you said:
> As my files are small, I think there would not be much benefit in
checkpointing file offset state.
Checkpointing is not about efficiency but about consistency.
If the position in a split is not checkpointed, your application won't
operate with
Hi,
regarding the plans. There are no plans to support custom window assigners
and evictors.
There were some thoughts about supporting different result update
strategies that could be used to return early results or update results in
case of late data.
However, these features are currently not
takes two parameters:
> partitionNumber and totalNumberOfPartitions. Should I assume that there are
> 2 splits divided into 24 partitions?
>
> Regards,
> Alexis.
>
>
>
> On Wed, Aug 8, 2018 at 11:57 AM Fabian Hueske wrote:
>
>> Hi Alexis,
>>
>> First o
Hi David,
Did you try to set the encoding on the TextInputFormat with
TextInputFormat tif = ...
tif.setCharsetName("UTF-16");
Best, Fabian
2018-08-08 17:45 GMT+02:00 David Dreyfus :
> Hello -
>
> It does not appear that Flink supports a charset encoding of "UTF-16". It
> particular, it
Hi Juan,
The state will be purged if you return None instead of a Some.
However, this only happens when the function is called for a specific key,
i.e., state won't be automatically removed after some time.
If this is your use case, you have to implement a ProcessFunction and use
timers to
Thanks Amit!
I've added Limeroad to the list with your description.
Best, Fabian
2018-08-08 14:12 GMT+02:00 amit.jain :
> Hi Fabian,
>
> We at Limeroad, are using Flink for multiple use-cases ranging from ETL
> jobs, ClickStream data processing, real-time dashboard to CEP. Could you
> list us
Hi everybody,
The Flink community maintains a directory of organizations and projects
that use Apache Flink [1].
Please reply to this thread if you'd like to add an entry to this list.
Thanks,
Fabian
[1] https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink
Hi Alexis,
First of all, I think you leverage the partitioning and sorting properties
of the data returned by the database using SplitDataProperties.
However, please be aware that SplitDataProperties are a rather experimental
feature.
If used without query parameters, the JDBCInputFormat
The code or the execution plan (ExecutionEnvironment.getExecutionPlan()) of
the job would be interesting.
2018-08-08 10:26 GMT+02:00 Chesnay Schepler :
> What have you tried so far to increase performance? (Did you try different
> combinations of -yn and -ys?)
>
> Can you provide us with your
I've created FLINK-10100 [1] to track the problem and suggest a solution
and workaround.
Best, Fabian
[1] https://issues.apache.org/jira/browse/FLINK-10100
2018-08-08 10:39 GMT+02:00 Fabian Hueske :
> Hi Dylan,
>
> Yes, that's a bug.
> As you can see from the plan, the parti
t; Thanks for the reply. I was mainly thinking of the usecase of streaming
> job.
> In the approach to port to Flink's SQL API, is it possible to read parquet
> data from S3 and register table in flink?
>
>
> On Tue, Aug 7, 2018 at 1:05 PM, Fabian Hueske wrote:
>
>> Hi Mugun
Hi Dylan,
Yes, that's a bug.
As you can see from the plan, the partitioning step is pushed past the
Filter.
This is possible, because the optimizer knows that a Filter function cannot
modify the data (it only removes records).
A workaround should be to implement the filter as a FlatMapFunction.
Hi Averall,
As Vino said, checkpoints store the state of all operators of an
application.
The state of a monitoring source function is the position in the currently
read split and all splits that have been received and are currently pending.
In case of a recovery, the splits are recovered and
Hi Mugunthan,
this depends on the type of your job. Is it a batch or a streaming job?
Some queries could be ported to Flink's SQL API as suggested by the link
that Hequn posted. In that case, the query would be executed in Flink.
Other options are to use a JDBC InputFormat or persisting the
cannot be applied to (String, org.apache.flink.streaming.
>>>> api.environment.StreamExecutionEnvironment, Symbol, Symbol, Symbol,
>>>> Symbol)
>>>> [error] tableEnv.registerDataStream("table1", streamExecEnv, 'key,
>>>> 'ticker, 'timeissued
Hi,
By setting the time characteristic to EventTime, you enable the internal
handling of record timestamps and watermarks.
In contrast to EventTime, ProcessingTime does not require any additional
data.
You can use both, EventTime and ProcessingTime in the same application and
discussion of your document.
Elias, do you want to put your document into Markdown and open a PR for the
documentation?
Thanks,
Fabian
2018-07-31 18:16 GMT+02:00 Fabian Hueske :
> Hi Elias,
>
> Sorry for the delay. I just made a pass over the document.
> I think it is very good.
>
>
Hi,
Paul is right.
Which and how much data is stored in state for a window depends on the type
of the function that is applied on the windows:
- ReduceFunction: Only the reduced value is stored
- AggregateFunction: Only the accumulator value is stored
- WindowFunction or ProcessWindowFunction:
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all res
Hi I think you are mixing Java and Scala dependencies.
org.apache.flink.streaming.api.datastream.DataStream is the DataStream of
the Java DataStream API.
You should use the DataStream of the Scala DataStream API.
Best, Fabian
2018-08-01 14:01 GMT+02:00 Mich Talebzadeh :
> Hi,
>
> I believed I
Hi Averell,
please find my answers inlined.
Best, Fabian
2018-07-31 13:52 GMT+02:00 Averell :
> Hi Fabian,
>
> Thanks for the information. I will try to look at the change to that
> complex
> logic that you mentioned when I have time. That would save one more shuffle
> (from 1 to 0), wouldn't
:
> Fabian,
>
> You have any time to review the changes?
>
> On Thu, Jul 19, 2018 at 2:19 AM Fabian Hueske wrote:
>
>> Hi Elias,
>>
>> Thanks for the update!
>> I'll try to have another look soon.
>>
>> Best, Fabian
>>
>> 2018-07-11 1:3
Hi,
If you are using a custom source, you can call
SourceContext.markAsTemporarilyIdle() to indicate that a task is currently
not producing new records [1].
Best, Fabian
2018-07-31 8:50 GMT+02:00 Reza Sameei :
> It's not a real solution; but why you don't change the parallelism for
> your
Hi Averell,
The records emitted by the monitoring tasks are "just" file splits, i.e.,
meta information that defines which data to read from where.
The reader tasks receive these splits and process them by reading the
corresponding files.
You could of course partition the splits based on the file
Hi,
Watermarks are not holding back records. Instead they define the event-time
at an operator (as Vino said) and can trigger the processing of data if the
logic of an operator is based on time.
For example, a window operator can emit complete results for a window once
the time passed the
Hi Chang,
The state handle objects are not created per key but just once per function
instance.
Instead they route state accesses to the backend (JVM heap or RocksDB) for
the currently active key.
Best, Fabian
2018-07-30 12:19 GMT+02:00 Chang Liu :
> Hi Andrey,
>
> Thanks for your reply. My
ich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be lia
Hi,
First of all, the ticket reports a bug (or improvement or feature
suggestion) such that others are aware of the problem and understand its
cause.
At some point it might be picked up and implemented. In general, there is
no guarantee whether or when this happens, but the Flink community is of
Hi,
Thanks for creating the Jira issue.
I'm not sure if I would consider this a blocker but it is certainly an
important problem to fix.
Anyway, in the original version Flink checkpoints the modification
timestamp up to which all files have been read (or at least up to which
point it *thinks* to
Hi,
The problem is that Flink tracks which files it has read by remembering the
modification time of the file that was added (or modified) last.
We use the modification time, to avoid that we have to remember the names
of all files that were ever consumed, which would be expensive to check and
Hi Darshan,
The join implementation in SQL / Table API does what is demanded by the SQL
semantics.
Hence, what results to emit and also what data to store (state) to compute
these results is pretty much given.
You can think of the semantics of the join as writing both streams into a
relational
ing).
>
> I can see the point of making the checkpoint triggering more flexible and
> giving some control to the user. In contrast to savepoints, checkpoints are
> considered for recovery. My question here would be, what would be the
> triggering condition in your case (other than
Hi Henkka,
You might want to consider implementing a dedicated job for state
bootstrapping that uses the same operator UUID and state names. That might
be easier than integrating the logic into your regular job.
I think you have to use the monitoring file source because AFAIK it won't
be
Hi,
Flink guarantees order only within a partition. For example, if you have
the program map_1 -> map_2 and both map functions run with parallelism 4,
the order of records in each of the 4 partitions is not changed..
In case of a shuffle (such as a keyBy or change in parallelism) records are
HI James,
Yes, that should also do the trick.
Best, Fabian
2018-07-19 16:06 GMT+02:00 Porritt, James :
> It looks like the following gives me the result I’m interested in:
>
>
>
> batchEnv
>
> .createInput(dataset)
>
> .groupBy("id")
>
>
Hi Nick,
What Ken said is correct, but let me add two more things.
1) State
Usually, you only need to partition (keyBy()) the data if you want to
process tuples with the same same key together.
Therefore, it is necessary to hold some tuples or intermediate results
(like partial or running
;>> I think you did not to enable access for comments for the link. Would
>>> you mind enabling comments for the google doc?
>>>
>>> Thanks,
>>> Rong
>>>
>>>
>>> On Thu, Jul 5, 2018 at 8:39 AM Fabian Hueske wrote:
>
Hi Chirag,
Stop with savepoint is not mentioned in the 1.5.0 release notes [1].
Since its a frequently requested feature, I'm pretty sure that it would
have been mentioned if it was added.
Best, Fabian
[1] http://flink.apache.org/news/2018/05/25/release-1.5.0.html
2018-07-19 8:39 GMT+02:00
Hi Soheil,
Hequn is right. This might be an issue with advancing event-time.
You can monitor that by checking the watermarks in the web dashboard or
print-debug it with a ProcessFunction which can lookup the current
watermark.
Best, Fabian
2018-07-19 3:30 GMT+02:00 Hequn Cheng :
> Hi Soheil,
>
Hi Shay,
This sounds very much like the off-by-one bug described by FLINK-9857 [1].
The problem was identified in another recent user ml thread and fixed for
Flink 1.5.2 and 1.6.0.
Best, Fabian
[1] https://issues.apache.org/jira/browse/FLINK-9857
2018-07-18 19:00 GMT+02:00 Andrey Zagrebin :
>
Hi everyone,
I'd like to announce the program for Flink Forward Berlin 2018.
The program committee [1] assembled a program of about 50 talks on use
cases, operations, ecosystem, tech deep dive, and research topics.
The conference will host speakers from Airbnb, Amazon, Google, ING, Lyft,
Hi Alexei,
Till (in CC) is familiar with Flink's Mesos support in 1.4.x.
Best, Fabian
2018-07-13 15:07 GMT+02:00 NEKRASSOV, ALEXEI :
> Can someone please clarify how Flink on Mesos in containerized?
>
>
>
> On 5-node Mesos cluster I started Flink (1.4.2) with two Task Managers.
> Mesos shows
Hi Gerard,
Thanks for reporting this issue. I'm pulling in Nico and Piotr who have
been working on the networking stack lately and might have some ideas
regarding your issue.
Best, Fabian
2018-07-13 13:00 GMT+02:00 Zhijiang(wangzhijiang999) <
wangzhijiang...@aliyun.com>:
> Hi Gerard,
>
> I
Hi,
I don't think that is possible.
The Evictor interface does not provide access to a state store, so there is
no way to access state.
Best, Fabian
2018-07-10 13:26 GMT+02:00 Jayant Ameta :
> Hi,
> I'm using the GlobalWindow with a custom CountTrigger (similar to the
> CountTrigger provided
ver, a second PoC I was considering is related to Flink CEP. Let's
>>> say I am elaborating sensor data, I want to have a rule which is working on
>>> the following principle:
>>> - If the temperature is more than 40
>>> - If the temperature yesterday at noon was more
Hi Yennie,
You might want to have a look at the OVER windows of Flink's Table API or
SQL [1].
An OVER window computes an aggregate (such as a count) for each incoming
record over a range of previous events.
For example the query:
SELECT ip, successful, COUNT(*) OVER (PARTITION BY ip, successful
Hi Elias,
Thanks for the great document!
I made a pass over it and left a few comments.
I think we should definitely add this to the documentation.
Thanks,
Fabian
2018-07-04 10:30 GMT+02:00 Fabian Hueske :
> Hi Elias,
>
> I agree, the docs lack a coherent discussion of event time
Hi,
> Flink doesn't support connecting multiple streams with heterogeneous
schema
This is not correct.
Flink is very well able to connect streams with different schema. However,
you cannot union two streams with different schema.
In order to reconfigure an operator with changing rules, you can
ble.getSchema.getTypes)
> tableEnv.toRetractStream[Row](outTable).print()
>
>
> Thanks again,
> Jungtaek Lim (HeartSaVioR)
>
> [1] https://issues.apache.org/jira/browse/FLINK-9742
>
> 2018년 7월 4일 (수) 오후 10:03, Fabian Hueske 님이 작성:
>
>> Hi,
>>
>> Glad you
Hi Yersinia,
The main idea of an event-driven application is to hold the state (i.e.,
the account data) in the streaming application and not in an external
database like Couchbase.
This design is very scalable (state is partitioned) and avoids look-ups
from the external database because all state
("eventTime"),
> new BoundedOutOfOrderTimestamps(Time.minutes(1).toMilliseconds)
> )
> .build()
>
> Thanks again!
> Jungtaek Lim (HeartSaVioR)
>
> 2018년 7월 4일 (수) 오후 8:18, Fabian Hueske 님이 작성:
>
>> Hi Jungtaek,
>>
>> If it is "only&q
Hi Xilang,
I thought about this again.
The bucketing sink would need to roll on event-time intervals (similar to
the current processing time rolling) which are triggered by watermarks in
order to support consistency.
However, it would also need to maintain a write ahead log of all received
rows
e tricky (as the semantic of SQL query is not for
> multiple outputs).
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2018년 7월 4일 (수) 오후 5:49, Fabian Hueske 님이 작성:
>
>> Hi Jungtaek,
>>
>> Flink 1.5.0 features a TableSource for Kafka and JSON [1], incl.
>> ti
Hi Ahmad,
Some tricks that might help to bring down the effort per tenant if you run
one job per tenant (or key per tenant):
- Pre-aggregate records in a 5 minute Tumbling window. However,
pre-aggregation does not work for FoldFunctions.
- Implement the window as a custom ProcessFunction that
ording to above conversation flink will persist state forever for non
> windowed operations. I want to know how flink persiat the state i.e.
> Database or file system or in memory etc.
>
> On Wed, 4 Jul 2018 at 2:12 PM, Fabian Hueske wrote:
>
>> Hi Amol,
>>
>> The m
Hi Jungtaek,
Flink 1.5.0 features a TableSource for Kafka and JSON [1], incl. timestamp
& watemark generation [2].
It would be great if you could let us know, if that addresses your use case
and if not what's missing or not working.
So far Table API / SQL does not have support for late-data side
Hi Amol,
The memory consumption depends on the query/operation that you are doing.
Time-based operations like group-window-aggregations,
over-window-aggregations, or window-joins can automatically clean up their
state once data is not no longer needed.
Operations such as non-windowed aggregations
Hi,
The Evictor is useful if you want to remove some elements from the window
state but not all.
This also implies that a window is evaluated multiple times because
otherwise you could just filter in the the user function (as you suggested)
and purge the whole window afterwards.
Evictors are
Hi Elias,
I agree, the docs lack a coherent discussion of event time features.
Thank you for this write up!
I just skimmed your document and will provide more detailed feedback later.
It would be great to add such a page to the documentation.
Best, Fabian
2018-07-03 3:07 GMT+02:00 Elias Levy :
:37 GMT+02:00 ashish pok :
> Thanks Fabian! It sounds like KeyGroup will do the trick if that can be
> made publicly accessible.
>
> On Monday, July 2, 2018, 5:43:33 AM EDT, Fabian Hueske
> wrote:
>
>
> Hi Ashish, hi Vijay,
>
> Flink does not distinguish between di
Looking at the other threads, I assume you solved this issue.
The problem should have been that FlinkKafka09Consumer is not included in
the flink-connector-kafka-0.11 module, because it is the connector for
Kafka 0.9 and not Kafka 0.11.
Best, Fabian
2018-07-02 11:20 GMT+02:00 Mich Talebzadeh :
There is also the SQL:2003 MERGE statement that can be used to implement
UPSERT logic.
It is a bit verbose but supported by Derby [1].
Best, Fabian
[1] https://issues.apache.org/jira/browse/DERBY-3155
2018-07-04 10:10 GMT+02:00 Fabian Hueske :
> Hi Chris,
>
> MySQL (and maybe o
Hi,
In addition to what Rong said:
- The types look OK.
- You can also use Types.STRING, and Types.LONG instead of BasicTypeInfo.xxx
- Beware that in the failure case, you might have multiple entries in the
database table. Some databases support an upsert syntax which (together
with key or
Hi Xilang,
Let me try to summarize your requirements.
If I understood you correctly, you are not only concerned about the
exactly-once guarantees but also need a consistent view of the data.
The data in all files that are finalized need to originate from a prefix of
the stream, i.e., all records
Hi,
The docs explain that the ExternalCatalog interface *can* be used to
implement a catalog for HCatalog or Metastore.
However, there is no such implementation in Flink yet. You would need to
implement such as catalog connector yourself.
I think there would be quite a few people interested in
Hi Will,
The community is currently working on improving the Kafka Avro integration
for Flink SQL.
There's a PR [1]. If you like, you could try it out and give some feedback.
Timo (in CC) has been working Kafka Avro and should be able to help with
any specific questions.
Best, Fabian
[1]
Hi Mich,
FlinkKafkaConsumer09 is the connector for Kafka 0.9.x.
Have you tried to use FlinkKafkaConsumer011 instead of FlinkKafkaConsumer09?
Best, Fabian
2018-07-02 22:57 GMT+02:00 Mich Talebzadeh :
> This is becoming very tedious.
>
> As suggested I changed the kafka dependency from
>
>
Hi,
Let me summarize:
1) Sometimes you get the error message
"org.apache.flink.client.program.ProgramMissingJobException: The program
didn't contain a Flink job.". when submitting a program through the
YarnClusterClient
2) The logs and the dashboard state that the job ran successful
3) The job
e physical partitioning in a way where physical partiotioning happens
>> first by parent key and localize grouping by child key, is there a need to
>> using custom partitioner? Obviously we can keyBy twice but was wondering if
>> we can minimize the re-partition stress.
>>
>
ne.
>
> I guess I might have to use a ThreadPool within each Slot(cam partition)
> to work on each seq# ??
>
> TIA
>
> On Tue, Jun 26, 2018 at 1:06 AM Fabian Hueske wrote:
>
>> Hi,
>>
>> keyBy() does not work hierarchically. Each keyBy() overrides the pre
Hi Osh,
You can certainly apply multiple reduce function on a DataSet, however, you
should make sure that the data is only partitioned and sorted once.
Moreover, you would end up with multiple data sets that you need to join
afterwards.
I think the easier approach is to wrap your functions in a
ta is
>>>> loaded into the system before the watermark advances. At that point the
>>>> checkpoints stall indefinitely with a couple of the tasks in the 'over'
>>>> operator never acknowledging. Any thoughts on what would cause that? Or how
>>>
Hi,
The OVER window operator can only emit result when the watermark is
advanced, due to SQL semantics which define that all records with the same
timestamp need to be processed together.
Can you check if the watermarks make sufficient progress?
Btw. did you observe state size or IO issues? The
Hi Elias,
Till (in CC) is familiar with Flink's HA implementation.
He might be able to answer your question.
Thanks,
Fabian
2018-06-25 23:24 GMT+02:00 Elias Levy :
> I noticed in one of our cluster that they are relatively old
> submittedJobGraph* and completedCheckpoint* files. I was
Hi,
You can just add a cast to StateBackend to get rid of the deprecation
warning:
env.setStateBackend((StateBackend) new
FsStateBackend("hdfs://myhdfsmachine:9000/flink/checkpoints"));
Best, Fabian
2018-06-27 5:47 GMT+02:00 Rong Rong :
> Hmm.
>
> If you have a wrapper function like this,
Hi Sagar,
That's more a question for the ORC community, but AFAIK, the top-level type
is always a struct because it needs to wrap the fields, e.g.,
struct(name:string, age:int)
Best, Fabian
2018-06-26 22:38 GMT+02:00 sagar loke :
> @zhangminglei,
>
> Question about the schema for ORC format:
>
Hi,
Measuring latency is tricky and you have to be careful about what you
measure.
Aggregations like window operators make things even more difficult because
you need to decide which timestamp(s) to forward (smallest?, largest?, all?)
Depending on the operation, the measurement code might even
nt
>> slots/threads on the same Task Manager instance(aka cam1 partition) using
>> keyBy(seq#) & setParallelism() ? Can *forward* Strategy be used to
>> achieve this ?
>>
>> TIA
>>
>>
>> On Mon, Jun 25, 2018 at 1:03 AM Fabian Hueske wrote:
>>
Hi Vishal,
1. I don't think a rolling update is possible. Flink 1.5.0 changed the
process orchestration and how they communicate. IMO, the way to go is to
start a Flink 1.5.0 cluster, take a savepoint on the running job, start
from the savepoint on the new cluster and shut the old job down.
2.
Hi,
Flink distributes task instances to slots and does not expose physical
machines.
Records are partitioned to task instances by hash partitioning. It is also
not possible to guarantee that the records in two different operators are
send to the same slot.
Sharing information by side-passing it
Hi,
I would not encode this information in watermarks. Watermarks are rather an
internal mechanism to reason about event-time.
Flink also generates watermarks internally. This makes the behavior less
predictive.
You could either inject special meta data records (which Flink handles just
like
Great, thank you!
2018-06-22 10:16 GMT+02:00 Vinay Patil :
> Hi Fabian,
>
> Created a JIRA ticket : https://issues.apache.org/jira/browse/FLINK-9643
>
> Regards,
> Vinay Patil
>
>
> On Fri, Jun 22, 2018 at 1:25 PM Fabian Hueske wrote:
>
>> Hi Vinay,
>&
Hi Vinay,
This looks like a bug.
Would you mind creating a Jira ticket [1] for this issue?
Thank you very much,
Fabian
[1] https://issues.apache.org/jira/projects/FLINK
2018-06-21 9:25 GMT+02:00 Vinay Patil :
> Hi,
>
> I have deployed Flink 1.3.2 and enabled SSL settings. From the ssl debug
>
Hi,
Although this solution looks straight-forward, custom triggers cannot be
added that easily.
The problem is that a window operator with a Trigger that emit early
results produces updates, i.e., results that have been emitted might be
updated later.
The default Trigger only emits the final
seems not supported in Flink-1.3 .
> I found this in Flink-1.3:
> Broadcasting
> DataStream → DataStream
>
> Broadcasts elements to every partition.
>
> dataStream.broadcast();
>
> But I don’t know how to convert it to list and get it in stream context .
>
> 在 2018年6月
Hi,
if the list is static and not too large, you can pass it as a parameter to
the function.
Function objects are serialized (using Java's default serialization) and
shipped to the workers for execution.
If the data is dynamic, you might want to have a look at Broadcast state
[1].
Best, Fabian
ger.clear(), not
> Trigger.onClose().
>
> Best,
> - Dongwon
>
>
> On Wed, Jun 20, 2018 at 7:30 PM, Chesnay Schepler
> wrote:
>
>> Checkpointing of metrics is a manual process.
>> The operator must write the current value into state, retrieve it on
>> re
Hi Manuel,
I had a look and couldn't find a way to do it.
However, this sounds like a very useful feature to me.
Would you mind creating a Jira issue [1] for that?
Thanks, Fabian
[1] https://issues.apache.org/jira/projects/FLINK
2018-06-18 16:23 GMT+02:00 Haddadi Manuel :
> Hi all,
>
>
> I
401 - 500 of 1535 matches
Mail list logo