Re: [ANNOUNCE] Weekly Community Update 2019/24

2019-06-17 Thread Konstantin Knauf
Hi Zili,

thank you for adding these threads :) I would have otherwise picked them up
next week, just couldn't put everything into one email.

Cheers,

Konstantin

On Sun, Jun 16, 2019 at 11:07 PM Zili Chen  wrote:

> Hi Konstantin and all,
>
> Thank Konstantin very much for reviving this tradition! It reminds
> me of the joyful time I can easily catch up interesting ongoing threads.
> Thanks for Till's work, too.
>
> Besides exciting updates and news above, I'd like to pick up
> some other threads you guys may be interested in.
>
> * xiaogang has recently started a discussion[1] on allowing
> at-most-once delivery in case of failures, which adapts Flink
> to more scenarios.
>
> * vino has raised a discussion[2] on supporting local aggregation
> in Flink, which was received a lot of positive feedbacks and now
> there is a ongoing FLIP-44 thread[3].
>
> * Jeff Zhang has raised a discussion[4] and drafted a design doc[5]
> on Flink client API enhancement, which aims at overcoming limitation
> when integrating Flink with projects such as Zepplin or Beam.
>
> Best,
> tison.
>
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Allow-at-most-once-delivery-in-case-of-failures-td29464.html
> [2]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-Local-Aggregation-in-Flink-td29307.html
> [3]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-44-Support-Local-Aggregation-in-Flink-td29513.html
> [4]
> https://lists.apache.org/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E
> [5]
> https://docs.google.com/document/d/1VavBrYn8vJeZs-Mhu5VzKO6xrWCF40aY0nlQ_UVVTRg/
>
>
> Konstantin Knauf  于2019年6月17日周一 上午12:10写道:
>
>> Dear community,
>>
>> last year Till did a great job on summarizing recent developments in the
>> Flink community in a "Weekly community update" thread. I found this very
>> helpful and would like to revive this tradition with a focus on topics &
>> threads which are particularly relevant to the wider community of Flink
>> users.
>>
>> As we haven't had such an update for some time (since December 2018), I
>> find it impossible to cover everything that's currently going on in this
>> email. I'll try to include most ongoing discussions and FLIPs over the
>> course of the next weeks to catch up. Afterwards I am going to go back to
>> only focus on news since the last update.
>>
>> You are welcome to share any additional news and updates with the
>> community in this thread.
>>
>> Flink Development
>> ===
>>
>> * [releases] The community is currently working on a Flink 1.8.1 release
>> [1]. The first release candidate should be ready soon (one critical bug to
>> fix as of writing, FLINK-12863).
>> * [releases] Kurt and Gordon stepped up as release managers for Flink 1.9
>> and started a thread [2] to sync on the status of various development
>> threads targeted for Flink 1.9. Check it out to see if the feature you are
>> waiting for is likely to make it or not.
>> * [savepoints] Gordon, Kostas and Congxian have recently started a
>> discussion [3] on unifying the savepoint format across StateBackends, which
>> will enable users to switch between StateBackends when recovering from a
>> Savepoint. The related discussion on introducing Stop-With-Checkpoint [4]
>> initiated by Yu Li is closely related and worth a read to understand the
>> long term vision.
>> * [savepoints] Seth and Gordon have started a discussion to add a State
>> Processing API ("Savepoint Connector"), which will allow reading &
>> modifying existing Savepoints as well as creating new Savepoints from
>> scratch with the DataSet API. The feature is targeted for Flink 1.9.0 as a
>> new *library*.
>> * [python-support] Back in April we had a discussion on the mailing list
>> about adding Python Support to the Table API [6]. This support will likely
>> be available in Flink 1.9 (without UDFs and later with UDF support as
>> well). Therefore, Stephan has started a discussion [7] to deprecate the
>> current Python API in Flink 1.9. This has gotten a lot of positive feedback
>> and the only open question as of writing is whether to only deprecate it or
>> to remove it directly.
>>
>> [1]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-8-1-td29154.html
>> [2]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Features-for-Apache-Flink-1-9-0-td28701.html
>> [3]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-41-Unify-Keyed-State-Snapshot-Binary-Format-for-Savepoints-td29197.html
>> [4] https://issues.apache.org/jira/browse/FLINK-12619
>> [5]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-FLIP-43-Savepoint-Connector-td29232.html
>> [6]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html#a28096
>> [7]
>> 

Re: [ANNOUNCE] Weekly Community Update 2019/24

2019-06-16 Thread Zili Chen
Hi Konstantin and all,

Thank Konstantin very much for reviving this tradition! It reminds
me of the joyful time I can easily catch up interesting ongoing threads.
Thanks for Till's work, too.

Besides exciting updates and news above, I'd like to pick up
some other threads you guys may be interested in.

* xiaogang has recently started a discussion[1] on allowing
at-most-once delivery in case of failures, which adapts Flink
to more scenarios.

* vino has raised a discussion[2] on supporting local aggregation
in Flink, which was received a lot of positive feedbacks and now
there is a ongoing FLIP-44 thread[3].

* Jeff Zhang has raised a discussion[4] and drafted a design doc[5]
on Flink client API enhancement, which aims at overcoming limitation
when integrating Flink with projects such as Zepplin or Beam.

Best,
tison.

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Allow-at-most-once-delivery-in-case-of-failures-td29464.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-Local-Aggregation-in-Flink-td29307.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-44-Support-Local-Aggregation-in-Flink-td29513.html
[4]
https://lists.apache.org/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E
[5]
https://docs.google.com/document/d/1VavBrYn8vJeZs-Mhu5VzKO6xrWCF40aY0nlQ_UVVTRg/


Konstantin Knauf  于2019年6月17日周一 上午12:10写道:

> Dear community,
>
> last year Till did a great job on summarizing recent developments in the
> Flink community in a "Weekly community update" thread. I found this very
> helpful and would like to revive this tradition with a focus on topics &
> threads which are particularly relevant to the wider community of Flink
> users.
>
> As we haven't had such an update for some time (since December 2018), I
> find it impossible to cover everything that's currently going on in this
> email. I'll try to include most ongoing discussions and FLIPs over the
> course of the next weeks to catch up. Afterwards I am going to go back to
> only focus on news since the last update.
>
> You are welcome to share any additional news and updates with the
> community in this thread.
>
> Flink Development
> ===
>
> * [releases] The community is currently working on a Flink 1.8.1 release
> [1]. The first release candidate should be ready soon (one critical bug to
> fix as of writing, FLINK-12863).
> * [releases] Kurt and Gordon stepped up as release managers for Flink 1.9
> and started a thread [2] to sync on the status of various development
> threads targeted for Flink 1.9. Check it out to see if the feature you are
> waiting for is likely to make it or not.
> * [savepoints] Gordon, Kostas and Congxian have recently started a
> discussion [3] on unifying the savepoint format across StateBackends, which
> will enable users to switch between StateBackends when recovering from a
> Savepoint. The related discussion on introducing Stop-With-Checkpoint [4]
> initiated by Yu Li is closely related and worth a read to understand the
> long term vision.
> * [savepoints] Seth and Gordon have started a discussion to add a State
> Processing API ("Savepoint Connector"), which will allow reading &
> modifying existing Savepoints as well as creating new Savepoints from
> scratch with the DataSet API. The feature is targeted for Flink 1.9.0 as a
> new *library*.
> * [python-support] Back in April we had a discussion on the mailing list
> about adding Python Support to the Table API [6]. This support will likely
> be available in Flink 1.9 (without UDFs and later with UDF support as
> well). Therefore, Stephan has started a discussion [7] to deprecate the
> current Python API in Flink 1.9. This has gotten a lot of positive feedback
> and the only open question as of writing is whether to only deprecate it or
> to remove it directly.
>
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-8-1-td29154.html
> [2]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Features-for-Apache-Flink-1-9-0-td28701.html
> [3]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-41-Unify-Keyed-State-Snapshot-Binary-Format-for-Savepoints-td29197.html
> [4] https://issues.apache.org/jira/browse/FLINK-12619
> [5]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-FLIP-43-Savepoint-Connector-td29232.html
> [6]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html#a28096
> [7]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Deprecate-previous-Python-APIs-td29483.html#a29522
>
> Notable Bugs
> ===
>
> In this section I am going to list some recently discovered bugs, which
> might be relevant to a larger audience. I'll try to explain them to the
> best of my knowledge, but no 

[ANNOUNCE] Weekly Community Update 2019/24

2019-06-16 Thread Konstantin Knauf
Dear community,

last year Till did a great job on summarizing recent developments in the
Flink community in a "Weekly community update" thread. I found this very
helpful and would like to revive this tradition with a focus on topics &
threads which are particularly relevant to the wider community of Flink
users.

As we haven't had such an update for some time (since December 2018), I
find it impossible to cover everything that's currently going on in this
email. I'll try to include most ongoing discussions and FLIPs over the
course of the next weeks to catch up. Afterwards I am going to go back to
only focus on news since the last update.

You are welcome to share any additional news and updates with the community
in this thread.

Flink Development
===

* [releases] The community is currently working on a Flink 1.8.1 release
[1]. The first release candidate should be ready soon (one critical bug to
fix as of writing, FLINK-12863).
* [releases] Kurt and Gordon stepped up as release managers for Flink 1.9
and started a thread [2] to sync on the status of various development
threads targeted for Flink 1.9. Check it out to see if the feature you are
waiting for is likely to make it or not.
* [savepoints] Gordon, Kostas and Congxian have recently started a
discussion [3] on unifying the savepoint format across StateBackends, which
will enable users to switch between StateBackends when recovering from a
Savepoint. The related discussion on introducing Stop-With-Checkpoint [4]
initiated by Yu Li is closely related and worth a read to understand the
long term vision.
* [savepoints] Seth and Gordon have started a discussion to add a State
Processing API ("Savepoint Connector"), which will allow reading &
modifying existing Savepoints as well as creating new Savepoints from
scratch with the DataSet API. The feature is targeted for Flink 1.9.0 as a
new *library*.
* [python-support] Back in April we had a discussion on the mailing list
about adding Python Support to the Table API [6]. This support will likely
be available in Flink 1.9 (without UDFs and later with UDF support as
well). Therefore, Stephan has started a discussion [7] to deprecate the
current Python API in Flink 1.9. This has gotten a lot of positive feedback
and the only open question as of writing is whether to only deprecate it or
to remove it directly.

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-8-1-td29154.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Features-for-Apache-Flink-1-9-0-td28701.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-41-Unify-Keyed-State-Snapshot-Binary-Format-for-Savepoints-td29197.html
[4] https://issues.apache.org/jira/browse/FLINK-12619
[5]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-FLIP-43-Savepoint-Connector-td29232.html
[6]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html#a28096
[7]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Deprecate-previous-Python-APIs-td29483.html#a29522

Notable Bugs
===

In this section I am going to list some recently discovered bugs, which
might be relevant to a larger audience. I'll try to explain them to the
best of my knowledge, but no guarantees.

* [FLINK-12296] [1.6.4] [1.7.2] [1.8.0] State can be silently lost when
recovering a job with two stateful Operators within the same Operator
Chain. This can only be the case when using reinterpretAsKeyedStream and
the bug only affects the RocksDBStatebackend with incremetal checkpointing.
Fixed in 1.7.3, 1.9.0 and 1.8.1. [8]
* [FLINK-12688] [1.6.4] [1.7.2] [1.8.0] A race condition while initializing
the TypeSerializer within a StateDescriptor could lead to rare
NullPointerExceptions when a StateDescriptor is shared between threads. Fixed
in 1.7.3, 1.9.0 and 1.8.1. [9]
* [FLINK-12653] [1.6.4] [1.7.2] 1.8.0 ] After rescaling a job recovery
might fail if some state was only registered in a subset of all Sub-Tasks.
This only affects the FileSystemStatebackend. Unresolved. [10]
* [FLINK-11820] [1.7.2] [1.8.0] The SimpleStringSchema of the
FlinkKafkaConsumer fails on "null" records. Unresolved, but PR available.
[11]
* [FLINK-11162] [1.6.4] [1.7.2] [ 1.8.0] Due to the way the checkpoint
directory is cleaned up by the CheckpointCoordinator tasks might fail
during materialization of a checkpoint if another task has previously
declined the same checkpoint already. The resolution is part of a larger
rework of how checkpoint failures are managed and seems to be targeted for
Flink 1.9.0. [12]
* [FLINK-10317] [1.6.4] [1.7.2] [1.8.0] Admittedly not a new bug, but still
unresolved and discussed: limiting the Java Metaspace size for Flink
processes by default. It is not clear right now whether limiting the
MetaspaceSize is a good idea. This ticket is a good starting point when
running