Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

Fabian Hueske Thu, 14 Feb 2019 02:14:57 -0800

Hi,

I like the idea of putting the roadmap on the website because it is much
more visible (and IMO more credible, obligatory) there.
However, I share the concerns about frequent updates.


It think it would be great to update the "official" roadmap on the website
once per release (-bugfix releases), i.e., every three month.
We can use the wiki to collect and draft the roadmap for the next update.

Best, Fabian


Am Do., 14. Feb. 2019 um 11:03 Uhr schrieb Jeff Zhang <zjf...@gmail.com>:

> Hi Stephan,
>
> Thanks for this proposal. It is a good idea to track the roadmap. One
> suggestion is that it might be better to put it into wiki page first.
> Because it is easier to update the roadmap on wiki compared to on flink web
> site. And I guess we may need to update the roadmap very often at the
> beginning as there's so many discussions and proposals in community
> recently. We can move it into flink web site later when we feel it could be
> nailed down.
>
> Stephan Ewen <se...@apache.org> 于2019年2月14日周四 下午5:44写道：
>
>> Thanks Jincheng and Rong Rong!
>>
>> I am not deciding a roadmap and making a call on what features should be
>> developed or not. I was only collecting broader issues that are already
>> happening or have an active FLIP/design discussion plus committer support.
>>
>> Do we have that for the suggested issues as well? If yes , we can add
>> them (can you point me to the issue/mail-thread), if not, let's try and
>> move the discussion forward and add them to the roadmap overview then.
>>
>> Best,
>> Stephan
>>
>>
>> On Wed, Feb 13, 2019 at 6:47 PM Rong Rong <walter...@gmail.com> wrote:
>>
>>> Thanks Stephan for the great proposal.
>>>
>>> This would not only be beneficial for new users but also for
>>> contributors to keep track on all upcoming features.
>>>
>>> I think that better window operator support can also be separately group
>>> into its own category, as they affects both future DataStream API and batch
>>> stream unification.
>>> can we also include:
>>> - OVER aggregate for DataStream API separately as @jincheng suggested.
>>> - Improving sliding window operator [1]
>>>
>>> One more additional suggestion, can we also include a more extendable
>>> security module [2,3] @shuyi and I are currently working on?
>>> This will significantly improve the usability for Flink in corporate
>>> environments where proprietary or 3rd-party security integration is needed.
>>>
>>> Thanks,
>>> Rong
>>>
>>>
>>> [1]
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html
>>> [2]
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
>>> [3]
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html
>>>
>>>
>>>
>>>
>>> On Wed, Feb 13, 2019 at 3:39 AM jincheng sun <sunjincheng...@gmail.com>
>>> wrote:
>>>
>>>> Very excited and thank you for launching such a great discussion,
>>>> Stephan !
>>>>
>>>> Here only a little suggestion that in the Batch Streaming Unification
>>>> section, do we need to add an item:
>>>>
>>>> - Same window operators on bounded/unbounded Table API and DataStream
>>>> API
>>>> (currently OVER window only exists in SQL/TableAPI, DataStream API does
>>>> not yet support)
>>>>
>>>> Best,
>>>> Jincheng
>>>>
>>>> Stephan Ewen <se...@apache.org> 于2019年2月13日周三 下午7:21写道：
>>>>
>>>>> Hi all!
>>>>>
>>>>> Recently several contributors, committers, and users asked about
>>>>> making it more visible in which way the project is currently going.
>>>>>
>>>>> Users and developers can track the direction by following the
>>>>> discussion threads and JIRA, but due to the mass of discussions and open
>>>>> issues, it is very hard to get a good overall picture.
>>>>> Especially for new users and contributors, is is very hard to get a
>>>>> quick overview of the project direction.
>>>>>
>>>>> To fix this, I suggest to add a brief roadmap summary to the homepage.
>>>>> It is a bit of a commitment to keep that roadmap up to date, but I think
>>>>> the benefit for users justifies that.
>>>>> The Apache Beam project has added such a roadmap [1]
>>>>> <https://beam.apache.org/roadmap/>, which was received very well by
>>>>> the community, I would suggest to follow a similar structure here.
>>>>>
>>>>> If the community is in favor of this, I would volunteer to write a
>>>>> first version of such a roadmap. The points I would include are below.
>>>>>
>>>>> Best,
>>>>> Stephan
>>>>>
>>>>> [1] https://beam.apache.org/roadmap/
>>>>>
>>>>> ========================================================
>>>>>
>>>>> Disclaimer: Apache Flink is not governed or steered by any one single
>>>>> entity, but by its community and Project Management Committee (PMC). This
>>>>> is not a authoritative roadmap in the sense of a plan with a specific
>>>>> timeline. Instead, we share our vision for the future and major 
>>>>> initiatives
>>>>> that are receiving attention and give users and contributors an
>>>>> understanding what they can look forward to.
>>>>>
>>>>> *Future Role of Table API and DataStream API*
>>>>>   - Table API becomes first class citizen
>>>>>   - Table API becomes primary API for analytics use cases
>>>>>       * Declarative, automatic optimizations
>>>>>       * No manual control over state and timers
>>>>>   - DataStream API becomes primary API for applications and data
>>>>> pipeline use cases
>>>>>       * Physical, user controls data types, no magic or optimizer
>>>>>       * Explicit control over state and time
>>>>>
>>>>> *Batch Streaming Unification*
>>>>>   - Table API unification (environments) (FLIP-32)
>>>>>   - New unified source interface (FLIP-27)
>>>>>   - Runtime operator unification & code reuse between DataStream /
>>>>> Table
>>>>>   - Extending Table API to make it convenient API for all analytical
>>>>> use cases (easier mix in of UDFs)
>>>>>   - Same join operators on bounded/unbounded Table API and DataStream
>>>>> API
>>>>>
>>>>> *Faster Batch (Bounded Streams)*
>>>>>   - Much of this comes via Blink contribution/merging
>>>>>   - Fine-grained Fault Tolerance on bounded data (Table API)
>>>>>   - Batch Scheduling on bounded data (Table API)
>>>>>   - External Shuffle Services Support on bounded streams
>>>>>   - Caching of intermediate results on bounded data (Table API)
>>>>>   - Extending DataStream API to explicitly model bounded streams (API
>>>>> breaking)
>>>>>   - Add fine fault tolerance, scheduling, caching also to DataStream
>>>>> API
>>>>>
>>>>> *Streaming State Evolution*
>>>>>   - Let all built-in serializers support stable evolution
>>>>>   - First class support for other evolvable formats (Protobuf, Thrift)
>>>>>   - Savepoint input/output format to modify / adjust savepoints
>>>>>
>>>>> *Simpler Event Time Handling*
>>>>>   - Event Time Alignment in Sources
>>>>>   - Simpler out-of-the box support in sources
>>>>>
>>>>> *Checkpointing*
>>>>>   - Consistency of Side Effects: suspend / end with savepoint (FLIP-34)
>>>>>   - Failed checkpoints explicitly aborted on TaskManagers (not only on
>>>>> coordinator)
>>>>>
>>>>> *Automatic scaling (adjusting parallelism)*
>>>>>   - Reactive scaling
>>>>>   - Active scaling policies
>>>>>
>>>>> *Kubernetes Integration*
>>>>>   - Active Kubernetes Integration (Flink actively manages containers)
>>>>>
>>>>> *SQL Ecosystem*
>>>>>   - Extended Metadata Stores / Catalog / Schema Registries support
>>>>>   - DDL support
>>>>>   - Integration with Hive Ecosystem
>>>>>
>>>>> *Simpler Handling of Dependencies*
>>>>>   - Scala in the APIs, but not in the core (hide in separate class
>>>>> loader)
>>>>>   - Hadoop-free by default
>>>>>
>>>>>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

Reply via email to