Re: [DISCUSS] FLIP-398: Improve Serialization Configuration And Usage In Flink

Yong Fang Wed, 20 Dec 2023 05:13:22 -0800

Hi Ken,

Thanks for your feedback. The purpose of this FLIP is to improve the use of
serialization, including configurable serializer for users, providing
serializer for composite data types, and resolving the default enabling of
Kryo, etc. Introducing a better serialization framework would be a great
help for Flink's performance, and it's great to see your tests on Fury.
However, as @Xintong mentioned, this could be a huge work and beyond the
scope of this FLIP. If you're interested, I think we could create a new
FLIP for it and discuss it further. What do you think? Thanks.


Best,
Fang Yong

On Mon, Dec 18, 2023 at 11:16 AM Xintong Song <tonysong...@gmail.com> wrote:

> Hi Ken,
>
> I think the main purpose of this FLIP is to change how users interact with
> the knobs for customizing the serialization behaviors, from requiring code
> changes to working with pure configurations. Redesigning the knobs (i.e.,
> names, semantics, etc.), on the other hand, is not the purpose of this
> FLIP. Preserving the existing names and semantics should also help minimize
> the migration cost for existing users. Therefore, I'm in favor of not
> changing them.
>
> Concerning decoupling from Kryo, and introducing other serialization
> frameworks like Fury, I think that's a bigger topic that is worth further
> discussion. At the moment, I'm not aware of any community consensus on
> doing so. And even if in the future we decide to do so, the changes needed
> should be the same w/ or w/o this FLIP. So I'd suggest not to block this
> FLIP on these issues.
>
> WDYT?
>
> Best,
>
> Xintong
>
>
>
> On Fri, Dec 15, 2023 at 1:40 AM Ken Krugler <kkrugler_li...@transpac.com>
> wrote:
>
> > Hi Yong,
> >
> > Looks good, thanks for creating this.
> >
> > One comment - related to my recent email about Fury, I would love to see
> > the v2 serialization decoupled from Kryo.
> >
> > As part of that, instead of using xxxKryo in methods, call them
> xxxGeneric.
> >
> > A more extreme change would be to totally rely on Fury (so no more POJO
> > serializer). Fury is faster than the POJO serializer in my tests, but
> this
> > would be a much bigger change.
> >
> > Though it could dramatically simplify the Flink serialization support.
> >
> > — Ken
> >
> > PS - a separate issue is how to migrate state from Kryo to something like
> > Fury, which supports schema evolution. I think this might be possible, by
> > having a smarter deserializer that identifies state as being created by
> > Kryo, and using (shaded) Kryo to deserialize, while still writing as
> Fury.
> >
> > > On Dec 6, 2023, at 6:35 PM, Yong Fang <zjur...@gmail.com> wrote:
> > >
> > > Hi devs,
> > >
> > > I'd like to start a discussion about FLIP-398: Improve Serialization
> > > Configuration And Usage In Flink [1].
> > >
> > > Currently, users can register custom data types and serializers in
> Flink
> > > jobs through various methods, including registration in code,
> > > configuration, and annotations. These lead to difficulties in upgrading
> > > Flink jobs and priority issues.
> > >
> > > In flink-2.0 we would like to manage job data types and serializers
> > through
> > > configurations. This FLIP will introduce a unified option for data type
> > and
> > > serializer and users can configure all custom data types and
> > > pojo/kryo/custom serializers. In addition, this FLIP will add more
> > built-in
> > > serializers for complex data types such as List and Map, and optimize
> the
> > > management of Avro Serializers.
> > >
> > > Looking forward to hearing from you, thanks!
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-398%3A+Improve+Serialization+Configuration+And+Usage+In+Flink
> > >
> > > Best,
> > > Fang Yong
> >
> > --------------------------
> > Ken Krugler
> > http://www.scaleunlimited.com
> > Custom big data solutions
> > Flink & Pinot
> >
> >
> >
> >
>

Re: [DISCUSS] FLIP-398: Improve Serialization Configuration And Usage In Flink

Reply via email to