Re: [DISCUSS] FLIP-398: Improve Serialization Configuration And Usage In Flink

Xintong Song Sun, 17 Dec 2023 19:16:47 -0800

Hi Ken,

I think the main purpose of this FLIP is to change how users interact with
the knobs for customizing the serialization behaviors, from requiring code
changes to working with pure configurations. Redesigning the knobs (i.e.,
names, semantics, etc.), on the other hand, is not the purpose of this
FLIP. Preserving the existing names and semantics should also help minimize
the migration cost for existing users. Therefore, I'm in favor of not
changing them.


Concerning decoupling from Kryo, and introducing other serialization
frameworks like Fury, I think that's a bigger topic that is worth further
discussion. At the moment, I'm not aware of any community consensus on
doing so. And even if in the future we decide to do so, the changes needed
should be the same w/ or w/o this FLIP. So I'd suggest not to block this
FLIP on these issues.

WDYT?

Best,

Xintong



On Fri, Dec 15, 2023 at 1:40 AM Ken Krugler <[email protected]>
wrote:

> Hi Yong,
>
> Looks good, thanks for creating this.
>
> One comment - related to my recent email about Fury, I would love to see
> the v2 serialization decoupled from Kryo.
>
> As part of that, instead of using xxxKryo in methods, call them xxxGeneric.
>
> A more extreme change would be to totally rely on Fury (so no more POJO
> serializer). Fury is faster than the POJO serializer in my tests, but this
> would be a much bigger change.
>
> Though it could dramatically simplify the Flink serialization support.
>
> — Ken
>
> PS - a separate issue is how to migrate state from Kryo to something like
> Fury, which supports schema evolution. I think this might be possible, by
> having a smarter deserializer that identifies state as being created by
> Kryo, and using (shaded) Kryo to deserialize, while still writing as Fury.
>
> > On Dec 6, 2023, at 6:35 PM, Yong Fang <[email protected]> wrote:
> >
> > Hi devs,
> >
> > I'd like to start a discussion about FLIP-398: Improve Serialization
> > Configuration And Usage In Flink [1].
> >
> > Currently, users can register custom data types and serializers in Flink
> > jobs through various methods, including registration in code,
> > configuration, and annotations. These lead to difficulties in upgrading
> > Flink jobs and priority issues.
> >
> > In flink-2.0 we would like to manage job data types and serializers
> through
> > configurations. This FLIP will introduce a unified option for data type
> and
> > serializer and users can configure all custom data types and
> > pojo/kryo/custom serializers. In addition, this FLIP will add more
> built-in
> > serializers for complex data types such as List and Map, and optimize the
> > management of Avro Serializers.
> >
> > Looking forward to hearing from you, thanks!
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-398%3A+Improve+Serialization+Configuration+And+Usage+In+Flink
> >
> > Best,
> > Fang Yong
>
> --------------------------
> Ken Krugler
> http://www.scaleunlimited.com
> Custom big data solutions
> Flink & Pinot
>
>
>
>

Re: [DISCUSS] FLIP-398: Improve Serialization Configuration And Usage In Flink

Reply via email to