Would a custom Kryo serializer that uses the coders to perform
serialization help?

There are various ways Kryo let's you annotate such serializer without full
surgery, including @Bind on the field level or at class level.

Thanks,
Thomas


On Thu, Aug 24, 2017 at 9:57 AM, Lukasz Cwik <lc...@google.com.invalid>
wrote:

> How does Kryo's FieldSerializer fail on WindowedValue/PaneInfo, it seems
> like those are pretty simple types.
>
> Also, I don't see why they can't be tagged with Serializable but I think
> the original reasoning was that coders should be used to guarantee a stable
> representation across JVM versions.
>
> On Thu, Aug 24, 2017 at 5:38 AM, Kobi Salant <kobi.sal...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I am working on Spark runner issue
> > https://issues.apache.org/jira/browse/BEAM-2669
> > "Kryo serialization exception when DStreams containing
> > non-Kryo-serializable data are cached"
> >
> > Currently, Spark runner enforce the kryo serializer and in shuffling
> > scenarios we always uses coders to transfer bytes over the network. But
> > Spark can also serialize data for caching/persisting and currently it
> uses
> > kryo for this.
> >
> > Today, when the user uses a class which is not kryo serializable the
> > caching fails and we thought to open the option to use java serialization
> > as a fallback.
> >
> > Our RDDs/DStreams behind the PCollections are usually typed
> > RDD/DStream<WindowedValue<InputT>>
> > and when Spark tries to java serialize them for caching purposes it fails
> > on WindowedValue not being java serializable.
> >
> > Is there any objection to add Serializable implements to SDK classes like
> > WindowedValue, PaneInfo and others?
> >
> > Again, i want to emphasise that coders are a big part of the Spark runner
> > code and we do use them whenever a shuffle is expected. Changing the
> types
> > of the  RDDs/DStreams to byte[] will make the code pretty unreadable and
> > will weaken type safety checks.
> >
> > Thanks
> > Kobi
> >
>

Reply via email to