Hi Ivan,

Beam does not use Java serialization for checkpoint data. It uses Beam
coders which are wrapped in Flink's TypeSerializers. That said, Beam
does not support serializer migration yet.

I'm curious, what do you consider a "backwards-compatible" change? If
you are attempting to upgrade the snapshot from one Beam version to
another, there is a high change this won't work due to coder changes
introduced between two Beam releases.

Best,
Max

PS:
This blog post might shed some more light on the matter:
https://flink.apache.org/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html

On 19.05.20 13:45, Ivan San Jose wrote:
> Hi, I've been started to use Apache Beam not so long ago (so bear with
> me please) and I have a question about Beam coders and how are they
> related with under the hood runner serializer... Let me explain myself
> better:
> 
> As far I've understood from Beam documentation, coders are being used
> in order to describe how PCollection elements are going to be
> serialized/deserialized when needed. Is that correct?
> 
> In the other hand, we have the runner, which is executing the
> application under the hood, and needs to send elements among different
> workers, drivers, whatever other thing... And it seems is using its
> proper serializer, not Beam coder defined for that type. Is that
> correct?
> 
> I'm asking this because, in our case, we are using Flink, and we are
> having issues trying to restore checkpoints because it seems to be
> using standard Java serialization.
> 
> So, first thing we would like to know is how coders are related with
> runner's serializer.. Is Beam serializing elements and then the runner
> is serializing again the result or what?
> 
> And the other question is how we could get rid of Java serializer when
> using Flink as runner, because we are getting following error when
> trying to restore a checkpoint with backward compatible changes:
> 
> Could not Java-deserialize TypeSerializer while restoring checkpoint
> metadata for serializer snapshot
> 'org.apache.beam.runners.flink.translation.types.CoderTypeSerializer$Le
> gacySnapshot'. Please update to the TypeSerializerSnapshot interface
> that removes Java Serialization to avoid this problem in the future.
> 
> But we don't know how to tell Beam to tell Flink to use
> TypeSerializerSnapshot as serializer, to be honest.
> 
> Thank you. Regards
> 
> 
> Este correo electrónico y sus adjuntos son de naturaleza confidencial. A no 
> ser que usted sea el destinatario, no puede utilizar, copiar o desvelar tanto 
> el mensaje como cualquier información contenida en el mensaje. Si no es el 
> destinatario, debe borrar este correo y notificar al remitente 
> inmediatamente. Cualquier punto de vista u opinión expresada en este correo 
> electrónico son únicamente del remitente, a no ser que se indique lo 
> contrario. Todos los derechos de autor en cualquier material de este correo 
> son reservados. Todos los correos electrónicos, salientes o entrantes, pueden 
> ser grabados y monitorizados para uso legítimo del negocio. Nos encontramos 
> exentos de toda responsabilidad ante cualquier perdida o daño que surja o 
> resulte de la recepción, uso o transmisión de este correo electrónico hasta 
> el máximo permitido por la ley.
> 
> This email and any attachment to it are confidential. Unless you are the 
> intended recipient, you may not use, copy or disclose either the message or 
> any information contained in the message. If you are not the intended 
> recipient, you should delete this email and notify the sender immediately. 
> Any views or opinions expressed in this email are those of the sender only, 
> unless otherwise stated. All copyright in any of the material in this email 
> is reserved. All emails, incoming and outgoing, may be recorded and monitored 
> for legitimate business purposes. We exclude all liability for any loss or 
> damage arising or resulting from the receipt, use or transmission of this 
> email to the fullest extent permitted by law.
> 

Reply via email to