Hi, I've been started to use Apache Beam not so long ago (so bear with me please) and I have a question about Beam coders and how are they related with under the hood runner serializer... Let me explain myself better:
As far I've understood from Beam documentation, coders are being used in order to describe how PCollection elements are going to be serialized/deserialized when needed. Is that correct? In the other hand, we have the runner, which is executing the application under the hood, and needs to send elements among different workers, drivers, whatever other thing... And it seems is using its proper serializer, not Beam coder defined for that type. Is that correct? I'm asking this because, in our case, we are using Flink, and we are having issues trying to restore checkpoints because it seems to be using standard Java serialization. So, first thing we would like to know is how coders are related with runner's serializer.. Is Beam serializing elements and then the runner is serializing again the result or what? And the other question is how we could get rid of Java serializer when using Flink as runner, because we are getting following error when trying to restore a checkpoint with backward compatible changes: Could not Java-deserialize TypeSerializer while restoring checkpoint metadata for serializer snapshot 'org.apache.beam.runners.flink.translation.types.CoderTypeSerializer$Le gacySnapshot'. Please update to the TypeSerializerSnapshot interface that removes Java Serialization to avoid this problem in the future. But we don't know how to tell Beam to tell Flink to use TypeSerializerSnapshot as serializer, to be honest. Thank you. Regards Este correo electrónico y sus adjuntos son de naturaleza confidencial. A no ser que usted sea el destinatario, no puede utilizar, copiar o desvelar tanto el mensaje como cualquier información contenida en el mensaje. Si no es el destinatario, debe borrar este correo y notificar al remitente inmediatamente. Cualquier punto de vista u opinión expresada en este correo electrónico son únicamente del remitente, a no ser que se indique lo contrario. Todos los derechos de autor en cualquier material de este correo son reservados. Todos los correos electrónicos, salientes o entrantes, pueden ser grabados y monitorizados para uso legítimo del negocio. Nos encontramos exentos de toda responsabilidad ante cualquier perdida o daño que surja o resulte de la recepción, uso o transmisión de este correo electrónico hasta el máximo permitido por la ley. This email and any attachment to it are confidential. Unless you are the intended recipient, you may not use, copy or disclose either the message or any information contained in the message. If you are not the intended recipient, you should delete this email and notify the sender immediately. Any views or opinions expressed in this email are those of the sender only, unless otherwise stated. All copyright in any of the material in this email is reserved. All emails, incoming and outgoing, may be recorded and monitored for legitimate business purposes. We exclude all liability for any loss or damage arising or resulting from the receipt, use or transmission of this email to the fullest extent permitted by law.
