Flink needs to know upfront what kind of types it deals with to setup the serialization stack between operators.

As such, generally speaking, you will have to use some generic container for transmitting data (e.g., a String or a Jackson ObjectNode) and either work on them directly or map them as required to specific type /within the scope of a single function/ based on some custom logic.

There may be other approaches, but we'd have to know more about the specific use-case and requirements to help you there (e.g., what does /your user/ interact with). My understanding is that you have some single source for all these events, and now you want some user to define a pipeline processing a specific subset of these events?

On 1/28/2021 5:44 AM, Devin Bost wrote:
I'm wanting to know if it's possible in Flink to parse strings into a dynamic JSON object that doesn't require me to know the primitive type details at compile time. We have over 300 event types to process, and I need a way to load the types at runtime. I only need to know if certain fields exist on the incoming objects, and the object schemas are all different except for certain fields. Every example I can find shows Flink users specifying the full type information at compile time, but there's no way this will scale.

It's possible for us to lookup primitive type details at runtime from JSON, but I'll still need a way to process that JSON in Flink to extract the metadata if it's required. So, that brings me back to the original issue.

How can I do this in Flink?

--
Devin G. Bost


Reply via email to