Hi Avi, I'm not entirely sure I understand the question. Let's say you have source A, B, C all with different schema but all have an id. You could use the ParquetMapInputFormat that provides a map of the records and just use a map-lookup.
However, I'm not sure how you want to write these records with different schema into the same parquet file. Maybe, you just want to extract the common fields of A, B, C? Then you can also use Table API and just declare the fields that are common. Or do you have sink A, B, C and actually 3 separate topologies? On Wed, Mar 10, 2021 at 10:50 AM Avi Levi <a...@theneura.com> wrote: > Hi all, > I am trying to filter lines from parquet files, the problem is that they > have different schemas, however the field that I am using to filter > exists in all schemas. > in spark this is quite straight forward : > > *val filtered = rawsDF.filter(col("id") != "123")* > > I tried to do it in flink by extending the ParquetInputFormat but in this > case I need to schema (message type) and implement Convert method which I > want to avoid since I do not want to convert the line (I want to write is > as is to other parquet file) > > Any ideas ? > > Cheers > Avi > >