Hello, I have a BigQuery table which ingests a Protobuf stream from Kafka with a Beam pipeline. The Protobuf has a `log Map<String,String>` column which translates to a field "log" of type RECORD with unknown fields in BigQuery.
So I scanned my whole stream to know which schema fields to expect and created an empty daily-partitioned table in BigQuery with correct fields for this RECORD type. Now someone is pushing a new field into this RECORD and the import fails as the schema is not matching anymore. My pipeline is configured to use FILE_LOADS and so these do fail - but the Flink pipeline just continues and does not throw any error. My Beam version is 2.10-SNAPSHOT running on Flink 1.6, my two questions are: 1. How can I make the pipeline fail hard in this situation? 2. How can I prevent this from happening? If I create the table with the correct schema in the beginning the table schema seems to be overwritten and autodetected again (without the added field). This is causing the load to fail. If I alter the schema and add the field manually when the pipeline is running it fixes it, but then I already have a dozen of failed imports. The main issue seems to be that I have a Protobuf schema which only defines a Map<String,String> type, that is not specific enough to create the full matching schema from it.
