derjust commented on issue #8371: [FLINK-9679] - Add AvroSerializationSchema URL: https://github.com/apache/flink/pull/8371#issuecomment-548919899 I used this PR fairly successful over a week producing ~50 mil messages in total. There is only one catch in the PR that needs to be fixed: https://github.com/apache/flink/pull/8371/files#diff-c60754c7ce564f4229c22913d783c339R128 ```java int schemaId= schemaCoderProvider.get() .writeSchema(getSchema()); ``` This creates a new http client each time a message is serialized which creates a significant overhead due to establishing the HTTP connections but also never leveraging the schema caching The `schemaCoderProvider.get()` should be cached/kept local after it has been fetched and used for all subsequent `serialize()` calls. In fact, it should be done in the same fashion as in the existing source implementation: ```java @Override protected void checkAvroInitialized() { super.checkAvroInitialized(); if (schemaCoder == null) { this.schemaCoder = schemaCoderProvider.get(); } ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
