Re: major reduction is performance when using schema registry - KafkaIO

2023-04-13 Thread Alexey Romanenko
Thanks for testing this! It requires some additional investigations, so I created an issue for that: https://github.com/apache/beam/issues/26262 Feel free to add more details if you have there. — Alexey > On 13 Apr 2023, at 12:45, Sigalit Eliazov wrote: > > I have made the suggested change

Re: major reduction is performance when using schema registry - KafkaIO

2023-04-13 Thread Sigalit Eliazov
I have made the suggested change and used ConfluentSchemaRegistryDeserializerProvider the results are slightly better.. average of 8000 msg/sec Thank you both for your response and i'll appreciate if you can keep me in the loop in the planned work with kafka schema or let me know if i can assist

Re: major reduction is performance when using schema registry - KafkaIO

2023-04-12 Thread Alexey Romanenko
Mine was the similar but "org.apache.beam.sdk.io.kafka,ConfluentSchemaRegistryDeserializerProvider" is leveraging “io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient” that I guessed should reduce this potential impact. — Alexey > On 12 Apr 2023, at 17:36, John Casey via user

Re: major reduction is performance when using schema registry - KafkaIO

2023-04-12 Thread John Casey via user
My initial guess is that there are queries being made in order to retrieve the schemas, which would impact performance, especially if those queries aren't cached with Beam splitting in mind. I'm looking to improve our interaction with Kafka schemas in the next couple of quarters, so I'll keep

Re: major reduction is performance when using schema registry - KafkaIO

2023-04-11 Thread Alexey Romanenko
I don’t have an exact answer why it’s so much slower for now (only some guesses but it requires some profiling), though could you try to test the same Kafka read but with “ConfluentSchemaRegistryDeserializerProvider” instead of KafkaAvroDeserializer and AvroCoder? More details and an example

Re: major reduction is performance when using schema registry - KafkaIO

2023-04-09 Thread Sigalit Eliazov
hi, KafkaIO.read() .withBootstrapServers(bootstrapServers) .withTopic(topic) .withConsumerConfigUpdates(Map.ofEntries( Map.entry("schema.registry.url", registryURL), Map.entry(ConsumerConfig.GROUP_ID_CONFIG, consumerGroup+

Re: major reduction is performance when using schema registry - KafkaIO

2023-04-09 Thread Reuven Lax via user
How are you using the schema registry? Do you have a code sample? On Sun, Apr 9, 2023 at 3:06 AM Sigalit Eliazov wrote: > Hello, > > I am trying to understand the effect of schema registry on our pipeline's > performance. In order to do sowe created a very simple pipeline that reads > from

major reduction is performance when using schema registry - KafkaIO

2023-04-09 Thread Sigalit Eliazov
Hello, I am trying to understand the effect of schema registry on our pipeline's performance. In order to do sowe created a very simple pipeline that reads from kafka, runs a simple transformation of adding new field and writes of kafka. the messages are in avro format I ran this pipeline with