A quick follow-on: I think you could mimic the concept of a schema registry
by implementing some basic forward-compatibility tests of the data schema
and run the tests either as pre-commit hooks or using whatever CI/CD
framework is currently in-place.

Best

Evan Jones
Website: www.ea-jones.com


On Fri, Mar 17, 2023 at 12:46 PM Evan Jones <[email protected]> wrote:

> A few questions:
> 1. Is the primary intention to track schema on the development side to
> minimize accidentally breaking user pipelines (forward-compatibility)?
> 2. Do you want to have a typed and versioned schema that is documented
> online somewhere for users?
> 3. Do you want run-time validation of the schema or would you rather
> offload validation onto users?
> 4. How big of a priority is keeping the dependency footprint small?
>
> Typescript will solve #1. I know it's a fairly large refactor, but usually
> you can only type the parts of the code base you really care about (here,
> the payload schema) and then make everything else generic, increased type
> coverage over time. There are libraries written to track TS schemas and
> produce OpenAPI documentation for them, I believe. That would solve #2.
>
> If you want more rigorous validation there are different options but they
> are either more invasive or a heavier lift.
>
> For example, Confluent's schema registry exists for just this reason:
> comparing changes in producer schemas against the previous version in the
> registry and ensure the changes meet whatever compatibility requirements
> you've set so as to not break downstream consumers. Now, of course, this is
> specific to Kafka and Confluent's ecosystem, but the concept is almost
> directly maps to what's under discussion here. UserAle is a producer within
> an event-driven architecture, we just so happen to be completely agnostic
> about the event brokering and consumers.
>
> Best
>
> Evan Jones
> Website: www.ea-jones.com
>
>
> On Wed, Mar 15, 2023 at 9:47 PM lewis john mcgibbney <[email protected]>
> wrote:
>
>> Big +1 one this. Would be useful as we are thinking about potentially
>> pushing data into OpenSearch in the future.
>> A schema and data types would be very useful.
>> Lewis
>>
>> On Wed, Mar 15, 2023 at 1:48 PM Gedd Johnson <[email protected]>
>> wrote:
>> >
>> > Hi all,
>> >
>> > As discussed in this PR, we'd like to ideate on the topic of
>> implementing a schema for the Userale client payloads that are sent to
>> backend servers.
>> >
>> > First stab at a problem statement: Userale in its current state does
>> not implement any sort of schema for its payloads. Changes to the payload's
>> shape (as referenced in the PR linked above) can break data pipelines for
>> downstream users. How might we:
>> >
>> > 1. Validate and version a schema so that downstream users know the
>> shape of data they will receive
>> >
>> > 2. Maintain the flexible schema management that Userale currently offers
>> >
>> > Looking forward to the discussion!
>> >
>> > Best,
>> > Gedd Johnson
>> >
>>
>> --
>> http://home.apache.org/~lewismc/
>> http://people.apache.org/keys/committer/lewismc
>>
>

Reply via email to