A quick follow-on: I think you could mimic the concept of a schema registry by implementing some basic forward-compatibility tests of the data schema and run the tests either as pre-commit hooks or using whatever CI/CD framework is currently in-place.
Best Evan Jones Website: www.ea-jones.com On Fri, Mar 17, 2023 at 12:46 PM Evan Jones <[email protected]> wrote: > A few questions: > 1. Is the primary intention to track schema on the development side to > minimize accidentally breaking user pipelines (forward-compatibility)? > 2. Do you want to have a typed and versioned schema that is documented > online somewhere for users? > 3. Do you want run-time validation of the schema or would you rather > offload validation onto users? > 4. How big of a priority is keeping the dependency footprint small? > > Typescript will solve #1. I know it's a fairly large refactor, but usually > you can only type the parts of the code base you really care about (here, > the payload schema) and then make everything else generic, increased type > coverage over time. There are libraries written to track TS schemas and > produce OpenAPI documentation for them, I believe. That would solve #2. > > If you want more rigorous validation there are different options but they > are either more invasive or a heavier lift. > > For example, Confluent's schema registry exists for just this reason: > comparing changes in producer schemas against the previous version in the > registry and ensure the changes meet whatever compatibility requirements > you've set so as to not break downstream consumers. Now, of course, this is > specific to Kafka and Confluent's ecosystem, but the concept is almost > directly maps to what's under discussion here. UserAle is a producer within > an event-driven architecture, we just so happen to be completely agnostic > about the event brokering and consumers. > > Best > > Evan Jones > Website: www.ea-jones.com > > > On Wed, Mar 15, 2023 at 9:47 PM lewis john mcgibbney <[email protected]> > wrote: > >> Big +1 one this. Would be useful as we are thinking about potentially >> pushing data into OpenSearch in the future. >> A schema and data types would be very useful. >> Lewis >> >> On Wed, Mar 15, 2023 at 1:48 PM Gedd Johnson <[email protected]> >> wrote: >> > >> > Hi all, >> > >> > As discussed in this PR, we'd like to ideate on the topic of >> implementing a schema for the Userale client payloads that are sent to >> backend servers. >> > >> > First stab at a problem statement: Userale in its current state does >> not implement any sort of schema for its payloads. Changes to the payload's >> shape (as referenced in the PR linked above) can break data pipelines for >> downstream users. How might we: >> > >> > 1. Validate and version a schema so that downstream users know the >> shape of data they will receive >> > >> > 2. Maintain the flexible schema management that Userale currently offers >> > >> > Looking forward to the discussion! >> > >> > Best, >> > Gedd Johnson >> > >> >> -- >> http://home.apache.org/~lewismc/ >> http://people.apache.org/keys/committer/lewismc >> >
