Re: [Architecture] Current thoughts on "live" GPII registries...

Tony Atkins Mon, 24 Apr 2017 05:04:19 -0700

Hi, Steve:

I think we're largely on the same page, but I just wanted to review this
bit from your note:


{
>
>     "http://registry.gpii.net/common/fontSize": {
>
>         name: "Font Size",
>
>         description: "",
>
>         schema: { $ref: "stringSchema" }
>
>     },
>
>
It wasn't clear from your example whether this was a definition or
property.   As long as it's a definition, people will actually pick up
improvements as we (for example) evolve settings that are currently simply
strings into enumerated lists or strings that must validate against a
pattern.

If it's meant to be a property, then it's a bit of a problem.  I mean, if
you want to validate fontSize in multiple documents, you're in essence
copying and pasting links to generic rules, and not anything specific to
fontSize.  I would argue that "one definition per setting" is a key safety
valve that gives us the freedom to link to generic common definitions but
also to update settings independently.

At the very least it makes sense to put a title and description in the
> actual JSON schema, but there still may be other non JSON Schema
> information that warrants the level up still


At least from your example, it doesn't seem like you're quite convinced
about using "name" and "description" inline.  I would argue that the
supported metadata fields are a reasonable fit for the early work you're
doing, where they can be mapped to field labels and instructions.  I would
only expect us to use them until JSON UI Schema matures, or until we come
up with our own overlay.

By "overlay" I mean a way of associating arbitrary metadata (instructions
in multiple languages, etc.) with a particular field.  The JSON Schema
standard only supports three metadata fields ("title", "description", and
"default").  There isn't really a way to add our own additional metadata
without creating invalid schemas.  We'd have to handle it externally, for
example, using a map of JSON pointers
<https://tools.ietf.org/html/rfc6901> that
associate arbitrary metadata with existing parts of a schema, as in:

highlightColors: {
>
> "#/properties/foo": "#ff0000",
>
> "#/properties/bar": "#00ff00"
>
> }
>
>
or perhaps:

extraMetadata: {
>
> "#/properties/foo": {
>
> "color": "#ff0000",
>
> "font-weight": "bold"
>
> },
>
> "#/properties/bar": {
>
> "color": "#00ff00"
>
> }
>
> }
>
>
The JSON pointers used above are "schema-centric", such as you would see in
validation output from AJV.  You could also have "document-centric"
pointers like "#/foo", and "#/bar".  Depending on the use case, I could
also see using viewComponent selectors or model variable "dot path"
references, as demonstrated in gpii-binder
<https://github.com/GPII/gpii-binder>.

Yeah, I think this would be a great set of stuff for the hackathon as well.


We have our next meeting next Thursday, so I'd like to sketch something out
before then, preferably this week.  I do think a PR against your repo would
be a simpler place to start if you're willing.  Otherwise, I can sketch out
something to talk about within the live solutions registry instead.  Just
let me know what you think.

Cheers,


Tony

On Mon, Apr 24, 2017 at 6:16 AM, Steven Githens <[email protected]> wrote:

> Hello!
>
> On Apr 21, 2017, at 3:16 AM, Tony Atkins <[email protected]> wrote:
>
> Hi, Steve:
>
> Happy to take over GPII-111, as that seems very much like what I've
> planned to focus on for the next month or so.
>
>
> Awesome, thanks much!
>
> I guess my point is:  Breaking these up so they aren't just one file per
>> OS sounds great, but I would recommend against requiring a 5-10 line file
>> for every single setting.  And while these are in files now, I hope we can
>> remember with any API we encounter that the JSON data is what's important,
>> and whatever physical storage it sits in is mostly accidental complexity.
>
>
> I disagree somewhat with the considered choice of using a GitHub repo as
> being "mostly accidental complexity".  Although as you say we have the
> freedom to change back-ends in the future, I am proposing a filesystem and
> a github-based workflow as an initial choice because it has "intentional
> simplicity" versus building our own back end, mechanisms for managing the
> data, workflow, workflow tools, et cetera.
>
>
> You're right, having a clear way for organizing the data in some type of
> tree system and using the github workflow is essential complexity for our
> organization structure.
>
> If I remember right, what spurred me to think this at all was during the
> APCP meeting when we were discussing that part of the reason they may have
> to be in separate files was a limitation of a 3rd party validation
> framework, or whether it was worth our time to script it's API.  That is
> largely what I considered to be the accidental complexity, that we were
> forced into some layout and storage situation by the validator requiring
> that referenced schemas absolutely had to be in some sort of posix file.
>
> In the back of my head though, I still imagine that the essential
> complexity you're working on is for organizing all this data and developing
> a standard way for people to contribute to it, and should successfully
> translate to another storage methodology that could represent blocks of
> texts in a tree structure and have some sort of versioning.  Be that couch
> documents, or whatever.  At the current state in time, the most robust and
> nicest set of tooling for this just happens to be files+git+github.
>
>
> Setting that aside, the key distinction here is not "one schema per
> setting".  It's "one definition per setting".  Individual schemas will
> represent what is valid for a particular use case, and will reuse one or
> more definitions
> <https://spacetelescope.github.io/understanding-json-schema/structuring.html#reuse>.
> As long as we distinctly define each setting in a way that can be referred
> to externally, the grouping is up to us.  The key concerns to me are:
>
>    1. Supporting reuse.
>    2. Keeping the files small to support contribution and review.
>
> So, with that in mind, here is a proposal.
>
>    1. "common" settings definitions that occur across platforms would be
>    in a central file.
>    2. "platform-unique" settings definitions would be in a file per OS.
>    3. "solution-unique" settings definitions could be in one file per
>    solution (or family of related solutions that use the same conventions).
>
> This seems super-reasonable, and something I might have to think about
> some more too.  I had never thought too much about #2, ( I can guess what
> they might be, but am having a hard time pulling out examples right now).
>
> With the actual ability to inhereit settings from grades, it is nice to
> think about things that are essentially the same in the linux.json and
> win32.json file being merged into one sort of entry.  Like the chrome and
> easy123 type ones.
>
>
> This level of organization means that most contributors and reviewers
> would be working with a "solution-unique" definitions file that is
> relatively small.  As we do with code reviews, part of a good PR would
> involve identifying more widely useful common patterns that need to be
> pulled upward to a new "platform-unique" or "common" settings.  The larger
> the definitions block, the less frequently it would be modified.
>
> The above just refers to definitions to be used in individual schemas.  I
> am still reviewing prior work before I start sketching out actual schemas,
> but can start showing you what I mean using your existing work.  Let's
> start with the first entry in your data
> <https://github.com/sgithens/gpii-devpmt/blob/master/src/js/commonTerms.json5#L3>.
> Right now it looks like:
>
> {
>>
>>     "http://registry.gpii.net/common/fontSize": {
>>
>>         name: "Font Size",
>>
>>         description: "",
>>
>>         schema: {
>>
>>             type: "integer"
>>
>>         }
>>
>>     },
>>
>>
> Although you would likely mature this approach once the JSON UI Schema
> spec comes into being, for now I would suggest using the metadata
> supported by the standard
> <https://spacetelescope.github.io/understanding-json-schema/reference/generic.html#metadata>,
> namely the "title", and "description" fields.   This allows you to collapse
> down your proto-schema to look more like:
>
> "http://registry.gpii.net/common/fontSize": {
>>>
>>         title: "Font Size",
>>
>>         description: "The font size to use in a particular context.",
>>
>>         type: "integer"
>>>
>> }
>>>
>>
>
> Interesting, putting those into the actual JSON schema.  I do wonder
> though if we won't need to keep the schema as an embedded resource, as we
> might have items that legit don't belong in it?  I had pulled 'name' and
> 'description' from Javi's document:
>
> https://wiki.gpii.net/w/Personal_Control_Panel_API#Settings.27_Metadata
>
> starting out just because it was the info I needed to render the DevPTT.
> Looking at that full list (and maybe other random things we need), it will
> be interesting to see whether it could *all* just go in the schema
> document.   It probably depends on much extra information we might want for
> each setting.  At the very least it makes sense to put a title and
> description in the actual JSON schema, but there still may be other non
> JSON Schema information that warrants the level up still... I just wonder
> if things could get wordy to a point that they shouldn't be in there,
> paragraph documentation with like links to API's from the vendors website
> and stuff. (still on a persetting basis)
>
>
> This now represents a single field in a larger object, with the key "
> http://registry.gpii.net/common/fontSize";.  To make your schema reusable,
> you would move the definition into a definitions block.  It doesn't
> necessarily have to live in a separate file, but just to illustrate the
> beginnings of the structure I mentioned above, let's assume I have a
> "common.json" definitions file that looks like:
>
> {
>>
>> "$schema": "http://json-schema.org/schema#";
>>
>> id: "common.json",
>>
>> definitions: {
>>
>> "fontSize": {
>>
>> id: "fontSize",
>>
>> title: "Font Size",
>>
>> description: "The font size to use in a particular context.",
>>
>> type: "integer"
>>
>> }
>>
>> }
>>
>> }
>>
>>
> The actual schema used for validation could now look more like:
>
> {
>>
>> "$schema": "http://json-schema.org/schema#";
>>
>> id: "devpmt.json",
>>
>> properties: {
>>
>> "http://registry.gpii.net/common/fontSize": { $ref:
>>> "common.json#fontSize" }
>>
>> }
>>
>> }
>>
>>
>
> That does look super cool, and I like how it could either be in or not in
> a file, depending on how complex it might, and again, looking at these
> libraries, it doesn't look like it should be too hard for us to ever
> process them from nodes if need be.
>
> Part of the 'separate file' worry I have was to make sure we don't have a
> situation where, like for JAWS you'd have to make like:
>
> jaws-solution.json
> jaws-setting-Options.PrimarySynthesizer.json
> jaws-setting-jaws-ENU-Global.Rate.json
> jaws-setting-ENU-JAWSCursor.Rate.json
> jaws-setting-ENU-Keyboard.Rate.json
> jaws-setting-ENU-MenuAndDialog.Rate.json
> ...this goes on for like a ream of paper
>
> ALSO, here is what I'm hoping we can do.  As you've noticed, and lot of
> the stuff I'm stubbing in is just like { type: "string" }, or { type:
> "integer" }.  I realize that eventually some of these should be more
> detailed, but sometimes you're really just stubbing in like 100 values that
> are all of like 2 or 3 types. In that case I was hoping I could go:
>
> {
>>
>>     "http://registry.gpii.net/common/fontSize": {
>>
>>         name: "Font Size",
>>
>>         description: "",
>>
>>         schema: { $ref: "stringSchema" }
>>
>>     },
>>
>>
>
> That does put us back to our previous larger structure... I don't know
> maybe they'd have to be parsed and generated on the fly.  It would be nice
> to have just those 5 or 8 base schemas you could just also use when you're
> crazy fast stubbing out something like JAWS that has a ton of settings that
> are for all practical purposes the same.
> I might have to think more about this use case, but I feel like I want to
> do somethign close to it.
>
>
>
> If this seems reasonable enough to flesh out a bit further, I would be
> happy to take a few minutes and prepare a PR against the devpmt,
> demonstrating this with all of your examples.  My idea is that that would
> be something tangible (and tested) to review in our next meeting, and that
> I would start work on a fuller set of definitions in the live registries
> based on our discussion.
>
> Anyway, please review and comment.
>
>
> Yeah, I think this would be a great set of stuff for the hackathon as
> well.  Talking about schemas and validation, I feel like some people have
> started certain ones, but I would really love to get a set of JSON schemas
> in for our most important formats that people edit... NP Sets, Solutions,
> probably other things.  I want the DevPTT as well, to be able to validate
> an existing or new NP Set against all the schema's we're making above,
> especially as the first step in avoiding settings injection issues down the
> road.
>
> Cheers to the max,
> Steve
>
> Cheers,
>
>
> Tony
>
>>
> On Fri, Apr 21, 2017 at 6:06 AM, Steven Githens <[email protected]> wrote:
>
>> Hi Tony!
>>
>> This all sounds cool... and I might just assign over GPII-111 to you, if
>> you think it's more or less the actual work you're doing right now.  You
>> can take a look, and if you think it is, feel free to take it.
>>
>> Just one comment below.
>>
>> On Apr 12, 2017, at 5:43 AM, Tony Atkins <[email protected]>
>> wrote:
>>
>> Hi, All:
>>
>> As we have long discussed, currently the solutions and settings used
>> within the GPII are stored in massive JSON files in the "universal"
>> repository.  I have been tasked with helping move us towards the kind of
>> granularity, inheritance, and testability we discussed in Toronto.  I have
>> been sketching out initial documentation and a loading/validation harness
>> <https://github.com/the-t-in-rtf/gpii-live-registries>, and wanted to
>> summarize for wider discussion.
>>
>> First, as discussed in Toronto, the idea is that the "live" registries
>> would be a separate repo that contains the data that currently lives in
>> universal, more finely broken down.  Changes to the data would be submitted
>> as pull requests against this repo.  The platform-specific repos would use
>> a versioned release of the "live" data (more on that in a bit).
>>
>> Each solution and setting would be a distinct grade, saved to a single
>> JSON(5) file.  We would use the effective path and filename to create an
>> implicit and unique grade name for each options file.  This accomplishes
>> two things:
>>
>>
>>
>> I think having a default convention for the repo of having each solution
>> in it's own file is probably good, but I would hope that each setting for
>> that solution wouldn't have to be in it's own file... maybe it could be if
>> you want, but hopefully optional.  Having barely survived that decade in
>> the early 2000's when J2EE was cool and having to put every single public
>> java class has left some anxiety in my stomach thinking about this.
>>
>> Mostly, I want to make sure that we future proof ourselves, and have this
>> validation and solutions registry tooling work well in any situation where
>> you have some JSON data that makes up a schema.  Regardless of whether it's
>> in a file, a couch document, a node in another JSON document, or being
>> dynamically and temporarily stored in the local storage of an awesome web
>> based authoring tool, which is most likely just over the horizon for us.
>>
>> Out of a paranoia [1], I did start reading through the spec and looking
>> at some of the validation libraries, and everything seems like it should be
>> Ok.  And even though I don't have the schema directive in yet (although
>> it's only 1 five minute vim macro away ;)  ), they do actually seem to
>> validate fine.  From what I mentioned on the APCP call, I am actually
>> hoping that we can start with the metadata I've created for the Generic
>> Preferences and filling in some of the apps settings metadata (JAWS
>> mostly), since it is like a good half days worth of typing.  I'm happy to
>> spend the 10 minutes with a vim macro to make them look however we need
>> them, and split them up in to files.
>>
>> I guess my point is:  Breaking these up so they aren't just one file per
>> OS sounds great, but I would recommend against requiring a 5-10 line file
>> for every single setting.  And while these are in files now, I hope we can
>> remember with any API we encounter that the JSON data is what's important,
>> and whatever physical storage it sits in is mostly accidental complexity.
>>
>> Cheers,
>> Steve
>>
>> [1] and left over stress/nightmares of when I was still working on J2EE
>> projects.
>>
>>
>>    1. We will have an easier time detecting namespace collisions with
>>    this model.
>>    2. We can detect the existence of and perform standard tests against
>>    each grade in isolation (see below).
>>
>> So, what do I mean by "grades" in this context?  Basically, anything you
>> can do in an options block without writing code can be stored in one of
>> these JSON(5) files.  Settings and solutions derive from concrete
>> *gpii.setting* and *gpii.solution* grades.  Abstract grades are also
>> possible, such as platform and platform version mix-ins.
>>
>> A new loader would scan through an "options file hierarchy" and associate
>> each block of options with its namespace, as though the user had called 
>> *fluid.defaults(namespace,
>> options)*.  Once all grades have their defaults defined, we can search
>> for any grades that extend *gpii.solution* or *gpii.setting*, and do
>> things like:
>>
>>    1. Confirm that each component can be safely instantiated.
>>    2. Confirm that the component satisfies the contract defined for the
>>    base grade, for example, that it provides an "isInstalled" invoker.
>>    3. For "abstract" grades, we would not attempt to instantiate them,
>>    only to confirm that each is extended by at least one "concrete" grade 
>> that
>>    has been tested.
>>
>> Platform specific tests would take place within the platform-specific
>> repos, which would test their version of the "live" data, for example
>> calling each solution's "isInstalled" method to confirm that nothing
>> breaks.  As with any versioned dependency change, we would submit a PR
>> against a platform repo and confirm that the new version of the "live" data
>> does not break anything before merging and releasing a new version of the
>> platform repo.
>>
>> So, that's the proposed workflow and test harness, which are independent
>> of the data format.  Please comment.  Once we have even "lazy consensus"
>> agreement on that, we will immediately need to move forward with
>> discussions about how we represent each solution/setting and the
>> relationships between settings.
>>
>> Cheers,
>>
>>
>> Tony
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> http://lists.gpii.net/mailman/listinfo/architecture
>>
>>
>>
>
>

_______________________________________________
Architecture mailing list
[email protected]
http://lists.gpii.net/mailman/listinfo/architecture

Re: [Architecture] Current thoughts on "live" GPII registries...

Reply via email to