Re: [Architecture] Current thoughts on "live" GPII registries...

Steven Githens Mon, 01 May 2017 21:42:08 -0700

Hi Tony,

Sorry for the delay, I've had to read through this a few times to pick up some 
of the subtleties, and I'm still a bit shaky on some of it... such that I 
couldn't start typing out examples.  ( I did go through your JSON pointers link 
as well ).


> On Apr 24, 2017, at 5:02 AM, Tony Atkins <[email protected]> wrote:
> 
> Hi, Steve:
> 
> I think we're largely on the same page, but I just wanted to review this bit 
> from your note:
> 
>> {
>>     "http://registry.gpii.net/common/fontSize 
>> <http://registry.gpii.net/common/fontSize>": {
>>         name: "Font Size",
>>         description: "",
>>         schema: { $ref: "stringSchema" }
>>     },
>> 
> It wasn't clear from your example whether this was a definition or property.  
>  As long as it's a definition, people will actually pick up improvements as 
> we (for example) evolve settings that are currently simply strings into 
> enumerated lists or strings that must validate against a pattern.
> 

I guess I'd like to say it could potentially be both.  As a definition, 
starting out it would just be a simple way to re-use simple schemas.  That's 
actually the use case here.

But I imagine we'd also like to be able to reuse properties.

> If it's meant to be a property, then it's a bit of a problem.  I mean, if you 
> want to validate fontSize in multiple documents, you're in essence copying 
> and pasting links to generic rules, and not anything specific to fontSize.  I 
> would argue that "one definition per setting" is a key safety valve that 
> gives us the freedom to link to generic common definitions but also to update 
> settings independently.
> 
> At the very least it makes sense to put a title and description in the actual 
> JSON schema, but there still may be other non JSON Schema information that 
> warrants the level up still
> 
> At least from your example, it doesn't seem like you're quite convinced about 
> using "name" and "description" inline.  I would argue that the supported 
> metadata fields are a reasonable fit for the early work you're doing, where 
> they can be mapped to field labels and instructions.  I would only expect us 
> to use them until JSON UI Schema matures, or until we come up with our own 
> overlay.
> 

I definitely think we should fill in the title and description metadata on the 
schema, but what I meant is I think there is a likely enough chance that we'll 
have so much other random information that we'd still need our surrounding JSON 
to specify all of it, with the actual JSON schema just being one part of the 
information that comes along with a setting.  I feel like the metadata over 
time would end up resembling an API doc... it could have paragraphs of info, 
links to the vendors pages, tags, notes, comments, etc etc.

> By "overlay" I mean a way of associating arbitrary metadata (instructions in 
> multiple languages, etc.) with a particular field.  The JSON Schema standard 
> only supports three metadata fields ("title", "description", and "default").  
> There isn't really a way to add our own additional metadata without creating 
> invalid schemas.  We'd have to handle it externally, for example, using a map 
> of JSON pointers <https://tools.ietf.org/html/rfc6901> that associate 
> arbitrary metadata with existing parts of a schema, as in:
> 
> highlightColors: {
>       "#/properties/foo": "#ff0000",
>       "#/properties/bar": "#00ff00"
> }
>  
> or perhaps:
> 
> extraMetadata: {
>       "#/properties/foo": {
>               "color": "#ff0000",
>               "font-weight": "bold"
>       },
>       "#/properties/bar": {
>               "color": "#00ff00"
>       }
> }
> 
> The JSON pointers used above are "schema-centric", such as you would see in 
> validation output from AJV.  You could also have "document-centric" pointers 
> like "#/foo", and "#/bar".  Depending on the use case, I could also see using 
> viewComponent selectors or model variable "dot path" references, as 
> demonstrated in gpii-binder <https://github.com/GPII/gpii-binder>.
> 

So, this would be like a post-processing event of some type?  Are these paths 
pulling in items from other JSON to add to the schema during runtime when it's 
assembled?


> Yeah, I think this would be a great set of stuff for the hackathon as well.
> 
> We have our next meeting next Thursday, so I'd like to sketch something out 
> before then, preferably this week.  I do think a PR against your repo would 
> be a simpler place to start if you're willing.  Otherwise, I can sketch out 
> something to talk about within the live solutions registry instead.  Just let 
> me know what you think.
> 

Yeah a PR is fine, and any other sort of full examples even if they're just in 
a gist... and again sorry for taking a while to respond to this.  At the 
moment, really what I want to make sure of is that:

1. The schema for each setting validates and is capable of validating, so that 
I can use it to validate the input when someone changes a needs and preferences 
setting.
2. We can dynamically pull and assemble any needed schemas to validate a full 
NP Set, which would entail:
   a) Validating the structure of an NP Set to ensure it's correct. (The 
nesting of context, preferences, sections etc)
   b) Validating each preference with the schema for that setting.

Whether 2 can be done by assembling one massive schema that contains all 
possible preference settings for those nodes, or if it has to be done in two 
passes by validating the overall structure and then running through the 
individual preference settings will be interesting.

Hopefully these responses aren't too vague... I've been so focused on just 
getting the basic structure of the app and Bern's initial mockups specced out, 
that I haven't gone into anything like pointers and complex stuff yet.  But I 
think some full examples would be helpful to walkthrough.

~Steve


> Cheers,
> 
> 
> Tony 
> 
> On Mon, Apr 24, 2017 at 6:16 AM, Steven Githens <[email protected] 
> <mailto:[email protected]>> wrote:
> Hello!
> 
>> On Apr 21, 2017, at 3:16 AM, Tony Atkins <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hi, Steve:
>> 
>> Happy to take over GPII-111, as that seems very much like what I've planned 
>> to focus on for the next month or so.
>> 
> 
> Awesome, thanks much!
> 
>> I guess my point is:  Breaking these up so they aren't just one file per OS 
>> sounds great, but I would recommend against requiring a 5-10 line file for 
>> every single setting.  And while these are in files now, I hope we can 
>> remember with any API we encounter that the JSON data is what's important, 
>> and whatever physical storage it sits in is mostly accidental complexity. 
>> 
>> I disagree somewhat with the considered choice of using a GitHub repo as 
>> being "mostly accidental complexity".  Although as you say we have the 
>> freedom to change back-ends in the future, I am proposing a filesystem and a 
>> github-based workflow as an initial choice because it has "intentional 
>> simplicity" versus building our own back end, mechanisms for managing the 
>> data, workflow, workflow tools, et cetera.
>> 
> 
> You're right, having a clear way for organizing the data in some type of tree 
> system and using the github workflow is essential complexity for our 
> organization structure.  
> 
> If I remember right, what spurred me to think this at all was during the APCP 
> meeting when we were discussing that part of the reason they may have to be 
> in separate files was a limitation of a 3rd party validation framework, or 
> whether it was worth our time to script it's API.  That is largely what I 
> considered to be the accidental complexity, that we were forced into some 
> layout and storage situation by the validator requiring that referenced 
> schemas absolutely had to be in some sort of posix file.
> 
> In the back of my head though, I still imagine that the essential complexity 
> you're working on is for organizing all this data and developing a standard 
> way for people to contribute to it, and should successfully translate to 
> another storage methodology that could represent blocks of texts in a tree 
> structure and have some sort of versioning.  Be that couch documents, or 
> whatever.  At the current state in time, the most robust and nicest set of 
> tooling for this just happens to be files+git+github.
> 
> 
>> Setting that aside, the key distinction here is not "one schema per 
>> setting".  It's "one definition per setting".  Individual schemas will 
>> represent what is valid for a particular use case, and will reuse one or 
>> more definitions 
>> <https://spacetelescope.github.io/understanding-json-schema/structuring.html#reuse>.
>>   As long as we distinctly define each setting in a way that can be referred 
>> to externally, the grouping is up to us.  The key concerns to me are:
>> Supporting reuse.
>> Keeping the files small to support contribution and review.
>> So, with that in mind, here is a proposal.
>> "common" settings definitions that occur across platforms would be in a 
>> central file.
>> "platform-unique" settings definitions would be in a file per OS.
>> "solution-unique" settings definitions could be in one file per solution (or 
>> family of related solutions that use the same conventions).
> 
> This seems super-reasonable, and something I might have to think about some 
> more too.  I had never thought too much about #2, ( I can guess what they 
> might be, but am having a hard time pulling out examples right now).
> 
> With the actual ability to inhereit settings from grades, it is nice to think 
> about things that are essentially the same in the linux.json and win32.json 
> file being merged into one sort of entry.  Like the chrome and easy123 type 
> ones.
> 
> 
>> This level of organization means that most contributors and reviewers would 
>> be working with a "solution-unique" definitions file that is relatively 
>> small.  As we do with code reviews, part of a good PR would involve 
>> identifying more widely useful common patterns that need to be pulled upward 
>> to a new "platform-unique" or "common" settings.  The larger the definitions 
>> block, the less frequently it would be modified.
>> 
>> The above just refers to definitions to be used in individual schemas.  I am 
>> still reviewing prior work before I start sketching out actual schemas, but 
>> can start showing you what I mean using your existing work.  Let's start 
>> with the first entry in your data 
>> <https://github.com/sgithens/gpii-devpmt/blob/master/src/js/commonTerms.json5#L3>.
>>   Right now it looks like:
>> 
>> {
>>     "http://registry.gpii.net/common/fontSize 
>> <http://registry.gpii.net/common/fontSize>": {
>>         name: "Font Size",
>>         description: "",
>>         schema: {
>>             type: "integer"            
>>         }
>>     },
>> 
>> Although you would likely mature this approach once the JSON UI Schema spec 
>> comes into being, for now I would suggest using the metadata supported by 
>> the standard 
>> <https://spacetelescope.github.io/understanding-json-schema/reference/generic.html#metadata>,
>>  namely the "title", and "description" fields.   This allows you to collapse 
>> down your proto-schema to look more like:
>> 
>> "http://registry.gpii.net/common/fontSize 
>> <http://registry.gpii.net/common/fontSize>": {
>>         title: "Font Size",
>>         description: "The font size to use in a particular context.",
>>         type: "integer"            
>> }
>> 
> 
> Interesting, putting those into the actual JSON schema.  I do wonder though 
> if we won't need to keep the schema as an embedded resource, as we might have 
> items that legit don't belong in it?  I had pulled 'name' and 'description' 
> from Javi's document:
> 
> https://wiki.gpii.net/w/Personal_Control_Panel_API#Settings.27_Metadata 
> <https://wiki.gpii.net/w/Personal_Control_Panel_API#Settings.27_Metadata>
> 
> starting out just because it was the info I needed to render the DevPTT.   
> Looking at that full list (and maybe other random things we need), it will be 
> interesting to see whether it could *all* just go in the schema document.   
> It probably depends on much extra information we might want for each setting. 
>  At the very least it makes sense to put a title and description in the 
> actual JSON schema, but there still may be other non JSON Schema information 
> that warrants the level up still... I just wonder if things could get wordy 
> to a point that they shouldn't be in there, paragraph documentation with like 
> links to API's from the vendors website and stuff. (still on a persetting 
> basis)
> 
> 
>> This now represents a single field in a larger object, with the key 
>> "http://registry.gpii.net/common/fontSize 
>> <http://registry.gpii.net/common/fontSize>".  To make your schema reusable, 
>> you would move the definition into a definitions block.  It doesn't 
>> necessarily have to live in a separate file, but just to illustrate the 
>> beginnings of the structure I mentioned above, let's assume I have a 
>> "common.json" definitions file that looks like:
>> 
>> {
>>      "$schema": "http://json-schema.org/schema# 
>> <http://json-schema.org/schema#>"
>>      id: "common.json",
>>      definitions: {
>>              "fontSize": {
>>                      id: "fontSize",
>>                      title: "Font Size",
>>                      description: "The font size to use in a particular 
>> context.",
>>                      type: "integer"            
>>              }
>>      }
>> }
>> 
>> The actual schema used for validation could now look more like:
>> 
>> {
>>      "$schema": "http://json-schema.org/schema# 
>> <http://json-schema.org/schema#>"
>>      id: "devpmt.json",
>>      properties: {
>>              "http://registry.gpii.net/common/fontSize 
>> <http://registry.gpii.net/common/fontSize>": { $ref: "common.json#fontSize" }
>>      }
>> }
> 
> 
> That does look super cool, and I like how it could either be in or not in a 
> file, depending on how complex it might, and again, looking at these 
> libraries, it doesn't look like it should be too hard for us to ever process 
> them from nodes if need be.  
> 
> Part of the 'separate file' worry I have was to make sure we don't have a 
> situation where, like for JAWS you'd have to make like:
> 
> jaws-solution.json
> jaws-setting-Options.PrimarySynthesizer.json
> jaws-setting-jaws-ENU-Global.Rate.json
> jaws-setting-ENU-JAWSCursor.Rate.json
> jaws-setting-ENU-Keyboard.Rate.json
> jaws-setting-ENU-MenuAndDialog.Rate.json
> ...this goes on for like a ream of paper
> 
> ALSO, here is what I'm hoping we can do.  As you've noticed, and lot of the 
> stuff I'm stubbing in is just like { type: "string" }, or { type: "integer" 
> }.  I realize that eventually some of these should be more detailed, but 
> sometimes you're really just stubbing in like 100 values that are all of like 
> 2 or 3 types. In that case I was hoping I could go:
> 
>> {
>>     "http://registry.gpii.net/common/fontSize 
>> <http://registry.gpii.net/common/fontSize>": {
>>         name: "Font Size",
>>         description: "",
>>         schema: { $ref: "stringSchema" }
>>     },
> 
> 
> 
> That does put us back to our previous larger structure... I don't know maybe 
> they'd have to be parsed and generated on the fly.  It would be nice to have 
> just those 5 or 8 base schemas you could just also use when you're crazy fast 
> stubbing out something like JAWS that has a ton of settings that are for all 
> practical purposes the same.
> I might have to think more about this use case, but I feel like I want to do 
> somethign close to it.
> 
> 
>> 
>> If this seems reasonable enough to flesh out a bit further, I would be happy 
>> to take a few minutes and prepare a PR against the devpmt, demonstrating 
>> this with all of your examples.  My idea is that that would be something 
>> tangible (and tested) to review in our next meeting, and that I would start 
>> work on a fuller set of definitions in the live registries based on our 
>> discussion.
>> 
>> Anyway, please review and comment.  
>> 
> 
> Yeah, I think this would be a great set of stuff for the hackathon as well.  
> Talking about schemas and validation, I feel like some people have started 
> certain ones, but I would really love to get a set of JSON schemas in for our 
> most important formats that people edit... NP Sets, Solutions, probably other 
> things.  I want the DevPTT as well, to be able to validate an existing or new 
> NP Set against all the schema's we're making above, especially as the first 
> step in avoiding settings injection issues down the road.
> 
> Cheers to the max,
> Steve
> 
>> Cheers,
>> 
>> 
>> Tony
>> 
>> On Fri, Apr 21, 2017 at 6:06 AM, Steven Githens <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Hi Tony!
>> 
>> This all sounds cool... and I might just assign over GPII-111 to you, if you 
>> think it's more or less the actual work you're doing right now.  You can 
>> take a look, and if you think it is, feel free to take it.
>> 
>> Just one comment below.
>> 
>>> On Apr 12, 2017, at 5:43 AM, Tony Atkins <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Hi, All:
>>> 
>>> As we have long discussed, currently the solutions and settings used within 
>>> the GPII are stored in massive JSON files in the "universal" repository.  I 
>>> have been tasked with helping move us towards the kind of granularity, 
>>> inheritance, and testability we discussed in Toronto.  I have been 
>>> sketching out initial documentation and a loading/validation harness 
>>> <https://github.com/the-t-in-rtf/gpii-live-registries>, and wanted to 
>>> summarize for wider discussion.
>>> 
>>> First, as discussed in Toronto, the idea is that the "live" registries 
>>> would be a separate repo that contains the data that currently lives in 
>>> universal, more finely broken down.  Changes to the data would be submitted 
>>> as pull requests against this repo.  The platform-specific repos would use 
>>> a versioned release of the "live" data (more on that in a bit).
>>> 
>>> Each solution and setting would be a distinct grade, saved to a single 
>>> JSON(5) file.  We would use the effective path and filename to create an 
>>> implicit and unique grade name for each options file.  This accomplishes 
>>> two things:
>> 
>> 
>> I think having a default convention for the repo of having each solution in 
>> it's own file is probably good, but I would hope that each setting for that 
>> solution wouldn't have to be in it's own file... maybe it could be if you 
>> want, but hopefully optional.  Having barely survived that decade in the 
>> early 2000's when J2EE was cool and having to put every single public java 
>> class has left some anxiety in my stomach thinking about this.
>> 
>> Mostly, I want to make sure that we future proof ourselves, and have this 
>> validation and solutions registry tooling work well in any situation where 
>> you have some JSON data that makes up a schema.  Regardless of whether it's 
>> in a file, a couch document, a node in another JSON document, or being 
>> dynamically and temporarily stored in the local storage of an awesome web 
>> based authoring tool, which is most likely just over the horizon for us.
>> 
>> Out of a paranoia [1], I did start reading through the spec and looking at 
>> some of the validation libraries, and everything seems like it should be Ok. 
>>  And even though I don't have the schema directive in yet (although it's 
>> only 1 five minute vim macro away ;)  ), they do actually seem to validate 
>> fine.  From what I mentioned on the APCP call, I am actually hoping that we 
>> can start with the metadata I've created for the Generic Preferences and 
>> filling in some of the apps settings metadata (JAWS mostly), since it is 
>> like a good half days worth of typing.  I'm happy to spend the 10 minutes 
>> with a vim macro to make them look however we need them, and split them up 
>> in to files.
>> 
>> I guess my point is:  Breaking these up so they aren't just one file per OS 
>> sounds great, but I would recommend against requiring a 5-10 line file for 
>> every single setting.  And while these are in files now, I hope we can 
>> remember with any API we encounter that the JSON data is what's important, 
>> and whatever physical storage it sits in is mostly accidental complexity. 
>> 
>> Cheers,
>> Steve
>> 
>> [1] and left over stress/nightmares of when I was still working on J2EE 
>> projects. 
>>> We will have an easier time detecting namespace collisions with this model.
>>> We can detect the existence of and perform standard tests against each 
>>> grade in isolation (see below).
>>> So, what do I mean by "grades" in this context?  Basically, anything you 
>>> can do in an options block without writing code can be stored in one of 
>>> these JSON(5) files.  Settings and solutions derive from concrete 
>>> gpii.setting and gpii.solution grades.  Abstract grades are also possible, 
>>> such as platform and platform version mix-ins.
>>> 
>>> A new loader would scan through an "options file hierarchy" and associate 
>>> each block of options with its namespace, as though the user had called 
>>> fluid.defaults(namespace, options).  Once all grades have their defaults 
>>> defined, we can search for any grades that extend gpii.solution or 
>>> gpii.setting, and do things like:
>>> Confirm that each component can be safely instantiated.
>>> Confirm that the component satisfies the contract defined for the base 
>>> grade, for example, that it provides an "isInstalled" invoker.
>>> For "abstract" grades, we would not attempt to instantiate them, only to 
>>> confirm that each is extended by at least one "concrete" grade that has 
>>> been tested.
>>> Platform specific tests would take place within the platform-specific 
>>> repos, which would test their version of the "live" data, for example 
>>> calling each solution's "isInstalled" method to confirm that nothing 
>>> breaks.  As with any versioned dependency change, we would submit a PR 
>>> against a platform repo and confirm that the new version of the "live" data 
>>> does not break anything before merging and releasing a new version of the 
>>> platform repo.
>>> 
>>> So, that's the proposed workflow and test harness, which are independent of 
>>> the data format.  Please comment.  Once we have even "lazy consensus" 
>>> agreement on that, we will immediately need to move forward with 
>>> discussions about how we represent each solution/setting and the 
>>> relationships between settings.
>>> 
>>> Cheers,
>>> 
>>> 
>>> Tony
>>> 
>>> _______________________________________________
>>> Architecture mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://lists.gpii.net/mailman/listinfo/architecture 
>>> <http://lists.gpii.net/mailman/listinfo/architecture>
>> 
>> 
> 
>

_______________________________________________
Architecture mailing list
[email protected]
http://lists.gpii.net/mailman/listinfo/architecture

Re: [Architecture] Current thoughts on "live" GPII registries...

Reply via email to