+1 to what Jacob said. On Tue, Jan 3, 2017 at 11:39 AM, Jacob Barrett <jbarr...@pivotal.io> wrote:
> A little late to the game here but I want to go back to Dan's idea of > storing the JSON or other self describing objects as a first class object > in Geode. As it stands right now an entry can be a POJO, Java serialized > object, or PDX, so why not other types? Seems perfectly reasonable to allow > first class storage of JSON (or BSON) without sacrificing any features that > we support today with PDX or POJO. In fact one could even implement a > version of PdxInstance that wraps a JSON document. > > It seems to me that any attempt to add structure to something without > structure is going to give us way more heartache than trying to add support > for truly unstructured data in Geode. We are part of the way there with PDX > since get/set field operations hide any real formal structure. > > In my last deep dive into PDX over a year ago it seemed very doable to me > at the time to add support for something like this. > > -Jake > > > On Wed, Dec 28, 2016 at 9:52 AM Udo Kohlmeyer <ukohlme...@pivotal.io> > wrote: > > > You are correct here. Ordering the fields would be a simple solution IF > > the only problem was that fields were incorrectly ordered. In most cases > > not all fields are provided thus causing an explosion of type > > definitions that would be generated. > > > > > > On 12/22/16 16:11, Darrel Schneider wrote: > > > When generating a pdx type for a JSON document couldn't we sort the > field > > > names from the JSON document so that field order would not generated > > > different pdx types? > > > Also when choosing a pdx field type if we always picked a "wider" type > > then > > > it would reduce the number of types generated because of different > field > > > types. > > > > > > > > > On Thu, Dec 22, 2016 at 10:02 AM, Udo Kohlmeyer <ukohlme...@pivotal.io > > > > > wrote: > > > > > >> Hi there Dan, > > >> > > >> You are correct, the thought is there to add a flag to the registry to > > >> indicate that a definition is custom and thus should not conflict with > > the > > >> existing ids. Even if they types were to be stored with the current > Pdx > > >> type definitions, upon loading/registration of the custom type > > definitions, > > >> any conflict will be reported and the custom set will not be > registered > > >> until all issues were addressed. > > >> > > >> I also had the opinion of the "if they can provide me a typeId, then > > >> surely they can provide me with a fully populated JSON document". > > >> Referencing the example document from the wiki, an user can be created > > with > > >> just a first and surname. It is not required to provide > currentAddress, > > >> previousAddresses, dob,etc... Whilst one could force the client to > > provide > > >> all fields in the JSON document, it is not always possible nor > feasible > > to > > >> do so. In the POJO world we have a structured data definition and the > > >> generation of a type definition is simple. This done because from a > > >> serialization perspective we always make sure that all fields are > > >> serialized. BUT if we were to change the serialization, i.e not > > serialize a > > >> field because it is null, the type definition behavior would be > exactly > > the > > >> same as JSON. Only, in this case, because we changed the type > definition > > >> for the 'com.demo.User' object (at runtime) the deserialization step > for > > >> previous versions would fail. > > >> > > >> I believe that if we were to be able to describe WHAT the structure > of a > > >> JSON document should be and define the type according to that > > definition, > > >> we could improve performance (as we don't have to determine type > > >> definitions for every JSON document), be more flexible in consuming > JSON > > >> documents that are only partially populated and lastly not potentially > > >> cause a vast amount of JSON-based type definitions to be generated. > > >> > > >> In addition to just the JSON benefits, having a formal way of > describing > > >> the type definitions will allow us to better maintain the current > > >> registered type definitions. In addition to this, it would allow > > >> customers/clients to create type definitions, by hand, if they were to > > have > > >> lost their type registry. > > >> > > >> As final thought, the addition of the external type registration > > process > > >> is not meant replace the current behavior. But rather enhance its > > >> capabilities. If no external types will have been defined OR the > client > > >> does not provide a '@typeId' tag, the current JSON type definition > > behavior > > >> will stay the same. > > >> > > >> --Udo > > >> > > >> > > >> On 12/21/16 18:20, Dan Smith wrote: > > >> > > >>> I'm assuming the type ids here are a different set than the type ids > > used > > >>> with regular PDX serialization so they won't conflict if the pdx > > registry > > >>> assigns 1 to some class and a user puts @typeId: 1 in their json? > > >>> > > >>> I'm concerned that this won't really address the type explosion > issue. > > >>> Users that are able to go to the effort of adding these typeIds to > all > > of > > >>> their json are probably users that can produce consistently formatted > > json > > >>> in the first place. Users that have inconsistently formatted json are > > >>> probably not going to want or be able to add these type ids. > > >>> > > >>> It might be better for us to pursue a way to store arbitrary > documents > > >>> that > > >>> are self describing. Our current approach for json documents is > > assuming > > >>> that the documents are all consistently formatted. We are infer a > > schema > > >>> for their documents store the field names in the type registry and > the > > >>> field values in the serialized data. If we give people the option to > > store > > >>> and query self describing values, then users with inconsistent json > > could > > >>> just use that option and pay the extra storage cost. > > >>> > > >>> -Dan > > >>> > > >>> On Tue, Dec 20, 2016 at 4:53 PM, Udo Kohlmeyer <ukohlme...@gmail.com > > > > >>> wrote: > > >>> > > >>> Hey there, > > >>>> I've just completed a new proposal on the wiki for a new mechanism > > that > > >>>> could be used to define a type definition for an object. > > >>>> https://cwiki.apache.org/confluence/display/GEODE/Custom+ > > >>>> External+Type+Definition+Proposal+for+JSON > > >>>> > > >>>> Primarily the new type definition proposal will hopefully help with > > the > > >>>> "structuring" of JSON document definitions in a manner that will > allow > > >>>> users to submit JSON documents for data types without the need to > > provide > > >>>> every field of the whole domain object type. > > >>>> > > >>>> Please review and comment as required. > > >>>> > > >>>> --Udo > > >>>> > > >>>> > > >>>> > > > > > -- -John john.blum10101 (skype)