When generating a pdx type for a JSON document couldn't we sort the field names from the JSON document so that field order would not generated different pdx types? Also when choosing a pdx field type if we always picked a "wider" type then it would reduce the number of types generated because of different field types.
On Thu, Dec 22, 2016 at 10:02 AM, Udo Kohlmeyer <ukohlme...@pivotal.io> wrote: > Hi there Dan, > > You are correct, the thought is there to add a flag to the registry to > indicate that a definition is custom and thus should not conflict with the > existing ids. Even if they types were to be stored with the current Pdx > type definitions, upon loading/registration of the custom type definitions, > any conflict will be reported and the custom set will not be registered > until all issues were addressed. > > I also had the opinion of the "if they can provide me a typeId, then > surely they can provide me with a fully populated JSON document". > Referencing the example document from the wiki, an user can be created with > just a first and surname. It is not required to provide currentAddress, > previousAddresses, dob,etc... Whilst one could force the client to provide > all fields in the JSON document, it is not always possible nor feasible to > do so. In the POJO world we have a structured data definition and the > generation of a type definition is simple. This done because from a > serialization perspective we always make sure that all fields are > serialized. BUT if we were to change the serialization, i.e not serialize a > field because it is null, the type definition behavior would be exactly the > same as JSON. Only, in this case, because we changed the type definition > for the 'com.demo.User' object (at runtime) the deserialization step for > previous versions would fail. > > I believe that if we were to be able to describe WHAT the structure of a > JSON document should be and define the type according to that definition, > we could improve performance (as we don't have to determine type > definitions for every JSON document), be more flexible in consuming JSON > documents that are only partially populated and lastly not potentially > cause a vast amount of JSON-based type definitions to be generated. > > In addition to just the JSON benefits, having a formal way of describing > the type definitions will allow us to better maintain the current > registered type definitions. In addition to this, it would allow > customers/clients to create type definitions, by hand, if they were to have > lost their type registry. > > As final thought, the addition of the external type registration process > is not meant replace the current behavior. But rather enhance its > capabilities. If no external types will have been defined OR the client > does not provide a '@typeId' tag, the current JSON type definition behavior > will stay the same. > > --Udo > > > On 12/21/16 18:20, Dan Smith wrote: > >> I'm assuming the type ids here are a different set than the type ids used >> with regular PDX serialization so they won't conflict if the pdx registry >> assigns 1 to some class and a user puts @typeId: 1 in their json? >> >> I'm concerned that this won't really address the type explosion issue. >> Users that are able to go to the effort of adding these typeIds to all of >> their json are probably users that can produce consistently formatted json >> in the first place. Users that have inconsistently formatted json are >> probably not going to want or be able to add these type ids. >> >> It might be better for us to pursue a way to store arbitrary documents >> that >> are self describing. Our current approach for json documents is assuming >> that the documents are all consistently formatted. We are infer a schema >> for their documents store the field names in the type registry and the >> field values in the serialized data. If we give people the option to store >> and query self describing values, then users with inconsistent json could >> just use that option and pay the extra storage cost. >> >> -Dan >> >> On Tue, Dec 20, 2016 at 4:53 PM, Udo Kohlmeyer <ukohlme...@gmail.com> >> wrote: >> >> Hey there, >>> >>> I've just completed a new proposal on the wiki for a new mechanism that >>> could be used to define a type definition for an object. >>> https://cwiki.apache.org/confluence/display/GEODE/Custom+ >>> External+Type+Definition+Proposal+for+JSON >>> >>> Primarily the new type definition proposal will hopefully help with the >>> "structuring" of JSON document definitions in a manner that will allow >>> users to submit JSON documents for data types without the need to provide >>> every field of the whole domain object type. >>> >>> Please review and comment as required. >>> >>> --Udo >>> >>> >>> >