Re: New proposal for type definitons

John Blum Tue, 03 Jan 2017 11:51:25 -0800

+1 to what Jacob said.

On Tue, Jan 3, 2017 at 11:39 AM, Jacob Barrett <jbarr...@pivotal.io> wrote:


> A little late to the game here but I want to go back to Dan's idea of
> storing the JSON or other self describing objects as a first class object
> in Geode. As it stands right now an entry can be a POJO, Java serialized
> object, or PDX, so why not other types? Seems perfectly reasonable to allow
> first class storage of JSON (or BSON) without sacrificing any features that
> we support today with PDX or POJO. In fact one could even implement a
> version of PdxInstance that wraps a JSON document.
>
> It seems to me that any attempt to add structure to something without
> structure is going to give us way more heartache than trying to add support
> for truly unstructured data in Geode. We are part of the way there with PDX
> since get/set field operations hide any real formal structure.
>
> In my last deep dive into PDX over a year ago it seemed very doable to me
> at the time to add support for something like this.
>
> -Jake
>
>
> On Wed, Dec 28, 2016 at 9:52 AM Udo Kohlmeyer <ukohlme...@pivotal.io>
> wrote:
>
> > You are correct here. Ordering the fields would be a simple solution IF
> > the only problem was that fields were incorrectly ordered. In most cases
> > not all fields are provided thus causing an explosion of type
> > definitions that would be generated.
> >
> >
> > On 12/22/16 16:11, Darrel Schneider wrote:
> > > When generating a pdx type for a JSON document couldn't we sort the
> field
> > > names from the JSON document so that field order would not generated
> > > different pdx types?
> > > Also when choosing a pdx field type if we always picked a "wider" type
> > then
> > > it would reduce the number of types generated because of different
> field
> > > types.
> > >
> > >
> > > On Thu, Dec 22, 2016 at 10:02 AM, Udo Kohlmeyer <ukohlme...@pivotal.io
> >
> > > wrote:
> > >
> > >> Hi there Dan,
> > >>
> > >> You are correct, the thought is there to add a flag to the registry to
> > >> indicate that a definition is custom and thus should not conflict with
> > the
> > >> existing ids. Even if they types were to be stored with the current
> Pdx
> > >> type definitions, upon loading/registration of the custom type
> > definitions,
> > >> any conflict will be reported and the custom set will not be
> registered
> > >> until all issues were addressed.
> > >>
> > >> I also had the opinion of the "if they can provide me a typeId, then
> > >> surely they can provide me with a fully populated JSON document".
> > >> Referencing the example document from the wiki, an user can be created
> > with
> > >> just a first and surname. It is not required to provide
> currentAddress,
> > >> previousAddresses, dob,etc... Whilst one could force the client to
> > provide
> > >> all fields in the JSON document, it is not always possible nor
> feasible
> > to
> > >> do so. In the POJO world we have a structured data definition and the
> > >> generation of a type definition is simple. This done because from a
> > >> serialization perspective we always make sure that all fields are
> > >> serialized. BUT if we were to change the serialization, i.e not
> > serialize a
> > >> field because it is null, the type definition behavior would be
> exactly
> > the
> > >> same as JSON. Only, in this case, because we changed the type
> definition
> > >> for the 'com.demo.User' object (at runtime) the deserialization step
> for
> > >> previous versions would fail.
> > >>
> > >> I believe that if we were to be able to describe WHAT the structure
> of a
> > >> JSON document should be and define the type according to that
> > definition,
> > >> we could improve performance (as we don't have to determine type
> > >> definitions for every JSON document), be more flexible in consuming
> JSON
> > >> documents that are only partially populated and lastly not potentially
> > >> cause a vast amount of JSON-based type definitions to be generated.
> > >>
> > >> In addition to just the JSON benefits, having a formal way of
> describing
> > >> the type definitions will allow us to better maintain the current
> > >> registered type definitions. In addition to this, it would allow
> > >> customers/clients to create type definitions, by hand, if they were to
> > have
> > >> lost their type registry.
> > >>
> > >> As  final thought, the addition of the external type registration
> > process
> > >> is not meant replace the current behavior. But rather enhance its
> > >> capabilities. If no external types will have been defined OR the
> client
> > >> does not provide a '@typeId' tag, the current JSON type definition
> > behavior
> > >> will stay the same.
> > >>
> > >> --Udo
> > >>
> > >>
> > >> On 12/21/16 18:20, Dan Smith wrote:
> > >>
> > >>> I'm assuming the type ids here are a different set than the type ids
> > used
> > >>> with regular PDX serialization so they won't conflict if the pdx
> > registry
> > >>> assigns 1 to some class and a user puts @typeId: 1 in their json?
> > >>>
> > >>> I'm concerned that this won't really address the type explosion
> issue.
> > >>> Users that are able to go to the effort of adding these typeIds to
> all
> > of
> > >>> their json are probably users that can produce consistently formatted
> > json
> > >>> in the first place. Users that have inconsistently formatted json are
> > >>> probably not going to want or be able to add these type ids.
> > >>>
> > >>> It might be better for us to pursue a way to store arbitrary
> documents
> > >>> that
> > >>> are self describing. Our current approach for json documents is
> > assuming
> > >>> that the documents are all consistently formatted. We are infer a
> > schema
> > >>> for their documents store the field names in the type registry and
> the
> > >>> field values in the serialized data. If we give people the option to
> > store
> > >>> and query self describing values, then users with inconsistent json
> > could
> > >>> just use that option and pay the extra storage cost.
> > >>>
> > >>> -Dan
> > >>>
> > >>> On Tue, Dec 20, 2016 at 4:53 PM, Udo Kohlmeyer <ukohlme...@gmail.com
> >
> > >>> wrote:
> > >>>
> > >>> Hey there,
> > >>>> I've just completed a new proposal on the wiki for a new mechanism
> > that
> > >>>> could be used to define a type definition for an object.
> > >>>> https://cwiki.apache.org/confluence/display/GEODE/Custom+
> > >>>> External+Type+Definition+Proposal+for+JSON
> > >>>>
> > >>>> Primarily the new type definition proposal will hopefully help with
> > the
> > >>>> "structuring" of JSON document definitions in a manner that will
> allow
> > >>>> users to submit JSON documents for data types without the need to
> > provide
> > >>>> every field of the whole domain object type.
> > >>>>
> > >>>> Please review and comment as required.
> > >>>>
> > >>>> --Udo
> > >>>>
> > >>>>
> > >>>>
> >
> >
>



-- 
-John
john.blum10101 (skype)

Re: New proposal for type definitons

Reply via email to