That's absolutely a good idea. One thing is describing the JSON structure,
which I guess would not be different (although more elaborate than average)
than other features. But the moment you mention this then I start thinking
about also query "into" the JSON. Or any other map structure. Also many of
our NoSQL-like modules have more advanced data structures - for instance
MongoDB, CouchDB, ElasticSearch and so on.

Querying into those data structures could be considered a separate feature
though. I mean - that could be useful regardless of having metadata about
the data structure. Even today I would love to somehow query with a
function or something like that where I could provide a kind of path (XPath
like or so) into the nested structure. Ankit, if this is also what you
would like, then I suggest starting a separate mail thread on it (unless
you feel it is tied very closely to this metadata thing).

2015-07-15 18:55 GMT+02:00 Ankit Kumar <[email protected]>:

> Hi Kasper,
>
> Sounds like a nice idea.
>
> Would this also allow to define a complex metadata on a single column. For
> e.g. if the content of the column is a JSON object, would it be possible to
> define what is the structure of the JSON, what specific fields within the
> JSON structure have what behavioural aspect (metadata as well)? May be an
> extreme case for later, but I can already see the benefit of this feature.
>
> Regards
> Ankit
>
> On Wed, Jul 15, 2015 at 4:58 PM, Kasper Sørensen <
> [email protected]> wrote:
>
> > Glad that you like the idea :-) And yes, I also would like to involve as
> > many as possible in building it.
> >
> > My high level design ideas:
> >
> > We would leave the existing DataContext API as-is to the extent
> possible. I
> > would see this as a service that is connected to the DataContext but
> since
> > the "new" metadata and the existing metadata might differ (as Alberto
> > mentioned before) it is also not bound directly to the schema, table,
> > column etc. Rather you should be able to resolve things using either
> > Schema, Table and Column objects, or simply by paths (strings).
> >
> > Here's some example method calls and class names that I could imagine. I
> > haven't thought long about the naming so it's just a "idea generating"
> > draft. I imagine the metadata "features" to be pluggable kinda like
> > annotations on java constructs:
> >
> > DataContext dc = ...
> > MetadataService svc = ...
> >
> > Table table = dc.getDefaultSchema().getTable(0);
> >
> > List<ColumnGroupMetadata> groups = svc.getColumnGroupMetadata(table);
> > boolean isAddressGroup =
> > groups.get(0).hasFeatureOfType(AddressColumnGroupFeature.class);
> > AddressColumnGroupFeature addressFeature =
> > groups.get(0).getFeatureOfType(AddressColumnGroupFeature.class);
> > ColumnMetadata countryColumnMetadata = addressFeature.getCountryColumn();
> > List<ColumnMetadata> addressLineColumnMetadataList =
> > addressFeature.getAddressLineColumns();
> >
> > Column column = table.getColumn(0);
> >
> > ColumnMetadata cm = svc.getColumnMetadata(column);
> > List<ColumnFeature> features = cm.getFeatures();
> > boolean isNominal = cm.hasFeatureOfType(NominalColumnFeature.class);
> > NominalColumnFeature ncf =
> cm.getFeatureOfType(NominalColumnFeature.class);
> >
> >
> > BR,
> > Kasper
> >
> >
> > 2015-07-15 13:07 GMT+02:00 Alberto Rodriguez <[email protected]>:
> >
> > > That's pretty cool. Sorry but I was not taking into account the
> powerful
> > > queries that you might bring into the equation with this stuff. It all
> > > makes sense now.
> > >
> > > It would definitely be a big feature that might be divided into smaller
> > > tasks. If we go ahead with this stuff I really would like to help with
> > the
> > > implementation.
> > >
> > > Kind regards,
> > >
> > > Alberto
> > >
> > > 2015-07-15 12:41 GMT+02:00 Kasper Sørensen <
> > [email protected]
> > > >:
> > >
> > > > That's absolutely true. It's not that I want to stop discovering what
> > we
> > > > can, but I was more thinking of also adding a mechanism to plug your
> > own
> > > > metadata. I guess it's pretty rare that a database itself offers a
> > > "domain
> > > > oriented" metadata system where I could tell it that "this field is a
> > zip
> > > > code, and together with field X, Y and Z it forms a single address".
> > > >
> > > > While from a querying perspective what I would love to archieve is
> > that I
> > > > could express something like the following:
> > > >
> > > > "Query all the addresses in the database".
> > > > "Do a SUM, AVG, MAX and MIN on all the ordinal-scale numbers in the
> > > > database".
> > > >
> > > > It will also help a lot in generating templates for data integration.
> > If
> > > I
> > > > am trying to move data from one table to another then I can probably
> > do a
> > > > lot of automatic mapping based on the metadata. Same goes for
> > reporting I
> > > > guess and stuff like that.
> > > >
> > > > Best regards,
> > > > Kasper
> > > >
> > > > 2015-07-15 12:06 GMT+02:00 Alberto Rodriguez <[email protected]>:
> > > >
> > > > > Ok, so you are not thinking of discovering more metadata
> > "on-the-fly",
> > > > your
> > > > > approach is statically define metadata for the datasource and load
> > and
> > > > > mix-in it with the existing metadata right?
> > > > >
> > > > > IMHO with this approach we will add a strong dependency between the
> > > data
> > > > > itself and the "external" metadata, correct me if I'm wrong or not
> > > fully
> > > > > understand your proposal but if the datastore changes (one column
> is
> > > > > deleted or a new column is added) the metadata will get obsolete.
> > > > >
> > > > > Kind regards,
> > > > >
> > > > > Alberto
> > > > >
> > > > > 2015-07-13 19:26 GMT+02:00 Kasper Sørensen <
> > > > [email protected]
> > > > > >:
> > > > >
> > > > > > I was thinking of having something like pluggable annotations or
> > > > features
> > > > > > that could be added to tables, columns or groups of columns.
> Maybe
> > > also
> > > > > to
> > > > > > other entities. But since a lot of this is not available as a
> thing
> > > > that
> > > > > > can be explored in the datastore itself I guess it would need to
> be
> > > > > stored
> > > > > > externally.
> > > > > >
> > > > > > Examples of features that I could imagine:
> > > > > >
> > > > > > Data type features: NominalScale, OrdinalScale, ...
> > > > > > Data conversion features: IntegerAsString, DateAsString,
> > > > TimestampAsLong
> > > > > > ...
> > > > > > Domain features: FirstName, LastName, AddressLine, AddressCity,
> > > > > > AddressCountry, DateYear, DateMonth ...
> > > > > > And groupings of columns also in domain like fatures: Name
> > (composed
> > > of
> > > > > > e.g. first and last name), Address (composed of multiple address
> > > > fields),
> > > > > > Date (composed of year, month etc.)
> > > > > >
> > > > > > I would like to store them so that I can save and load them,
> saving
> > > the
> > > > > > developer for the work of restoring all the metadata again and
> > again.
> > > > Is
> > > > > > that not sensible?
> > > > > >
> > > > > > Best regards,
> > > > > > Kasper
> > > > > >
> > > > > >
> > > > > > 2015-07-13 9:55 GMT+02:00 Alberto Rodriguez <[email protected]>:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > We are also facing similar issues so I completely agree with
> this
> > > > > > feature.
> > > > > > > In fact, we added recently in our service layer a new field for
> > our
> > > > > > > metadata called "format", we needed this field to specify
> > different
> > > > > > format
> > > > > > > types for the dates returned by our datastores.
> > > > > > >
> > > > > > > However, I'm not really sure how to implement this feature... I
> > > guess
> > > > > we
> > > > > > > should keep getting the metadata "core" from our datasources
> but
> > > what
> > > > > > about
> > > > > > > the new metadata??:
> > > > > > >
> > > > > > >    - How to fill it out? Will the integrator of MM provide
> > > functions
> > > > to
> > > > > > > define when an element is going to be "x" and when is going to
> be
> > > > "y"?
> > > > > > > (thinking here of providing lambda functions)
> > > > > > >    - Do we really need a metadata store?
> > > > > > >
> > > > > > > Kind regards,
> > > > > > >
> > > > > > >
> > > > > > > 2015-07-10 12:51 GMT+02:00 Kasper Sørensen <
> > > > > > [email protected]
> > > > > > > >:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > All the time I see more and more need for us to add metadata
> to
> > > our
> > > > > > > > MetaModel based connectors. That could be for instance
> metadata
> > > > about
> > > > > > > scale
> > > > > > > > (nominal, ordinal etc.) so that we can automate some stats
> > > > collection
> > > > > > > etc.
> > > > > > > > or it could be more "meaning" oriented features to describe
> > e.g.
> > > > > "This
> > > > > > > is a
> > > > > > > > first name" or "This is a city" or "These two fields (first
> and
> > > > last
> > > > > > > name)
> > > > > > > > are together defining a name of a person".
> > > > > > > >
> > > > > > > > We have such mechanisms in our application levels many
> places,
> > > but
> > > > > not
> > > > > > at
> > > > > > > > the core framework of MetaModel and that's a pity because it
> > > makes
> > > > it
> > > > > > > > harder for us to share.
> > > > > > > >
> > > > > > > > So I'm thinking of adding such a layer to the metadata of
> > > > MetaModel.
> > > > > > But
> > > > > > > > one thing that's difficult is then about representing that
> > > metadata
> > > > > in
> > > > > > > some
> > > > > > > > "metadata store" which isn't necesarily the same as the data
> > > source
> > > > > > > itself.
> > > > > > > > It could be an XML file or it could be a complete metadata
> > > > database.
> > > > > > And
> > > > > > > I
> > > > > > > > think that this metadata would be mutable by the integrator
> of
> > > > > > MetaModel
> > > > > > > > because it is rarely fully revealed by the data source
> itself.
> > > > > > > >
> > > > > > > > What do you think? Nice feature or?`
> > > > > > > >
> > > > > > > > Kasper
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to