That's absolutely a good idea. One thing is describing the JSON structure, which I guess would not be different (although more elaborate than average) than other features. But the moment you mention this then I start thinking about also query "into" the JSON. Or any other map structure. Also many of our NoSQL-like modules have more advanced data structures - for instance MongoDB, CouchDB, ElasticSearch and so on.
Querying into those data structures could be considered a separate feature though. I mean - that could be useful regardless of having metadata about the data structure. Even today I would love to somehow query with a function or something like that where I could provide a kind of path (XPath like or so) into the nested structure. Ankit, if this is also what you would like, then I suggest starting a separate mail thread on it (unless you feel it is tied very closely to this metadata thing). 2015-07-15 18:55 GMT+02:00 Ankit Kumar <[email protected]>: > Hi Kasper, > > Sounds like a nice idea. > > Would this also allow to define a complex metadata on a single column. For > e.g. if the content of the column is a JSON object, would it be possible to > define what is the structure of the JSON, what specific fields within the > JSON structure have what behavioural aspect (metadata as well)? May be an > extreme case for later, but I can already see the benefit of this feature. > > Regards > Ankit > > On Wed, Jul 15, 2015 at 4:58 PM, Kasper Sørensen < > [email protected]> wrote: > > > Glad that you like the idea :-) And yes, I also would like to involve as > > many as possible in building it. > > > > My high level design ideas: > > > > We would leave the existing DataContext API as-is to the extent > possible. I > > would see this as a service that is connected to the DataContext but > since > > the "new" metadata and the existing metadata might differ (as Alberto > > mentioned before) it is also not bound directly to the schema, table, > > column etc. Rather you should be able to resolve things using either > > Schema, Table and Column objects, or simply by paths (strings). > > > > Here's some example method calls and class names that I could imagine. I > > haven't thought long about the naming so it's just a "idea generating" > > draft. I imagine the metadata "features" to be pluggable kinda like > > annotations on java constructs: > > > > DataContext dc = ... > > MetadataService svc = ... > > > > Table table = dc.getDefaultSchema().getTable(0); > > > > List<ColumnGroupMetadata> groups = svc.getColumnGroupMetadata(table); > > boolean isAddressGroup = > > groups.get(0).hasFeatureOfType(AddressColumnGroupFeature.class); > > AddressColumnGroupFeature addressFeature = > > groups.get(0).getFeatureOfType(AddressColumnGroupFeature.class); > > ColumnMetadata countryColumnMetadata = addressFeature.getCountryColumn(); > > List<ColumnMetadata> addressLineColumnMetadataList = > > addressFeature.getAddressLineColumns(); > > > > Column column = table.getColumn(0); > > > > ColumnMetadata cm = svc.getColumnMetadata(column); > > List<ColumnFeature> features = cm.getFeatures(); > > boolean isNominal = cm.hasFeatureOfType(NominalColumnFeature.class); > > NominalColumnFeature ncf = > cm.getFeatureOfType(NominalColumnFeature.class); > > > > > > BR, > > Kasper > > > > > > 2015-07-15 13:07 GMT+02:00 Alberto Rodriguez <[email protected]>: > > > > > That's pretty cool. Sorry but I was not taking into account the > powerful > > > queries that you might bring into the equation with this stuff. It all > > > makes sense now. > > > > > > It would definitely be a big feature that might be divided into smaller > > > tasks. If we go ahead with this stuff I really would like to help with > > the > > > implementation. > > > > > > Kind regards, > > > > > > Alberto > > > > > > 2015-07-15 12:41 GMT+02:00 Kasper Sørensen < > > [email protected] > > > >: > > > > > > > That's absolutely true. It's not that I want to stop discovering what > > we > > > > can, but I was more thinking of also adding a mechanism to plug your > > own > > > > metadata. I guess it's pretty rare that a database itself offers a > > > "domain > > > > oriented" metadata system where I could tell it that "this field is a > > zip > > > > code, and together with field X, Y and Z it forms a single address". > > > > > > > > While from a querying perspective what I would love to archieve is > > that I > > > > could express something like the following: > > > > > > > > "Query all the addresses in the database". > > > > "Do a SUM, AVG, MAX and MIN on all the ordinal-scale numbers in the > > > > database". > > > > > > > > It will also help a lot in generating templates for data integration. > > If > > > I > > > > am trying to move data from one table to another then I can probably > > do a > > > > lot of automatic mapping based on the metadata. Same goes for > > reporting I > > > > guess and stuff like that. > > > > > > > > Best regards, > > > > Kasper > > > > > > > > 2015-07-15 12:06 GMT+02:00 Alberto Rodriguez <[email protected]>: > > > > > > > > > Ok, so you are not thinking of discovering more metadata > > "on-the-fly", > > > > your > > > > > approach is statically define metadata for the datasource and load > > and > > > > > mix-in it with the existing metadata right? > > > > > > > > > > IMHO with this approach we will add a strong dependency between the > > > data > > > > > itself and the "external" metadata, correct me if I'm wrong or not > > > fully > > > > > understand your proposal but if the datastore changes (one column > is > > > > > deleted or a new column is added) the metadata will get obsolete. > > > > > > > > > > Kind regards, > > > > > > > > > > Alberto > > > > > > > > > > 2015-07-13 19:26 GMT+02:00 Kasper Sørensen < > > > > [email protected] > > > > > >: > > > > > > > > > > > I was thinking of having something like pluggable annotations or > > > > features > > > > > > that could be added to tables, columns or groups of columns. > Maybe > > > also > > > > > to > > > > > > other entities. But since a lot of this is not available as a > thing > > > > that > > > > > > can be explored in the datastore itself I guess it would need to > be > > > > > stored > > > > > > externally. > > > > > > > > > > > > Examples of features that I could imagine: > > > > > > > > > > > > Data type features: NominalScale, OrdinalScale, ... > > > > > > Data conversion features: IntegerAsString, DateAsString, > > > > TimestampAsLong > > > > > > ... > > > > > > Domain features: FirstName, LastName, AddressLine, AddressCity, > > > > > > AddressCountry, DateYear, DateMonth ... > > > > > > And groupings of columns also in domain like fatures: Name > > (composed > > > of > > > > > > e.g. first and last name), Address (composed of multiple address > > > > fields), > > > > > > Date (composed of year, month etc.) > > > > > > > > > > > > I would like to store them so that I can save and load them, > saving > > > the > > > > > > developer for the work of restoring all the metadata again and > > again. > > > > Is > > > > > > that not sensible? > > > > > > > > > > > > Best regards, > > > > > > Kasper > > > > > > > > > > > > > > > > > > 2015-07-13 9:55 GMT+02:00 Alberto Rodriguez <[email protected]>: > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > We are also facing similar issues so I completely agree with > this > > > > > > feature. > > > > > > > In fact, we added recently in our service layer a new field for > > our > > > > > > > metadata called "format", we needed this field to specify > > different > > > > > > format > > > > > > > types for the dates returned by our datastores. > > > > > > > > > > > > > > However, I'm not really sure how to implement this feature... I > > > guess > > > > > we > > > > > > > should keep getting the metadata "core" from our datasources > but > > > what > > > > > > about > > > > > > > the new metadata??: > > > > > > > > > > > > > > - How to fill it out? Will the integrator of MM provide > > > functions > > > > to > > > > > > > define when an element is going to be "x" and when is going to > be > > > > "y"? > > > > > > > (thinking here of providing lambda functions) > > > > > > > - Do we really need a metadata store? > > > > > > > > > > > > > > Kind regards, > > > > > > > > > > > > > > > > > > > > > 2015-07-10 12:51 GMT+02:00 Kasper Sørensen < > > > > > > [email protected] > > > > > > > >: > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > All the time I see more and more need for us to add metadata > to > > > our > > > > > > > > MetaModel based connectors. That could be for instance > metadata > > > > about > > > > > > > scale > > > > > > > > (nominal, ordinal etc.) so that we can automate some stats > > > > collection > > > > > > > etc. > > > > > > > > or it could be more "meaning" oriented features to describe > > e.g. > > > > > "This > > > > > > > is a > > > > > > > > first name" or "This is a city" or "These two fields (first > and > > > > last > > > > > > > name) > > > > > > > > are together defining a name of a person". > > > > > > > > > > > > > > > > We have such mechanisms in our application levels many > places, > > > but > > > > > not > > > > > > at > > > > > > > > the core framework of MetaModel and that's a pity because it > > > makes > > > > it > > > > > > > > harder for us to share. > > > > > > > > > > > > > > > > So I'm thinking of adding such a layer to the metadata of > > > > MetaModel. > > > > > > But > > > > > > > > one thing that's difficult is then about representing that > > > metadata > > > > > in > > > > > > > some > > > > > > > > "metadata store" which isn't necesarily the same as the data > > > source > > > > > > > itself. > > > > > > > > It could be an XML file or it could be a complete metadata > > > > database. > > > > > > And > > > > > > > I > > > > > > > > think that this metadata would be mutable by the integrator > of > > > > > > MetaModel > > > > > > > > because it is rarely fully revealed by the data source > itself. > > > > > > > > > > > > > > > > What do you think? Nice feature or?` > > > > > > > > > > > > > > > > Kasper > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
