An additional note/idea that just came to mind: Some features like these
might be detectable and some might even be intrinsically available in the
data source. For instance in Salesforce.com many of the fields are standard
fields as well as the tables being standard tables. That means that for
this data source we could add features coming from the DataContext itself.
My suggestion would be to add an optional interface for DataContexts that
could expose such informations.

As for detection I need to think a bit more, or maybe some of you have
ideas. Maybe something a la the schema inference stuff we have where a
detector object gets a preview of the data in order to try and determine
metadata.

2015-07-15 19:41 GMT+02:00 Kasper Sørensen <[email protected]>:

> That's absolutely a good idea. One thing is describing the JSON structure,
> which I guess would not be different (although more elaborate than average)
> than other features. But the moment you mention this then I start thinking
> about also query "into" the JSON. Or any other map structure. Also many of
> our NoSQL-like modules have more advanced data structures - for instance
> MongoDB, CouchDB, ElasticSearch and so on.
>
> Querying into those data structures could be considered a separate feature
> though. I mean - that could be useful regardless of having metadata about
> the data structure. Even today I would love to somehow query with a
> function or something like that where I could provide a kind of path (XPath
> like or so) into the nested structure. Ankit, if this is also what you
> would like, then I suggest starting a separate mail thread on it (unless
> you feel it is tied very closely to this metadata thing).
>
> 2015-07-15 18:55 GMT+02:00 Ankit Kumar <[email protected]>:
>
>> Hi Kasper,
>>
>> Sounds like a nice idea.
>>
>> Would this also allow to define a complex metadata on a single column. For
>> e.g. if the content of the column is a JSON object, would it be possible
>> to
>> define what is the structure of the JSON, what specific fields within the
>> JSON structure have what behavioural aspect (metadata as well)? May be an
>> extreme case for later, but I can already see the benefit of this feature.
>>
>> Regards
>> Ankit
>>
>> On Wed, Jul 15, 2015 at 4:58 PM, Kasper Sørensen <
>> [email protected]> wrote:
>>
>> > Glad that you like the idea :-) And yes, I also would like to involve as
>> > many as possible in building it.
>> >
>> > My high level design ideas:
>> >
>> > We would leave the existing DataContext API as-is to the extent
>> possible. I
>> > would see this as a service that is connected to the DataContext but
>> since
>> > the "new" metadata and the existing metadata might differ (as Alberto
>> > mentioned before) it is also not bound directly to the schema, table,
>> > column etc. Rather you should be able to resolve things using either
>> > Schema, Table and Column objects, or simply by paths (strings).
>> >
>> > Here's some example method calls and class names that I could imagine. I
>> > haven't thought long about the naming so it's just a "idea generating"
>> > draft. I imagine the metadata "features" to be pluggable kinda like
>> > annotations on java constructs:
>> >
>> > DataContext dc = ...
>> > MetadataService svc = ...
>> >
>> > Table table = dc.getDefaultSchema().getTable(0);
>> >
>> > List<ColumnGroupMetadata> groups = svc.getColumnGroupMetadata(table);
>> > boolean isAddressGroup =
>> > groups.get(0).hasFeatureOfType(AddressColumnGroupFeature.class);
>> > AddressColumnGroupFeature addressFeature =
>> > groups.get(0).getFeatureOfType(AddressColumnGroupFeature.class);
>> > ColumnMetadata countryColumnMetadata =
>> addressFeature.getCountryColumn();
>> > List<ColumnMetadata> addressLineColumnMetadataList =
>> > addressFeature.getAddressLineColumns();
>> >
>> > Column column = table.getColumn(0);
>> >
>> > ColumnMetadata cm = svc.getColumnMetadata(column);
>> > List<ColumnFeature> features = cm.getFeatures();
>> > boolean isNominal = cm.hasFeatureOfType(NominalColumnFeature.class);
>> > NominalColumnFeature ncf =
>> cm.getFeatureOfType(NominalColumnFeature.class);
>> >
>> >
>> > BR,
>> > Kasper
>> >
>> >
>> > 2015-07-15 13:07 GMT+02:00 Alberto Rodriguez <[email protected]>:
>> >
>> > > That's pretty cool. Sorry but I was not taking into account the
>> powerful
>> > > queries that you might bring into the equation with this stuff. It all
>> > > makes sense now.
>> > >
>> > > It would definitely be a big feature that might be divided into
>> smaller
>> > > tasks. If we go ahead with this stuff I really would like to help with
>> > the
>> > > implementation.
>> > >
>> > > Kind regards,
>> > >
>> > > Alberto
>> > >
>> > > 2015-07-15 12:41 GMT+02:00 Kasper Sørensen <
>> > [email protected]
>> > > >:
>> > >
>> > > > That's absolutely true. It's not that I want to stop discovering
>> what
>> > we
>> > > > can, but I was more thinking of also adding a mechanism to plug your
>> > own
>> > > > metadata. I guess it's pretty rare that a database itself offers a
>> > > "domain
>> > > > oriented" metadata system where I could tell it that "this field is
>> a
>> > zip
>> > > > code, and together with field X, Y and Z it forms a single address".
>> > > >
>> > > > While from a querying perspective what I would love to archieve is
>> > that I
>> > > > could express something like the following:
>> > > >
>> > > > "Query all the addresses in the database".
>> > > > "Do a SUM, AVG, MAX and MIN on all the ordinal-scale numbers in the
>> > > > database".
>> > > >
>> > > > It will also help a lot in generating templates for data
>> integration.
>> > If
>> > > I
>> > > > am trying to move data from one table to another then I can probably
>> > do a
>> > > > lot of automatic mapping based on the metadata. Same goes for
>> > reporting I
>> > > > guess and stuff like that.
>> > > >
>> > > > Best regards,
>> > > > Kasper
>> > > >
>> > > > 2015-07-15 12:06 GMT+02:00 Alberto Rodriguez <[email protected]>:
>> > > >
>> > > > > Ok, so you are not thinking of discovering more metadata
>> > "on-the-fly",
>> > > > your
>> > > > > approach is statically define metadata for the datasource and load
>> > and
>> > > > > mix-in it with the existing metadata right?
>> > > > >
>> > > > > IMHO with this approach we will add a strong dependency between
>> the
>> > > data
>> > > > > itself and the "external" metadata, correct me if I'm wrong or not
>> > > fully
>> > > > > understand your proposal but if the datastore changes (one column
>> is
>> > > > > deleted or a new column is added) the metadata will get obsolete.
>> > > > >
>> > > > > Kind regards,
>> > > > >
>> > > > > Alberto
>> > > > >
>> > > > > 2015-07-13 19:26 GMT+02:00 Kasper Sørensen <
>> > > > [email protected]
>> > > > > >:
>> > > > >
>> > > > > > I was thinking of having something like pluggable annotations or
>> > > > features
>> > > > > > that could be added to tables, columns or groups of columns.
>> Maybe
>> > > also
>> > > > > to
>> > > > > > other entities. But since a lot of this is not available as a
>> thing
>> > > > that
>> > > > > > can be explored in the datastore itself I guess it would need
>> to be
>> > > > > stored
>> > > > > > externally.
>> > > > > >
>> > > > > > Examples of features that I could imagine:
>> > > > > >
>> > > > > > Data type features: NominalScale, OrdinalScale, ...
>> > > > > > Data conversion features: IntegerAsString, DateAsString,
>> > > > TimestampAsLong
>> > > > > > ...
>> > > > > > Domain features: FirstName, LastName, AddressLine, AddressCity,
>> > > > > > AddressCountry, DateYear, DateMonth ...
>> > > > > > And groupings of columns also in domain like fatures: Name
>> > (composed
>> > > of
>> > > > > > e.g. first and last name), Address (composed of multiple address
>> > > > fields),
>> > > > > > Date (composed of year, month etc.)
>> > > > > >
>> > > > > > I would like to store them so that I can save and load them,
>> saving
>> > > the
>> > > > > > developer for the work of restoring all the metadata again and
>> > again.
>> > > > Is
>> > > > > > that not sensible?
>> > > > > >
>> > > > > > Best regards,
>> > > > > > Kasper
>> > > > > >
>> > > > > >
>> > > > > > 2015-07-13 9:55 GMT+02:00 Alberto Rodriguez <[email protected]
>> >:
>> > > > > >
>> > > > > > > Hi all,
>> > > > > > >
>> > > > > > > We are also facing similar issues so I completely agree with
>> this
>> > > > > > feature.
>> > > > > > > In fact, we added recently in our service layer a new field
>> for
>> > our
>> > > > > > > metadata called "format", we needed this field to specify
>> > different
>> > > > > > format
>> > > > > > > types for the dates returned by our datastores.
>> > > > > > >
>> > > > > > > However, I'm not really sure how to implement this feature...
>> I
>> > > guess
>> > > > > we
>> > > > > > > should keep getting the metadata "core" from our datasources
>> but
>> > > what
>> > > > > > about
>> > > > > > > the new metadata??:
>> > > > > > >
>> > > > > > >    - How to fill it out? Will the integrator of MM provide
>> > > functions
>> > > > to
>> > > > > > > define when an element is going to be "x" and when is going
>> to be
>> > > > "y"?
>> > > > > > > (thinking here of providing lambda functions)
>> > > > > > >    - Do we really need a metadata store?
>> > > > > > >
>> > > > > > > Kind regards,
>> > > > > > >
>> > > > > > >
>> > > > > > > 2015-07-10 12:51 GMT+02:00 Kasper Sørensen <
>> > > > > > [email protected]
>> > > > > > > >:
>> > > > > > >
>> > > > > > > > Hi all,
>> > > > > > > >
>> > > > > > > > All the time I see more and more need for us to add
>> metadata to
>> > > our
>> > > > > > > > MetaModel based connectors. That could be for instance
>> metadata
>> > > > about
>> > > > > > > scale
>> > > > > > > > (nominal, ordinal etc.) so that we can automate some stats
>> > > > collection
>> > > > > > > etc.
>> > > > > > > > or it could be more "meaning" oriented features to describe
>> > e.g.
>> > > > > "This
>> > > > > > > is a
>> > > > > > > > first name" or "This is a city" or "These two fields (first
>> and
>> > > > last
>> > > > > > > name)
>> > > > > > > > are together defining a name of a person".
>> > > > > > > >
>> > > > > > > > We have such mechanisms in our application levels many
>> places,
>> > > but
>> > > > > not
>> > > > > > at
>> > > > > > > > the core framework of MetaModel and that's a pity because it
>> > > makes
>> > > > it
>> > > > > > > > harder for us to share.
>> > > > > > > >
>> > > > > > > > So I'm thinking of adding such a layer to the metadata of
>> > > > MetaModel.
>> > > > > > But
>> > > > > > > > one thing that's difficult is then about representing that
>> > > metadata
>> > > > > in
>> > > > > > > some
>> > > > > > > > "metadata store" which isn't necesarily the same as the data
>> > > source
>> > > > > > > itself.
>> > > > > > > > It could be an XML file or it could be a complete metadata
>> > > > database.
>> > > > > > And
>> > > > > > > I
>> > > > > > > > think that this metadata would be mutable by the integrator
>> of
>> > > > > > MetaModel
>> > > > > > > > because it is rarely fully revealed by the data source
>> itself.
>> > > > > > > >
>> > > > > > > > What do you think? Nice feature or?`
>> > > > > > > >
>> > > > > > > > Kasper
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to