[jira] [Comment Edited] (ATLAS-1839) Area 2 of the open metadata model

2017-07-20 Thread Nigel Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094613#comment-16094613
 ] 

Nigel Jones edited comment on ATLAS-1839 at 7/20/17 12:37 PM:
--

[~davidrad] the toxic combination support in ranger policies is primarily 
geared to controlling what a user may access, whilst the validation [~ivarea] 
is suggesting is primarily about creates and updates, ie defining the data 
model itself. That's not to say ranger couldn't do this (since it can address 
any operation such as a create) but I don't think that's ranger's intent. But I 
agree it's a fine line and could well vary significantly in different 
environments

As such I think it makes sense to define validation in atlas and be able to 
link to code artifacts, services that implement those validations probably 
through a combination of discovery & stewardship , plus making it easier when 
writing pipelines for say ETL or streaming, to be able to easily pull in atlas 
metadata and capture a link between a validation implemented by an pipeline 
author (or being used from a library) and it's definition in atlas. Thus atlas 
ends up with both the "intent" (the business spec if you like) as well as links 
to the implementation yet does not constrain those implementations since they 
can be so varied. It's also incremental, easy to adopt, and means one can get 
value without everything being in place?

Following on from this, absolutely some of those validations could be 
implemented as complex rules using a full featured rules engine, but I think it 
would be tricky and constraining to capture all that in atlas, hence why I'd go 
for the link approach & some relatively loose coupling

So with that done, sure we could have a more complex rules engine embedded in, 
or used by ranger plugins... but this could be one of a number of different 
approaches

I'd be inclined to start off with us figuring out how to model, and some use 
cases where we can explore the authoring (ie in atlas), assisted authoring 
(when writing a job), metadata capture (from those other systems, also relates 
to lineage) & probably best to do that in ATLAS-1995? This also touches on 
RANGER-1869 (metadata capture)

Certainly this is an interesting area !


was (Author: jonesn):
[~davidrad] the toxic combination support in ranger policies is primarily 
geared to controlling what a user may access, whilst the validation [~ivarea] 
is suggesting is primarily about creates and updates, ie defining the data 
model itself. That's not to say ranger couldn't do this (since it can address 
any operation such as a create) but I don't think that's ranger's intent. But I 
agree it's a fine line and could well vary significantly in different 
environments

As such I think it makes sense to define validation in atlas and be able to 
link to code artifacts, services that implement those validations probably 
through a combination of discovery & stewardship , plus making it easier when 
writing pipelines for say ETL or streaming, to be able to easily pull in atlas 
metadata and capture a link between a validation implemented by an pipeline 
author (or being used from a library) and it's definition in atlas. Thus atlas 
ends up with both the "intent" (the business spec if you like) as well as links 
to the implementation yet does not constrain those implementations since they 
can be so varied. 

Following on from this, absolutely some of those validations could be 
implemented as complex rules using a full featured rules engine, but I think it 
would be tricky and constraining to capture all that in atlas, hence why I'd go 
for the link approach & some relatively loose coupling

So with that done, sure we could have a more complex rules engine embedded in, 
or used by ranger plugins... but this could be one of a number of different 
approaches

I'd be inclined to start off with us figuring out how to model, and some use 
cases where we can explore the authoring (ie in atlas), assisted authoring 
(when writing a job), metadata capture (from those other systems, also relates 
to lineage) & probably best to do that in ATLAS-1995? This also touches on 
RANGER-1869 (metadata capture)

Certainly this is an interesting area !

> Area 2 of the open metadata model
> -
>
> Key: ATLAS-1839
> URL: https://issues.apache.org/jira/browse/ATLAS-1839
> Project: Atlas
>  Issue Type: Task
>  Components:  atlas-core
>Affects Versions: 0.9-incubating
>Reporter: Mandy Chessell
>Assignee: David Radley
>  Labels: OpenMetadata, VirtualDataConnector
> Attachments: 0005LinkedMediaTypes.json, 0210Glossary.json, 
> 0220CategoryHierarchy.json, 0230Terms.json, 0240Dictionary.json, 
> 0250RelatedTerms.json, 0260Contexts.json, 0270SemanticAssignment.json, 
> 

[jira] [Comment Edited] (ATLAS-1839) Area 2 of the open metadata model

2017-07-20 Thread Nigel Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094613#comment-16094613
 ] 

Nigel Jones edited comment on ATLAS-1839 at 7/20/17 12:36 PM:
--

[~davidrad] the toxic combination support in ranger policies is primarily 
geared to controlling what a user may access, whilst the validation [~ivarea] 
is suggesting is primarily about creates and updates, ie defining the data 
model itself. That's not to say ranger couldn't do this (since it can address 
any operation such as a create) but I don't think that's ranger's intent. But I 
agree it's a fine line and could well vary significantly in different 
environments

As such I think it makes sense to define validation in atlas and be able to 
link to code artifacts, services that implement those validations probably 
through a combination of discovery & stewardship , plus making it easier when 
writing pipelines for say ETL or streaming, to be able to easily pull in atlas 
metadata and capture a link between a validation implemented by an pipeline 
author (or being used from a library) and it's definition in atlas. Thus atlas 
ends up with both the "intent" (the business spec if you like) as well as links 
to the implementation yet does not constrain those implementations since they 
can be so varied. 

Following on from this, absolutely some of those validations could be 
implemented as complex rules using a full featured rules engine, but I think it 
would be tricky and constraining to capture all that in atlas, hence why I'd go 
for the link approach & some relatively loose coupling

So with that done, sure we could have a more complex rules engine embedded in, 
or used by ranger plugins... but this could be one of a number of different 
approaches

I'd be inclined to start off with us figuring out how to model, and some use 
cases where we can explore the authoring (ie in atlas), assisted authoring 
(when writing a job), metadata capture (from those other systems, also relates 
to lineage) & probably best to do that in ATLAS-1995? This also touches on 
RANGER-1869 (metadata capture)

Certainly this is an interesting area !


was (Author: jonesn):
[~davidrad] the toxic combination support in ranger policies is primarily 
geared to controlling what a user may access, whilst the validation [~ivarea] 
is suggesting is primarily about creates and updates, ie defining the data 
model itself. That's not to say ranger couldn't do this (since it can address 
any operation such as a create) but I don't think that's ranger's intent. But I 
agree it's a fine line and could well vary significantly in different 
environments

As such I think it makes sense to define validation in atlas and be able to 
link to code artifacts, services that implement those validations probably 
through a combination of discovery & stewardship , plus making it easier when 
writing pipelines for say ETL or streaming, to be able to easily pull in atlas 
metadata and capture a link between a validation implemented by an pipeline 
author (or being used from a library) and it's definition in atlas. Thus atlas 
ends up with both the "intent" (the business spec if you like) as well as links 
to the implementation yet does not constrain those implementations since they 
can be so varied. 

Following on from this, absolutely some of those validations could be 
implemented as complex rules, but I think it would be tricky and constraining 
to capture all that in atlas, hence why I'd go for the link approach & some 
relatively loose coupling

So with that done, sure we could have a more complex rules engine embedded in, 
or used by ranger plugins... but this could be one of a number of different 
approaches

I'd be inclined to start off with us figuring out how to model, and some use 
cases where we can explore the authoring (ie in atlas), assisted authoring 
(when writing a job), metadata capture (from those other systems, also relates 
to lineage) & probably best to do that in ATLAS-1995? This also touches on 
RANGER-1869 (metadata capture)

Certainly this is an interesting area !

> Area 2 of the open metadata model
> -
>
> Key: ATLAS-1839
> URL: https://issues.apache.org/jira/browse/ATLAS-1839
> Project: Atlas
>  Issue Type: Task
>  Components:  atlas-core
>Affects Versions: 0.9-incubating
>Reporter: Mandy Chessell
>Assignee: David Radley
>  Labels: OpenMetadata, VirtualDataConnector
> Attachments: 0005LinkedMediaTypes.json, 0210Glossary.json, 
> 0220CategoryHierarchy.json, 0230Terms.json, 0240Dictionary.json, 
> 0250RelatedTerms.json, 0260Contexts.json, 0270SemanticAssignment.json, 
> 0280SpineObjects.json
>
>
> This task delivers the JSON files for the new models that describe types for 
> Area 2 in the open metadata 

[jira] [Comment Edited] (ATLAS-1839) Area 2 of the open metadata model

2017-06-28 Thread David Radley (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066435#comment-16066435
 ] 

David Radley edited comment on ATLAS-1839 at 6/28/17 1:22 PM:
--

I have specified this glossary files as json. They depend on the 0005 file that 
I have attached to the area 0 Jira. 
I have not included Taxonomy in the glossary - as this type already exists in 
my atlas due to the legacy terms. 
I am not included versionable in this files and have created the Glossary 
entityDefs subclassing Referencable. Recent discussions have talked about 
implementing versioning using system attributes rather than explicitly in a 
model. There will be a versioning design detailing the approach in more detail 
so the community can review it. Any comments at this stage around versioning or 
anything else are welcome.  

The association between classificationDefs and entityDefs cannot be specified 
yet. 
In related terms I have called ISA IsA - I can revert if the capitalized form 
is required. 

The files should be loaded in order. I am testing using a java app rather than 
putting the files in a place to pickup during initialization.

The 230 file fails to load as this time. I have not tried the later ones.

I am publishing these files so the community can play with them.  



 



was (Author: davidrad):
I have specified this glossary files as json. They depend on the 0005 file that 
I have attached to the area 0 Jira. 
I have not included Taxonomy in the glossary - as this type already exists in 
my atlas due to the legacy terms. 
The association between classificationDefs and entityDefs cannot be specified 
yet. 
In related terms I have called ISA IsA - I can revert if the capitalized form 
is required. 

The files should be loaded in order. I am testing using a java app rather than 
putting the files in a place to pickup during initialization.

The 230 file fails to load as this time. I have not tried the later ones.

I am publishing these files so the community can play with them.  



 


> Area 2 of the open metadata model
> -
>
> Key: ATLAS-1839
> URL: https://issues.apache.org/jira/browse/ATLAS-1839
> Project: Atlas
>  Issue Type: Task
>  Components:  atlas-core
>Affects Versions: 0.9-incubating
>Reporter: Mandy Chessell
>Assignee: Mandy Chessell
>  Labels: OpenMetadata, VirtualDataConnector
> Attachments: 0210Glossary.json, 0220CategoryHierarchy.json, 
> 0230Terms.json, 0240Dictionary.json, 0250RelatedTerms.json
>
>
> This task delivers the JSON files for the new models that describe types for 
> Area 2 in the open metadata model. This area covers the glossary.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)