[ 
https://issues.apache.org/jira/browse/ATLAS-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16130253#comment-16130253
 ] 

Israel Varea commented on ATLAS-1955:
-------------------------------------

I can't find a strong argument to decide between the two options. A good 
question to choose between the two options is:  must validations be reusable 
for different attributes?
If the answer is yes, then go with Option 2. If the answer is "A reusable 
validation is not so important" then go with Option 1 since it is more simple.
I think both options will cover most of the common use cases anyway.



> Validation for Attributes
> -------------------------
>
>                 Key: ATLAS-1955
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1955
>             Project: Atlas
>          Issue Type: New Feature
>          Components:  atlas-core
>    Affects Versions: 0.9-incubating
>            Reporter: Israel Varea
>            Assignee: Richard Ding
>             Fix For: 0.9-incubating
>
>
> It would be very nice that Atlas model could contain a way to represent 
> attribute validation. 
> A simple example is that we would like to model a Person, with attributes 
> Name, Email and Country. Now we would like to specify that Email has to 
> follow a specific regular expression, so it would be nice if we could set 
> Email -> hasValidation -> EmailRegex, with EmailRegex having:
> Name: Email Regular Expresion
> Expression: /[0-9a-z]+@[0-9a-z]+.[0-9a-z]+/
> For more complex types of validation, e.g. checking card number validity, it 
> could be added some external validator function/service.
> Name: Credit Card Number Validator
> Validator: org.apache.atlas.validators.creditcard or 
> https://host:port/creditCardValidator
> For validations from a reference table, for example a country name, it could 
> be:
> Name: Country Name Ref Validator
> Reference Column: <country_name_column>
> where <country_name_column> would be an instance of type Hive_Column or 
> HBase_Column.
> Since this is a kind of Standarization, it could be placed in [Area 
> 5|https://cwiki.apache.org/confluence/display/ATLAS/Area+5+-+Standards].
> A similar approach is followed in software 
> [Kylo|https://github.com/Teradata/kylo/tree/master/integrations/spark/spark-validate-cleanse]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to