[
https://issues.apache.org/jira/browse/ATLAS-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129558#comment-16129558
]
Richard Ding commented on ATLAS-1955:
-------------------------------------
I have an offline discussion with [~davidrad] on how to add attribute
validations to the type system. It seems that option 2 is more clean by
separating validation rules from the attributes they validate. But it does add
another top level type to the system. Please let me know what you think.
*Option 1: Embed the validation rules inside attribute definition*
Add subclass AtlasValidationDef to AtlasStructDef class and AtlasValidationDef
will be an element of AtlasAttributeDef. For example, an email attribute can
contain a validation definition:
{code}
{
"name": "email",
"typeName": "string",
"cardinality": "SINGLE",
"validations": [
{
"type": "regex",
"validator": "[0-9a-z]@[0-9a-z].[0-9a-z]+"
}
],
"isIndexable": false,
"isOptional": true,
"isUnique": false
}
{code}
Notes:
# The validationDefs will be serialized / deserialized as part of attributeDef
Json string
# The validationDefs associated with attributeDefs will be retrieved and
invoked when validateValue or ValidateValueForUpdate (of AtlasStructType
class) is called.
# Initially, we’ll support three validation types: regex, lookup and class
#* regex: the validator value is a regex string
#* lookup: the validator value is the name of an existing AtlasEnumDef
#* class: the validator value is the name of a validator class (e.g.
org.apache.atlas.model.validation.CreditCardValidator). Validator classes all
implement AttributeValidator interface. These classes can be builtin (part of
Atlas), or dynamically loaded via Java provider framework.
*Option 2: Validation rules as top level type definition*
Here we define _AtlasAttributeValidationType_ (and
_AtlasAttributeValidationDef_) as top level Atlas type, similar to
_AtlasEnumType_ (and _AtlasEnumDef_), and add a optional validation field to
_AtlasAttributeDef_. For example, we first define
_AtlasAttributeValidationDefs_:
{code}
"validationDefs": [
{
"name": "email_validation",
"typeVersion": "1.0",
"type": "regex",
"validator": "[0-9a-z]@[0-9a-z].[0-9a-z]+"
},
{
"name": "country_code_validation",
"typeVersion": "1.0",
"type": "lookup",
"validator": "country_code_enum_type"
},
{
"name": "credit_card_validation",
"typeVersion": "1.0",
"type": "class",
"validator": "org.apache.atlas.model.validataion.CreditCardValidator"
}
]
{code}
Then we define the validation field in email attributeDef:
{code}
{
"name": "email",
"typeName": "string",
"cardinality": "SINGLE",
"validation": "email_validation",
"isIndexable": false,
"isOptional": true,
"isUnique": false
}
{code}
Notes:
# As a top level Atlas type, an AtlasAttibuteValidationType instance will be
stored as a vertex in the backing graph db.
# If validation field exists in an AtlasAttributeDef object, the attribute
value will be validated based on validationDef in method validateValue or
ValidateValueForUpdate (of AtlasStructType class)
# Initially, we’ll support three validation types: regex, lookup and class
> Validation for Attributes
> -------------------------
>
> Key: ATLAS-1955
> URL: https://issues.apache.org/jira/browse/ATLAS-1955
> Project: Atlas
> Issue Type: New Feature
> Components: atlas-core
> Affects Versions: 0.9-incubating
> Reporter: Israel Varea
> Assignee: Richard Ding
> Fix For: 0.9-incubating
>
>
> It would be very nice that Atlas model could contain a way to represent
> attribute validation.
> A simple example is that we would like to model a Person, with attributes
> Name, Email and Country. Now we would like to specify that Email has to
> follow a specific regular expression, so it would be nice if we could set
> Email -> hasValidation -> EmailRegex, with EmailRegex having:
> Name: Email Regular Expresion
> Expression: /[0-9a-z]+@[0-9a-z]+.[0-9a-z]+/
> For more complex types of validation, e.g. checking card number validity, it
> could be added some external validator function/service.
> Name: Credit Card Number Validator
> Validator: org.apache.atlas.validators.creditcard or
> https://host:port/creditCardValidator
> For validations from a reference table, for example a country name, it could
> be:
> Name: Country Name Ref Validator
> Reference Column: <country_name_column>
> where <country_name_column> would be an instance of type Hive_Column or
> HBase_Column.
> Since this is a kind of Standarization, it could be placed in [Area
> 5|https://cwiki.apache.org/confluence/display/ATLAS/Area+5+-+Standards].
> A similar approach is followed in software
> [Kylo|https://github.com/Teradata/kylo/tree/master/integrations/spark/spark-validate-cleanse]
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)